├── .gitignore ├── 01-binary-search ├── README.md ├── binary-search.go ├── binary-search.py ├── binary-search.rb └── images │ ├── log-n.png │ ├── logarithms.png │ └── running-time.png ├── 02-big-o-notation ├── README.md └── images │ ├── big-o-notation.png │ ├── example1.png │ ├── example2.png │ ├── example3.png │ ├── example4.png │ └── pigeon-versus-internet.png ├── 03-selection-sort ├── README.md ├── images │ ├── selection-sort-explanation.png │ └── selection-sort-time.png ├── selection-sort.py └── selection-sort.rb ├── 04-arrays-and-linked-lists ├── README.md └── images │ ├── doubly-linked-lists.png │ ├── facebook-exercise.png │ ├── run-time-1.png │ ├── run-time-2.png │ └── singly-linked-lists.png ├── 05-recursion ├── README.md ├── array-items-count-with-recursion.py └── array-items-sum-with-recursion.py ├── 06-stack ├── README.md └── images │ ├── call-stack-1.png │ ├── call-stack-2.png │ ├── call-stack-3.png │ ├── call-stack-4.png │ ├── call-stack-5.png │ ├── call-stack-6.png │ ├── call-stack-example-1.png │ ├── call-stack-example-2.png │ └── stack.png ├── 07-quicksort ├── README.md ├── images │ ├── pivot.png │ └── recursion.png ├── quicksort.py └── quicksort.rb ├── 08-divide-and-conquer ├── README.md └── images │ ├── dc-1.png │ ├── dc-2.png │ └── dc-3.jpg ├── 09-big-o-revisited ├── README.md └── images │ ├── big-o-runtimes.png │ ├── n-logn.png │ ├── n-logn2.png │ ├── pivot-first.png │ └── pivot-middle.png ├── 10-hash-tables ├── README.md └── images │ ├── good-and-bad-hash-function.png │ ├── linear-probing.png │ ├── load-factor.png │ ├── performance.png │ ├── resized-hash.png │ ├── resizing.png │ └── worst-case.png ├── 11-hash-tables-security-and-bcrypt ├── README.md └── images │ └── bcrypt.png ├── 12-breadth-first-search ├── README.md ├── array-queue.py ├── array-queue.rb ├── images │ ├── complex-graph.png │ ├── dequeue.png │ ├── directed-undirected.png │ ├── duplicate-queue-1.png │ ├── duplicate-queue-2.png │ ├── fifo-lifo.png │ ├── node-edge.png │ ├── queue-algorithm.png │ ├── queues-1.png │ ├── queues-2.png │ ├── simple-graph.png │ ├── topological-sort.png │ └── tree.png ├── list-queue.py └── list-queue.rb ├── 13-dijkstras-algorithm ├── README.md └── images │ ├── algorithm-code.png │ ├── dijkstra-1.png │ ├── dijkstra-2.png │ ├── fastest-path.png │ ├── find-lowest-cost-node.png │ ├── implementation.png │ ├── parents.png │ └── weighted-unweighted.png ├── 14-greedy-algorithms ├── README.md └── images │ ├── .DS_Store │ ├── approximation.png │ ├── classroom-schedule.png │ ├── greedy-algorithms.png │ ├── knapsack.png │ ├── set-covering-1.png │ └── set-covering-2.png ├── 15-dynamic-programming ├── README.md └── images │ ├── dynamic-programming-1.png │ ├── dynamic-programming-2.png │ ├── dynamic-programming-3.png │ ├── dynamic-programming-4.png │ ├── dynamic-programming-5.png │ ├── dynamic-programming-6.png │ ├── dynamic-programming-7.png │ ├── formula.png │ ├── knapsack-1.png │ └── knapsack-2.png ├── 16-k-nearest-neighbors ├── README.md └── images │ ├── feature-extract.png │ ├── fruit.png │ ├── k-nearest-neighbors.png │ ├── netflix.png │ ├── pythagorean-formula-2.png │ └── pythagorean-formula.png ├── 17-bellman-ford-algorithm └── README.md ├── 18-where-to-go-next ├── README.md └── images │ ├── inbalanced-tree.png │ ├── inverted-index.png │ ├── map.png │ ├── reduce.png │ ├── trees-1.png │ ├── trees-2.png │ └── trees-3.png └── README.md /.gitignore: -------------------------------------------------------------------------------- 1 | .DS_Store 2 | -------------------------------------------------------------------------------- /01-binary-search/README.md: -------------------------------------------------------------------------------- 1 | # Binary Search 2 | 3 | - Binary search is an example of the famous divide-and-conquer algorithm; its input is a **sorted** list of elements. If an element you're looking for is in that list, binary search returns the position of the item. Otherwise, it returns **null**. 4 | 5 | - You can only conduct binary search in ordered lists. 6 | 7 | - In general, for any list of `n`, binary search will take **log2n** steps to run in the worst case, whereas simple search will take **n** steps. 8 | 9 | ![logarithms](images/logarithms.png) 10 | 11 | In the documents, log always means log2. 12 | 13 | ## Running Time 14 | 15 | | Running Time | Log(n) vs n | 16 | | ------------ | ----------- | 17 | | ![running-time](images/running-time.png) | ![log-n](images/log-n.png) | 18 | 19 | When binary search makes an incorrect guess, the portion of the array that contains reasonable guesses is reduced by at least half. If the reasonable portion had 32 elements, then an incorrect guess cuts it down to have at most 16. Binary search halves the size of the reasonable portion upon every incorrect guess. 20 | 21 | ## Interesting Fact 22 | 23 | > According to 'Programming Pearls', only 10% of professional programmers are able to implement binary search in their code. They can explain it very well, but coding it is a challenge for them. 24 | -------------------------------------------------------------------------------- /01-binary-search/binary-search.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "fmt" 5 | "math/rand" 6 | "time" 7 | ) 8 | 9 | func main() { 10 | var ( 11 | min = 0 12 | max = 100 13 | tries = 0 14 | seed = rand.New(rand.NewSource(time.Now().UnixNano())) 15 | randNumber = seed.Intn(max) 16 | ) 17 | 18 | for min <= max { 19 | guess := (min + max) / 2 20 | 21 | if guess == randNumber { 22 | fmt.Printf("The number have been found in %v tries and it was %v\n", tries, randNumber) 23 | break 24 | } else if guess > randNumber { 25 | tries++ 26 | max = guess - 1 27 | } else if guess < randNumber { 28 | tries++ 29 | min = guess + 1 30 | } 31 | } 32 | } 33 | -------------------------------------------------------------------------------- /01-binary-search/binary-search.py: -------------------------------------------------------------------------------- 1 | # python3 2 | 3 | def binary_search(array, item): 4 | # positions to search 5 | minimum = 0 6 | maximum = len(array)-1 7 | 8 | while minimum <= maximum: 9 | middle = int((minimum + maximum) / 2) 10 | guess = array[middle] 11 | 12 | if guess == item: 13 | return middle 14 | elif guess > item: 15 | maximum = middle - 1 16 | else: 17 | minimum = middle + 1 18 | return "Not found!" 19 | 20 | list = [10, 20, 23, 24, 30, 35, 80, 85, 86, 87, 100] 21 | print(binary_search(list, 10)) 22 | -------------------------------------------------------------------------------- /01-binary-search/binary-search.rb: -------------------------------------------------------------------------------- 1 | # encoding: UTF-8 2 | 3 | def binary_search(array, item) 4 | # positions to search 5 | minimum = 0 6 | maximum = array.length - 1 7 | 8 | while minimum <= maximum 9 | middle = (minimum + maximum) / 2 10 | guess = array[middle] 11 | 12 | if guess == item 13 | return item 14 | elsif guess > item 15 | maximum = middle - 1 16 | else 17 | minimum = middle + 1 18 | end 19 | end 20 | 21 | return "Not found!" 22 | end 23 | 24 | list = [10, 20, 23, 24, 30, 35, 80, 85, 86, 87, 100] 25 | puts binary_search(list, 35) 26 | -------------------------------------------------------------------------------- /01-binary-search/images/log-n.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/01-binary-search/images/log-n.png -------------------------------------------------------------------------------- /01-binary-search/images/logarithms.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/01-binary-search/images/logarithms.png -------------------------------------------------------------------------------- /01-binary-search/images/running-time.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/01-binary-search/images/running-time.png -------------------------------------------------------------------------------- /02-big-o-notation/README.md: -------------------------------------------------------------------------------- 1 | # Big-o-notation 2 | 3 | - Big-o-notation is a special notation that tells you how fast an algorithm is. 4 | 5 | - Simple search needs to check each element, so it will take **n** operations. The run time in Big O notation is **O(n)**. 6 | 7 | - Binary search needs **log(n)** operations to check a list of size n. Running time in Big O notation is **O(log n)**. 8 | 9 | - Big O will not tell you anything about how long it will take, it will only tell you how fast the algorithm grows, in other words how your algorithm scales. 10 | 11 | - Big O always considers the worst case. If you are lucky, you can find the value that we you searching for even in the first try, but Big(O) always considers the worst case. Omega(n) considers the best case, and Theta(n) considers the average case. 12 | 13 | ![big-o-notation](images/big-o-notation.png) 14 | 15 | ## Some Common Big O Run Times 16 | 17 | - O(logn), also known as log time. Binary search. 18 | - O(n), also known as linear time. Simple search. 19 | - O(n * logn). A fast sorting algorithm like quicksort. 20 | - O(n2). A slow sorting algorithm like selection sort. 21 | - O(n!). A really slow algorithm like traveling salesperson. 22 | 23 | ## Summary 24 | 25 | - Algorithm speed isn't measured in seconds, but in growth of the number of operations. 26 | - It says how time scales with respect to some variables. 27 | - We talk about how quickly the run time of an algorithm increases as the size of the input increases. 28 | - Run time of algorithms is expressed in Big O notation. 29 | 30 | # Pigeon and Internet Example 31 | 32 | ![pigeon-versus-internet](images/pigeon-versus-internet.png) 33 | 34 | # Algorithm Analysis 35 | 36 | ![example-1](images/example1.png) 37 | 38 | ![example-2](images/example2.png) 39 | 40 | ![example-3](images/example3.png) 41 | 42 | ![example-4](images/example4.png) 43 | -------------------------------------------------------------------------------- /02-big-o-notation/images/big-o-notation.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/02-big-o-notation/images/big-o-notation.png -------------------------------------------------------------------------------- /02-big-o-notation/images/example1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/02-big-o-notation/images/example1.png -------------------------------------------------------------------------------- /02-big-o-notation/images/example2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/02-big-o-notation/images/example2.png -------------------------------------------------------------------------------- /02-big-o-notation/images/example3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/02-big-o-notation/images/example3.png -------------------------------------------------------------------------------- /02-big-o-notation/images/example4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/02-big-o-notation/images/example4.png -------------------------------------------------------------------------------- /02-big-o-notation/images/pigeon-versus-internet.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/02-big-o-notation/images/pigeon-versus-internet.png -------------------------------------------------------------------------------- /03-selection-sort/README.md: -------------------------------------------------------------------------------- 1 | # Selection Sort 2 | 3 | ![selection-sort](images/selection-sort-explanation.png) 4 | 5 | - Selection sort takes O(n2) time. 6 | - It actually takes n*(n+1)/2 time (n2+n / 2), however since we don't care about constants, it is **n2** 7 | 8 | ![selection-sort-time](images/selection-sort-time.png) 9 | 10 | - Selection sort is a neat algorithm, but it's not very fast. Quicksort is a faster sorting algorithm that only takes **O(n logn)** time. 11 | -------------------------------------------------------------------------------- /03-selection-sort/images/selection-sort-explanation.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/03-selection-sort/images/selection-sort-explanation.png -------------------------------------------------------------------------------- /03-selection-sort/images/selection-sort-time.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/03-selection-sort/images/selection-sort-time.png -------------------------------------------------------------------------------- /03-selection-sort/selection-sort.py: -------------------------------------------------------------------------------- 1 | # python3 2 | 3 | def find_smallest(array): 4 | smallest = array[0] 5 | smallest_index = 0 6 | 7 | for i in range(1, len(array)): 8 | if array[i] < smallest: 9 | smallest = array[i] 10 | smallest_index = i 11 | return smallest_index 12 | 13 | def selection_sort(array): 14 | ordered_list = [] 15 | 16 | for i in range(len(array)): 17 | smallest = find_smallest(array) 18 | ordered_list.append(array.pop(smallest)) 19 | return ordered_list 20 | 21 | list = [10, 1, 20, 23, 24, 100, 30, 35, 80, 85, 86, 87, 100, 0] 22 | 23 | print(selection_sort(list)) 24 | -------------------------------------------------------------------------------- /03-selection-sort/selection-sort.rb: -------------------------------------------------------------------------------- 1 | # encoding: UTF-8 2 | 3 | def find_smallest(array) 4 | smallest = array[0] 5 | smallest_index = 0 6 | 7 | for i in (1..(array.length - 1)) 8 | if array[i] < smallest 9 | smallest = array[i] 10 | smallest_index = i 11 | end 12 | end 13 | 14 | return smallest_index 15 | end 16 | 17 | def selection_sort(array) 18 | ordered_list = [] 19 | 20 | array.length.times do 21 | smallest = find_smallest(array) 22 | ordered_list << array[smallest] 23 | array.delete_at(smallest) 24 | end 25 | 26 | print ordered_list 27 | end 28 | 29 | list = [10, 1, 20, 23, 24, 100, 30, 35, 80, 85, 86, 87, 100, 0] 30 | 31 | puts selection_sort(list) 32 | -------------------------------------------------------------------------------- /04-arrays-and-linked-lists/README.md: -------------------------------------------------------------------------------- 1 | # Arrays and Linked Lists 2 | 3 | Arrays need a memory block that the whole array can fit. In order words, the whole array needs to be stored in a single memory block, and can't be distributed between multiple memory blocks. On the other hand, linked lists can be stored in different memory blocks since each array element knows the location of the next element (pointer) in the array. 4 | 5 | This can be explained with a "going to the cinema with friends" analogy. In the array scenario, all friends would like to sit together, while in linked list scenario, group members would like to sit separately however each member knows the location of the next member in cinema. 6 | 7 | That's why adding or removing an element from an array isn't cheap. Each addition or removal requires finding of a suitable block in the memory. Let's say you are a group of 7 and you are sitting in a line with 7 seats together. If an another friend joins you, you will all have to move to another line together, and you will also need a line with 8 empty seats. 8 | 9 | As a solution to this problem, reserving seats might be offered. For example, buying 10 seats, instead of 7 will guarantee that 3 more people can join to the group. This approach has 2 main disadvantages. First, it's gonna be much more expensive, and second if the expected friends don't join you, you will still be blocking 3 seats for no reason. In terms of memory usage, you are gonna be wasting memory, and you will be blocking other processes from using the available memory. 10 | 11 | On the other hand, this approach isn't feasiable. What is 6 more friends decide to join you? What if 10 more? When the amount of possible joiners isn't known, this approach can't really work. 12 | 13 | Adding or removing from a linked list is much more easier, since each item only knows the address of the next item in the list. Then, if linked-lists are so great, why do arrays exist at all? The unique advantage of an array becomes more visible when you search an element. For example, it's easy to find 6th element in array because of indexes, however you would have to start from the first item and follow all the links for a linked list. 14 | 15 | ## Terminology 16 | 17 | - Here are run times for common operations on arrays and lists: 18 | 19 | ![run-time](images/run-time-1.png) 20 | 21 | ``` 22 | O(n) -> Linear time 23 | O(1) -> Constant time 24 | ``` 25 | 26 | ## Inserting into the middle of a list 27 | 28 | - **Lists** are better if you want to insert elements into the middle. If you use arrays for inserting into middle, the memory have to shift the remaining elements of the array. 29 | 30 | ## Deletions 31 | 32 | - **Lists** are better when it comes to deletion, because you just need to change what the previous element points to. With arrays, everything needs to be moved up when you delete an element. 33 | 34 | - Unlike insertions, deletions will always work. Insertions can fail sometimes when there is no space left in memory. But you can always delete an element. 35 | 36 | - Here are the run times for common operations on arrays and linked lists: 37 | 38 | ![run-time-2](images/run-time-2.png) 39 | 40 | - Which are used more? Arrays or lists? Arrays see a lot of use because they allow random access. There are two types of access: **random access** and **sequential access**. 41 | 42 | - **Sequential access** means reading the elements one by one, starting at the first element. Linked lists can only do sequential acess. If you want to read the 10th element of a linked list, you have to read the first 9 elements and follow the links to the 10th element. 43 | 44 | - **Random access** means you can jump directly to the 10th element. Arrays provide random access, therefore they are faster at reading. 45 | 46 | ## Exercise 47 | 48 | - Facebook uses neither an array nor a linked list to store user information. Let's consider a hybrid data structure: an array of linked lists. You have an array with 26 slots. Each slot points to a linked list. For example, the first slot in the array points to a linked list containing all the usernames starting with a. The second slot points to a linked list containing all the usernames starting with b, and so on. 49 | 50 | ![facebook-exercise](images/facebook-exercise.png) 51 | 52 | - Question: Suppose Adit B signs up for Facebook, and you want to add them to the list. You go to slot 1 in the array, go to the linked list for slot 1, and add Adit B at the end. Now, suppose you want to search for Zakhir H. You go to slot 26, which points to a linked list of all the Z names. Then you search through that list to find Zakhir H. Compare this hybrid data structure to arrays and linked lists. Is it slower or faster than each for searching and inserting? You don't have to give Big O run times, just whether the new data structure would be faster or slower. 53 | 54 | - Answer: Searching—slower than arrays, faster than linked lists. Inserting—faster than arrays, same amount of time as linked lists. So it's slower for searching than an array, but faster or the same as linked lists for everything. 55 | 56 | ## Summary 57 | 58 | - When accessing to the first element, the speed of array and linked list is the same since we don't need to follow all the pointers between linked-list items. However, for any other item, array is gonna be faster. 59 | - When we create an array, the computer allocates a memory block, it doesn't for linked-list. 60 | - The last node in a linked-list is NULL! So our program doesn't crash when it looks to the last item. 61 | - Adding an item in the middle of a linked-list is just a matter of breaking a link, and creating a new one. However, adding an item in the middle of a linked-list will require a complete shift. 62 | - Doubly linked lists have two different pointers to forward and backward: 63 | 64 | ![singly-linked-lists](images/singly-linked-lists.png) 65 | 66 | ![doubly-linked-lists](images/doubly-linked-lists.png) 67 | -------------------------------------------------------------------------------- /04-arrays-and-linked-lists/images/doubly-linked-lists.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/04-arrays-and-linked-lists/images/doubly-linked-lists.png -------------------------------------------------------------------------------- /04-arrays-and-linked-lists/images/facebook-exercise.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/04-arrays-and-linked-lists/images/facebook-exercise.png -------------------------------------------------------------------------------- /04-arrays-and-linked-lists/images/run-time-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/04-arrays-and-linked-lists/images/run-time-1.png -------------------------------------------------------------------------------- /04-arrays-and-linked-lists/images/run-time-2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/04-arrays-and-linked-lists/images/run-time-2.png -------------------------------------------------------------------------------- /04-arrays-and-linked-lists/images/singly-linked-lists.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/04-arrays-and-linked-lists/images/singly-linked-lists.png -------------------------------------------------------------------------------- /05-recursion/README.md: -------------------------------------------------------------------------------- 1 | # Recursion 2 | 3 | - The divide-and-conquer strategy uses this simple concept to solve hard problems. 4 | 5 | - Recursion is where a function calls itself. 6 | 7 | - Recursion is used when it makes the solution clearer. There's no performance benefit to using recursion; in fact, loops are sometimes better for performance. I like this quote by Leigh Caldwell on Stack Overflow: 8 | 9 | > "Loops may achieve a performance gain for your program. Recursion may achieve a performance gain for your programmer. Choose which is more important in your situation!" 10 | 11 | - Because a recursive function calls itself, it's easy to write a function incorrectly that ends up in an infinite loop: 12 | 13 | ```ruby 14 | def countdown(i) 15 | print i 16 | countdown(i-1) 17 | end 18 | ``` 19 | 20 | - When you write a recursive function, you have to tell it when to stop recursing. That's why every recursive function has two parts: the base case, and the recursive case. The recursive case is when the function calls itself. The base case is when the function doesn't call itself again, so it doesn't go into an infinite loop. 21 | 22 | - Let's add a base case to the countdown function: 23 | 24 | ```ruby 25 | def countdown(i) 26 | print i 27 | 28 | if i <= 0 29 | return 30 | else 31 | countdown(i-1) 32 | end 33 | end 34 | ``` 35 | -------------------------------------------------------------------------------- /05-recursion/array-items-count-with-recursion.py: -------------------------------------------------------------------------------- 1 | # python3 2 | 3 | def count(list): 4 | if list == []: 5 | return 0 6 | return 1 + count(list[1:]) 7 | 8 | list = [1, 2, 3, 4, 5] 9 | 10 | print(count(list)) 11 | -------------------------------------------------------------------------------- /05-recursion/array-items-sum-with-recursion.py: -------------------------------------------------------------------------------- 1 | # python3 2 | 3 | def sum(list): 4 | if list == []: 5 | return 0 6 | return list[0] + sum(list[1:]) 7 | 8 | list = [1, 2, 3, 4, 5] 9 | 10 | print(sum(list)) 11 | -------------------------------------------------------------------------------- /06-stack/README.md: -------------------------------------------------------------------------------- 1 | # Stack 2 | 3 | - You can think stack analogy of sticky notes! 4 | 5 | - When you insert an item, it gets added to the top of the list. When you read an item, you only read the topmost item, and it's taken off the list. So your todo list has only two actions: push (insert) and pop (remove and read). 6 | 7 | ![stack](images/stack.png) 8 | 9 | ## The Call Stack 10 | 11 | - The computer uses a stack internally called the call stack. Let's see it in action: 12 | 13 | ```ruby 14 | def greet(name) 15 | print "hello #{name}" 16 | say_cheers(name) 17 | bye(name) 18 | end 19 | ``` 20 | 21 | - This function (`greet`), prints something and then calls two other functions (`say_cheers` and `bye`). 22 | 23 | - Suppose you call `greet("maggie")`. First, your computer allocates a box of memory for that function call. 24 | 25 | ![call-stack-1](images/call-stack-1.png) 26 | 27 | - Now let's use the memory. The variable name is set to "maggie". That needs to be saved in memory. 28 | 29 | ![call-stack-2](images/call-stack-2.png) 30 | 31 | - Every time you make a function call, your computer saves the values for all the variables for that call in memory like this. Next, you print `hello, maggie!` Then you call `greet2("maggie")`. Again, your computer allocates a box of memory for this function call. 32 | 33 | ![call-stack-3](images/call-stack-3.png) 34 | 35 | - Your computer is using a stack for these boxes. The second box is added on top of the first one. You print `how are you, maggie?`. Then you return from the function call. When this happens, the box on top of the stack gets popped off. 36 | 37 | ![call-stack-4](images/call-stack-4.png) 38 | 39 | - Now the topmost box on the stack is for the `greet` function, which means you returned back to the `greet` function. 40 | 41 | - When you called the `say_cheers` function, the `greet` function was partially completed. This is the big idea behind this section: **when you call a function from another function, the calling function is paused in a partially completed state**. 42 | 43 | - All the values of the variables for that function are still stored in memory. Now that you're done with the `say_cheers` function, you're back to the `greet` function, and you pick up where you left off. First, you print `getting ready to say bye...`. You call the `bye` function. 44 | 45 | ![call-stack-5](images/call-stack-5.png) 46 | 47 | - A box for that function is added to the top of the stack. Then you print `ok bye!` and return from the function call. 48 | 49 | ![call-stack-6](images/call-stack-6.png) 50 | 51 | - And you're back to the `greet` function. There's nothing else to be done, so you return from the `greet` function too. This stack, used to save the variables for multiple functions, is called the **call stack**. 52 | 53 | ## The call stack with recursion 54 | 55 | - Let's look at this in action with the factorial function. factorial(5) is written as 5!, and it's defined like this: 5! = 5* 4 * 3 * 2 * 1. Similarly, factorial(3) is 3 * 2 * 1. Here's a recursive function to calculate the factorial of a number: 56 | 57 | ```python 58 | def fact(x): 59 | if x == 1: 60 | return 1 61 | else: 62 | return x * fact(x-1) 63 | ``` 64 | 65 | ![call-stack-example-1](images/call-stack-example-1.png) 66 | 67 | ![call-stack-example-2](images/call-stack-example-2.png) 68 | 69 | - Using the stack is convenient, but there is a cost. Saving all that info can take up a lot of memory. Each of those function calls takes up some memory, and when your stack is too big, that means your computer is saving information for many function calls. **At this point you have two options**: 70 | 71 | 1. You can rewrite your code to use a look instead. 72 | 1. You can use something called tail recursion. That's an advanced topic and only supported by some languages. 73 | 74 | - Question: What if you accidentally write a recursive function that runs forever? 75 | - Answer: When the program runs out of space, it will exit with a stack-overflow error. 76 | 77 | ## Recap 78 | 79 | - Recursion is when a function calls itself. 80 | - Every recursive function has two cases: the base case and the recursive case. 81 | - A stack has two operations: push and pop. 82 | - All function calls go onto the stack (remember sticky notes). 83 | - The call stack can get very large, which takes up a lot of memory. 84 | -------------------------------------------------------------------------------- /06-stack/images/call-stack-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/06-stack/images/call-stack-1.png -------------------------------------------------------------------------------- /06-stack/images/call-stack-2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/06-stack/images/call-stack-2.png -------------------------------------------------------------------------------- /06-stack/images/call-stack-3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/06-stack/images/call-stack-3.png -------------------------------------------------------------------------------- /06-stack/images/call-stack-4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/06-stack/images/call-stack-4.png -------------------------------------------------------------------------------- /06-stack/images/call-stack-5.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/06-stack/images/call-stack-5.png -------------------------------------------------------------------------------- /06-stack/images/call-stack-6.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/06-stack/images/call-stack-6.png -------------------------------------------------------------------------------- /06-stack/images/call-stack-example-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/06-stack/images/call-stack-example-1.png -------------------------------------------------------------------------------- /06-stack/images/call-stack-example-2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/06-stack/images/call-stack-example-2.png -------------------------------------------------------------------------------- /06-stack/images/stack.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/06-stack/images/stack.png -------------------------------------------------------------------------------- /07-quicksort/README.md: -------------------------------------------------------------------------------- 1 | # Quicksort 2 | 3 | - When you get a new problem, you don't have to be stumped. Instead, you can ask, "Can I solve this if I use divide and conquer?" 4 | 5 | - Quicksort is an example of divide-and-conquer, and it is much more faster than selection sort. 6 | 7 | - Binary search was also an example of divide-and-conquer. From this viewpoint, we can say that they are similar, since we split the problem into smaller parts. 8 | 9 | - Lets use quicksort to sort an array. What is the simplest array that a sorting algorithm can handle? Well, some arrays don't need to be sorted at all, such as empty arrays and arrays with just one element will be the **base case**: 10 | 11 | ```python 12 | def quicksort(array): 13 | if len(array) < 2: 14 | return array 15 | ``` 16 | 17 | - For arrays with more than 2 items, we have to pick an item from the array, we call it **pivot**. 18 | 19 | - Now find the elements smaller than the pivot and the elements larger than the pivot: 20 | 21 | ![pivot](images/pivot.png) 22 | 23 | - This is called **partitioning**. Now you have: 24 | 25 | - The pivot 26 | - A sub-array of all the numbers less than the pivot 27 | - A sub-array of all the numbers greater than the pivot 28 | 29 | - The two sub-arrays aren't sorted. They're just partitioned. But if they were sorted, then sorting the whole array would be pretty easy. If the sub-arrays are sorted, then you can combine the whole thing like this— left array + pivot + right array —and you get a sorted array. How do you sort the sub-arrays? Well, the quicksort base case already knows how to sort arrays of two elements (the left sub-array) and empty arrays (the right sub-array). 30 | 31 | - What about an array of four elements? 32 | 33 | ![recursion](images/recursion.png) 34 | -------------------------------------------------------------------------------- /07-quicksort/images/pivot.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/07-quicksort/images/pivot.png -------------------------------------------------------------------------------- /07-quicksort/images/recursion.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/07-quicksort/images/recursion.png -------------------------------------------------------------------------------- /07-quicksort/quicksort.py: -------------------------------------------------------------------------------- 1 | # python3 2 | 3 | def quicksort(array): 4 | less = [] 5 | greater = [] 6 | 7 | if len(array) < 2: 8 | return array # base case 9 | else: 10 | pivot = array[0] # pick a pivot 11 | for i in array[1:]: 12 | if i <= pivot: 13 | less.append(i) 14 | else: 15 | greater.append(i) 16 | 17 | return quicksort(less) + [pivot] + quicksort(greater) 18 | 19 | list = [10, 5, 20, 13, 99, 24, 80, 30, 35, 80, 85, 86, 87, 100] 20 | 21 | print(quicksort(list)) -------------------------------------------------------------------------------- /07-quicksort/quicksort.rb: -------------------------------------------------------------------------------- 1 | # encoding: UTF-8 2 | 3 | def quicksort(array) 4 | less = [] 5 | greater = [] 6 | 7 | if array.length < 2 8 | return array 9 | else 10 | pivot = array[0] 11 | 12 | array[1..-1].each do |i| 13 | if i <= pivot 14 | less << i 15 | else 16 | greater << i 17 | end 18 | end 19 | 20 | return quicksort(less) + [pivot] + quicksort(greater) 21 | end 22 | end 23 | 24 | list = [10, 5, 20, 13, 99, 24, 80, 30, 35, 80, 85, 86, 87, 100] 25 | 26 | print quicksort(list) -------------------------------------------------------------------------------- /08-divide-and-conquer/README.md: -------------------------------------------------------------------------------- 1 | # Divide and Conquer 2 | 3 | D&C algorithms are recursive algorithms. To solve a problem using D&C, there are two steps: 4 | 5 | 1. Figure out the base case. This should be the simplest possible case. 6 | 1. Divide or decrease your problem until it becomes the base case. 7 | 8 | ## Question 9 | 10 | - How many equal squares can you split a 1680x640 field, by using minimum number of squares? In other words, target for the maximum square size. 11 | 12 | ![dc-1](images/dc-1.png) 13 | 14 | - First we need to find the base case. The case has to be the simplest possible case. Then you'll have to divide the problem into multiple easier pieces by using the base case, in the end the situation we reach will be our new base case. 15 | 16 | - First, figure out the base case. The easiest case would be if one side was a multiple of the other side. 17 | 18 | ![dc-2](images/dc-2.png) 19 | 20 | - You can fit two 640×640 boxes in there, and there's some land still left to be divided. Now here comes the "Aha!" moment. There's a farm segment left to divide. Why don't you apply the same algorithm to this segment? So we recursively continue splitting the are into maximum possible boxes, we get 80 as the lenght of a box. 21 | 22 | ![dc-3](images/dc-3.jpg) 23 | 24 | - Euclid's algorithm is the proof of divide and conquer. 25 | 26 | ## Sneak Peak at Functional Programming 27 | 28 | - "Why would I do this recursively if I can do it easily with a loop?" you may be thinking. Well, this is a sneak peek into functional programming! 29 | 30 | - Functional programming languages like Haskell don't have loops, so you have to use recursion to write functions like this. If you have a good understanding of recursion, functional languages will be easier to learn. 31 | -------------------------------------------------------------------------------- /08-divide-and-conquer/images/dc-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/08-divide-and-conquer/images/dc-1.png -------------------------------------------------------------------------------- /08-divide-and-conquer/images/dc-2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/08-divide-and-conquer/images/dc-2.png -------------------------------------------------------------------------------- /08-divide-and-conquer/images/dc-3.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/08-divide-and-conquer/images/dc-3.jpg -------------------------------------------------------------------------------- /09-big-o-revisited/README.md: -------------------------------------------------------------------------------- 1 | # Big-o-revisited 2 | 3 | - Quicksort is unique because its speed depends on the pivot you choose. 4 | 5 | - Here are most the common Big-o times: 6 | 7 | ![big-o-runtimes](images/big-o-runtimes.png) 8 | 9 | - There is another sorting algorithm called **merge sort**, which is O(n logn). **Quicksort** is a tricky case. In the worst case, quicksort takes O(n2) time. It's as slow as **selection sort**. But that's the worst case. In the average case, quicksort takes **O(n logn)** time. 10 | 11 | - You might be wondering: 12 | 13 | 1. What do worst case and avarage case mean here? 14 | 1. If quicksort is O(n log n) on avarage, but merge sort is O(n logn) always, why not use merge sort? Isn't it faster? 15 | 16 | - We omit constants when stating the big-o, but sometimes the constant can make a difference. Quicksort versus merge sort is one example. Quicksort has a smaller constant than merge sort. So if they are both O(n logn) time, quicksort is faster. And quicksort is faster in practice because it hits the average case way more often than the worst case. 17 | 18 | - The worst case for quicksort is trying to sort an array that is already sorted. In other words if our pivot is the biggest or smallest number in the array, then we hit the worst case (to prevent this, we can pick two random items from the array and take their medium): 19 | 20 | ![pivot-first](images/pivot-first.png) 21 | 22 | - If we pick the middle element as the pivot: 23 | 24 | ![pivot-middle](images/pivot-middle.png) 25 | 26 | - The first example we saw is the worst-case scenario, and the second example is the best-case scenario. In the worst scenario, the stack size is O(n). In the best case, the stack size is O(log n). 27 | 28 | ## WTF is `O(n log n)` 29 | 30 | Now look at the first level in the stack. You pick one element as the pivot, and the rest of the elements are divided into sub-arrays. You touch all eight elements in the array. So this first operation takes O(n) time. You touched all eight elements on this level of the call stack. But actually, you touch O(n) elements on every level of the call stack. 31 | 32 | ![n-logn](images/n-logn.png) 33 | 34 | Even if you partition the array differently, you're still touching O(n) elements every time. 35 | 36 | ![n-logn2](images/n-logn2.png) 37 | 38 | In this example, there are `O(log n)` levels (the technical way to say that is, "The height of the call stack is `O(log n)`"), and each level takes `O(n)` time. The entire algorithm will take `O(n) * O(log n) = O(n log n)` time. This is the best-case scenario. In the worst case, there are `O(n)` levels, so the algorithm will take O(n) * O(n) = O(n2) time. 39 | 40 | Quicksort is one of the fastest sorting algorithms out here and it's a very good example of D&C. 41 | 42 | ## Recap 43 | 44 | - D&C works by breaking a problem down into smaller and smaller pieces. If you're using D&C on a list, the base case is probably an empty array or an array with one element. 45 | 46 | - If you're implementing quicksort, choose a random element as the pivot. The average runtime of quicksort is `O(n log n)`! 47 | 48 | - The constant in Big O notation can matter sometimes. That's why quicksort is faster than merge sort. 49 | -------------------------------------------------------------------------------- /09-big-o-revisited/images/big-o-runtimes.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/09-big-o-revisited/images/big-o-runtimes.png -------------------------------------------------------------------------------- /09-big-o-revisited/images/n-logn.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/09-big-o-revisited/images/n-logn.png -------------------------------------------------------------------------------- /09-big-o-revisited/images/n-logn2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/09-big-o-revisited/images/n-logn2.png -------------------------------------------------------------------------------- /09-big-o-revisited/images/pivot-first.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/09-big-o-revisited/images/pivot-first.png -------------------------------------------------------------------------------- /09-big-o-revisited/images/pivot-middle.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/09-big-o-revisited/images/pivot-middle.png -------------------------------------------------------------------------------- /10-hash-tables/README.md: -------------------------------------------------------------------------------- 1 | # Hash Tables 2 | 3 | A hash function is a function where you put in a string and you get back a number. In technical terminology, we'd say that a hash function "maps strings to numbers". But there are some requirements for a hash function: 4 | 5 | - It needs to be consistent. For example, suppose you put in "apple" and get back "4". Every time you put in "apple", you should get "4" back. 6 | - It should map different words to different numbers. For example, a hash function is no good if it always returns "1" for any word you put in. In the best case, every different word should map to a different number. 7 | 8 | Put a hash function and an array together, and you get a data structure called a **hash table**. 9 | 10 | Arrays and lists map straight to memory, but hash tables are smarter. They use a hash function to intelligently figure out where to store elements. Hash tables are probably the most useful complex data structure you'll learn. They're also known as hash maps, maps, dictionaries, and associative arrays. And hash tables are fast! And hash tables use an array to store the data, so they're equally fast as arrays. 11 | 12 | You'll probably never have to implement hash tables yourself. Any good language will have an implementation for hash tables. Python has hash tables; they're called dictionaries. In Ruby they are named as Hash. 13 | 14 | ```python 15 | phone_book = {} 16 | phone_book[“jenny”] = 8675309 17 | phone_book[“emergency”] = 911 18 | ``` 19 | 20 | ## Using hash tables as a cache 21 | 22 | Another use case for a hash table is caching. If you work on a website, you may have heard of caching before as a good thing to do. 23 | 24 | - Caching has two advantages: 25 | 26 | 1. You get the web page a lot faster, just like when you memorized the distance from Earth to the Moon. The next time your niece asks you, you won't have to Google it. You can answer instantly. 27 | 2. The website has to do less work. 28 | 29 | Caching is a common way to make things faster. All big websites use caching, and that data is cached in a hash! 30 | 31 | ```python 32 | cache = {} 33 | 34 | def get_page(url): 35 | if cache.get(url): 36 | return cache[url] 37 | else: 38 | data = get_data_from_server(url) 39 | cache[url] = data 40 | return data 41 | ``` 42 | 43 | ## Recap 44 | 45 | Hashes are good for: 46 | 47 | - Modeling relationships from one thing to another thing 48 | - Filtering out duplicates 49 | - Caching/memorizing data instead of making your server do work 50 | 51 | ## Performance 52 | 53 | - Performance comparison: 54 | 55 | ![performance](images/performance.png) 56 | 57 | - O(1) means it doesn't matter whether your hash table has 1 element or 1 billion elements—getting something out of a hash table will take the same amount of time. Actually, you've seen constant time before. Getting an item out of an array takes constant time. It doesn't matter how big your array is; it takes the same amount of time to get an element. In the average case, hash tables are really fast. 58 | 59 | - Worst case is something like that for hashes: 60 | 61 | ![worst-case](images/worst-case.png) 62 | 63 | - This is a collision situation. To avoid collisions, you need: 64 | 65 | 1. A low load factor 66 | 2. A good hash function 67 | 68 | ## Load factor 69 | 70 | I'm going to talk about how to implement a hash table, but you'll never have to do that yourself. Whatever programming language you use will have an implementation of hash tables built in. You can use the built-in hash table and assume it will have good performance. This section gives you a peek under the hood. 71 | 72 | > Number of items in hash table / total number of slots 73 | 74 | ![load-factor](images/load-factor.png) 75 | 76 | - Suppose you need to store the price of 100 produce items in your hash table, and your hash table has 50 slots. In the best case, each slot will have 2 items in it. Than we can say the load factor is 2. 77 | 78 | - Having a load factor greater than 1 means you have more items than slots in your array. Once the load factor starts to grow, you need to add more slots to your hash table. This is called resizing. For example, suppose you have this hash table that is getting pretty full. 79 | 80 | ![resizing](images/resizing.png) 81 | 82 | - The rule of thumb is to make an array that is **twice the size** when resizing. 83 | 84 | - Now you need to re-insert all of those items into this new hash table using the hash function: 85 | 86 | ![resized-hash](images/resized-hash.png) 87 | 88 | - This new table has a load factor of 3/8 . Much better! With a lower load factor, you'll have fewer collisions, and your table will perform better. A good rule of thumb is, **resize when your load factor is greater than 0.7**. 89 | 90 | - You might be thinking, "This resizing business takes a lot of time!" And you're right. Resizing is expensive, and you don't want to resize too often. 91 | 92 | ![good-and-bad-hash-function](images/good-and-bad-hash-function.png) 93 | 94 | - What is a good hash function? That's something you'll never have to worry about—old men (and women) with big beards sit in dark rooms and worry about that. 95 | 96 | ## Recap 97 | 98 | - You might soon find that you're using them all the time: 99 | - You can make a hash table by combining a hash function with an array. 100 | - Collisions are bad. You need a hash function that minimizes collisions. 101 | - Hash tables have really fast search, insert, and delete. 102 | - Hash tables are good for modeling relationships from one item to another item. 103 | - Once your load factor is greater than .07, it's time to resize your hash table. 104 | - Hash tables are used for caching data (for example, with a web server). 105 | - Hash tables are great for catching duplicates. 106 | - Hashing is widely used in database indexing, compilers, caching, password authentication, and more. 107 | 108 | ## Bonus: Linear Probing 109 | 110 | - To prevent collision, if more than one data comes across in the same array field, the next cell is searched, until an empty place is found. For example, after getting the mod, the array wanted to put in the third cell, but it was full. Then we look at the fourth, fifth, sixt, etc. until an empty place is found. Whenever an empty space has found, the data goes there. This is a collision blocking approach, like creating a linked list (also called **chaining**). 111 | 112 | ![linear-probing](images/linear-probing.png) 113 | -------------------------------------------------------------------------------- /10-hash-tables/images/good-and-bad-hash-function.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/10-hash-tables/images/good-and-bad-hash-function.png -------------------------------------------------------------------------------- /10-hash-tables/images/linear-probing.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/10-hash-tables/images/linear-probing.png -------------------------------------------------------------------------------- /10-hash-tables/images/load-factor.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/10-hash-tables/images/load-factor.png -------------------------------------------------------------------------------- /10-hash-tables/images/performance.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/10-hash-tables/images/performance.png -------------------------------------------------------------------------------- /10-hash-tables/images/resized-hash.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/10-hash-tables/images/resized-hash.png -------------------------------------------------------------------------------- /10-hash-tables/images/resizing.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/10-hash-tables/images/resizing.png -------------------------------------------------------------------------------- /10-hash-tables/images/worst-case.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/10-hash-tables/images/worst-case.png -------------------------------------------------------------------------------- /11-hash-tables-security-and-bcrypt/README.md: -------------------------------------------------------------------------------- 1 | # Hash Tables and Security 2 | 3 | ## How Not to Store Passwords 4 | 5 | 1. Bad Solution 1: Plain text password 6 | 7 | - So hackable. 8 | 9 | 2. Bad Solution 2: Encryption [`sha1(password), md5(password)`] 10 | 11 | - Very open to brute force and rainbow tables. 12 | - When multiple users use the same password, all will get hacked if one can be cracked because encrypted strings will be the same. 13 | 14 | 3. Bad Solution 3: Encryption with Fixed Salt [`sha1(FIXED_SALT + password)`] 15 | 16 | - Fixed salts are extremely risky. 17 | 18 | 4. Bad Solution 4: Encryption with Per User Salt [`sha1(PER_USER_SALT + password)`] 19 | 20 | - Storing salts in the DB is risky. 21 | 22 | 5. Hashing 23 | 24 | - It can even be cracked with a simple Google search if it's a simple algorithm. Again, quite open to brute force and rainbow table attacks. 25 | 26 | ## How to Store Passwords 27 | 28 | - Hashing + salting. 29 | 30 | - A salt is a random string, unique for each user. So even if two different users use the same password they won't be hacked together if one of their passwords got hacked, simply because their hash strings will be different. However, this method still doesn't protect passwords from bruteforce attacks. So it's an effective way to stop rainbow table attacks, but not to bruteforce. 31 | 32 | ## Bcrypt 33 | 34 | - Bcrypt works like that: 35 | 36 | ```ruby 37 | hash(salt + just_entered_password) 38 | ``` 39 | 40 | - Bcrypt has designed to work slow! For example, let's say during a fixed amount of time (N), an attacker can generate 150.000 MD5 hash strings - but he/she will only gonna be able to generate 500 bcrypt hash strings in the same time period (N). In order words, generating a hash string for bcrypt is 300 times slower than MD5. This makes it impossible to generate rainbow table for bcrypt. 41 | 42 | - But wait, if bcrypt generates the salt randomly, how do we compare hash strings during the login time? Since we can't generate a random thing again. 43 | 44 | - BCrypt defines its own `==` method, which knows how to extract that "salt" value so that it can take that into account when comparing the passwords. Bcrypt is not an encryption algorithm, it is a hashing algorithm. You cannot reverse a hash. For example if the hashing algorith uses modulus somehow, it's almost impossible to reverse it, since modulus of many numbers are the same (ie. modulus of 1000 and 100). 45 | 46 | ![bcrypt](images/bcrypt.png) 47 | 48 | - Bcrypt places the random salt into the hash string, so there is no need to store salt anywhere else. The function embedded into bcrypt already knows how to extract the salt from the hash string. 49 | 50 | - Resource: http://dustwell.com/how-to-handle-passwords-bcrypt.html 51 | -------------------------------------------------------------------------------- /11-hash-tables-security-and-bcrypt/images/bcrypt.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/11-hash-tables-security-and-bcrypt/images/bcrypt.png -------------------------------------------------------------------------------- /12-breadth-first-search/README.md: -------------------------------------------------------------------------------- 1 | # Breadth First Search 2 | 3 | Breadth-first search allows you to find the shortest distance between two things. But shortest distance can mean a lot of things! 4 | 5 | You can use breadth-first search to: 6 | 7 | - Write a checkers AI that calculates the fewest moves to victory. 8 | - Write a spell checker (fewest edits from your misspelling to a real word—for example, READED -> READER is one edit). 9 | - Find the doctor closest to you in your network. 10 | 11 | ## Graph 12 | 13 | ![simple-graph](images/simple-graph.png) 14 | 15 | Alex owes Rama money, Tom owes Adit money, and so on. Each graph is made up of nodes and edges. 16 | 17 | ![node-edge](images/node-edge.png) 18 | 19 | ## Tree 20 | 21 | ![tree](images/tree.png) 22 | 23 | This is called a **tree**. A tree is a special type of graph, where no edges ever point back. 24 | 25 | ## Breadth-first search 26 | 27 | It can help answer two types of questions: 28 | 29 | 1. Question type 1: Is there a path from node A to node B? 30 | 1. Question type 2: What is the shortest path from node A to node B? 31 | 32 | You'd prefer a first-degree connection to a second-degree connection, and a second-degree connection to a third-degree connection, and so on. So you shouldn't search any second-degree connections before you make sure you don't have a first-degree connection who is a mango seller. Well, breadth-first search already does this! The way breadth-first search works, the search radiates out from the starting point. So you'll check first-degree connections before second-degree connections. 33 | 34 | ![queues-1](images/queues-1.png) 35 | 36 | Another way to see this is, first-degree connections are added to the search list before second-degree connections. 37 | 38 | You need to search people in the order that they're added. There's a data structure for this: it's called a **queue**. 39 | 40 | ## Queues 41 | 42 | A queue works exactly like it does in real life. Suppose you and your friend are queueing up at the bus stop. If you're before him in the queue, you get on the bus first. 43 | 44 | ![queues-2](images/queues-2.png) 45 | 46 | Queues are similar to **stacks**. You can't access random elements in the queue. Instead, there are two only operations, **enqueue** and **dequeue**. 47 | 48 | A queue is an abstract data type (ADT). 49 | 50 | If you enqueue two items to the list, the first item you added will be dequeued before the second item. You can use this for your search list! People who are added to the list first will be dequeued and searched first. 51 | 52 | ![dequeue](images/dequeue.png) 53 | 54 | The queue is called a **FIFO** data structure: **First In, First Out**. In contrast, a stack is a **LIFO** data structure: **Last In, First Out**. 55 | 56 | ![fifo-lifo](images/fifo-lifo.png) 57 | 58 | ## Implementing the graph 59 | 60 | First, you need to implement the graph in code. A graph consists of several nodes. And each node is connected to neighboring nodes. 61 | 62 | How do you express a relationship like "you -> bob"? Luckily, you know a data structure that lets you express relationships: a **hash table**! Remember, a hash table allows you to map a key to a value. In this case, you want to map a node to all of its neighbors. 63 | 64 | ```python 65 | graph = {} 66 | graph["you"] = ["alice", "bob", "claire"] 67 | ``` 68 | 69 | Notice that "you" is mapped to an array. So graph["you"] will give you an array of all the neighbors of "you". 70 | 71 | What about a bigger graph, like this one? 72 | 73 | ![complex-graph](images/complex-graph.png) 74 | 75 | ```python 76 | graph = {} 77 | graph["you"] = ["alice", "bob", "claire"] 78 | graph["bob"] = ["anuj", "peggy"] 79 | graph["alice"] = ["peggy"] 80 | graph["claire"] = ["thom", "jonny"] 81 | graph["anuj"] = [] 82 | graph["peggy"] = [] 83 | graph["thom"] = [] 84 | graph["jonny"] = [] 85 | ``` 86 | 87 | ![directed-undirected](images/directed-undirected.png) 88 | 89 | - When updating queues, I use the terms **enqueue** and **dequeue**. You'll also encounter the terms **push* and **pop**. Push is almost always the same thing as enqueue, and pop is almost always the same thing as dequeue. 90 | 91 | - Enqueue, dequeue (or push, pop etc.) takes O(1) constant time. 92 | 93 | - We can implement queues as arrays or as linked lists. 94 | 95 | ## Implementing the Algorithm 96 | 97 | ![queue-algorithm](images/queue-algorithm.png) 98 | 99 | Alice and Bob share a friend: Peggy. So Peggy will be added to the queue twice: once when you add Alice's friends, and again when you add Bob's friends. You'll end up with two Peggys in the search queue. 100 | 101 | ![duplicate-queue-1](images/duplicate-queue-1.png) 102 | 103 | But you only need to check Peggy once to see whether she's a mango seller. If you check her twice, you're doing unnecessary, extra work. So once you search a person, you should mark that person as searched and not search them again. 104 | 105 | If you don't do this, you could also end up in an infinite loop. Suppose the mango seller graph looked like this. 106 | 107 | ![duplicate-queue-2](images/duplicate-queue-2.png) 108 | 109 | - Before checking a person, it's important to make sure they haven't been checked already. To do that, you'll keep a list of people you've already checked. 110 | 111 | ## Running time 112 | 113 | If you search your entire network for a mango seller, that means you'll follow each edge (remember, an edge is the arrow or connection from one person to another). So the running time is at least O(number of edges). 114 | 115 | You also keep a queue of every person to search. Adding one person to the queue takes constant time: O(1). Doing this for every person will take O(number of people) total. Breadth-first search takes O(number of people + number of edges), and it's more commonly written as O(V+E) (V for number of vertices, E for number of edges). 116 | 117 | ## Topological sort 118 | 119 | ![topological-sort](images/topological-sort.png) 120 | 121 | You could say that this list is sorted, in a way. If task A depends on task B, task A shows up later in the list. This is called a **topological sort**, and it's a way to make an ordered list out of a graph. 122 | 123 | ## Recap 124 | 125 | - Breadth-first search tells you if there's a path from A to B. 126 | - If there's a path, breadth-first search will find the shortest path. 127 | - If you have a problem like "find the shortest X" try modeling your problem as a graph, and use breadth-first search to solve. 128 | - A directed graph has arrows, and the relationship follows the direction of the arrow (rama -> adit means "rama owes adit money"). 129 | - Undirected graphs don't have arrows, and the relationship goes both ways (ross - rachel means "ross dated rachel and rachel dated ross"). 130 | - Queues are FIFO (First In, First Out). 131 | - Stacks are LIFO (Last In, First Out). 132 | - You need to check people in the order they were added to the search list, so the search list needs to be a queue. Otherwise, you won't get the shortest path. 133 | - Once you check someone, make sure you don't check them again. Otherwise, you might end up in an infinite loop. 134 | 135 | # Queues in Ruby 136 | 137 | - It is especially useful in threaded programming when information must be exchanged safely between multiple threads. The `Queue` class implements all the required locking semantics. 138 | 139 | - The `Queue` class is used to synchronize communication between threads. You would use this if you were doing something with concurrency. 140 | 141 | ```ruby 142 | require 'thread' 143 | queue = Queue.new 144 | 145 | producer = Thread.new do 146 | 5.times do |i| 147 | sleep rand(i) # simulate expense 148 | queue << i 149 | puts "#{i} produced" 150 | end 151 | end 152 | 153 | consumer = Thread.new do 154 | 5.times do |i| 155 | value = queue.shift 156 | sleep rand(i/2) # simulate expense 157 | puts "consumed #{value}" 158 | end 159 | end 160 | ``` 161 | 162 | ## Methods 163 | 164 | - `new`: creates a new queue 165 | - `<<`: same as push. 166 | - `clear`: remove all objects from queue. 167 | - `close`: closes the queue. a closed queue cannot be re-opened. 168 | - `closed?`: returns true if the queue is closed. 169 | - `empty?`: returns true if the queue is empty. 170 | - `length`: returns the size of the queue. 171 | - `num_waiting`: returns the number of objects waiting on the queue. 172 | - `push`: add an item to the end of the queue. 173 | - `shift`: remove the first object from queue. 174 | - `pop`: remove the last object from queue. 175 | - `size`: same as length. 176 | 177 | ## Stack versus Queue 178 | 179 | Well, an array can be a stack or queue by limiting yourself to stack or queue methods (push, pop, shift, unshift). 180 | 181 | - Stack behavior: Using **push / pop** gives **LIFO (last in first out)** behavior (stack). 182 | - Queue behavior: Using **push / shift** gives **FIFO (first in first out)** behavior (queue). 183 | -------------------------------------------------------------------------------- /12-breadth-first-search/array-queue.py: -------------------------------------------------------------------------------- 1 | # python3 2 | 3 | from collections import deque 4 | 5 | class Queue: 6 | # constructor creates a list 7 | def __init__(self): 8 | self.queue = list() 9 | 10 | # adding elements to queue 11 | def enqueue(self, data): 12 | if data not in self.queue: 13 | self.queue.insert(0, data) 14 | print("%s %s" %(data, 'queued!',)) 15 | return False 16 | 17 | def dequeue(self): 18 | if len(self.queue) > 0: 19 | item = self.queue.pop() 20 | print("%s %s" %(item, 'dequeued!',)) 21 | return ("Queue Empty!") 22 | 23 | # getting the size of the queue 24 | def size(self): 25 | return len(self.queue) 26 | 27 | myQueue = Queue() 28 | 29 | for i in range(1, 11): 30 | myQueue.enqueue(i) 31 | 32 | while myQueue.size() > 0: 33 | myQueue.dequeue() 34 | -------------------------------------------------------------------------------- /12-breadth-first-search/array-queue.rb: -------------------------------------------------------------------------------- 1 | # encoding: UTF-8 2 | 3 | require 'thread' 4 | 5 | @queue = Queue.new 6 | 7 | def produce 8 | (1..10).each do |i| 9 | sleep 0.2 10 | @queue << i 11 | puts "#{i} enqueued!" 12 | end 13 | end 14 | 15 | def consume 16 | @queue.close if @queue.empty? 17 | 18 | until @queue.empty? 19 | sleep 0.3 20 | value = @queue.shift 21 | puts "#{value} dequeued!" 22 | end 23 | end 24 | 25 | produce 26 | consume 27 | -------------------------------------------------------------------------------- /12-breadth-first-search/images/complex-graph.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/12-breadth-first-search/images/complex-graph.png -------------------------------------------------------------------------------- /12-breadth-first-search/images/dequeue.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/12-breadth-first-search/images/dequeue.png -------------------------------------------------------------------------------- /12-breadth-first-search/images/directed-undirected.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/12-breadth-first-search/images/directed-undirected.png -------------------------------------------------------------------------------- /12-breadth-first-search/images/duplicate-queue-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/12-breadth-first-search/images/duplicate-queue-1.png -------------------------------------------------------------------------------- /12-breadth-first-search/images/duplicate-queue-2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/12-breadth-first-search/images/duplicate-queue-2.png -------------------------------------------------------------------------------- /12-breadth-first-search/images/fifo-lifo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/12-breadth-first-search/images/fifo-lifo.png -------------------------------------------------------------------------------- /12-breadth-first-search/images/node-edge.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/12-breadth-first-search/images/node-edge.png -------------------------------------------------------------------------------- /12-breadth-first-search/images/queue-algorithm.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/12-breadth-first-search/images/queue-algorithm.png -------------------------------------------------------------------------------- /12-breadth-first-search/images/queues-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/12-breadth-first-search/images/queues-1.png -------------------------------------------------------------------------------- /12-breadth-first-search/images/queues-2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/12-breadth-first-search/images/queues-2.png -------------------------------------------------------------------------------- /12-breadth-first-search/images/simple-graph.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/12-breadth-first-search/images/simple-graph.png -------------------------------------------------------------------------------- /12-breadth-first-search/images/topological-sort.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/12-breadth-first-search/images/topological-sort.png -------------------------------------------------------------------------------- /12-breadth-first-search/images/tree.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/12-breadth-first-search/images/tree.png -------------------------------------------------------------------------------- /12-breadth-first-search/list-queue.py: -------------------------------------------------------------------------------- 1 | # python3 2 | 3 | from collections import deque 4 | 5 | graph = {} 6 | graph["you"] = ["alice", "bob", "claire"] 7 | graph["bob"] = ["anuj", "peggy"] 8 | graph["alice"] = ["peggy"] 9 | graph["claire"] = ["thom", "jonny"] 10 | graph["anuj"] = [] 11 | graph["peggy"] = [] 12 | graph["thom"] = [] 13 | graph["jonny"] = [] 14 | 15 | def search(name): 16 | search_queue = deque() # create a queue 17 | search_queue += graph[name] # add all your neighbours (alice, bob and claire) to the queue. 18 | searched = [] 19 | 20 | while search_queue: # While the queue isn't empty 21 | person = search_queue.popleft() # dequeue the first person from queue. 22 | if not person in searched: 23 | if person_is_seller(person): 24 | print(person + " is a mango seller!") 25 | return True 26 | else: 27 | search_queue += graph[person] # Add all of this person's friends to the search queue. 28 | searched.append(person) 29 | return False 30 | 31 | # This function checks whether the person's name ends with the letter m. If it does, they're a mango seller. 32 | def person_is_seller(name): 33 | return name[-1] == 'm' 34 | 35 | search("you") -------------------------------------------------------------------------------- /12-breadth-first-search/list-queue.rb: -------------------------------------------------------------------------------- 1 | # encoding: UTF-8 2 | 3 | require 'thread' 4 | 5 | @graph = [["alice", "bob", "claire"], ["anuj", "peggy"], ["peggy"], ["thom", "jonny"]] 6 | @search_queue = Queue.new 7 | 8 | def enqueue(arr) 9 | added = [] 10 | 11 | arr.each do |item| 12 | item.each do |element| 13 | unless added.include?(element) 14 | @search_queue.push(element) 15 | added << element 16 | puts "#{element} queued!" 17 | end 18 | end 19 | end 20 | end 21 | 22 | def dequeue(queue) 23 | until queue.empty? 24 | item = queue.shift 25 | puts "#{item} dequeued!" 26 | sleep 0.5 27 | end 28 | end 29 | 30 | def search(name, queue) 31 | searched = [] 32 | 33 | unless queue.empty? || searched.include?(name) 34 | if person_is_seller(name) 35 | puts "#{name} is a mango seller!" 36 | return true 37 | else 38 | searched << name 39 | end 40 | end 41 | end 42 | 43 | # this function checks whether the person's name ends with the letter m. If it does, they're a mango seller. 44 | def person_is_seller(name) 45 | name[-1] == 't' 46 | end 47 | 48 | enqueue(@graph) 49 | dequeue(@search_queue) 50 | enqueue([["serhat"]]) 51 | search("serhat", @search_queue) -------------------------------------------------------------------------------- /13-dijkstras-algorithm/README.md: -------------------------------------------------------------------------------- 1 | # Dijkstras Algorithm 2 | 3 | - Dijkstra's algorithm, lets you answer "What's the shortest path to X?" for weighted graphs. 4 | 5 | ![fastest-path](images/fastest-path.png) 6 | 7 | - Breadth-first search will find you the path with the fewest steps. What if you want the fastest path instead? You can do that fastest with a different algorithm called Dijkstra's algorithm. 8 | 9 | - Dijkstra's algorithm has four steps: 10 | 11 | 1. Find the cheapest node. This is the node you can get to in the least amount of time. 12 | 2. Check whether there's a cheaper path to the neighbors of this node. If so, update their costs. 13 | 3. Repeat until you've done this for every node in the graph. 14 | 4. Calculate the final path. 15 | 16 | - Dijkstra's algorithm, each edge in the graph has a number associated with it. These are called weights. A graph with weights is called a weighted graph. A graph without weights is called an unweighted graph. 17 | 18 | ![weighted-unweighted](images/weighted-unweighted.png) 19 | 20 | - Dijkstra's algorithm only works with directed acyclic graphs, called DAGs for short. In other words, Dijkstra's algorithm can not work with recurcive cycle graphs. 21 | 22 | - You can't use Dijkstra's algorithm if you have negative-weight edges. Negative-weight edges break the algorithm. 23 | 24 | ## Implementation 25 | 26 | ![implementation](images/implementation.png) 27 | 28 | - You'll update the costs and parents hash tables as the algorithm progresses. First, you need to implement the graph: 29 | 30 | ```python 31 | graph = {} 32 | ``` 33 | 34 | - This time, we need to store the neighbors and the cost for getting to that neighbor. For example, Start has two neighbors, A and B. How do you represent the weights of those edges? Why not just use another hash table? 35 | 36 | ```python 37 | graph["start]["a"] = 6 38 | graph["start]["b"] = 2 39 | graph["a"]["fin"] = 1 40 | graph["b"]["a"] = 3 41 | graph["b"]["fin"] = 5 42 | graph["fin"] = {} 43 | ``` 44 | 45 | - So graph["start"] is a hash table. You can get all the neighbors for Start like this: 46 | 47 | ```python 48 | >>> print graph["start"].keys() 49 | ["a", "b"] 50 | ``` 51 | 52 | - Next you need a hash table to store the costs for each node. The cost of a node is how long it takes to get to that node from the start. If you don't know the cost yet, you put down infinity. Can you represent infinity in Python? Turns out, you can: 53 | 54 | ```python 55 | infinity = float("inf") 56 | ``` 57 | 58 | - Here's the code to make the costs table: 59 | 60 | ```python 61 | infinity = float("inf") 62 | costs = {} 63 | costs["a"] = 6 64 | costs["b"] = 2 65 | costs["fin"] = infinity 66 | ``` 67 | 68 | - You also need another hash table for the parents: 69 | 70 | ![parents](images/parents.png) 71 | 72 | - Here's the code to make the hash table for the parents: 73 | 74 | ```python 75 | parents = {} 76 | parents["a"] = "start" 77 | parents["b"] = "start" 78 | parents["fin"] = None 79 | ``` 80 | 81 | - Finally, you need an array to keep track of all the nodes you've already processed, because you don't need to process a node more than once: 82 | 83 | ```python 84 | processed = [] 85 | ``` 86 | 87 | - I'll show you the code first and then walk through it. Here's the code: 88 | 89 | ![algorithm-code](images/algorithm-code.png) 90 | 91 | - First, let's see this `find_lowest_cost_node` algorithm code in action: 92 | 93 | ![dijkstra-1](images/dijkstra-1.png) 94 | ![dijkstra-2](images/dijkstra-2.png) 95 | 96 | - Once you've processed all the nodes, the algorithm is over. I hope the walkthrough helped you understand the algorithm a little better. Finding the lowest-cost node is pretty easy with the `find_lowest_cost_node` function. Here it is in code: 97 | 98 | ![find_lowest_cost_node](images/find-lowest-cost-node.png) 99 | 100 | ## Recap 101 | 102 | - Breadth-first search is used to calculate the shortest path for an unweighted graph. 103 | - Dijkstra's algorithm is used to calculate the shortest path for a weighted graph. 104 | - Dijkstra's algorithm works when all the weights are positive. 105 | - If you have negative weights, use the Bellman-Ford algorithm. 106 | -------------------------------------------------------------------------------- /13-dijkstras-algorithm/images/algorithm-code.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/13-dijkstras-algorithm/images/algorithm-code.png -------------------------------------------------------------------------------- /13-dijkstras-algorithm/images/dijkstra-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/13-dijkstras-algorithm/images/dijkstra-1.png -------------------------------------------------------------------------------- /13-dijkstras-algorithm/images/dijkstra-2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/13-dijkstras-algorithm/images/dijkstra-2.png -------------------------------------------------------------------------------- /13-dijkstras-algorithm/images/fastest-path.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/13-dijkstras-algorithm/images/fastest-path.png -------------------------------------------------------------------------------- /13-dijkstras-algorithm/images/find-lowest-cost-node.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/13-dijkstras-algorithm/images/find-lowest-cost-node.png -------------------------------------------------------------------------------- /13-dijkstras-algorithm/images/implementation.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/13-dijkstras-algorithm/images/implementation.png -------------------------------------------------------------------------------- /13-dijkstras-algorithm/images/parents.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/13-dijkstras-algorithm/images/parents.png -------------------------------------------------------------------------------- /13-dijkstras-algorithm/images/weighted-unweighted.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/13-dijkstras-algorithm/images/weighted-unweighted.png -------------------------------------------------------------------------------- /14-greedy-algorithms/README.md: -------------------------------------------------------------------------------- 1 | # Greedy Algorithms 2 | 3 | - Greedy algorithms are easy. 4 | 5 | - Greedy algorithm always targets the most optimum behavior in each step. But why it's called greedy? Let's look at the example here: 6 | 7 | ![greedy-algorithms](images/greedy-algorithms.png) 8 | 9 | In this image, we were asked to find the biggest possible sum. If you imitate the greedy algorithm, you would be choosing 7 in the first iteration, and then you would be choosing 11. Finally, the sum would be 3+7+11=21. However, if you've chosen 4 in the first iteration, and then 20, the sum would be 27. Being greedy in the first step didn't help us in this example. 10 | 11 | ## The classroom scheduling problem 12 | 13 | - Suppose you have a classroom and want to hold as many classes here as possible. You get a list of classes. 14 | 15 | ![classroom-schedule](images/classroom-schedule.png) 16 | 17 | - You want to hold as many classes as possible in this classroom. How do you pick what set of classes to hold, so that you get the biggest set of classes possible? 18 | 19 | - Sounds like a hard problem, right? Actually, the algorithm is so easy, it might surprise you. Here's how it works: 20 | 21 | 1. Pick the class that ends the soonest. This is the first class you'll hold in this classroom. 22 | 2. Now, you have to pick a class that starts after the first class. Again, pick the class that ends the soonest. This is the second class you'll hold. 23 | 24 | ## The knapsack problem 25 | 26 | - Suppose you're a greedy thief. You're in a store with a knapsack, and there are all these items you can steal. But you can only take what you can fit in your knapsack. The knapsack can hold 35 pounds. 27 | 28 | - You're trying to maximize the value of the items you put in your knapsack. What algorithm do you use? Again, the greedy strategy is pretty simple: 29 | 30 | 1. Pick the most expensive thing that will fit in your knapsack. 31 | 2. Pick the next most expensive thing that will fit in your knapsack. And so on. 32 | 33 | ![knapsack](images/knapsack.png) 34 | 35 | Your knapsack can hold 35 pounds of items. The stereo system is the most expensive, so you steal that. Now you don't have space for anything else. You got $3,000 worth of goods. But wait! If you'd picked the laptop and the guitar instead, you could have had $3,500 worth of loot! 36 | 37 | Clearly, the greedy strategy doesn't give you the optimal solution here. But it gets you pretty close. In the next chapter, I'll explain how to calculate the correct solution. But if you're a thief in a shopping center, you don't care about perfect. “Pretty good” is good enough. 38 | 39 | Here's the takeaway from this second example: sometimes, perfect is the enemy of good. Sometimes all you need is an algorithm that solves the problem pretty well. And that's where greedy algorithms shine, because they're simple to write and usually get pretty close. 40 | 41 | ## The set-covering problem 42 | 43 | - This is an example where greedy algorithms are absolutely necessary. 44 | 45 | Suppose you're starting a radio show. You want to reach listeners in all 50 states. You have to decide what stations to play on to reach all those listeners. It costs money to be on each station, so you're trying to minimize the number of stations you play on. You have a list of stations. Each station covers a region, and there's overlap. 46 | 47 | ![set-covering-1](images/set-covering-1.png) 48 | ![set-covering-2](images/set-covering-2.png) 49 | 50 | How do you gure out the smallest set of stations you can play on to cover all 50 states? Sounds easy, doesn't it? Turns out it's extremely hard. Here's how to do it: 51 | 52 | 1. List every possible subset of stations. This is called the power set. There are `2^n` possible subsets. For example we have Station 1, Station 2, Station 3 and Station 4. We can make 16 different combinations: 53 | 54 | `0, 1, 2, 3, 4, 12, 13, 14, 23, 24, 34, 123, 124, 134, 234, 1234` 55 | 56 | 2. From these, pick the set with the smallest number of stations that covers all 50 states. 57 | 58 | The problem is, it takes a long time to calculate every possible subset of stations. It takes `O(2^n)` time, because there are `2^n` stations. It's possible to do if you have a small set of 5 to 10 stations. But with all the examples here, think about what will happen if you have 100 items. Suppose you can calculate 10 subsets per second. There's no algorithm that solves it fast enough! What can you do? 59 | 60 | ## Approximation algorithms 61 | 62 | Greedy algorithms to the rescue! Here's a greedy algorithm that comes pretty close: 63 | 64 | 1. Pick the station that covers the most states that haven't been covered yet. It's OK if the station covers some states that have been covered already. 65 | 2. Repeat until all tdsahe states are covered. 66 | 67 | This is called an approximation algorithm. When calculating the exact solution will take too much time, an approximation algorithm will work. Approximation algorithms are judged by; 68 | 69 | - How fast they are 70 | - How close they are to the optimal solution 71 | 72 | Greedy algorithms are a good choice because not only are they simple to come up with, but that simplicity means they usually run fast, too. In this case, the greedy algorithm runs in `O(n^2)` time, where n is the number of radio stations. 73 | 74 | In Python, sets are like lists, except sets can't have duplicates. 75 | 76 | ## Exercises 77 | 78 | For each of these algorithms, say whether it's a greedy algorithm or not. 79 | 80 | - Quicksort: No. It brings the best solution, not the optimal (approximated) one. 81 | - Breadth-first search: Yes. 82 | - Dijkstra's algorithm: Yes. 83 | 84 | ## NP-complete problems 85 | 86 | To sole the set-covering problem, you had to calculate every possible set. 87 | 88 | Maybe you were reminded of the traveling salesperson problem from chapter 1. In this problem, a salesperson has to visit five different cities. And he's trying to figure out the shortest route that will take him to all five cities. To find the shortest route, you first have to calculate every possible route. **How many routes do you have to calculate for five cities?** 89 | 90 | It's 5! = 120. Suppose you have 10 cities. How many possible routes are there? 10! = 3,628,800. You have to calculate over 3 million possible routes for 10 cities. As you can see, the number of possible routes becomes big very fast! This is why it's impossible to compute the "correct" solution for the traveling-salesperson problem if you have a large number of cities. 91 | 92 | The traveling-salesperson problem and the set-covering problem both have something in common: you calculate every possible solution and pick the smallest/shortest one. Both of these problems are NP-complete. 93 | 94 | ## Approximating 95 | 96 | What's a good approximation algorithm for the traveling salesperson? Something simple that finds a short path. See if you can come up with an answer before reading on. Here's how I would do it: arbitrarily pick a start city. Then, each time the salesperson has to pick the next city to visit, they pick the closest unvisited city. Suppose they start in Marin. 97 | 98 | ![approximation](images/approximation.png) 99 | 100 | Total distance: 71 miles. Maybe it's not the shortest path, but it's still pretty short. 101 | 102 | - Here's the short explanation of NP-completeness: **some problems are famously hard to solve**. The traveling salesperson and the set-covering problem are two examples. A lot of smart people think that it's not possible to write an algorithm that will solve these problems quickly. 103 | 104 | - It's nice to know if the problem you're trying to solve is NP-complete. At that point, you can stop trying to solve it perfectly, and solve it using an **approximation** algorithm instead. But **it's hard to tell if a problem you're working on is NP-complete**. Usually there's a very small difference between a problem that's easy to solve and an NP-complete problem. 105 | 106 | The short answer: **there's no easy way to tell if the problem you're working on is NP-complete**. Here are some giveaways: 107 | 108 | - Your algorithm runs quickly with a handful of items but really slows down with more items. 109 | - Do you have to calculate "every possible version" of X because you can't break it down into smaller sub-problems? Might be NP-complete. 110 | - If your problem involves a sequence (such as a sequence of cities, like traveling salesperson), and it's hard to solve, it might be NP-complete. 111 | - If your problem involves a set (like a set of radio stations) and it's hard to solve, it might be NP-complete. 112 | - Can you restate your problem as the set-covering problem or the traveling-salesperson problem? Then your problem is definitely NP-complete. 113 | 114 | ## Excercises 115 | 116 | - A postman needs to deliver to 20 homes. He needs to find the shortest route that goes to all 20 homes. Is this an NP-complete problem? 117 | - YES 118 | 119 | - Finding the largest clique in a set of people (a clique is a set of people who all know each other). Is this an NP-complete problem? 120 | - YES 121 | 122 | - You're making a map of the USA, and you need to color adjacent states with different colors. You have to find the minimum number of colors you need so that no two adjacent states are the same color. Is this an NP-complete problem? 123 | - YES 124 | 125 | ## Recap 126 | 127 | - Greedy algorithms optimize locally, hoping to end up with a global optimum. 128 | - NP-complete problems have no known fast solution. 129 | - If you have an NP-complete problem, your best bet is to use an approximation algorithm. 130 | - Greedy algorithms are easy to write and fast to run, so they make good approximation algorithms. 131 | -------------------------------------------------------------------------------- /14-greedy-algorithms/images/.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/14-greedy-algorithms/images/.DS_Store -------------------------------------------------------------------------------- /14-greedy-algorithms/images/approximation.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/14-greedy-algorithms/images/approximation.png -------------------------------------------------------------------------------- /14-greedy-algorithms/images/classroom-schedule.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/14-greedy-algorithms/images/classroom-schedule.png -------------------------------------------------------------------------------- /14-greedy-algorithms/images/greedy-algorithms.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/14-greedy-algorithms/images/greedy-algorithms.png -------------------------------------------------------------------------------- /14-greedy-algorithms/images/knapsack.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/14-greedy-algorithms/images/knapsack.png -------------------------------------------------------------------------------- /14-greedy-algorithms/images/set-covering-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/14-greedy-algorithms/images/set-covering-1.png -------------------------------------------------------------------------------- /14-greedy-algorithms/images/set-covering-2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/14-greedy-algorithms/images/set-covering-2.png -------------------------------------------------------------------------------- /15-dynamic-programming/README.md: -------------------------------------------------------------------------------- 1 | # The Knapsack Problem 2 | 3 | You have three items that you can put into the knapsack. 4 | 5 | ![knapsack-1](images/knapsack-1.png) 6 | 7 | What items should you steal so that you steal the maximum money's worth of goods? The simplest algorithm is this: you try every possible set of goods and find the set that gives you the most value. 8 | 9 | ![knapsack-2](images/knapsack-2.png) 10 | 11 | This works, but it's really slow. For 3 items, you have to calculate 8 possible sets. For 4 items, you have to calculate 16 sets. With every item you add, the number of sets you have to calculate doubles! This algorithm takes `O(2^n)` time, which is very, very slow. 12 | 13 | In chapter 8, you saw how to calculate an approximate solution. That solution will be close to the optimal solution, but it may not be the optimal solution. So how do you calculate the optimal solution? 14 | 15 | ## Dynamic programming 16 | 17 | Answer: With dynamic programming! Let's see how the dynamic-programming algorithm works here. Dynamic programming starts by solving subproblems and builds up to solving the big problem. 18 | 19 | For the knapsack problem, you'll start by solving the problem for smaller knapsacks (or "sub-knapsacks") and then work up to solving the original problem. 20 | 21 | **Every dynamic-programming algorithm starts with a grid.** Here's a grid for the knapsack problem: 22 | 23 | ![dynamic-programming](images/dynamic-programming-1.png) 24 | 25 | The grid starts out empty. You're going to fill in each cell of the grid. Once the grid is filled in, you'll have your answer to this problem! 26 | 27 | ### The guitar row 28 | 29 | ![dynamic-programming-2](images/dynamic-programming-2.png) 30 | 31 | This is the guitar row, which means you're trying to fit the guitar into the knapsack. At each cell, there's a simple decision: **do you steal the guitar or not?** 32 | 33 | The first cell has a knapsack of capacity 1 lb. The guitar is also 1 lb, which means it fits into the knapsack! So the value of this cell is $1,500, and it contains a guitar. 34 | 35 | Like this, each cell in the grid will contain a list of all the items that fit into the knapsack at that point. Let's look at the next cell. Here you have a knapsack of capacity 2 lb. Well, the guitar will definitely fit in there! The same for the rest of the cells in this row. 36 | 37 | ![dynamic-programming-3](images/dynamic-programming-3.png) 38 | 39 | ### The stereo row 40 | 41 | Now that you're on the second row, you can steal the stereo or the guitar. At every row, you can steal the item at that row or the items in the rows above it. So you can't 42 | choose to steal the laptop right now, but you can steal the stereo and/or the guitar. Let's start with the first cell, a knapsack of capacity 1 lb. 43 | 44 | You have a knapsack of capacity 1 lb. Will the stereo fit in there? Nope, it's too heavy! Because you can't fit the stereo, $1,500 remains the max guess for a 1 lb knapsack. 45 | 46 | What if you have a knapsack of capacity 4 lb? Aha: the stereo finally fits! The old max value was $1,500, but if you put the stereo in there instead, the value is $3,000! Let's take the stereo. 47 | 48 | ![dynamic-programming-4](images/dynamic-programming-4.png) 49 | 50 | You just updated your estimate! If you have a 4 lb knapsack, you can fit at least $3,000 worth of goods in it. 51 | 52 | ### The laptop row 53 | 54 | Let's do the same thing with the laptop! The laptop weighs 3 lb, so it won't fit into a 1 lb or a 2 lb knapsack. The estimate for the first two cells stays at $1,500. At 3 lb, the old estimate was $1,500. But you can choose the laptop instead, and that's worth $2,000. So the new max estimate is $2,000! At 4 lb, things get really interesting. This is an important part. The current estimate is $3,000. You can put the laptop in the knapsack, but it's only worth $2,000. 55 | 56 | ![dynamic-programming-5](images/dynamic-programming-5.png) 57 | 58 | Hmm, that's not as good as the old estimate. But wait! The laptop weighs only 3 lb, so you have 1 lb free! You could put something in this 1 lb. 59 | 60 | ![dynamic-programming-6](images/dynamic-programming-6.png) 61 | 62 | What's the maximum value you can fit into 1 lb of space? Well, you've been calculating it all along. 63 | 64 | ![dynamic-programming-7](images/dynamic-programming-7.png) 65 | 66 | According to the last best estimate, you can fit the guitar into that 1 lb space, and that's worth $1,500. So the real comparison is as follows: 67 | 68 | 3000$ Stereo VERSUS (2000$ Laptop + 1500$ Guitar)? 69 | 70 | You might have been wondering why you were calculating max values for smaller knapsacks. I hope now it makes sense! When you have space left over, you can use the answers to those subproblems to figure out what will fit in that space. 71 | 72 | Here is the formula for this problem: 73 | 74 | ![formula](images/formula.png) 75 | 76 | ## The Knapsack Problem FAQ 77 | 78 | - Can you steal fractions of an item? 79 | 80 | You can't. With the dynamic-programming solution, you either take the item or not. There's no way for it to figure out that you should take half an item. 81 | 82 | - Handling items that depend on each other. Suppose you want to go to Paris, so you add a couple of items on the list. These places take a lot of time, because first you have to travel from London to Paris. That takes half a day. If you want to do all three items, it will take four and a half days. Wait, that's not right. You don't have to travel to Paris for each item. Once you're in Paris, each item should only take a day. So it should be one day per item + half a day of travel = 3.5 days, not 4.5 days. If you put the Eiffel Tower in your knapsack, then the Louvre becomes "cheaper" - it will only cost you a day instead of 1.5 days. How do you model this in dynamic programming? 83 | 84 | You can't. Dynamic programming is powerful because it can solve subproblems and use those answers to solve the big problem. Dynamic programming only works when each subproblem is discrete - when it doesn't depend on other subproblems. That means there's no way to account for Paris using the dynamic-programming algorithm. 85 | 86 | # Longest Common Substring 87 | 88 | You've seen one dynamic programming problem so far. What are the takeaways? 89 | 90 | - Dynamic programming is useful when you're trying to optimize something given a constraint. In the knapsack problem, you had to maximize the value of the goods you stole, constrained by the size of the knapsack. 91 | 92 | - You can use dynamic programming when the problem can be broken into discrete subproblems, and they don't depend on each other. 93 | 94 | It can be hard to come up with a dynamic-programming solution. That's what we'll focus on in this section. Some general tips follow: 95 | 96 | - Every dynamic-programming solution involves a grid. 97 | - The values in the cells are usually what you're trying to optimize. For the knapsack problem, the values were the value of the goods. 98 | - Each cell is a subproblem, so think about how you can divide your problem into subproblems. That will help you figure out what the axes are. 99 | 100 | Computer scientists sometimes joke about using the **Feynman algorithm**. The Feynman algorithm is named after the famous physicist Richard Feynman, and it works like this: 101 | 102 | 1. Write down the problem. 103 | 2. Think real hard. 104 | 3. Write down the solution. 105 | 106 | So is dynamic programming ever really used? Yes: 107 | 108 | - Longest common subsequence and longest common substring are examples of dynamic programming, which can be solved with a grid. 109 | 110 | - Biologists use the longest common subsequence to find similarities in DNA strands. They can use this to tell how similar two animals or two diseases are. The longest common subsequence is being used to find a cure for multiple sclerosis. 111 | 112 | - Have you ever used diff (like `git diff`)? Diff tells you the differences between two files, and it uses dynamic programming to do so. 113 | 114 | - We talked about string similarity. Levenshtein distance measures how similar two strings are, and it uses dynamic programming. Levenshtein distance is used for everything from spell-check to figuring out whether a user is uploading copyrighted data. 115 | 116 | - Have you ever used an app that does word wrap, like Microsoft Word? How does it figure out where to wrap so that the line length stays consistent? Dynamic programming! 117 | -------------------------------------------------------------------------------- /15-dynamic-programming/images/dynamic-programming-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/15-dynamic-programming/images/dynamic-programming-1.png -------------------------------------------------------------------------------- /15-dynamic-programming/images/dynamic-programming-2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/15-dynamic-programming/images/dynamic-programming-2.png -------------------------------------------------------------------------------- /15-dynamic-programming/images/dynamic-programming-3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/15-dynamic-programming/images/dynamic-programming-3.png -------------------------------------------------------------------------------- /15-dynamic-programming/images/dynamic-programming-4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/15-dynamic-programming/images/dynamic-programming-4.png -------------------------------------------------------------------------------- /15-dynamic-programming/images/dynamic-programming-5.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/15-dynamic-programming/images/dynamic-programming-5.png -------------------------------------------------------------------------------- /15-dynamic-programming/images/dynamic-programming-6.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/15-dynamic-programming/images/dynamic-programming-6.png -------------------------------------------------------------------------------- /15-dynamic-programming/images/dynamic-programming-7.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/15-dynamic-programming/images/dynamic-programming-7.png -------------------------------------------------------------------------------- /15-dynamic-programming/images/formula.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/15-dynamic-programming/images/formula.png -------------------------------------------------------------------------------- /15-dynamic-programming/images/knapsack-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/15-dynamic-programming/images/knapsack-1.png -------------------------------------------------------------------------------- /15-dynamic-programming/images/knapsack-2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/15-dynamic-programming/images/knapsack-2.png -------------------------------------------------------------------------------- /16-k-nearest-neighbors/README.md: -------------------------------------------------------------------------------- 1 | # K-nearest Neighbors 2 | 3 | - It's about taking a look to the neighbours and making decisions with them when classifying a data. 4 | 5 | ![fruit](images/fruit.png) 6 | 7 | - More neighbors are oranges than grapefruit. So this fruit is probably an orange. Congratulations: You just used the k-nearest neighbors (KNN) algorithm for classification! The whole algorithm is pretty simple. 8 | 9 | ![k-nearest-neighbors](images/k-nearest-neighbors.png) 10 | 11 | But there's still a big piece missing. You graphed the fruits by similarity. How do you figure out how similar two fruits are? 12 | 13 | ## Feature extraction 14 | 15 | - In the grapefruit example, you compared fruit based on how big they are and how red they are. Size and color are the features you're comparing. 16 | 17 | ![feature-extract](images/feature-extract.png) 18 | 19 | - From the graph, you can tell visually that fruits A and B are similar. Let's measure how close they are. To find the distance between two points, you use the Pythagorean formula. 20 | 21 | ![pythagorean-formula](images/pythagorean-formula.png) 22 | 23 | - For Netflix situation, once you can graph users, you can measure the distance between them. Here's how you can convert users into a set of numbers. When users sign up for Netflix, have them rate some categories of movies based on how much they like those categories. For each user, you now have a set of ratings! 24 | 25 | ![netflix](images/netflix.png) 26 | 27 | - Remember how in oranges versus grapefruit, each fruit was represented by a set of two numbers? Here, each user is represented by a set of five numbers. A mathematician would say, instead of calculating the distance in two dimensions, you're now calculating the distance in five dimensions. But the distance formula remains the same. 28 | 29 | ![pythagorean-formula-2](images/pythagorean-formula-2.png) 30 | 31 | The distance tells you how similar those sets of numbers are. 32 | 33 | ## Regression 34 | 35 | Suppose you want to do more than just recommend a movie: you want to guess how Priyanka will rate this movie. Take the five people closest to her. By the way, I keep talking about the closest five people. There's nothing special about the number 5: you could do the closest 2, or 10, or 10,000. That's why the algorithm is called k-nearest neighbors and not five-nearest neighbors! 36 | 37 | Suppose you're trying to guess a rating for Pitch Perfect. Well, how did Justin, JC, Joey, Lance, and Chris rate it? You could take the **average** of their ratings and get 4.2 stars. That's called **regression**. These are the two basic things you'll do with KNN—classification and regression: 38 | 39 | - Classification = categorization into a group 40 | - Regression = predicting a response (like a number) 41 | 42 | ## Picking good features 43 | 44 | To figure out recommendations, you had users rate categories of movies. What if you had them rate pictures of cats instead? Then you'd find users who rated those pictures similarly. This would probably be a worse recommendations engine, because the "features" don't have a lot to do with taste in movies! 45 | 46 | Or suppose you ask users to rate movies so you can give them recommendations—but you only ask them to rate Toy Story, Toy Story 2, and Toy Story 3. This won't tell you a lot about the users' movie tastes! 47 | 48 | When you're working with KNN, it's really important to pick the **right features to compare against**. Picking the right features means: 49 | 50 | - Features that directly correlate to the movies you're trying to recommend 51 | - Features that don't have a bias (for example, if you ask the users to only rate comedy movies, that doesn't tell you whether they like action movies) 52 | 53 | ## Exercice 54 | 55 | **Question:** Netflix has millions of users. The earlier example looked at the five closest neighbors for building the recommendations system. Is this too low? Too high? 56 | 57 | **Answer:** It's too low. If you look at fewer neighbors, there's a bigger chance that the results will be skewed. A good rule of thumb is, **if you have N users, you should look at `sqrt(N)` neighbors.** 58 | 59 | ## Introduction to machine learning 60 | 61 | ### OCR 62 | 63 | How would you automatically figure out what number this is (7)? You can use KNN for this: 64 | 65 | 1. Go through a lot of images of numbers, and extract features of those numbers like curves, points and lines. 66 | 2. When you get a new image, extract the features of that image, and see what its nearest neighbors are! 67 | 68 | The first step of OCR, where you go through images of numbers and extract features, is called **training**. Most machine-learning algorithms have a training step: before your computer can do the task, it must be trained. 69 | 70 | ### Building a spam filter 71 | 72 | Spam filters use another simple algorithm called the **Naive Bayes classifier**. First, you train your Naive Bayes classifier on some data. 73 | 74 | Suppose you get an email with the subject "collect your million dollars now!" Is it spam? You can break this sentence into words. Then, for each word, see what the probability is for that word to show up in a spam email. 75 | 76 | ## Recap 77 | 78 | - KNN is used for classification and regression and involves looking at the k-nearest neighbors. 79 | - Classification = categorization into a group. 80 | - Regression = predicting a response (like a number). 81 | - Feature extraction means converting an item (like a fruit or a user) into a list of numbers that can be compared. 82 | - Picking good features is an important part of a successful KNN algorithm. 83 | -------------------------------------------------------------------------------- /16-k-nearest-neighbors/images/feature-extract.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/16-k-nearest-neighbors/images/feature-extract.png -------------------------------------------------------------------------------- /16-k-nearest-neighbors/images/fruit.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/16-k-nearest-neighbors/images/fruit.png -------------------------------------------------------------------------------- /16-k-nearest-neighbors/images/k-nearest-neighbors.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/16-k-nearest-neighbors/images/k-nearest-neighbors.png -------------------------------------------------------------------------------- /16-k-nearest-neighbors/images/netflix.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/16-k-nearest-neighbors/images/netflix.png -------------------------------------------------------------------------------- /16-k-nearest-neighbors/images/pythagorean-formula-2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/16-k-nearest-neighbors/images/pythagorean-formula-2.png -------------------------------------------------------------------------------- /16-k-nearest-neighbors/images/pythagorean-formula.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/16-k-nearest-neighbors/images/pythagorean-formula.png -------------------------------------------------------------------------------- /17-bellman-ford-algorithm/README.md: -------------------------------------------------------------------------------- 1 | # Bellman-Ford Algorithm 2 | 3 | - Shortest path from one node to all other nodes. 4 | - It works with negative edge weights, Dijkstra doesn't. Both doesn't work on negative cycles. 5 | - Examine each node in every iteration and update the cost. 6 | - Dijkstra is a greedy algorithm, Bellman-Ford doesn't. 7 | - Time complexty is `O(v*e)`, where v is the number of vertices and e is the number of edges. 8 | - We need v-1 iterations to find the result, where v is the number of vertices. 9 | 10 | ## A detailed explanation 11 | 12 | https://www.youtube.com/watch?v=obWXjtg0L64 13 | -------------------------------------------------------------------------------- /18-where-to-go-next/README.md: -------------------------------------------------------------------------------- 1 | # Where to go next 2 | 3 | ## Trees 4 | 5 | - Let's go back to the binary search example. When a user logs in to Facebook, Facebook has to look through a big array to see if the username exists. We said the fastest way to search through this array is to run binary search. But there's a problem: every time a new user signs up, you insert their username into the array. Then you have to re-sort the array, because binary search only works with sorted arrays. Wouldn't it be nice if you could insert the username into the right slot in the array right away, so you don't have to sort the array afterward? That's the idea behind the **binary search tree** data structure. 6 | 7 | For every node, the nodes to its left are smaller in value, and the nodes to the right are larger in value. 8 | 9 | ![trees-1](images/trees-1.png) 10 | 11 | Suppose you're searching for Maggie. You start at the root node. 12 | 13 | ![trees-2](images/trees-2.png) 14 | 15 | ![trees-3](images/trees-3.png) 16 | 17 | A binary search tree is **a lot faster** for insertions and deletions on average. 18 | 19 | | | Array | Binary Search Tree | 20 | |------- |---------|--------------------| 21 | | Search | O(logn) | O(logn) | 22 | | Insert | O(n) | O(logn) | 23 | | Delete | O(n) | O(logn) | 24 | 25 | Binary search trees have some **downsides** too: for one thing, **you don't get random access**. You can't say, "Give me the fifth element of this tree". Those performance times are also on average and rely on the tree being balanced. Suppose you have an imbalanced tree like the one shown next. 26 | 27 | ![inbalanced-tree](images/inbalanced-tree.png) 28 | 29 | See how it's leaning to the right? This tree doesn't have very good performance, because it isn't balanced. 30 | 31 | So when are binary search trees used? B-trees, a special type of binary tree, are commonly used to store data in databases. If you're interested in databases or more-advanced data structures, check these out: 32 | 33 | - B-trees 34 | - Red-black trees 35 | - Heaps 36 | - Splay trees 37 | 38 | ## Inverted indexes 39 | 40 | A hash that maps words to places where they appear. This data structure is called an **inverted index**, and it's commonly used to build search engines. 41 | 42 | ![inverted-index](images/inverted-index.png) 43 | 44 | ## Parallel algorithms 45 | 46 | Laptops and computers ship with multiple cores. To make your algorithms faster, you need to change them to run in parallel across all the cores at once! 47 | 48 | The best you can do with a sorting algorithm is roughly `O(n logn)`. It's well known that you can't sort an array in `O(n)` time—unless you use a parallel algorithm! There's a parallel version of quicksort that will sort an array in `O(n)` time. 49 | 50 | **Parallel algorithms are hard to design**. And it's also **hard to make sure they work correctly** and to figure out what type of speed boost you'll see. One thing is for sure—the **time gains aren't linear**. So **if you have two cores in your laptop** instead of one, that almost **never means your algorithm will magically run twice as fast**. There are a couple of reasons for this: 51 | 52 | - Overhead of managing the parallelism: Suppose you have to sort an array of 1,000 items. How do you divide this task among the two cores? Do you give each core 500 items to sort and then merge the two sorted arrays into one big sorted array? Merging the two arrays takes time. 53 | 54 | - Load balancing: Suppose you have 10 tasks to do, so you give each core 5 tasks. But core A gets all the easy tasks, so it's done in 10 seconds, whereas core B gets all the hard tasks, so it takes a minute. That means core A was sitting idle for 50 seconds while core B was doing all the work! How do you distribute the work evenly so both cores are working equally hard? 55 | 56 | ## MapReduce 57 | 58 | - There's a special type of parallel algorithm that is becoming increasingly popular: **the distributed algorithm**. It's fine to run a parallel algorithm on your laptop if you need two to four cores, but what if you need hundreds of cores? Then you can write your algorithm to **run across multiple machines**. The MapReduce algorithm is a popular **distributed algorithm**. You can use it through the popular open source tool **Apache Hadoop**. 59 | 60 | - Suppose you have a table with billions or trillions of rows, and you want to run a complicated SQL query on it. You can't run it on MySQL, because it struggles after **a few billion rows**. Use MapReduce through Hadoop! 61 | 62 | - Distributed algorithms are great when you have a lot of work to do and want to speed up the time required to do it. MapReduce in particular is built up from two simple ideas: the **map function** and the **reduce function**. 63 | 64 | ### The map function 65 | 66 | - The map function is simple: it takes an array and applies the same function to each item in the array. For example, like we're doubling every item in the array. 67 | 68 | - Wouldn't it be great if you had 100 machines, and map could automatically spread out the work across all of them? This is the idea behind the "map" in MapReduce. 69 | 70 | ```python 71 | arr1 = [1, 2, 3, 4, 5] 72 | arr2 = map(lambda x: 2*x, arr1) 73 | ``` 74 | 75 | ### The reduce function 76 | 77 | - The idea behind `reduce` is that you "reduce" a whole list of items down to one item. With `map`, you go from one array to another: 78 | 79 | ![map](images/map.png) 80 | 81 | - With reduce, you transform an array to a single item: 82 | 83 | ![reduce](images/reduce.png) 84 | 85 | - Summing up all the elements in an array can be an example of reduce. 86 | 87 | ```python 88 | arr1 = [1, 2, 3, 4, 5] 89 | reduce(lambda x, y: x+y, arr1) 90 | ``` 91 | 92 | ## Bloom Filters 93 | 94 | Bloom filters are **probabilistic** data structures. They give you an answer that **could be wrong but is probably correct**. Instead of a hash, you can ask your bloom filter if you've crawled an URL before. A hash table would give you an accurate answer. A bloom filter will give you an answer that's **probably correct**: 95 | 96 | - False positives are possible. Google might say, "You've already crawled this site" even though you haven't. 97 | - False negatives aren't possible. If the bloom filter says, "You haven't crawled this site" then you definitely haven't crawled this site. 98 | 99 | Bloom filters are great because they take up very little space. A hash table would have to store every URL crawled by Google, but a bloom filter doesn't have to do that. They're great when you don't need an exact answer, as in all of these examples. 100 | 101 | ## Comparing files 102 | 103 | SHA is a **hash function**. It generates a hash, which is just a short string. The hash function for hash tables goes from **string to array** index, whereas **SHA goes from string to string**. SHA generates a different hash for every string. 104 | 105 | You can use SHA to tell whether two files are the same. This is useful when you have very large files. Suppose you have a 4 GB file. You want to check whether your friend has the same large file. You don't have to try to email them your large file. Instead, you can both calculate the SHA hash and compare it. 106 | 107 | ## Simhash 108 | 109 | In SHA, if you change just one character of the string and regenerate the hash, it's totally different! 110 | 111 | This is good because an attacker can't compare hashes to see whether they're close to cracking a password. Sometimes, you want the opposite: you want a locality-sensitive hash function. That's where Simhash comes in. If you make a small change to a string, Simhash generates a hash that's **only a little different**. This allows you to compare hashes and see **how similar two strings are**, which is pretty useful! 112 | 113 | - **Google uses Simhash** to detect duplicates while crawling the web. 114 | - A teacher could use Simhash to see whether a student was copying an essay from the web. 115 | 116 | Simhash is useful when you want to check for similar items. 117 | 118 | ## Diffie-Hellman key exchange 119 | 120 | How do you encrypt a message so it can only be read by the person you sent the message to? 121 | 122 | The easiest way is to come up with a cipher, like a = 1, b = 2, and so on. Then if I send you the message "4, 15, 7", you can translate it to "d, o, g". But for this to work, we both have to agree on the cipher. We can't agree over email, because someone might hack into your email, figure out the cipher, and decode our messages. Heck, even if we meet in person, someone might guess the cipher—it's not complicated. So we should change it every day. But then we have to meet in person to change it every day! Even if we did manage to change it every day, a simple cipher like this is easy to crack with a brute-force attack. 123 | 124 | Diffie-Hellman solves both problems: 125 | 126 | - Both parties don't need to know the cipher. So we don't have to meet and agree to what the cipher should be. 127 | - The encrypted messages are extremely hard to decode. 128 | 129 | Diffie-Hellman has two keys: a public key and a private key. The public key is exactly that: public. You can post it on your website, email it to friends, or do anything you want with it. You don't have to hide it. When someone wants to send you a message, they encrypt it using the public key. An encrypted message can only be decrypted using the private key. As long as you're the only person with the private key, only you will be able to decrypt this message! The Diffie-Hellman algorithm is still used in practice, along with its successor, RSA. 130 | 131 | ## Linear programming 132 | 133 | Linear programming is used to maximize something given some constraints. Linear programming uses the Simplex algorithm. It's a complex algorithm, which is why it's not included here. 134 | -------------------------------------------------------------------------------- /18-where-to-go-next/images/inbalanced-tree.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/18-where-to-go-next/images/inbalanced-tree.png -------------------------------------------------------------------------------- /18-where-to-go-next/images/inverted-index.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/18-where-to-go-next/images/inverted-index.png -------------------------------------------------------------------------------- /18-where-to-go-next/images/map.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/18-where-to-go-next/images/map.png -------------------------------------------------------------------------------- /18-where-to-go-next/images/reduce.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/18-where-to-go-next/images/reduce.png -------------------------------------------------------------------------------- /18-where-to-go-next/images/trees-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/18-where-to-go-next/images/trees-1.png -------------------------------------------------------------------------------- /18-where-to-go-next/images/trees-2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/18-where-to-go-next/images/trees-2.png -------------------------------------------------------------------------------- /18-where-to-go-next/images/trees-3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/msdundar/notes-algorithms/92cfc92b653345ee4de53fd99cd335e60437661b/18-where-to-go-next/images/trees-3.png -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Notes - Algorithms 2 | 3 | This repository includes some notes about algorithms, mostly obtained from [Grooking Algorithms](https://www.amazon.com/Grokking-Algorithms-illustrated-programmers-curious/dp/1617292230), and from some other Internet resources. 4 | 5 | Each folder includes: 6 | 7 | 1. Content as markdown 8 | 1. Code examples in Python, Ruby and GoLang 9 | 1. Images and charts related to the chapter. 10 | 11 | ## Contributing 12 | 13 | This repository is open for any contributions, simply fork the repo and create a PR/MR to contribute. 14 | --------------------------------------------------------------------------------