├── 1 Divide and Conquer, Sorting and Searching, and Randomized Algorithms
    ├── Week 1
    │   ├── 1wk1_a1.py
    │   └── README.md
    ├── Week 2
    │   ├── 1wk2_a1.py
    │   └── README.md
    ├── Week 3
    │   ├── 1wk3_a1.py
    │   └── README.md
    └── Week 4
    │   ├── 1wk4_a1.py
    │   └── README.md
├── 2 Graph Search, Shortest Paths, and Data Structures
    ├── Week 1
    │   ├── 2wk1_a1.py
    │   └── README.md
    ├── Week 2
    │   ├── 2wk2_a1.py
    │   └── README.md
    ├── Week 3
    │   └── README.md
    └── Week 4
    │   ├── 2wk4_a1.py
    │   └── README.md
├── 3 Greedy Algorithms, Minimum Spanning Trees, and Dynamic Programming
    ├── Week 1
    │   ├── 3wk1_a1.py
    │   └── README.md
    ├── Week 2
    │   ├── 3wk2_a1.py
    │   ├── 3wk2_a2.py
    │   └── README.md
    ├── Week 3
    │   ├── 3wk3_a1.py
    │   └── README.md
    └── Week 4
    │   ├── 3wk4_a1.py
    │   ├── 3wk4_a2.py
    │   └── README.md
├── 4 Shortest Paths Revisited, NP-Complete Problems and What To Do About Them
    ├── Week 1
    │   ├── 4wk1_a1.py
    │   └── README.md
    ├── Week 2
    │   ├── 4wk1_a2.py
    │   └── README.md
    ├── Week 3
    │   ├── 4wk1_a3.py
    │   └── README.md
    └── Week 4
    │   ├── 4wk1_a4.py
    │   └── README.md
└── README.md


/1 Divide and Conquer, Sorting and Searching, and Randomized Algorithms/Week 1/1wk1_a1.py:
--------------------------------------------------------------------------------
 1 | """
 2 | Recursive integer multiplication
 3 | """
 4 | 
 5 | 
 6 | def multi(num1, num2):
 7 |     n1, n2 = len(num1), len(num2)
 8 |     a, b, c, d = num1[:n1//2], num1[n1//2:], num2[:n2//2], num2[n2//2:]
 9 |     if min(len(a), len(b), len(c), len(d)) == 1:
10 |         ac = str(int(a)*int(c))
11 |         ad = str(int(a)*int(d))
12 |         bc = str(int(b)*int(c))
13 |         bd = str(int(b)*int(d))
14 |     else:
15 |         ac = multi(a, c)
16 |         ad = multi(a, d)
17 |         bc = multi(b, c)
18 |         bd = multi(b, d)
19 |     return str(int(ac)*int(10**(len(a)*2))+(int(ad)+int(bc))*int(10**len(a))+int(bd))
20 | 


--------------------------------------------------------------------------------
/1 Divide and Conquer, Sorting and Searching, and Randomized Algorithms/Week 1/README.md:
--------------------------------------------------------------------------------
 1 | # Recursive Integer Multiplication
 2 | 
 3 | In this programming assignment you will implement one or more of the integer multiplication algorithms described in lecture.
 4 | 
 5 | To get the most out of this assignment, your program should restrict itself to multiplying only pairs of single-digit numbers. You can implement the grade-school algorithm if you want, but to get the most out of the assignment you'll want to implement recursive integer multiplication and/or Karatsuba's algorithm.
 6 | 
 7 | So: what's the product of the following two 64-digit numbers?
 8 | 
 9 | 3141592653589793238462643383279502884197169399375105820974944592
10 | 
11 | 2718281828459045235360287471352662497757247093699959574966967627
12 | 
13 | [TIP: before submitting, first test the correctness of your program on some small test cases of your own devising. Then post your best test cases to the discussion forums to help your fellow students!]
14 | 
15 | [Food for thought: the number of digits in each input number is a power of 2. Does this make your life easier? Does it depend on which algorithm you're implementing?]
16 | 
17 | The numeric answer should be typed in the space below. So if your answer is 1198233847, then just type 1198233847 in the space provided without any space / commas / any other punctuation marks.
18 | 
19 | (We do not require you to submit your code, so feel free to use any programming language you want --- just type the final numeric answer in the following space.)
20 | 


--------------------------------------------------------------------------------
/1 Divide and Conquer, Sorting and Searching, and Randomized Algorithms/Week 2/1wk2_a1.py:
--------------------------------------------------------------------------------
 1 | """
 2 | Inversions: num of (i, j) in A where i<j and A[i]>A[j]
 3 | """
 4 | 
 5 | ia = []
 6 | f = open('IntegerArray.txt', 'r')
 7 | ls = f.readlines()
 8 | f.close()
 9 | ia = [int(i) for i in ls]
10 | 
11 | 
12 | def inversion(a):
13 |     n = len(a)
14 |     if n == 1:
15 |         return (a, 0)
16 |     else:
17 |         b1, nleft = inversion(a[:n//2])
18 |         b2, nright = inversion(a[n//2:])
19 |         cross = 0
20 |         b = []
21 |         i, j = 0, 0
22 |         while i < len(b1) or j < len(b2):
23 |             if i == len(b1):
24 |                 b += b2[j:]
25 |                 j = len(b2)
26 |             elif j == len(b2):
27 |                 b += b1[i:]
28 |                 i = len(b1)
29 |             elif b1[i] < b2[j]:
30 |                 b += [b1[i]]
31 |                 i += 1
32 |             else:
33 |                 b += [b2[j]]
34 |                 j += 1
35 |                 cross += len(b1)-i
36 |         return (b, nleft+nright+cross)
37 | 
38 | 
39 | _, num = inversion(ia)
40 | print(num)
41 | 


--------------------------------------------------------------------------------
/1 Divide and Conquer, Sorting and Searching, and Randomized Algorithms/Week 2/README.md:
--------------------------------------------------------------------------------
 1 | # Inversions: num of (i, j) in A where i<j and A[i]>A[j]
 2 | 
 3 | This file contains all of the 100,000 integers between 1 and 100,000 (inclusive) in some order, with no integer repeated.
 4 | 
 5 | Your task is to compute the number of inversions in the file given, where the ith row of the file indicates the ith entry of an array.
 6 | 
 7 | Because of the large size of this array, you should implement the fast divide-and-conquer algorithm covered in the video lectures.
 8 | 
 9 | The numeric answer for the given input file should be typed in the space below.
10 | 
11 | So if your answer is 1198233847, then just type 1198233847 in the space provided without any space / commas / any other punctuation marks. You can make up to 5 attempts, and we'll use the best one for grading.
12 | 
13 | (We do not require you to submit your code, so feel free to use any programming language you want --- just type the final numeric answer in the following space.)
14 | 
15 | [TIP: before submitting, first test the correctness of your program on some small test files or your own devising. Then post your best test cases to the discussion forums to help your fellow students!]
16 | 


--------------------------------------------------------------------------------
/1 Divide and Conquer, Sorting and Searching, and Randomized Algorithms/Week 3/1wk3_a1.py:
--------------------------------------------------------------------------------
 1 | """
 2 | QuickSort with different pivot
 3 | """
 4 | 
 5 | ia = []
 6 | f = open('QuickSort.txt', 'r')
 7 | ls = f.readlines()
 8 | f.close()
 9 | ia = [int(i) for i in ls]
10 | 
11 | 
12 | def QS1(a):
13 |     if len(a) <= 1:
14 |         return (a, 0)
15 |     pivot = a[0]
16 |     i, j = 1, 1
17 |     n = len(a)-1
18 |     while j < len(a):
19 |         if a[j] > pivot:
20 |             j += 1
21 |         else:
22 |             a[i], a[j] = a[j], a[i]
23 |             i += 1
24 |             j += 1
25 |     a[0], a[i-1] = a[i-1], a[0]
26 |     a[:i-1], nleft = QS1(a[:i-1])
27 |     a[i:], nright = QS1(a[i:])
28 |     return (a, n+nleft+nright)
29 | 
30 | 
31 | def QS2(a):
32 |     if len(a) <= 1:
33 |         return (a, 0)
34 |     pivot = a[-1]
35 |     a[0], a[-1] = a[-1], a[0]
36 |     i, j = 1, 1
37 |     n = len(a)-1
38 |     while j < len(a):
39 |         if a[j] > pivot:
40 |             j += 1
41 |         else:
42 |             a[i], a[j] = a[j], a[i]
43 |             i += 1
44 |             j += 1
45 |     a[0], a[i-1] = a[i-1], a[0]
46 |     a[:i-1], nleft = QS2(a[:i-1])
47 |     a[i:], nright = QS2(a[i:])
48 |     return (a, n+nleft+nright)
49 | 
50 | 
51 | def QS3(a):
52 |     if len(a) <= 1:
53 |         return (a, 0)
54 |     findpivot = [a[0], a[(len(a)-1)//2], a[-1]]
55 |     k = findpivot.copy()
56 |     k.remove(min(k))
57 |     knum = findpivot.index(min(k))
58 |     pivotnum = 0 if knum == 0 else (len(a)-1)//2 if knum == 1 else len(a)-1
59 |     pivot = a[pivotnum]
60 |     a[0], a[pivotnum] = a[pivotnum], a[0]
61 |     i, j = 1, 1
62 |     n = len(a)-1
63 |     while j < len(a):
64 |         if a[j] > pivot:
65 |             j += 1
66 |         else:
67 |             a[i], a[j] = a[j], a[i]
68 |             i += 1
69 |             j += 1
70 |     a[0], a[i-1] = a[i-1], a[0]
71 |     a[:i-1], nleft = QS3(a[:i-1])
72 |     a[i:], nright = QS3(a[i:])
73 |     return (a, n+nleft+nright)
74 | 
75 | 
76 | ia1, ia2, ia3 = ia.copy(), ia.copy(), ia.copy()
77 | ia1, num1 = QS1(ia1)
78 | ia2, num2 = QS2(ia2)
79 | ia3, num3 = QS3(ia3)
80 | 


--------------------------------------------------------------------------------
/1 Divide and Conquer, Sorting and Searching, and Randomized Algorithms/Week 3/README.md:
--------------------------------------------------------------------------------
 1 | # QuickSort with different pivot
 2 | 
 3 | 1.GENERAL DIRECTIONS:
 4 | 
 5 | Download the following text file:
 6 | 
 7 | QuickSort.txt
 8 | 
 9 | The file contains all of the integers between 1 and 10,000 (inclusive, with no repeats) in unsorted order. The integer in the ith row of the file gives you the ith entry of an input array.
10 | 
11 | Your task is to compute the total number of comparisons used to sort the given input file by QuickSort. As you know, the number of comparisons depends on which elements are chosen as pivots, so we'll ask you to explore three different pivoting rules.
12 | 
13 | You should not count comparisons one-by-one. Rather, when there is a recursive call on a subarray of length m, you should simply add m−1 to your running total of comparisons. (This is because the pivot element is compared to each of the other m−1 elements in the subarray in this recursive call.)
14 | 
15 | WARNING: The Partition subroutine can be implemented in several different ways, and different implementations can give you differing numbers of comparisons. For this problem, you should implement the Partition subroutine exactly as it is described in the video lectures (otherwise you might get the wrong answer).
16 | 
17 | DIRECTIONS FOR THIS PROBLEM:
18 | 
19 | For the first part of the programming assignment, you should always use the first element of the array as the pivot element.
20 | 
21 | HOW TO GIVE US YOUR ANSWER:
22 | 
23 | Type the numeric answer in the space provided.
24 | 
25 | So if your answer is 1198233847, then just type 1198233847 in the space provided without any space / commas / other punctuation marks. You have 5 attempts to get the correct answer.
26 | 
27 | (We do not require you to submit your code, so feel free to use the programming language of your choice, just type the numeric answer in the following space.)
28 | 
29 | 2.GENERAL DIRECTIONS AND HOW TO GIVE US YOUR ANSWER:
30 | 
31 | See the first question.
32 | 
33 | DIRECTIONS FOR THIS PROBLEM:
34 | 
35 | Compute the number of comparisons (as in Problem 1), always using the final element of the given array as the pivot element. Again, be sure to implement the Partition subroutine exactly as it is described in the video lectures.
36 | 
37 | Recall from the lectures that, just before the main Partition subroutine, you should exchange the pivot element (i.e., the last element) with the first element.
38 | 
39 | 3.GENERAL DIRECTIONS AND HOW TO GIVE US YOUR ANSWER:
40 | 
41 | See the first question.
42 | 
43 | DIRECTIONS FOR THIS PROBLEM:
44 | 
45 | Compute the number of comparisons (as in Problem 1), using the "median-of-three" pivot rule. [The primary motivation behind this rule is to do a little bit of extra work to get much better performance on input arrays that are nearly sorted or reverse sorted.] In more detail, you should choose the pivot as follows. Consider the first, middle, and final elements of the given array. (If the array has odd length it should be clear what the "middle" element is; for an array with even length 2k, use the kth element as the "middle" element. So for the array 4 5 6 7, the "middle" element is the second one ---- 5 and not 6!) Identify which of these three elements is the median (i.e., the one whose value is in between the other two), and use this as your pivot. As discussed in the first and second parts of this programming assignment, be sure to implement Partition exactly as described in the video lectures (including exchanging the pivot element with the first element just before the main Partition subroutine).
46 | 
47 | EXAMPLE: For the input array 8 2 4 5 7 1 you would consider the first (8), middle (4), and last (1) elements; since 4 is the median of the set {1,4,8}, you would use 4 as your pivot element.
48 | 
49 | SUBTLE POINT: A careful analysis would keep track of the comparisons made in identifying the median of the three candidate elements. You should NOT do this. That is, as in the previous two problems, you should simply add m−1 to your running total of comparisons every time you recurse on a subarray with length m.
50 | 


--------------------------------------------------------------------------------
/1 Divide and Conquer, Sorting and Searching, and Randomized Algorithms/Week 4/1wk4_a1.py:
--------------------------------------------------------------------------------
 1 | """
 2 | Min Cut
 3 | """
 4 | 
 5 | from random import choice
 6 | 
 7 | 
 8 | f = open('kargerMinCut.txt', 'r')
 9 | ls = f.readlines()
10 | f.close()
11 | graph = [list(map(int, i.split('\t')[:-1])) for i in ls]
12 | 
13 | 
14 | def create():
15 |     global graph
16 |     return [i.copy() for i in graph]
17 | 
18 | 
19 | def mincut(g):
20 |     while len(g) > 2:
21 |         c1 = choice(range(len(g)))
22 |         v_del = g.pop(c1)
23 |         c2 = choice(range(1, len(v_del)))
24 |         v1, v2 = v_del[0], v_del[c2]
25 |         while v2 in v_del:
26 |             v_del.remove(v2)
27 |         for i in range(len(g)):
28 |             if g[i][0] == v2:
29 |                 g[i] += v_del
30 |                 while v1 in g[i]:
31 |                     g[i].remove(v1)
32 |             for j in range(len(g[i])):
33 |                 g[i][j] = v2 if g[i][j] == v1 else g[i][j]
34 |     return len(g[0])-1
35 | 
36 | 
37 | N = 1000
38 | cut = []
39 | for i in range(N):
40 |     cut += [mincut(create())]
41 | 
42 | print(min(cut))
43 | 


--------------------------------------------------------------------------------
/1 Divide and Conquer, Sorting and Searching, and Randomized Algorithms/Week 4/README.md:
--------------------------------------------------------------------------------
1 | # Min Cut
2 | 
3 | The file contains the adjacency list representation of a simple undirected graph. There are 200 vertices labeled 1 to 200. The first column in the file represents the vertex label, and the particular row (other entries except the first column) tells all the vertices that the vertex is adjacent to. So for example, the 6th row looks like : "6 155 56 52 120 ......". This just means that the vertex with label 6 is adjacent to (i.e., shares an edge with) the vertices with labels 155,56,52,120,......,etc
4 | 
5 | Your task is to code up and run the randomized contraction algorithm for the min cut problem and use it on the above graph to compute the min cut. (HINT: Note that you'll have to figure out an implementation of edge contractions. Initially, you might want to do this naively, creating a new graph from the old every time there's an edge contraction. But you should also think about more efficient implementations.) (WARNING: As per the video lectures, please make sure to run the algorithm many times with different random seeds, and remember the smallest cut that you ever find.) Write your numeric answer in the space provided. So e.g., if your answer is 5, just type 5 in the space provided.
6 | 


--------------------------------------------------------------------------------
/2 Graph Search, Shortest Paths, and Data Structures/Week 1/2wk1_a1.py:
--------------------------------------------------------------------------------
  1 | """
  2 | Strongly Connected Component (SCC) Search
  3 | """
  4 | 
  5 | f = open('SCC.txt', 'r')
  6 | N = 875714
  7 | graph = []
  8 | graph_r = []
  9 | for i in range(N):
 10 |     graph += [[]]
 11 |     graph_r += [[]]
 12 | ls = f.readline()
 13 | while ls:
 14 |     data = list(map(int, ls.split(' ')[:-1]))
 15 |     ls = f.readline()
 16 |     if data[0] == data[1]:
 17 |         continue
 18 |     graph[data[0]-1] += [data[1]]
 19 |     graph_r[data[1]-1] += [data[0]]
 20 | f.close()
 21 | 
 22 | 
 23 | def create(g):
 24 |     return [i.copy() for i in g]
 25 | 
 26 | 
 27 | g1 = create(graph)
 28 | g2 = create(graph_r)
 29 | 
 30 | 
 31 | def DFS(g, i):
 32 |     global t, s, ex, f, leader, N
 33 |     i -= 1
 34 |     ex[i] = True
 35 |     leader[i] = s
 36 |     if len(g[i]) > 1:
 37 |         for j in g[i][1:]:
 38 |             if not ex[j-1]:
 39 |                 DFS(g, j)
 40 |     t += 1
 41 |     f[i] = t
 42 |     print('V %i t= %i' % (i+1, t))
 43 | 
 44 | 
 45 | m = 0
 46 | 
 47 | 
 48 | def record(i):
 49 |     global t, m
 50 |     if t >= m*100000:
 51 |         print('Point %i t=%i' % (i+1, t))
 52 |         m += 1
 53 | 
 54 | 
 55 | def tdone(g, i):
 56 |     global f, t, leader, s, ex
 57 |     i -= 1
 58 |     for j in g[i]:
 59 |         if not ex[j-1]:
 60 |             ex[j-1] = True
 61 |             leader[j-1] = s
 62 |             return j
 63 |     t += 1
 64 |     f[i] = t
 65 |     record(i)
 66 |     return 0
 67 | 
 68 | 
 69 | def DFS_loop(g):
 70 |     global t, s, ex, f, N
 71 |     for i in list(range(N))[::-1]:
 72 |         if not ex[i]:
 73 |             s = i+1
 74 |             ex[i] = True
 75 |             leader[i] = s
 76 |             exlist = [i+1]
 77 |             while True:
 78 |                 if len(exlist) == 0:
 79 |                     break
 80 |                 j = tdone(g, exlist[-1])
 81 |                 if j == 0:
 82 |                     exlist.pop(-1)
 83 |                 else:
 84 |                     exlist += [j]
 85 | 
 86 | 
 87 | t = 0
 88 | s = None
 89 | ex = [False]*N
 90 | f = [0]*N
 91 | leader = [0]*N
 92 | print('Loop 1')
 93 | DFS_loop(g2)
 94 | 
 95 | for i in range(len(g1)):
 96 |     for j in range(len(g1[i])):
 97 |         g1[i][j] = f[g1[i][j]-1]
 98 | 
 99 | t = 0
100 | ex = [False]*N
101 | f = [0]*N
102 | leader = [0]*N
103 | print('Loop 2')
104 | m = 0
105 | DFS_loop(g1)
106 | 
107 | sccdic = {}
108 | for i in leader:
109 |     if i in sccdic:
110 |         sccdic[i] += 1
111 |     else:
112 |         sccdic[i] = 1
113 | 
114 | sccrank = list(sccdic.values())
115 | sccrank.sort(reverse=True)
116 | print(sccrank[:5])
117 | 


--------------------------------------------------------------------------------
/2 Graph Search, Shortest Paths, and Data Structures/Week 1/README.md:
--------------------------------------------------------------------------------
 1 | # Strongly Connected Component (SCC)
 2 | 
 3 | The file contains the edges of a directed graph. Vertices are labeled as positive integers from 1 to 875714. Every row indicates an edge, the vertex label in first column is the tail and the vertex label in second column is the head (recall the graph is directed, and the edges are directed from the first column vertex to the second column vertex). So for example, the 11th row looks liks : "2 47646". This just means that the vertex with label 2 has an outgoing edge to the vertex with label 47646
 4 | 
 5 | Your task is to code up the algorithm from the video lectures for computing strongly connected components (SCCs), and to run this algorithm on the given graph.
 6 | 
 7 | Output Format: You should output the sizes of the 5 largest SCCs in the given graph, in decreasing order of sizes, separated by commas (avoid any spaces). So if your algorithm computes the sizes of the five largest SCCs to be 500, 400, 300, 200 and 100, then your answer should be "500,400,300,200,100" (without the quotes). If your algorithm finds less than 5 SCCs, then write 0 for the remaining terms. Thus, if your algorithm computes only 3 SCCs whose sizes are 400, 300, and 100, then your answer should be "400,300,100,0,0" (without the quotes). (Note also that your answer should not have any spaces in it.)
 8 | 
 9 | WARNING: This is the most challenging programming assignment of the course. Because of the size of the graph you may have to manage memory carefully. The best way to do this depends on your programming language and environment, and we strongly suggest that you exchange tips for doing this on the discussion forums.
10 | 


--------------------------------------------------------------------------------
/2 Graph Search, Shortest Paths, and Data Structures/Week 2/2wk2_a1.py:
--------------------------------------------------------------------------------
 1 | """
 2 | Dijkstra's shortest path
 3 | """
 4 | 
 5 | f = open('dijkstraData.txt', 'r')
 6 | N = 200
 7 | graph = []
 8 | ls = f.readline()
 9 | while ls:
10 |     data = ls.split('\t')[:-1]
11 |     v = int(data.pop(0))-1
12 |     graph += [{}]
13 |     for i in data:
14 |         edge = list(map(int, i.split(',')))
15 |         graph[v][edge[0]] = edge[1]
16 |     ls = f.readline()
17 | f.close()
18 | 
19 | groupX = [1]
20 | path = {1: 0}
21 | groupV = list(range(2, 201))
22 | 
23 | while len(groupV) > 0:
24 |     new = []
25 |     for x in groupX:
26 |         for edge in graph[x-1]:
27 |             if edge in groupV:
28 |                 newpath = graph[x-1][edge]+path[x]
29 |                 if new == [] or newpath < new[1]:
30 |                     new = [edge, newpath]
31 |     if len(new) > 0:
32 |         path[new[0]] = new[1]
33 |         groupX += [new[0]]
34 |         groupV.remove(new[0])
35 |     else:
36 |         break
37 | 
38 | answerx = [7, 37, 59, 82, 99, 115, 133, 165, 188, 197]
39 | print([path[x] for x in answerx])
40 | 


--------------------------------------------------------------------------------
/2 Graph Search, Shortest Paths, and Data Structures/Week 2/README.md:
--------------------------------------------------------------------------------
 1 | # Dijkstra's shortest path
 2 | 
 3 | The file contains an adjacency list representation of an undirected weighted graph with 200 vertices labeled 1 to 200. Each row consists of the node tuples that are adjacent to that particular vertex along with the length of that edge. For example, the 6th row has 6 as the first entry indicating that this row corresponds to the vertex labeled 6. The next entry of this row "141,8200" indicates that there is an edge between vertex 6 and vertex 141 that has length 8200. The rest of the pairs of this row indicate the other vertices adjacent to vertex 6 and the lengths of the corresponding edges.
 4 | 
 5 | Your task is to run Dijkstra's shortest-path algorithm on this graph, using 1 (the first vertex) as the source vertex, and to compute the shortest-path distances between 1 and every other vertex of the graph. If there is no path between a vertex v and vertex 1, we'll define the shortest-path distance between 1 and v to be 1000000.
 6 | 
 7 | You should report the shortest-path distances to the following ten vertices, in order: 7,37,59,82,99,115,133,165,188,197. You should encode the distances as a comma-separated string of integers. So if you find that all ten of these vertices except 115 are at distance 1000 away from vertex 1 and 115 is 2000 distance away, then your answer should be 1000,1000,1000,1000,1000,2000,1000,1000,1000,1000. Remember the order of reporting DOES MATTER, and the string should be in the same order in which the above ten vertices are given. The string should not contain any spaces. Please type your answer in the space provided.
 8 | 
 9 | IMPLEMENTATION NOTES: This graph is small enough that the straightforward O(mn) time implementation of Dijkstra's algorithm should work fine. OPTIONAL: For those of you seeking an additional challenge, try implementing the heap-based version. Note this requires a heap that supports deletions, and you'll probably need to maintain some kind of mapping between vertices and their positions in the heap.
10 | 


--------------------------------------------------------------------------------
/2 Graph Search, Shortest Paths, and Data Structures/Week 3/README.md:
--------------------------------------------------------------------------------
1 | # Median Maintenance
2 | 
3 | The goal of this problem is to implement the "Median Maintenance" algorithm (covered in the Week 3 lecture on heap applications). The text file contains a list of the integers from 1 to 10000 in unsorted order; you should treat this as a stream of numbers, arriving one by one. Letting xi denote the ith number of the file, the kth median mk is defined as the median of the numbers x1,…,xk. (So, if k is odd, then mk is ((k+1)/2)th smallest number among x1,…,xk; if k is even, then mk is the (k/2)th smallest number among x1,…,xk.)
4 | 
5 | In the box below you should type the sum of these 10000 medians, modulo 10000 (i.e., only the last 4 digits). That is, you should compute (m1+m2+m3+⋯+m10000)mod10000.
6 | 
7 | OPTIONAL EXERCISE: Compare the performance achieved by heap-based and search-tree-based implementations of the algorithm.
8 | 


--------------------------------------------------------------------------------
/2 Graph Search, Shortest Paths, and Data Structures/Week 4/2wk4_a1.py:
--------------------------------------------------------------------------------
 1 | """
 2 | 2-sum
 3 | """
 4 | 
 5 | f = open('algo1-programming_prob-2sum.txt', 'r')
 6 | data = list(map(int, f.readlines()))
 7 | 
 8 | delta = 9999999
 9 | h = {}
10 | for i in range(-delta, delta+1):
11 |     h[i] = []
12 | 
13 | data = list(set(data))
14 | 
15 | for i in data:
16 |     h[i//10000] += [i]
17 | 
18 | t = []
19 | for i in range(-delta, delta+1):
20 |     if len(h[i]) > 0:
21 |         find = h[-i-2]+h[-i-1]+h[-i]+h[-i+1]
22 |         for x in h[i]:
23 |             for y in find:
24 |                 if x != y and abs(x+y) <= 10000 and x+y not in t:
25 |                     t += [x+y]
26 | 
27 | print(len(t))
28 | 


--------------------------------------------------------------------------------
/2 Graph Search, Shortest Paths, and Data Structures/Week 4/README.md:
--------------------------------------------------------------------------------
 1 | # 2-sum
 2 | 
 3 | The goal of this problem is to implement a variant of the 2-SUM algorithm covered in this week's lectures.
 4 | 
 5 | The file contains 1 million integers, both positive and negative (there might be some repetitions!).This is your array of integers, with the ith row of the file specifying the ith entry of the array.
 6 | 
 7 | Your task is to compute the number of target values t in the interval [-10000,10000] (inclusive) such that there are distinct numbers x,y in the input file that satisfy x+y=t. (NOTE: ensuring distinctness requires a one-line addition to the algorithm from lecture.)
 8 | 
 9 | Write your numeric answer (an integer between 0 and 20001) in the space provided.
10 | 
11 | OPTIONAL CHALLENGE: If this problem is too easy for you, try implementing your own hash table for it. For example, you could compare performance under the chaining and open addressing approaches to resolving collisions.
12 | 


--------------------------------------------------------------------------------
/3 Greedy Algorithms, Minimum Spanning Trees, and Dynamic Programming/Week 1/3wk1_a1.py:
--------------------------------------------------------------------------------
 1 | """
 2 | Greedy algorithms
 3 | """
 4 | 
 5 | f = open('jobs.txt', 'r')
 6 | 
 7 | data = []
 8 | f.readline()
 9 | ls = f.readline()
10 | while ls:
11 |     data += [list(map(int, ls.split(' ')))]
12 |     ls = f.readline()
13 | f.close()
14 | 
15 | 
16 | def getdata(a):
17 |     return [i.copy() for i in a]
18 | 
19 | 
20 | jobs1 = getdata(data)
21 | jobs1.sort(key=lambda x: x[0], reverse=True)
22 | jobs1.sort(key=lambda x: x[0]-x[1], reverse=True)
23 | c1, l1 = 0, 0
24 | for j in jobs1:
25 |     l1 += j[1]
26 |     c1 += l1*j[0]
27 | 
28 | jobs2 = getdata(data)
29 | jobs2.sort(key=lambda x: x[0]/x[1], reverse=True)
30 | c2, l2 = 0, 0
31 | for j in jobs2:
32 |     l2 += j[1]
33 |     c2 += l2*j[0]
34 | 
35 | f = open('edges.txt', 'r')
36 | graph = []
37 | for i in range(500):
38 |     graph += [{}]
39 | f.readline()
40 | ls = f.readline()
41 | while ls:
42 |     edge = list(map(int, ls.split(' ')))
43 |     graph[edge[0]-1][edge[1]] = edge[2]
44 |     graph[edge[1]-1][edge[0]] = edge[2]
45 |     ls = f.readline()
46 | f.close()
47 | 
48 | X = [1]
49 | G = list(range(2, 501))
50 | tree = 0
51 | while len(G) > 0:
52 |     newx, newl = 0, 0
53 |     for vx in X:
54 |         for eg in graph[vx-1]:
55 |             if eg in G:
56 |                 if newx == 0 or graph[vx-1][eg] < newl:
57 |                     newx, newl = eg, graph[vx-1][eg]
58 |     tree += newl
59 |     X += [newx]
60 |     G.remove(newx)
61 | 
62 | print('Answers: %i, %i, %i' % (c1, c2, tree))
63 | 


--------------------------------------------------------------------------------
/3 Greedy Algorithms, Minimum Spanning Trees, and Dynamic Programming/Week 1/README.md:
--------------------------------------------------------------------------------
 1 | # Greedy algorithms
 2 | 
 3 | 1.In this programming problem and the next you'll code up the greedy algorithms from lecture for minimizing the weighted sum of completion times.
 4 | 
 5 | Download the text file below.
 6 | 
 7 | jobs.txt
 8 | 
 9 | This file describes a set of jobs with positive and integral weights and lengths. It has the format
10 | 
11 | [number_of_jobs]
12 | 
13 | [job_1_weight] [job_1_length]
14 | 
15 | [job_2_weight] [job_2_length]
16 | 
17 | For example, the third line of the file is "74 59", indicating that the second job has weight 74 and length 59.
18 | 
19 | You should NOT assume that edge weights or lengths are distinct.
20 | 
21 | Your task in this problem is to run the greedy algorithm that schedules jobs in decreasing order of the difference (weight - length). Recall from lecture that this algorithm is not always optimal. IMPORTANT: if two jobs have equal difference (weight - length), you should schedule the job with higher weight first. Beware: if you break ties in a different way, you are likely to get the wrong answer. You should report the sum of weighted completion times of the resulting schedule --- a positive integer --- in the box below.
22 | 
23 | ADVICE: If you get the wrong answer, try out some small test cases to debug your algorithm (and post your test cases to the discussion forum).
24 | 
25 | 2.For this problem, use the same data set as in the previous problem. Your task now is to run the greedy algorithm that schedules jobs (optimally) in decreasing order of the ratio (weight/length). In this algorithm, it does not matter how you break ties. You should report the sum of weighted completion times of the resulting schedule --- a positive integer --- in the box below.
26 | 
27 | 3.In this programming problem you'll code up Prim's minimum spanning tree algorithm. Download the text file below. edges.txt This file describes an undirected graph with integer edge costs. It has the format
28 | 
29 | [number_of_nodes] [number_of_edges]
30 | 
31 | [one_node_of_edge_1] [other_node_of_edge_1] [edge_1_cost]
32 | 
33 | [one_node_of_edge_2] [other_node_of_edge_2] [edge_2_cost] For example, the third line of the file is "2 3 -8874", indicating that there is an edge connecting vertex #2 and vertex #3 that has cost -8874.
34 | 
35 | You should NOT assume that edge costs are positive, nor should you assume that they are distinct.
36 | 
37 | Your task is to run Prim's minimum spanning tree algorithm on this graph. You should report the overall cost of a minimum spanning tree --- an integer, which may or may not be negative --- in the box below.
38 | 
39 | IMPLEMENTATION NOTES: This graph is small enough that the straightforward O(mn) time implementation of Prim's algorithm should work fine. OPTIONAL: For those of you seeking an additional challenge, try implementing a heap-based version. The simpler approach, which should already give you a healthy speed-up, is to maintain relevant edges in a heap (with keys = edge costs). The superior approach stores the unprocessed vertices in the heap, as described in lecture. Note this requires a heap that supports deletions, and you'll probably need to maintain some kind of mapping between vertices and their positions in the heap.
40 | 


--------------------------------------------------------------------------------
/3 Greedy Algorithms, Minimum Spanning Trees, and Dynamic Programming/Week 2/3wk2_a1.py:
--------------------------------------------------------------------------------
 1 | """
 2 | k-clustering
 3 | """
 4 | 
 5 | f = open('clustering1.txt', 'r')
 6 | 
 7 | graph = {}
 8 | f.readline()
 9 | ls = f.readline()
10 | while ls:
11 |     data = list(map(int, ls.split(' ')))
12 |     graph[(data[0], data[1])] = data[2]
13 |     ls = f.readline()
14 | f.close()
15 | 
16 | g = graph.copy()
17 | 
18 | c = list(range(1, 501))
19 | cnum = 500
20 | while True:
21 |     edge = min(g, key=g.get)
22 |     d = g.pop(edge)
23 |     l1, l2 = c[edge[0]-1], c[edge[1]-1]
24 |     if l1 != l2 and cnum > 4:
25 |         cnum -= 1
26 |         for i in range(500):
27 |             c[i] = l1 if c[i] == l2 else c[i]
28 |     elif l1 != l2 and cnum == 4:
29 |         print(edge, l1, l2, d)
30 |         break
31 | 


--------------------------------------------------------------------------------
/3 Greedy Algorithms, Minimum Spanning Trees, and Dynamic Programming/Week 2/3wk2_a2.py:
--------------------------------------------------------------------------------
 1 | """
 2 | k-clustering
 3 | """
 4 | 
 5 | f = open('clustering_big.txt', 'r')
 6 | 
 7 | f.readline()
 8 | ls = f.readline()
 9 | graph = []
10 | while ls:
11 |     graph += [ls[:-1].replace(' ', '')]
12 |     ls = f.readline()
13 | f.close()
14 | 
15 | graph = list(set(graph))
16 | N = len(graph)
17 | gn = [int(i, 2) for i in graph]
18 | 
19 | 
20 | def nb(v):
21 |     n = 24
22 |     data = []
23 |     for i in range(n):
24 |         s = list(v)
25 |         s[i] = '1' if s[i] == '0' else '0'
26 |         data += [int(''.join(s), 2)]
27 |         for j in range(i+1, n):
28 |             ss = s.copy()
29 |             ss[j] = '1' if ss[j] == '0' else '0'
30 |             data += [int(''.join(ss), 2)]
31 |     return data
32 | 
33 | 
34 | uf = {i: i for i in gn}
35 | rank = {i: 0 for i in gn}
36 | 
37 | 
38 | def find(i):
39 |     global uf
40 |     if uf[i] == i:
41 |         return i
42 |     elif uf[uf[i]] == uf[i]:
43 |         return uf[i]
44 |     else:
45 |         newi = uf[i]
46 |         while uf[newi] != newi:
47 |             newi = uf[newi]
48 |         uf[i] = newi
49 |         return newi
50 | 
51 | 
52 | def merge(i, j):
53 |     global uf, rank
54 |     i, j = find(i), find(j)
55 |     if i != j:
56 |         if rank[i] > rank[j]:
57 |             uf[j] = i
58 |         elif rank[i] < rank[j]:
59 |             uf[i] = j
60 |         else:
61 |             uf[j] = i
62 |             rank[i] += 1
63 | 
64 | 
65 | for s in range(N):
66 |     if s % 10000 == 0:
67 |         print('scan %i, size=%i' % (s+1, len(set(uf.values()))))
68 |     for vn in nb(graph[s]):
69 |         if vn in uf:
70 |             merge(gn[s], vn)
71 | 
72 | for i in uf:
73 |     find(i)
74 | print('DONE, cluster size=%i' % len(set(uf.values())))
75 | 


--------------------------------------------------------------------------------
/3 Greedy Algorithms, Minimum Spanning Trees, and Dynamic Programming/Week 2/README.md:
--------------------------------------------------------------------------------
 1 | # k-clustering
 2 | 
 3 | 1.In this programming problem and the next you'll code up the clustering algorithm from lecture for computing a max-spacing k-clustering.
 4 | 
 5 | Download the text file below.
 6 | 
 7 | clustering1.txt This file describes a distance function (equivalently, a complete graph with edge costs). It has the following format:
 8 | 
 9 | [number_of_nodes]
10 | 
11 | [edge 1 node 1] [edge 1 node 2] [edge 1 cost]
12 | 
13 | [edge 2 node 1] [edge 2 node 2] [edge 2 cost]
14 | 
15 | ...
16 | 
17 | There is one edge (i,j) for each choice of 1≤i<j≤n, where n is the number of nodes.
18 | 
19 | For example, the third line of the file is "1 3 5250", indicating that the distance between nodes 1 and 3 (equivalently, the cost of the edge (1,3)) is 5250. You can assume that distances are positive, but you should NOT assume that they are distinct.
20 | 
21 | Your task in this problem is to run the clustering algorithm from lecture on this data set, where the target number k of clusters is set to 4. What is the maximum spacing of a 4-clustering?
22 | 
23 | ADVICE: If you're not getting the correct answer, try debugging your algorithm using some small test cases. And then post them to the discussion forum!
24 | 
25 | 2.In this question your task is again to run the clustering algorithm from lecture, but on a MUCH bigger graph. So big, in fact, that the distances (i.e., edge costs) are only defined implicitly, rather than being provided as an explicit list.
26 | 
27 | The data set is below.
28 | 
29 | clustering_big.txt The format is:
30 | 
31 | [# of nodes] [# of bits for each node's label]
32 | 
33 | [first bit of node 1] ... [last bit of node 1]
34 | 
35 | [first bit of node 2] ... [last bit of node 2]
36 | 
37 | ...
38 | 
39 | For example, the third line of the file "0 1 1 0 0 1 1 0 0 1 0 1 1 1 1 1 1 0 1 0 1 1 0 1" denotes the 24 bits associated with node #2.
40 | 
41 | The distance between two nodes u and v in this problem is defined as the Hamming distance--- the number of differing bits --- between the two nodes' labels. For example, the Hamming distance between the 24-bit label of node #2 above and the label "0 1 0 0 0 1 0 0 0 1 0 1 1 1 1 1 1 0 1 0 0 1 0 1" is 3 (since they differ in the 3rd, 7th, and 21st bits).
42 | 
43 | The question is: what is the largest value of k such that there is a k-clustering with spacing at least 3? That is, how many clusters are needed to ensure that no pair of nodes with all but 2 bits in common get split into different clusters?
44 | 
45 | NOTE: The graph implicitly defined by the data file is so big that you probably can't write it out explicitly, let alone sort the edges by cost. So you will have to be a little creative to complete this part of the question. For example, is there some way you can identify the smallest distances without explicitly looking at every pair of nodes?
46 | 


--------------------------------------------------------------------------------
/3 Greedy Algorithms, Minimum Spanning Trees, and Dynamic Programming/Week 3/3wk3_a1.py:
--------------------------------------------------------------------------------
 1 | """
 2 | Huffman coding and Dynamic programing
 3 | """
 4 | 
 5 | f = open('huffman.txt', 'r')
 6 | data = list(map(int, f.readlines()[1:]))
 7 | f.close()
 8 | cr = {i: data[i] for i in range(1000)}
 9 | meta = [[i] for i in range(1000)]
10 | newid = 1000
11 | while len(cr) > 2:
12 |     k1 = min(cr, key=cr.get)
13 |     w1 = cr.pop(k1)
14 |     k2 = min(cr, key=cr.get)
15 |     w2 = cr.pop(k2)
16 |     cr[newid] = w1+w2
17 |     newid += 1
18 |     meta += [meta[k1]+meta[k2]]
19 | byte = [0]*1000
20 | for i in meta:
21 |     for j in i:
22 |         byte[j] += 1
23 | print(max(byte), min(byte))
24 | 
25 | f = open('mwis.txt', 'r')
26 | path = list(map(int, f.readlines()))
27 | f.close()
28 | mwis = [0]*1001
29 | solution = [[] for i in range(1001)]
30 | mwis[0] = 0
31 | mwis[1] = path[1]
32 | solution[1] = [1]
33 | for i in range(2, 1001):
34 |     mwis[i] = max(mwis[i-2]+path[i], mwis[i-1])
35 |     if mwis[i-2]+path[i] > mwis[i-1]:
36 |         solution[i] = solution[i-2]+[i]
37 |     else:
38 |         solution[i] = solution[i-1].copy()
39 | ans = solution[-1]
40 | ask = [1, 2, 3, 4, 17, 117, 517, 997]
41 | ans = ['1' if i in solution[-1] else '0' for i in ask]
42 | print(''.join(ans))
43 | 


--------------------------------------------------------------------------------
/3 Greedy Algorithms, Minimum Spanning Trees, and Dynamic Programming/Week 3/README.md:
--------------------------------------------------------------------------------
 1 | # Huffman coding and Dynamic programing
 2 | 
 3 | 1.In this programming problem and the next you'll code up the greedy algorithm from the lectures on Huffman coding.
 4 | 
 5 | Download the text file below.
 6 | 
 7 | huffman.txt This file describes an instance of the problem. It has the following format:
 8 | 
 9 | [number_of_symbols]
10 | 
11 | [weight of symbol #1]
12 | 
13 | [weight of symbol #2]
14 | 
15 | ...
16 | 
17 | For example, the third line of the file is "6852892," indicating that the weight of the second symbol of the alphabet is 6852892. (We're using weights instead of frequencies, like in the "A More Complex Example" video.)
18 | 
19 | Your task in this problem is to run the Huffman coding algorithm from lecture on this data set. What is the maximum length of a codeword in the resulting Huffman code?
20 | 
21 | ADVICE: If you're not getting the correct answer, try debugging your algorithm using some small test cases. And then post them to the discussion forum!
22 | 
23 | 2.Continuing the previous problem, what is the minimum length of a codeword in your Huffman code?
24 | 
25 | 3.In this programming problem you'll code up the dynamic programming algorithm for computing a maximum-weight independent set of a path graph.
26 | 
27 | Download the text file below.
28 | 
29 | mwis.txt This file describes the weights of the vertices in a path graph (with the weights listed in the order in which vertices appear in the path). It has the following format:
30 | 
31 | [number_of_vertices]
32 | 
33 | [weight of first vertex]
34 | 
35 | [weight of second vertex]
36 | 
37 | ...
38 | 
39 | For example, the third line of the file is "6395702," indicating that the weight of the second vertex of the graph is 6395702.
40 | 
41 | Your task in this problem is to run the dynamic programming algorithm (and the reconstruction procedure) from lecture on this data set. The question is: of the vertices 1, 2, 3, 4, 17, 117, 517, and 997, which ones belong to the maximum-weight independent set? (By "vertex 1" we mean the first vertex of the graph---there is no vertex 0.) In the box below, enter a 8-bit string, where the ith bit should be 1 if the ith of these 8 vertices is in the maximum-weight independent set, and 0 otherwise. For example, if you think that the vertices 1, 4, 17, and 517 are in the maximum-weight independent set and the other four vertices are not, then you should enter the string 10011010 in the box below.
42 | 


--------------------------------------------------------------------------------
/3 Greedy Algorithms, Minimum Spanning Trees, and Dynamic Programming/Week 4/3wk4_a1.py:
--------------------------------------------------------------------------------
 1 | """
 2 | Knapsack algorithm
 3 | """
 4 | 
 5 | import numpy as np
 6 | f = open('knapsack1.txt', 'r')
 7 | f.readline()
 8 | ls = f.readline()
 9 | v = []
10 | w = []
11 | while ls:
12 |     data = list(map(int, ls.split(' ')))
13 |     v += [data[0]]
14 |     w += [data[1]]
15 |     ls = f.readline()
16 | f.close()
17 | Wall = 10000
18 | N = 100
19 | A = np.zeros([N+1, Wall+1])
20 | for i in range(1, N+1):
21 |     for x in range(0, Wall+1):
22 |         A[i, x] = max(A[i-1, x], A[i-1, x-w[i-1]]+v[i-1]) if x >= w[i-1] else A[i-1, x]
23 | print(A[N, Wall])
24 | 


--------------------------------------------------------------------------------
/3 Greedy Algorithms, Minimum Spanning Trees, and Dynamic Programming/Week 4/3wk4_a2.py:
--------------------------------------------------------------------------------
 1 | """
 2 | Knapsack algorithm 2
 3 | """
 4 | 
 5 | f = open('knapsack_big.txt', 'r')
 6 | f.readline()
 7 | ls = f.readline()
 8 | v = []
 9 | w = []
10 | while ls:
11 |     data = list(map(int, ls.split(' ')))
12 |     v += [data[0]]
13 |     w += [data[1]]
14 |     ls = f.readline()
15 | f.close()
16 | Wall = 2000000
17 | N = 2000
18 | 
19 | solution = [[N, Wall]]
20 | ans = solution[0]
21 | soludic = {(N, Wall): 0}
22 | i = 0
23 | ni = N
24 | while True:
25 |     ni, wi = ans
26 |     x, y = [ni-1, wi], [ni-1, wi-w[ni-1]]
27 |     if ni >= 1:
28 |         if tuple(x) not in soludic:
29 |             solution += [x]
30 |             soludic[tuple(x)] = 0
31 |         if wi >= w[ni-1]:
32 |             if tuple(y) not in soludic:
33 |                 solution += [y]
34 |                 soludic[tuple(y)] = 0
35 |     i += 1
36 |     if i == len(solution):
37 |         break
38 |     else:
39 |         ans = solution[i]
40 |     if i % 1000000 == 0:
41 |         print(i)
42 | 
43 | nn = len(solution)
44 | for i in list(range(nn))[::-1]:
45 |     ni, wi = solution[i]
46 |     if i % 1000000 == 0:
47 |         print(i)
48 |     if ni == 0:
49 |         continue
50 |     soludic[(ni, wi)] = max(soludic[(ni-1, wi)], soludic[(ni-1), wi-w[ni-1]]+v[ni-1]) if wi >= w[ni-1] else soludic[(ni-1, wi)]
51 | print(soludic[(N, Wall)])
52 | 


--------------------------------------------------------------------------------
/3 Greedy Algorithms, Minimum Spanning Trees, and Dynamic Programming/Week 4/README.md:
--------------------------------------------------------------------------------
 1 | # Knapsack algorithm
 2 | 
 3 | 1.In this programming problem and the next you'll code up the knapsack algorithm from lecture.
 4 | 
 5 | Let's start with a warm-up. Download the text file below.
 6 | 
 7 | knapsack1.txt This file describes a knapsack instance, and it has the following format:
 8 | 
 9 | [knapsack_size][number_of_items]
10 | 
11 | [value_1] [weight_1]
12 | 
13 | [value_2] [weight_2]
14 | 
15 | ...
16 | 
17 | For example, the third line of the file is "50074 659", indicating that the second item has value 50074 and size 659, respectively.
18 | 
19 | You can assume that all numbers are positive. You should assume that item weights and the knapsack capacity are integers.
20 | 
21 | In the box below, type in the value of the optimal solution.
22 | 
23 | ADVICE: If you're not getting the correct answer, try debugging your algorithm using some small test cases. And then post them to the discussion forum!
24 | 
25 | 2.This problem also asks you to solve a knapsack instance, but a much bigger one.
26 | 
27 | Download the text file below.
28 | 
29 | knapsack_big.txt This file describes a knapsack instance, and it has the following format:
30 | 
31 | [knapsack_size][number_of_items]
32 | 
33 | [value_1] [weight_1]
34 | 
35 | [value_2] [weight_2]
36 | 
37 | ...
38 | 
39 | For example, the third line of the file is "50074 834558", indicating that the second item has value 50074 and size 834558, respectively. As before, you should assume that item weights and the knapsack capacity are integers.
40 | 
41 | This instance is so big that the straightforward iterative implemetation uses an infeasible amount of time and space. So you will have to be creative to compute an optimal solution. One idea is to go back to a recursive implementation, solving subproblems --- and, of course, caching the results to avoid redundant work --- only on an "as needed" basis. Also, be sure to think about appropriate data structures for storing and looking up solutions to subproblems.
42 | 
43 | In the box below, type in the value of the optimal solution.
44 | 
45 | ADVICE: If you're not getting the correct answer, try debugging your algorithm using some small test cases. And then post them to the discussion forum!
46 | 


--------------------------------------------------------------------------------
/4 Shortest Paths Revisited, NP-Complete Problems and What To Do About Them/Week 1/4wk1_a1.py:
--------------------------------------------------------------------------------
 1 | """
 2 | Shortest Path
 3 | """
 4 | 
 5 | import numpy as np
 6 | 
 7 | 
 8 | def readgraph(file):
 9 |     f = open(file, 'r')
10 |     f.readline()
11 |     g = {i: {} for i in range(1, 1001)}
12 |     ls = f.readline()
13 |     while ls:
14 |         data = list(map(int, ls.split(' ')))
15 |         g[data[0]][data[1]] = data[2]
16 |         ls = f.readline()
17 |     f.close()
18 |     return g
19 | 
20 | 
21 | g1 = readgraph('g1.txt')
22 | g2 = readgraph('g2.txt')
23 | g3 = readgraph('g3.txt')
24 | 
25 | 
26 | def askmin(g):
27 |     n = 1000
28 |     A = np.zeros([n, n, n])
29 |     for i in range(n):
30 |         for j in range(n):
31 |             A[i, j, 0] = 0 if i == j else g[i+1][j+1] if j+1 in g[i+1] else np.inf
32 |     for k in range(1, n):
33 |         if k % 100 == 0:
34 |             print(k)
35 |         for i in range(n):
36 |             for j in range(n):
37 |                 A[i, j, k] = min(A[i, j, k-1], A[i, k, k-1]+A[k, j, k-1])
38 |     for i in range(n):
39 |         if A[i, i, n-1] < 0:
40 |             print('error at %i' % (i+1))
41 |     print('min=%i' % A[:, :, n-1].min())
42 | 
43 | 
44 | print('g1')
45 | askmin(g1)
46 | print('g2')
47 | askmin(g2)
48 | print('g3')
49 | askmin(g3)
50 | 


--------------------------------------------------------------------------------
/4 Shortest Paths Revisited, NP-Complete Problems and What To Do About Them/Week 1/README.md:
--------------------------------------------------------------------------------
 1 | # Shortest Path
 2 | 
 3 | In this assignment you will implement one or more algorithms for the all-pairs shortest-path problem. Here are data files describing three graphs:
 4 | 
 5 | g1.txt g2.txt g3.txt The first line indicates the number of vertices and edges, respectively. Each subsequent line describes an edge (the first two numbers are its tail and head, respectively) and its length (the third number). NOTE: some of the edge lengths are negative. NOTE: These graphs may or may not have negative-cost cycles.
 6 | 
 7 | Your task is to compute the "shortest shortest path". Precisely, you must first identify which, if any, of the three graphs have no negative cycles. For each such graph, you should compute all-pairs shortest paths and remember the smallest one (i.e., compute min u,v∈V d(u,v), where d(u,v) denotes the shortest-path distance from u to v).
 8 | 
 9 | If each of the three graphs has a negative-cost cycle, then enter "NULL" in the box below. If exactly one graph has no negative-cost cycles, then enter the length of its shortest shortest path in the box below. If two or more of the graphs have no negative-cost cycles, then enter the smallest of the lengths of their shortest shortest paths in the box below.
10 | 
11 | OPTIONAL: You can use whatever algorithm you like to solve this question. If you have extra time, try comparing the performance of different all-pairs shortest-path algorithms!
12 | 
13 | OPTIONAL: Here is a bigger data set to play with.
14 | 
15 | large.txt For fun, try computing the shortest shortest path of the graph in the file above.
16 | 


--------------------------------------------------------------------------------
/4 Shortest Paths Revisited, NP-Complete Problems and What To Do About Them/Week 2/4wk1_a2.py:
--------------------------------------------------------------------------------
 1 | """
 2 | Traveling salesman problem
 3 | """
 4 | 
 5 | import numpy as np
 6 | from itertools import combinations
 7 | 
 8 | f = open('tsp.txt', 'r')
 9 | ls = f.readlines()[1:]
10 | graph = [list(map(float, i.split(' '))) for i in ls]
11 | 
12 | 
13 | def dis(i, j):
14 |     return np.sqrt((graph[i][0]-graph[j][0])**2+(graph[i][1]-graph[j][1])**2)
15 | 
16 | 
17 | N = len(graph)
18 | dic1 = {frozenset([0]): {0: 0}}
19 | 
20 | for m in range(1, N):
21 |     comb = list(combinations(range(1, N), m))
22 |     dic2 = {frozenset(comb[i]): {list(comb[i])[j]: 0 for j in range(m)} for i in range(len(comb))}
23 |     print(m, len(dic2))
24 |     for s in dic2:
25 |         for j in s:
26 |             ans = []
27 |             if m == 1:
28 |                 dic2[s][j] = dis(0, j)
29 |             else:
30 |                 sj = set(s)
31 |                 sj.remove(j)
32 |                 dic2[s][j] = min([dic1[frozenset(sj)][k]+dis(k, j) for k in sj if k != j])
33 |     dic1 = dic2.copy()
34 | 
35 | tsp = min([dic2[frozenset(comb[0])][j]+dis(0, j) for j in range(1, N)])
36 | print(tsp)
37 | 


--------------------------------------------------------------------------------
/4 Shortest Paths Revisited, NP-Complete Problems and What To Do About Them/Week 2/README.md:
--------------------------------------------------------------------------------
 1 | # Traveling salesman problem
 2 | 
 3 | In this assignment you will implement one or more algorithms for the traveling salesman problem, such as the dynamic programming algorithm covered in the video lectures. Here is a data file describing a TSP instance.
 4 | 
 5 | tsp.txt The first line indicates the number of cities. Each city is a point in the plane, and each subsequent line indicates the x-coordinates and y-coordinates of a single city.
 6 | 
 7 | The distance between two cities is defined as the Euclidean distance --- that is, two cities at locations (x,y) and (z,w) have distance ((x − z)^2 + (y − w)^2)^(1 / 2) between them.
 8 | 
 9 | In the box below, type in the minimum cost of a traveling salesman tour for this instance, rounded down to the nearest integer.
10 | 
11 | OPTIONAL: If you want bigger data sets to play with, check out the TSP instances from around the world here. The smallest data set (Western Sahara) has 29 cities, and most of the data sets are much bigger than that. What's the largest of these data sets that you're able to solve --- using dynamic programming or, if you like, a completely different method?
12 | 
13 | HINT: You might experiment with ways to reduce the data set size. For example, trying plotting the points. Can you infer any structure of the optimal solution? Can you use that structure to speed up your algorithm?
14 | 


--------------------------------------------------------------------------------
/4 Shortest Paths Revisited, NP-Complete Problems and What To Do About Them/Week 3/4wk1_a3.py:
--------------------------------------------------------------------------------
 1 | """
 2 | Traveling salesman problem (large)
 3 | """
 4 | 
 5 | import numpy as np
 6 | 
 7 | f = open('nn.txt', 'r')
 8 | ls = f.readlines()[1:]
 9 | graph = [list(map(float, i.split(' ')))[1:] for i in ls]
10 | graph = {i: graph[i] for i in range(len(graph))}
11 | 
12 | N = len(graph)
13 | 
14 | 
15 | def dis(i, j):
16 |     return (graph[i][0]-graph[j][0])**2+(graph[i][1]-graph[j][1])**2
17 | 
18 | 
19 | tour = [0]
20 | travel = 0
21 | g = graph.copy()
22 | g.pop(0)
23 | 
24 | while len(g) > 0:
25 |     plan = 1e9
26 |     for c in g:
27 |         d = dis(tour[-1], c)
28 |         if d < plan:
29 |             plan = d
30 |             city = c
31 |     travel += np.sqrt(plan)
32 |     tour += [city]
33 |     g.pop(city)
34 |     if len(g) % 1000 == 0:
35 |         print('Travel %i, %i cities left' % (city, len(g)))
36 | 
37 | travel += np.sqrt(dis(0, tour[-1]))
38 | print(travel)
39 | 


--------------------------------------------------------------------------------
/4 Shortest Paths Revisited, NP-Complete Problems and What To Do About Them/Week 3/README.md:
--------------------------------------------------------------------------------
 1 | # Traveling salesman problem (large)
 2 | 
 3 | In this assignment we will revisit an old friend, the traveling salesman problem (TSP). This week you will implement a heuristic for the TSP, rather than an exact algorithm, and as a result will be able to handle much larger problem sizes. Here is a data file describing a TSP instance (original source: http://www.math.uwaterloo.ca/tsp/world/bm33708.tsp).
 4 | 
 5 | The first line indicates the number of cities. Each city is a point in the plane, and each subsequent line indicates the x- and y-coordinates of a single city.
 6 | 
 7 | The distance between two cities is defined as the Euclidean distance --- that is, two cities at locations (x,y) and (z,w) have distance ((x − z) ^ 2 + (y − w) ^ 2) ^ (1 / 2) between them.
 8 | 
 9 | You should implement the nearest neighbor heuristic:
10 | 
11 | 1. Start the tour at the first city.
12 | 2. Repeatedly visit the closest city that the tour hasn't visited yet. In case of a tie, go to the closest city with the lowest index. For example, if both the third and fifth cities have the same distance from the first city (and are closer than any other city), then the tour should begin by going from the first city to the third city.
13 | 3. Once every city has been visited exactly once, return to the first city to complete the tour. In the box below, enter the cost of the traveling salesman tour computed by the nearest neighbor heuristic for this instance, rounded down to the nearest integer.
14 | 
15 | [Hint: when constructing the tour, you might find it simpler to work with squared Euclidean distances (i.e., the formula above but without the square root) than Euclidean distances. But don't forget to report the length of the tour in terms of standard Euclidean distance.]
16 | 


--------------------------------------------------------------------------------
/4 Shortest Paths Revisited, NP-Complete Problems and What To Do About Them/Week 4/4wk1_a4.py:
--------------------------------------------------------------------------------
  1 | """
  2 | 2-SAT
  3 | """
  4 | 
  5 | f = open('2sat6.txt', 'r')
  6 | ls = f.readline()
  7 | n = int(ls)
  8 | N = n*2
  9 | graph = {i: [] for i in range(-n, n+1) if i != 0}
 10 | graph_r = {i: [] for i in range(-n, n+1) if i != 0}
 11 | ls = f.readline()
 12 | while ls:
 13 |     data = list(map(int, ls.split(' ')))
 14 |     graph[-data[0]] += [data[1]]
 15 |     graph[-data[1]] += [data[0]]
 16 |     graph_r[data[1]] += [-data[0]]
 17 |     graph_r[data[0]] += [-data[1]]
 18 |     ls = f.readline()
 19 | f.close()
 20 | 
 21 | 
 22 | def record(i):
 23 |     global t, m
 24 |     if t >= m*100000:
 25 |         print('Point %i t=%i' % (i+1, t))
 26 |         m += 1
 27 | 
 28 | 
 29 | def tdone(g, i):
 30 |     global f, t, leader, s, ex
 31 |     for j in g[i]:
 32 |         if not ex[j]:
 33 |             ex[j] = True
 34 |             leader[j] = s
 35 |             return j
 36 |     t += 1
 37 |     t += 1 if t == 0 else 0
 38 |     f[i] = t
 39 |     record(i)
 40 |     return 0
 41 | 
 42 | 
 43 | def DFS_loop(g):
 44 |     global t, s, ex, f, N
 45 |     for i in list(g.keys())[::-1]:
 46 |         if not ex[i]:
 47 |             s = i
 48 |             ex[i] = True
 49 |             leader[i] = s
 50 |             exlist = [i]
 51 |             while True:
 52 |                 if len(exlist) == 0:
 53 |                     break
 54 |                 j = tdone(g, exlist[-1])
 55 |                 if j == 0:
 56 |                     exlist.pop(-1)
 57 |                 else:
 58 |                     exlist += [j]
 59 | 
 60 | 
 61 | t = -n-1
 62 | s = None
 63 | m = 0
 64 | ex = {i: False for i in graph}
 65 | f = {i: 0 for i in graph}
 66 | leader = {i: 0 for i in graph}
 67 | print('Loop 1')
 68 | DFS_loop(graph_r)
 69 | 
 70 | gnew = {i: [] for i in graph}
 71 | for i in graph:
 72 |     for j in graph[i]:
 73 |         gnew[f[i]] += [f[j]]
 74 | 
 75 | fr = {f[i]: i for i in graph}
 76 | 
 77 | t = -n-1
 78 | m = 0
 79 | ex = {i: False for i in graph}
 80 | f = {i: 0 for i in graph}
 81 | leader = {i: 0 for i in graph}
 82 | print('Loop 2')
 83 | DFS_loop(gnew)
 84 | 
 85 | sccdic = {}
 86 | for i in leader:
 87 |     if leader[i] in sccdic:
 88 |         sccdic[leader[i]] += [fr[i]]
 89 |     else:
 90 |         sccdic[leader[i]] = [fr[i]]
 91 | 
 92 | num = 0
 93 | for i in sccdic:
 94 |     if len(sccdic[i]) > 1:
 95 |         num += 1
 96 |         for j in sccdic[i]:
 97 |             if -j in sccdic[i]:
 98 |                 print('error')
 99 |                 break
100 |         print(sccdic[i])
101 | print(num)
102 | 


--------------------------------------------------------------------------------
/4 Shortest Paths Revisited, NP-Complete Problems and What To Do About Them/Week 4/README.md:
--------------------------------------------------------------------------------
1 | # 2SAT
2 | 
3 | The file format is as follows. In each instance, the number of variables and the number of clauses is the same, and this number is specified on the first line of the file. Each subsequent line specifies a clause via its two literals, with a number denoting the variable and a "-" sign denoting logical "not". For example, the second line of the first data file is "-16808 75250", which indicates the clause ¬x16808∨x75250.
4 | 
5 | Your task is to determine which of the 6 instances are satisfiable, and which are unsatisfiable. In the box below, enter a 6-bit string, where the ith bit should be 1 if the ith instance is satisfiable, and 0 otherwise. For example, if you think that the first 3 instances are satisfiable and the last 3 are not, then you should enter the string 111000 in the box below.
6 | 
7 | DISCUSSION: This assignment is deliberately open-ended, and you can implement whichever 2SAT algorithm you want. For example, 2SAT reduces to computing the strongly connected components of a suitable graph (with two vertices per variable and two directed edges per clause, you should think through the details). This might be an especially attractive option for those of you who coded up an SCC algorithm in Part 2 of this specialization. Alternatively, you can use Papadimitriou's randomized local search algorithm. (The algorithm from lecture is probably too slow as stated, so you might want to make one or more simple modifications to it --- even if this means breaking the analysis given in lecture --- to ensure that it runs in a reasonable amount of time.) A third approach is via backtracking. In lecture we mentioned this approach only in passing; see Chapter 9 of the Dasgupta-Papadimitriou-Vazirani book, for example, for more details.
8 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Algorithms-Stanford
 2 | 
 3 | Assignments (in Python) in Algorithms Courses of Stanford University at Coursera
 4 | 
 5 | ## Divide and Conquer, Sorting and Searching, and Randomized Algorithms
 6 | 
 7 | - [Week 1](1%20Divide%20and%20Conquer,%20Sorting%20and%20Searching,%20and%20Randomized%20Algorithms/Week%201)
 8 | - [Week 2](1%20Divide%20and%20Conquer,%20Sorting%20and%20Searching,%20and%20Randomized%20Algorithms/Week%202)
 9 | - [Week 3](1%20Divide%20and%20Conquer,%20Sorting%20and%20Searching,%20and%20Randomized%20Algorithms/Week%203)
10 | - [Week 4](1%20Divide%20and%20Conquer,%20Sorting%20and%20Searching,%20and%20Randomized%20Algorithms/Week%204)
11 | 
12 | ## Graph Search, Shortest Paths, and Data Structures
13 | 
14 | - [Week 1](2%20Graph%20Search,%20Shortest%20Paths,%20and%20Data%20Structures/Week%201)
15 | - [Week 2](2%20Graph%20Search,%20Shortest%20Paths,%20and%20Data%20Structures/Week%201)
16 | - [Week 3](2%20Graph%20Search,%20Shortest%20Paths,%20and%20Data%20Structures/Week%201)
17 | - [Week 4](2%20Graph%20Search,%20Shortest%20Paths,%20and%20Data%20Structures/Week%201)
18 | 
19 | ## Greedy Algorithms, Minimum Spanning Trees, and Dynamic Programming
20 | 
21 | - [Week 1](3%20Greedy%20Algorithms,%20Minimum%20Spanning%20Trees,%20and%20Dynamic%20Programming/Week%201)
22 | - [Week 2](3%20Greedy%20Algorithms,%20Minimum%20Spanning%20Trees,%20and%20Dynamic%20Programming/Week%202)
23 | - [Week 3](3%20Greedy%20Algorithms,%20Minimum%20Spanning%20Trees,%20and%20Dynamic%20Programming/Week%203)
24 | - [Week 4](3%20Greedy%20Algorithms,%20Minimum%20Spanning%20Trees,%20and%20Dynamic%20Programming/Week%204)
25 | 
26 | ## Shortest Paths Revisited, NP-Complete Problems and What To Do About Them
27 | 
28 | - [Week 1](4%20Shortest%20Paths%20Revisited,%20NP-Complete%20Problems%20and%20What%20To%20Do%20About%20Them/Week%201)
29 | - [Week 2](4%20Shortest%20Paths%20Revisited,%20NP-Complete%20Problems%20and%20What%20To%20Do%20About%20Them/Week%202)
30 | - [Week 3](4%20Shortest%20Paths%20Revisited,%20NP-Complete%20Problems%20and%20What%20To%20Do%20About%20Them/Week%203)
31 | - [Week 4](4%20Shortest%20Paths%20Revisited,%20NP-Complete%20Problems%20and%20What%20To%20Do%20About%20Them/Week%204)
32 | 


--------------------------------------------------------------------------------