85 | HOW TO DETERMINE COMPLEXITIES
86 |
87 | We can determine complexity based on the type of
88 | statements used by a program. The following examples are in
89 | java but can be easily followed if you have basic
90 | programming experience and use big O notation we will
91 | explain later why big O notation is commonly used:
92 |
93 | Constant time: O(1)
94 |
95 |
96 | The following operations take constant time:
97 |
98 |
99 | - Assigning a value to some variable
100 | - Inserting an element in an array
101 | - Determining if a binary number is even or odd.
102 | - Retrieving element i from an array
103 | - Retrieving a value from a hash table(dictionary)
104 | with a key
105 |
106 | They take constant time because they are "simple"
107 | statements. In this case we say the statement time is O(1)
108 |
109 |
110 |
111 |
112 |
113 | int example = 1;
114 |
115 |
116 |
117 | As you can see in the graph below constant time is
118 | indifferent of input size. Declaring a variable, inserting
119 | an element in a stack, inserting an element into an unsorted
120 | linked list all these statements take constant time.
121 |
122 |
124 |
125 | Linear time: O(n)
126 |
127 | The next loop executes N times, if we assume the statement
128 | inside the loop is O(1), then the total time for the loop is
129 | N*O(1), which equals O(N) also known as linear time:
130 |
131 |
132 |
133 |
134 |
135 | for (int i = 0; i < N; i++) {
136 |
137 | }
138 |
139 |
140 |
141 |
142 | In the following graph we can see how running time
143 | increases linearly in relation to the number of elements n:
144 |
146 |
147 | More examples of linear time are:
148 |
149 | - Finding an item in an unsorted collection or a
150 | unbalanced tree (worst case)
151 | - Sorting an array via bubble sort
152 |
153 |
154 |
155 | Quadratic time: O(n2)
156 |
157 |
158 | In this example the first loop executes N times. For each
159 | time the outer loop executes, the inner loop executes N
160 | times. Therefore, the statement in the nested loop executes
161 | a total of N * N times. Here the complexity is O(N*N) which
162 | equals O(N2). This should be avoided as this
163 | complexity grows in quadratic time
164 |
165 |
166 |
167 |
168 | for (int i=0; i < N; i++) {
169 | for(int j=0; j< N; j++){
170 |
171 | }
172 | }
173 |
174 |
175 |
176 |
177 | Some extra examples of quadratic time are:
178 |
179 | - Performing linear search in a matrix
180 | - Time complexity of quicksort, which is highly
181 | improbable as we will see in the Algorithms section of this
183 | website.
184 |
185 | - Insertion sort
186 |
187 |
188 | Algorithms that scale in quadratic time are better to be
189 | avoided. Once the input size reaches n=100,000 element it
190 | can take 10 seconds to complete. For an input size of
191 | n=1’000,000 it can take ~16 min to complete; and for an
192 | input size of n=10’000,000 it could take ~1.1 days to
193 | complete...you get the idea.
194 |
196 |
197 |
198 |
199 |
200 | Logarithmic time: O(Log n)
201 |
202 | Logarithmic time grows slower as N grows. An easy way to
203 | check if a loop is log n is to see if the counting variable
204 | (in this case: i) doubles instead of incrementing by 1. In
205 | the following example
206 | int i
207 | doesn’t increase by 1 (i++), it doubles with each run thus
208 | traversing the loop in log(n) time:
209 |
210 |
211 |
212 |
213 |
214 |
215 | for(int i=0; i < n; i *= 2) {
216 |
217 | }
218 |
219 |
220 |
221 | Some common examples of logarithmic time are:
222 |
223 | - Binary search
224 | - Insert or delete an element into a heap
225 |
226 |
227 | Don't feel intimidated by logarithms. Just remember that
228 | logarithms are the inverse operation of exponentiating
229 | something. Logarithms appear when things are constantly
230 | halved or doubled.
231 |
232 | Logarithmic algorithms have excellent performance in
233 | large data sets:
234 |
235 |
237 |
238 | Linearithmic time: O(n*Log n)
239 |
240 | Linearithmic algorithms are capable of good performance
241 | with very large data sets. Some examples of linearithmic
242 | algorithms are:
243 |
244 | - heapsort
245 | - merge sort
246 | - Quick sort
247 |
248 |
249 |
250 | We'll see a custom implementation of Merge and Quicksort in
251 | the algorithms section. But
252 | for now the following example helps us illustrate our point:
253 |
254 |
255 |
256 |
257 |
258 |
259 | for(int i= 0; i< n; i++) {
260 | for(int j= 1; j< n; j *= 2){
261 |
262 | }
263 | }
264 |
265 |
266 |
267 |
268 |
270 |
271 |
272 | Conclusion
273 |
274 | As you might have noticed, Big O notation describes the
275 | worst case possible. When you loop through an array in order
276 | to find if it contains X item the worst case is
277 | that it’s at the end or that it’s not even present on the
278 | list. Making you iterate through all n items, thus O(n). The
279 | best case would be for the item we search to be at the
280 | beginning so every time we loop it takes constant time to
281 | search but this is highly uncommon and becomes more
282 | improbable as the list of items increases. In the next
283 | section we'll look deeper into why big O focuses on worst
284 | case analysis.
285 |
286 |
287 | A comparison of the first four complexities, might let
288 | you understand why for large data sets we should avoid
289 | quadratic time and strive towards logarithmic or
290 | linearithmic time:
291 |
292 |
294 |
295 |
296 |
297 |
300 | BIG O NOTATION AND WORST CASE ANALYSIS
301 | Big O notation is simply a measure of how well an
302 | algorithm scales (or its rate of growth). This way we can
303 | describe the performance or complexity of an algorithm. Big
304 | O notation focuses on the worst-case scenario.
305 |
306 | Why focus on worst case performance? At first look it might
307 | seem counter-intuitive why not focus on best case or at
308 | least in average case performance? I like a lot the answer
309 | given in The Algorithms Design Manual by S. Skiena:
310 |
311 | Imagine you go to a casino what will happen if you bring
312 | n dollars?
313 |
314 |
315 | The best case, is that you walk out owning the
316 | casino, it’s possible but so improbable that you don’t
317 | even think about it.
318 |
319 |
320 | The average case, is a little more tricky to
321 | prove as you need domain knowledge in order
322 | to identify which is the average case. For example,
323 | the average case in our example is that the typical
324 | bettor loses ~87% of the money that they bring to the
325 | casino, but people who are drunk surely loose even
326 | more, what about experienced professional players and
327 | what exactly is the average? How did they determined
328 | it? Who determined the average case? Are their metrics
329 | correct? Average case just makes the task of analyzing
330 | an algorithm even more complex.
331 |
332 |
333 | The worst case is that you lose all your n
334 | dollars, this is easy to calculate and very likely to
335 | happen.
336 |
337 |
338 | Now think of this in a context of a program with
339 | .search() method which takes linear time to execute:
340 | The worst case is O(n), this is when the key is at the
341 | end or never present in the list. Which might happen.
342 | The best case is O(1), this happens if and only if the
343 | key is at the beginning of the list. Which becomes even more
344 | unlikely as n grows.
345 |
346 |
347 |
348 |