├── .gitignore ├── assets ├── heapsort_animation.gif ├── polynomial_growth.png ├── insertion_sort_animation.gif └── 1_practical_heapsort │ ├── kcachegrind_heapsort_profile.png │ └── kcachegrind_insertion_profile.png ├── notes ├── 3_3d_printing_with_soolidoodle_3_on_gnu_linux.md ├── 2_vim_tricks.md └── 1_practical_heapsort.md └── README.md /.gitignore: -------------------------------------------------------------------------------- 1 | *.swp 2 | -------------------------------------------------------------------------------- /assets/heapsort_animation.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rzetterberg/case_studies/HEAD/assets/heapsort_animation.gif -------------------------------------------------------------------------------- /assets/polynomial_growth.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rzetterberg/case_studies/HEAD/assets/polynomial_growth.png -------------------------------------------------------------------------------- /assets/insertion_sort_animation.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rzetterberg/case_studies/HEAD/assets/insertion_sort_animation.gif -------------------------------------------------------------------------------- /notes/3_3d_printing_with_soolidoodle_3_on_gnu_linux.md: -------------------------------------------------------------------------------- 1 | #Vim tricks 2 | 3 | **Linux, Vim** 4 | 5 | ##1. Project specific configs 6 | -------------------------------------------------------------------------------- /assets/1_practical_heapsort/kcachegrind_heapsort_profile.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rzetterberg/case_studies/HEAD/assets/1_practical_heapsort/kcachegrind_heapsort_profile.png -------------------------------------------------------------------------------- /assets/1_practical_heapsort/kcachegrind_insertion_profile.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/rzetterberg/case_studies/HEAD/assets/1_practical_heapsort/kcachegrind_insertion_profile.png -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | #README 2 | 3 | ##What is this? 4 | 5 | This is a collection of notes from different projects I have done. They will 6 | serve as educational material with a practical approach. Hopefully someone will 7 | find them interesting and learn something! 8 | 9 | Why I choose github is because I would like others to be able to make changes 10 | easily if they find something to be wrong or have a better solution. 11 | 12 | ##Notes 13 | 14 | ###1. Practical heapsort (C, gcc, linux, valgrind) 15 | 16 | This note is about how I implemented heapsort instead of the existing 17 | insertion sort to speed up sorting of statistical data points by date. The 18 | application was written in C and compiled with gcc. I was using valgrind/callgrind 19 | to check for memory leaks and to produce profiling information to use in 20 | kcachegrind. 21 | 22 | [Read](https://github.com/rzetterberg/case_studies/blob/master/notes/1_practical_heapsort.md) 23 | 24 | ###2. Vim tricks 25 | 26 | This is a collection of productivity tricks for vim that I use. 27 | 28 | [Read](https://github.com/rzetterberg/case_studies/blob/master/notes/2_vim_tricks.md) 29 | 30 | ###3. 3D printing with Solidoodle 3 on GNU/Linux 31 | 32 | This note is about how I have setup Slic3r and Pronterface on Debian to be able to use my 33 | Solidoodle 3. There are also some tips and tricks on 3D printing related tasks on Linux. 34 | 35 | [Read](https://github.com/rzetterberg/case_studies/blob/master/notes/3_3d_printing_with_solidoodle_3_on_gnu_linux.md) 36 | -------------------------------------------------------------------------------- /notes/2_vim_tricks.md: -------------------------------------------------------------------------------- 1 | #Vim tricks 2 | 3 | **Linux, Vim** 4 | 5 | **Disclaimer:** These are some tricks I have collected over the years, in no way 6 | are they the only way to do things. This is just the way I do things. If you 7 | have a better solution I would love to hear that, I am always looking to learn 8 | something new. 9 | 10 | ##1. Productive shell scripts 11 | 12 | I like to write scripts to automate common development tasks, such as running 13 | tests and compiling. I also want to be able to run these scripts without having 14 | to switch to my shell and write the commands. To be able to do that I set up 15 | hotkeys for my most common tasks. For example I would have F5 to be bound to 16 | run my tests. 17 | 18 | Now there are 3 different ways I run shell scripts: 19 | 20 | - Hidden 21 | - Visible 22 | - Send to shell in a tmux pane 23 | 24 | ###Hidden 25 | 26 | Some tasks are very common, almost never fail and produce output I do not need 27 | to see. Those tasks I run hidden and use a notification daemon to notify me 28 | when they are complete. 29 | 30 | To run a shell command hidden you use the vim function 31 | `call system("yourcommand")`. 32 | 33 | For example I would have a script that cleans the working directory of build 34 | files called `clean_src.sh`, and I want to be able to run this command by 35 | pressing F5. 36 | 37 | I first set up a command called `CleanSrc`: 38 | 39 | ``` 40 | :command CleanSrc :call system("./clean_src.sh") 41 | ``` 42 | 43 | I then bind F5 to run this `CleanSrc` command: 44 | 45 | ``` 46 | map :CleanSrc 47 | ``` 48 | 49 | When you run commands like this you will not know if the command failed. What 50 | I like to do is to have the script use `notify-send` to send notifications to 51 | me to tell me whether the command failed or succeeded. 52 | 53 | This is especially 54 | useful when the command would run for a couple of minutes. I can do something 55 | else while the command is running and then get a notification when it 56 | completes. 57 | 58 | ###Visible 59 | 60 | Some tasks are very common, will often fail and produce output I do need 61 | to see. Those tasks I run visible and use a notification daemon to notify me 62 | when they are complete. 63 | 64 | To run a shell command visible you use the vim function `! yourcommand`. 65 | 66 | For example I would have `make` compile my project and I want to be able to run 67 | this command by pressing F4. 68 | 69 | Like with hidden I set up a command, this time it is called `Compile`: 70 | 71 | ``` 72 | :command Compile :! make; false 73 | ``` 74 | 75 | The reason why I execute the command `false` after my command is because 76 | otherwise vim will execute the command and exit. By using `false` vim will 77 | execute the command and then wait for me to press a button to bring me back 78 | to editing. 79 | 80 | I then bind F4 to run this `Compile` command: 81 | 82 | ``` 83 | map :Compile 84 | ``` 85 | 86 | For commands like this I also like to use `notify-send` in my scripts to notify 87 | me when the long running command is complete. This way I can do something else 88 | will the command is running and also see the output after it is complete. 89 | 90 | ###Send to shell in a tmux pane 91 | 92 | Some tasks are very common, will often fail, produce output I need to see and 93 | produce a state I need to be able to manipulate. 94 | 95 | For example I would have a REPL I want to send commands to that is opened in 96 | a tmux pane. 97 | 98 | One way of doing this is using the plugin `vim-slime`. Using `vim-slime` you can 99 | select code in visual mode and send it to a tmux pane. This very nice when you 100 | have a piece of code you want to try out. 101 | 102 | One scenario I have encountered often is when you want to send the same piece 103 | of code many times. By having to select the code in visual everytime can be 104 | tedious. 105 | 106 | I looked at the source code of the `vim-slime` plugin to try to figure out if 107 | you could send the contents of a file instead of the visual selection. It turns 108 | out `vim-slime` do that each time by creating a temporary file. 109 | 110 | Tmux have two commands called `load-buffer` and `paste-buffer` to provide this 111 | functionality. `load-buffer` takes a filepath, and `paste-buffer` sends that 112 | buffer to the pane specified by the `-t` flag. Here is what a script looks like 113 | that reads the content of the file `tmp/repl_buffer` and sends it to the second 114 | pane in the first window of the session `vim_tricks`: 115 | 116 | ``` 117 | #!/bin/bash 118 | tmux -L default load-buffer tmp/repl_buffer 119 | tmux -L default paste-buffer -d -t vim_tricks:0.1 120 | ``` 121 | 122 | For example I am developing a python project. I have a tmux window split 123 | vertically. On the left side I have vim and on the right side have python open. 124 | 125 | I am currently working on a module called `parsing` that contains a function 126 | called `extract_person` that I want to be able to run quickly with test data to 127 | see how it behaves. Here is what my `tmp/repl_buffer` contains: 128 | 129 | ``` 130 | import parsing 131 | 132 | reload(parsing) 133 | 134 | parsing.extract_person("Hello, {{person}}! How are you?") 135 | ``` 136 | 137 | I want to run this code in the REPL by pressing F1, so I add this to my rc: 138 | 139 | ``` 140 | :command SendRepl :call system("./send_repl.sh") 141 | map :SendRepl 142 | ``` 143 | 144 | Since I am not interested in the output of `send_repl.sh` I run it hidden. 145 | Now I can just press F1 to to run my function in the REPL. This becomes 146 | very efficient when you want to test a very small part of your codebase 147 | often. You could develop your unit test in your `tmp/repl_buffer` while 148 | testing everything before putting the unit test in the appropriate module. 149 | 150 | ##2. Same shell and environment variables 151 | 152 | Often I found myself wanting to use the same shell and environment variables 153 | when running shell commands in vim. A way I found to do that is to use a 154 | local bashrc file that I tell bash to use. 155 | 156 | For example I am working on a python project which uses `virtualenv` to 157 | locally install all modules. I want my scripts that I execute in vim to 158 | also use `virtualenv`. 159 | 160 | I then create a file called `.bashrc.local` which contains: 161 | 162 | ``` 163 | #!/bin/bash 164 | 165 | source env/bin/activate 166 | ``` 167 | 168 | By using the vim variable `shell` I can tell vim to use bash and source my 169 | `.bashrc.local`: 170 | 171 | ``` 172 | set shell=/bin/bash\ --rcfile\ .bashrc.local\ -i 173 | ``` 174 | 175 | Now everytime I am using `call system("./mycommand")` or `! ./mycommand; false` 176 | bash is using my `virtualenv`. 177 | 178 | You can put other environment variables in there too, by using 179 | `export VAR_NAME="value"`. 180 | 181 | ##3. Project specific vimrc 182 | 183 | Binding hotkeys and running scripts often is very specific to a project. 184 | Therefor I like to set up a project specific vimrc. 185 | 186 | In vim you can read rc-files with the command `source`. You can put that 187 | command in your global .vimrc: 188 | 189 | ``` 190 | source .vimrc.local 191 | ``` 192 | 193 | The problem is that when you open vim in a directory without a `.vimrc.local` 194 | file it will give you an error. To avoid that only try to source it when the 195 | file actually exists, like so: 196 | 197 | ``` 198 | if filereadable(glob(".vimrc.local")) 199 | source .vimrc.local 200 | endif 201 | ``` 202 | 203 | Now you can put a `.vimrc.local` with project specific hotkeys and whatnot! 204 | -------------------------------------------------------------------------------- /notes/1_practical_heapsort.md: -------------------------------------------------------------------------------- 1 | #Practical heapsort 2 | 3 | **C, Linux, Gcc, Valgrind, Performance** 4 | 5 | The project shown in this case study was an application that tracks personal 6 | statistics such as; how much sleep each night, how many meals consumed, how much 7 | coffee consumed etc. 8 | 9 | Why I started working on this application was because I wanted to improve my 10 | skills in these areas: 11 | 12 | * C programming in general 13 | * Data structures 14 | * Algorithms 15 | * Parsing, loading and saving binary data from/to disk 16 | * Valgrind 17 | 18 | What was one of the most interesting things I learned during this project was 19 | using different algorithms and seeing how they affected performance. The biggest 20 | performance boost I got was from switching from [Insertion Sort](http://en.wikipedia.org/wiki/Insertion_sort) to [Heap sort](http://en.wikipedia.org/wiki/Heapsort) when 21 | sorting data points by date. 22 | 23 | The computer used in this case study has the following specs: 24 | 25 | * Pentium(R) Dual-Core CPU T4500 @ 2.30GHz 26 | * Linux 3.0.0-15-generic 27 | * Ubuntu 11.10 28 | 29 | ##The data 30 | 31 | Each data point looked like this: 32 | 33 | ```c 34 | typedef struct Data_Point{ 35 | size_t refs; 36 | uint8_t weight; 37 | uint8_t slept; 38 | uint8_t coffee_consumed; 39 | uint8_t meals_consumed; 40 | bool slept_during_day; 41 | /* 42 | A bunch of other fields ... 43 | */ 44 | uint8_t day; 45 | uint8_t month; 46 | uint16_t year; 47 | } Data_Point; 48 | ``` 49 | 50 | Nothing special, just some data and a field containing how many pointers that 51 | point to that memory. See [reference counting](http://en.wikipedia.org/wiki/Reference_counting) 52 | 53 | Each point is contained in a static length array: 54 | 55 | ```c 56 | typedef struct Data_Point_Array{ 57 | size_t refs; 58 | size_t length; 59 | Data_Point **items; 60 | } Data_Point_Array; 61 | ``` 62 | 63 | A reference count, an item count and a pointer to the memory 64 | containing the pointers that point towards each data point. 65 | 66 | ##The original algorithm 67 | 68 | The original algorithm used to sort these points was insertion sort. It's a very 69 | simple algorithm but it is slow. If it were to sort 100 data points, in the 70 | worst case it has to do 10 000 (100^2) comparisons. Comparing that to 71 | it's replacement (heapsort), which has to in worst case do ~664 (100 * 72 | log2(100)) comparisons. 73 | 74 | Both these algorithms can do the sorting in-place, which means it doesn't have 75 | to create a new array to put the results in. It just shifts the items around in 76 | the given array. However, since I'm a novice I had implemented the insertion 77 | sort which creates a new array to put the results in. So not only is the array 78 | slow, but I had made a sub-optimal implementation of it. 79 | 80 | Here is what the first part of the code looks like: 81 | 82 | ```c 83 | Data_Point_Array *Data_Point_Array_insertion_sort(Data_Point_Array *array) 84 | { 85 | Data_Point_Array *sorted = Data_Point_Array_create(array->length); 86 | Data_Point *insertee = NULL; 87 | size_t i = 0; 88 | 89 | for (; i < array->length; i++) { 90 | insertee = array->items[i]; 91 | Data_Point_incref(insertee); 92 | 93 | insertion_sort_insert(sorted, insertee); 94 | } 95 | 96 | return sorted; 97 | } 98 | ``` 99 | 100 | As you can see the array which will contain the sorted items is created with the 101 | same length as the original one. It will only sort, not filter out any items, so 102 | it's safe to assume that the length will be the same in the result. 103 | 104 | Then all it does is to iterate the original array and extract each item, 105 | increase it's reference count and then inserts the item into the sorted array. 106 | Since each item is going to be inserted into a new array it's reference count is 107 | increased by 1 so that when we remove the original array the memory for each 108 | item will not be free'd. 109 | 110 | Here is what the insert function looks like: 111 | 112 | ```c 113 | static inline void insertion_sort_insert( 114 | Data_Point_Array *sorted, 115 | Data_Point *insertee) 116 | { 117 | Data_Point *current = NULL; 118 | Data_Point *tmp = NULL; 119 | size_t i = 0; 120 | 121 | for (; i < sorted->length; i++) { 122 | if ((current = sorted->items[i]) == NULL) { 123 | sorted->items[i] = insertee; 124 | break; 125 | } 126 | 127 | if (Data_Point_older(current, insertee)) { 128 | tmp = current; 129 | sorted->items[i] = insertee; 130 | insertee = tmp; 131 | } 132 | } 133 | } 134 | ``` 135 | 136 | This is the meat of the algorithm. For each item it recieves, it checks it 137 | against items already inserted in the sorted array. 138 | 139 | The `Data_Point_older` function checks whether the first item 140 | is older than the second. True, if A is older than B, False if A is not older 141 | than B and False if A is equally old as B. It is important that if the items are 142 | equal the compare function should return False. I leave the reason as an 143 | exercise for the reader to find out. 144 | 145 | Here are the steps involved to insert a new item: 146 | 147 | 1. Start at the beginning of the array, check first item 148 | 2. If the current item is NULL, insert the given item. 149 | 3. If not, check if the given item is older than the current item. 150 | 4. If the current item is older, insert the given item in it's place use the 151 | current item as a given item. 152 | 5. Repeat from step 2. 153 | 154 | Now, that might not be the best explanation of the algorithm, but here is a good 155 | animation that show visually what happens: 156 | 157 | ![Insertion sort animation](https://raw.github.com/rzetterberg/case_studies/master/assets/insertion_sort_animation.gif) 158 | 159 | The animation shows what the algorithm looks like when it's sorting in-place, so 160 | imagine that the black boxes are items in the sorted array and the other ones 161 | are the items in the original array. 162 | 163 | ##Measuring the original algorithm 164 | 165 | Now comes the interesting part, measuring the performance to get an idea of how 166 | slow this implementation of the algorithm is. 167 | 168 | When doing these measurements the program was compiled with `-O3` optimization 169 | and *7300* points were used. The data was created using `dd` to get bytes from 170 | `/dev/urandom`. 171 | 172 | Since the datafile is only a flat binary file were the points are stored 173 | sequencially it is easy to generate random data with `dd`. Each point is stored 174 | as **8 bytes**, so here is the command used to generate the test file: 175 | 176 | ```bash 177 | dd if=/dev/urandom of=random.data bs=8 count=7300 178 | ``` 179 | 180 | Now that we have fairly large file we will use valgrind to generate profiling 181 | data for us to look at. But first we will check the program for any memory leaks 182 | before we do that. Valgrind can do that for us too! Just compile the program 183 | with -O3 and -g, then run the executable with valgrind: 184 | 185 | ```bash 186 | valgrind --leak-check=full --log-file=valgrind.log ./compiled_program 187 | ``` 188 | 189 | If there are any leaks the output can become quite long, so I like to output the 190 | information into a file to view with my editor. Here is what my output looks 191 | like: 192 | 193 | ``` 194 | ==7339== Memcheck, a memory error detector 195 | ==7339== Copyright (C) 2002-2010, and GNU GPL'd, by Julian Seward et al. 196 | ==7339== Using Valgrind-3.6.1-Debian and LibVEX; rerun with -h for copyright 197 | info 198 | ==7339== Command: ./compiled_program 199 | ==7339== Parent PID: 7338 200 | ==7339== 201 | ==7339== 202 | ==7339== HEAP SUMMARY: 203 | ==7339== in use at exit: 0 bytes in 0 blocks 204 | ==7339== total heap usage: 172 allocs, 172 frees, 4,189 bytes allocated 205 | ==7339== 206 | ==7339== All heap blocks were freed -- no leaks are possible 207 | ==7339== 208 | ==7339== For counts of detected and suppressed errors, rerun with: -v 209 | ==7339== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 11 from 6) 210 | ``` 211 | 212 | Great! No leaks found or any other problems involving memory found. 213 | 214 | Now it's time to generate the profiling data: 215 | 216 | ```bash 217 | valgrind --tool=callgrind --dump-instr=yes --simulate-cache=yes 218 | --collect-jumps=yes ./compiled_program 219 | ``` 220 | 221 | This will output a file called **compiled_program.out.XXX** where XXX is the pid 222 | of the application executed. Now we can view this file in `kcachegrind` to get a 223 | nice visual presentation of amount of calls to different functions in the 224 | application. Here is what the output looks like using the insertion sort on 225 | these 7300 data points: 226 | 227 | ![Kcachegrind insertion sort output](https://raw.github.com/rzetterberg/case_studies/master/assets/1_practical_heapsort/kcachegrind_insertion_profile.png) 228 | 229 | As you can see the most amount of time if spend doing the comparisons of the 230 | items. Look at the called amount. That's right, the function is run **26 641 350 231 | times**! This took ~20 seconds for my old laptop to complete. And that's not 232 | even the worst case, that is about half amount of calls that the worst case 233 | would be. 234 | 235 | I wonder what heap sort can do for us? 236 | 237 | ##The replacement algorithm 238 | 239 | The heapsort algorithm is a bit more complex, but it is much faster. The main 240 | idea with heapsort is that you build a heap of the data and then use the heap to 241 | order the items sequencally. It works in-place, and the implementation I will 242 | show you does too. 243 | 244 | A heap is basically a special type of binary tree, but it doesn't require a 245 | separate data structure, it is created within the array. It does this by 246 | ordering the items in a special way which represent the tree. Here is a good 247 | visual representation of how it does this: 248 | 249 | ![Heapsort animation](https://raw.github.com/rzetterberg/case_studies/master/assets/heapsort_animation.gif) 250 | 251 | I won't go into detail how the heapsort algorithm works, because it's out of 252 | scope of this article. I'll leave reading up on how heapsort works as an 253 | exercise for the reader. 254 | 255 | On to how I have implemented the algorithm! Here is the first part of the code 256 | for the algorithm: 257 | 258 | ```c 259 | void Data_Point_Array_heap_sort(Data_Point_Array *array) 260 | { 261 | 262 | int32_t i = (array->length / 2) - 1; 263 | 264 | for (; i >= 0; i--) { 265 | heap_sort_sift_down(array, i, array->length - 1); 266 | } 267 | 268 | for (i = array->length - 1; i >= 1; i--) { 269 | Data_Point_Array_items_switch(array, 0, i); 270 | heap_sort_sift_down(array, 0, i - 1); 271 | } 272 | } 273 | ``` 274 | 275 | Here we can see that no array is created to place the sorted result in, we just 276 | use the existing one. First the heap is built by calling the sift down function 277 | on half of the array starting from the middle to the start. 278 | 279 | When the heap is created and all items are in place, the heap is then flattened 280 | into a sequencially sorted array. 281 | 282 | Here is what the second part of the code looks like: 283 | 284 | ```c 285 | #define HEAP_LEFT(I) (I * 2) 286 | #define HEAP_RIGHT(I) (HEAP_LEFT(I) + 1) 287 | 288 | static void heap_sort_sift_down( 289 | Data_Point_Array *array, 290 | size_t current_root, 291 | size_t bottom) 292 | { 293 | size_t left = HEAP_LEFT(current_root); 294 | size_t right = HEAP_RIGHT(current_root); 295 | size_t root_candidate; 296 | 297 | while (left <= bottom) { 298 | if (left == bottom) { 299 | root_candidate = left; 300 | }else if (Data_Point_older(array->items[left], array->items[right])) { 301 | root_candidate = left; 302 | }else{ 303 | root_candidate = right; 304 | } 305 | 306 | if (Data_Point_older(array->items[root_candidate], array->items[current_root])) { 307 | Data_Point_Array_items_switch(array, current_root, root_candidate); 308 | 309 | current_root = root_candidate; 310 | left = HEAP_LEFT(current_root); 311 | right = HEAP_RIGHT(current_root); 312 | }else{ 313 | return; 314 | } 315 | } 316 | } 317 | ``` 318 | 319 | If you have read up on heapsort you will probably have seen pseudo-code of 320 | heapsort being implemented as a recursive algorithm. This implementation uses an 321 | iterative implementation instead. There are two reasons: 322 | 323 | 1. For each function call, data is placed on stack memory. Eventually when the 324 | amount of recursive calls are too many the stack memory will be overflow and the 325 | program will crash. See [stack overflow](http://en.wikipedia.org/wiki/Stack_overflow) 326 | 2. For each recursive call there are instructions that needs to be run to set up 327 | the function call, by using a loop instead we don't have to run those 328 | instructions which makes the algorithm faster. 329 | 330 | ##Measuring the replacement algorithm 331 | 332 | Assume that the same steps were performed before profiling this algorithm as 333 | with the original algirithm. Here is what the kcachegrind output looks like for 334 | the heapsort algorithm: 335 | 336 | ![Kcachegrind heapsort output](https://raw.github.com/rzetterberg/case_studies/master/assets/1_practical_heapsort/kcachegrind_heapsort_profile.png) 337 | 338 | That is a big improvement! From **26 641 350 comparisons** we are not down to **179 688**. 339 | Using this algorithm the sort took less than 1 second to run on my old laptop. 340 | --------------------------------------------------------------------------------