├── .gitignore ├── LICENSE.md ├── README.md ├── chapter-1 ├── 1.1-names │ └── README.md ├── 1.2-expressions-and-statements │ ├── README.md │ └── eval.c ├── 1.3-consistency-and-idioms │ └── README.md ├── 1.4-function-macros │ └── README.md ├── 1.5-magic-numbers │ └── README.md ├── 1.6-comments │ └── README.md ├── 1.7-why-bother │ └── README.md └── README.md ├── chapter-2 ├── 2.1-searching │ ├── README.md │ ├── binary_search.c │ ├── linear_search.c │ └── values.c ├── 2.2-sorting │ ├── README.md │ ├── quick_sort.c │ └── values.c ├── 2.3-libraries │ ├── README.md │ ├── libraries.c │ ├── quick_sort_iter_vs_rec.c │ └── values.c ├── 2.4-java-quicksort │ ├── Main.java │ └── README.md ├── 2.5-o-notation │ ├── README.md │ ├── quick_sort_worst_cases.c │ └── slow_sort.c ├── 2.6-growing-arrays │ ├── README.md │ └── growing_arrays.c ├── 2.7-lists │ ├── README.md │ ├── generic_lists │ │ └── generic_list.c │ ├── lists.c │ ├── reverse.c │ └── tests.c ├── 2.8-trees │ ├── README.md │ ├── rec_vs_iter.c │ ├── sort.c │ ├── tests.c │ └── trees.c ├── 2.9-hash-tables │ ├── README.md │ ├── coordinates.c │ ├── hash_func.c │ ├── hash_tables.c │ └── lookup.c └── README.md ├── chapter-3 ├── 3.1-the-markov-chain-algorithm │ └── README.md ├── 3.2-data-structure-alternatives │ └── README.md ├── 3.3-building-the-data-structure-in-c │ ├── README.md │ └── structs.c ├── 3.4-generating-output │ ├── README.md │ ├── output.c │ └── text.txt ├── 3.5-java │ ├── Markov.java │ └── README.md ├── 3.6-c++ │ ├── README.md │ ├── markov.cpp │ └── structs.cpp ├── 3.7-awk-and-perl │ ├── README.md │ ├── markov.awk │ └── markov.pl ├── 3.8-performance │ └── README.md ├── 3.9-lessons │ ├── README.md │ └── go │ │ ├── go.mod │ │ └── main │ │ └── main.go └── README.md ├── chapter-4 ├── 4.1-comma-separated-values │ ├── README.md │ └── data.csv ├── 4.2-a-prototype-library │ ├── README.md │ └── csv.c ├── 4.3-a-library-for-others │ ├── README.md │ ├── csv.c │ ├── csv.h │ ├── csv_generator.c │ ├── csv_generator.h │ └── values.txt ├── 4.4-a-c++-implementation │ ├── README.md │ └── csv.cpp ├── 4.5-interface-principles │ └── README.md ├── 4.6-resource-management │ └── README.md ├── 4.7-abort-retry-fail │ └── README.md ├── 4.8-user-interfaces │ └── README.md └── README.md ├── chapter-5 ├── 5.1-debuggers │ └── README.md ├── 5.2-good-clues-easy-bugs │ └── README.md ├── 5.3-no-clues-hard-bugs │ └── README.md ├── 5.4-last-resorts │ └── README.md ├── 5.5-non-reproducible-bugs │ └── README.md ├── 5.6-debugging-tools │ ├── README.md │ ├── strings.c │ ├── vis.c │ └── xoa.txt ├── 5.7-other-peoples-bugs │ └── README.md ├── 5.8-summary │ └── README.md └── README.md ├── chapter-6 ├── 6.1-test-as-you-write-the-code │ ├── 6-1-a.c │ ├── 6-1-b.c │ ├── 6-1-c.c │ ├── 6-1-d.c │ ├── 6-1-e.c │ ├── 6-1-f.c │ └── README.md ├── 6.2-systematic-testing │ ├── README.md │ └── freq.c ├── 6.3-test-automation │ └── README.md ├── 6.4-test-scaffolds │ └── README.md ├── 6.5-stress-tests │ └── README.md ├── 6.6-tips-for-testing │ └── README.md ├── 6.7-who-does-the-testing │ └── README.md ├── 6.8-testing-the-markov-program │ └── README.md ├── 6.9-summary │ └── README.md └── README.md ├── chapter-7 ├── 7-2 │ └── main.go └── README.md ├── chapter-8 └── README.md └── chapter-9 └── README.md /.gitignore: -------------------------------------------------------------------------------- 1 | .vscode 2 | *.class 3 | executable 4 | -------------------------------------------------------------------------------- /LICENSE.md: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2020 Anton Sankov 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # The Practice of Programming 2 | 3 | This repository contains the exercises from the book "The Practice of Programming" by Brian W. Kernighan and Rob Pike. 4 | 5 | The book can be bought from [Amazon](https://www.amazon.de/-/en/Brian-W-Kernighan-ebook/dp/B00HU50A12/) or other bookstores. 6 | 7 | ## Table of Contents 8 | 9 | - [Chapter 1: Style](chapter-1) 10 | - [Chapter 2: Algorithms and Data structures](chapter-2) 11 | - [Chapter 3: Design and Implementation](chapter-3) 12 | - [Chapter 4: Interfaces](chapter-4) 13 | - [Chapter 5: Debugging](chapter-5) 14 | - [Chapter 6: Testing](chapter-6) 15 | - [Chapter 7: Performance](chapter-7) 16 | - Chapter 8: Portability 17 | - Chapter 9: Notation 18 | 19 | ## License 20 | 21 | This work is licensed under MIT license. For more info see [LICENSE.md](LICENSE.md) 22 | -------------------------------------------------------------------------------- /chapter-1/1.1-names/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 1: Style 2 | 3 | ## 1.1 Names 4 | 5 | ### Exercise 1-1 6 | 7 | Comment on the choice of names and values in the following code: 8 | 9 | ```c 10 | #define TRUE 0 11 | #define FALSE 1 12 | 13 | if ((ch = getchar()) == EOF) 14 | not_eof = FALSE; 15 | ``` 16 | 17 | _Answer:_ I think the naming here is all wrong, because: 18 | 19 | - `TRUE` should be `1`, and `FALSE` should be `0`. Written as they are now, `while(FALSE){ /* code */ }` would result in an infinite loop 20 | and `while(TRUE){ /* code */ }` would never be executed. This is misleading and would lead to someone spending hours debugging this problem. 21 | 22 | For the second part, while the logic is true, the naming is weird and may lead to confussion. A much better version would be: 23 | 24 | ```c 25 | if ((ch = getchar()) == EOF) 26 | eof = TRUE; 27 | ``` 28 | 29 | The other version may lead to double negation like `if (!not_eof) { /* code */ }` which is more confusing than the simple `if (eof) { /* code */}`. 30 | 31 | ### Exercise 1-2 32 | 33 | Improve this function. 34 | 35 | ```c 36 | int smaller(char *s, char *t) { 37 | if (strcmp(s, t) < 1) 38 | return 1; 39 | else 40 | return 0; 41 | } 42 | ``` 43 | 44 | _Answer:_ 45 | 46 | ```c 47 | /* smaller_than: returns 1 if s is smaller than t, otherwise 0 */ 48 | int is_smaller_than(char *s, char *t) { 49 | if (strcmp(s, t) >= 0) 50 | return 0; 51 | return 1; 52 | } 53 | ``` 54 | 55 | Improvements: 56 | 57 | - naming - this way it is clear what the function does 58 | - documentation - what is not clear from the name (which string is smaller that which) is clear by the doc string 59 | - code - reversed the condition to fit the name and docs. This way the function is different than the original, but it does what it says 60 | 61 | ### Exercise 1-3 62 | 63 | Read this code aloud. 64 | 65 | ```c 66 | if ((falloc(SMRHHSHSCRTCH, S_IFEXT|0644, MAXRODDHSH)) < 0) 67 | ... 68 | ``` 69 | 70 | As hard as it was - done. 71 | -------------------------------------------------------------------------------- /chapter-1/1.2-expressions-and-statements/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 1: Style 2 | 3 | ## 1.2 Expressions and Statemenents 4 | 5 | ### Exercise 1-4 6 | 7 | Improve each of these fragments: 8 | 9 | ```c 10 | if (!(c == 'y' || c == 'Y')) 11 | return; 12 | ``` 13 | 14 | ```c 15 | length = (length < BUFFSIZE) ? length : BUFFSIZE; 16 | ``` 17 | 18 | ```c 19 | flag = flag ? 0 : 1; 20 | ``` 21 | 22 | ```c 23 | quote = (*line == '"') ? 1 : 0; 24 | ``` 25 | 26 | ```c 27 | if (val & 1) 28 | bit = 1; 29 | else 30 | bit = 0; 31 | ``` 32 | 33 | _Answer:_ 34 | 35 | ```c 36 | if (c != 'y' && c != 'Y') 37 | return; 38 | ``` 39 | 40 | Because this way reads more naturally. 41 | 42 | ```c 43 | length = (length > BUFFSIZE) ? BUFFSIZE : length; 44 | ``` 45 | 46 | Because this way reads more naturally. 47 | 48 | ```c 49 | flag = !flag; 50 | ``` 51 | 52 | This is simpler and obvious. 53 | 54 | ```c 55 | is_quote = *line == '"'; 56 | ``` 57 | 58 | This is simpler and obvious. 59 | 60 | ```c 61 | bit = val & 1; 62 | ``` 63 | 64 | This is simpler and obvious. 65 | 66 | ### Exercise 1-5 67 | 68 | What is wrong with this excerpt? 69 | 70 | ```c 71 | int read(int *ip) { 72 | scanf("%d", ip); 73 | return *ip; 74 | } 75 | ... 76 | insert(&graph[vert], read(&val), read(&ch)); 77 | ``` 78 | 79 | _Answer:_ The thing that is wrong with the `read` function is that it both scans the value into `ip` and returns the value. 80 | This seems weird, and people may find clever ways to utilize this. As we know from the book - clear is better than clever. 81 | The thing that is wrong with the call to `insert` comes from that problem with `read`. This statement does 3 things at the same time. 82 | Calls `insert` with the value, read from stdin, and assigns them to `val` and `ch` respectively. This may lead to nasty bugs 83 | and more hours of debugging. 84 | 85 | ### Exercise 1-6 86 | 87 | List all the different outputs this could produce with various orders of evaluation: 88 | 89 | ```c 90 | n = 1; 91 | printf("%d %d\n", i++, i++); 92 | ``` 93 | 94 | _Answer:_ The possible outputs are: 95 | 96 | - `1 1` 97 | - `1 2` 98 | - `2 2` 99 | - `2 3` 100 | 101 | However, in my opinion the most common one would be `1 2`, because the `++` operator has a good definition, and that is: 102 | `increment the variable and return the old value`. So the first `i++` would increment `i`, but would return the old value of `i`, hence `1`. 103 | The second call to `i++` would increment the variable again and return the old value, which is now `2`. If we try to use `i` one more time, the value would now be `3`. 104 | 105 | For experimentation I created [eval.c](eval.c), which I compiled twice - with `gcc` and `clang` on MacOS. 106 | The output of both compilations was identical - warning for multiple unsequenced moditification - 107 | 108 | ```text 109 | eval.c:6:24: warning: multiple unsequenced modifications to 'n' [-Wunsequenced] 110 | printf("%d %d\n", n++, n++); 111 | ^ ~~ 112 | 1 warning generated. 113 | ``` 114 | 115 | and running both executables many times produced the same output always: 116 | 117 | ```text 118 | 1 2 119 | ``` 120 | 121 | In conclusion, while this code is bad, and must not be ran in production the output it produces is rather predictable and no surprises occurred while running it. 122 | -------------------------------------------------------------------------------- /chapter-1/1.2-expressions-and-statements/eval.c: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | int main() 4 | { 5 | int n = 1; 6 | printf("%d %d\n", n++, n++); 7 | 8 | return 0; 9 | } -------------------------------------------------------------------------------- /chapter-1/1.3-consistency-and-idioms/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 1: Style 2 | 3 | ## 1.3 Consistency and Idioms 4 | 5 | ### Exercise 1-7 6 | 7 | Rewrite these C/C++ excerpts more clearly: 8 | 9 | ```c 10 | if (istty(stdin)) ; 11 | else if (istty(stdout)) ; 12 | else if (istty(stderr)) ; 13 | else return(0); 14 | ``` 15 | 16 | ```c 17 | if (retval != SUCCESS) 18 | { 19 | return (retval); 20 | } 21 | /* All went well! */ 22 | return SUCCESS; 23 | ``` 24 | 25 | ```c 26 | for (k = 0; k++ < 5; x += dx) 27 | scanf("%lf", &dx); 28 | ``` 29 | 30 | _Answer:_ 31 | 32 | ```c 33 | if (istty(stdin)) {} 34 | else if (istty(stdout)) {} 35 | else if (istty(stderr)) {} 36 | else 37 | return (0); 38 | ``` 39 | 40 | Here, I just added parantheses around the empty bodies of the if conditions and corrected the indentation. This should be enough to easy readability. 41 | 42 | ```c 43 | return retval; 44 | ``` 45 | 46 | Here, the return value is always `retval`, so there is no sense in checking it, since no other action is executed anyway. 47 | 48 | ```c 49 | for (k = 0; k < 5; k++) 50 | scanf("%lf", &dx); 51 | x += dx; 52 | ``` 53 | 54 | Here, I moved the incrementation of `k` to the last part of the `for` loop definition, as this is the idiomatic choice. 55 | Moreove, I moved `x += dx` into the body of the for loop, since this way is even clearer where `dx` is coming from and what happens with `x`. Also, now the `for` loop looks like the standart `for` that a seasoned programmer would recognise at a glance. 56 | 57 | ### Exercise 1-8 58 | 59 | Identify the errors in this Java fragment and repair it by rewriting with an idiomatic loop. 60 | 61 | ```java 62 | int count = 0; 63 | while (count < total) { 64 | count++; 65 | if (this.getName(count) == nametable.userName()) { 66 | return (true); 67 | } 68 | } 69 | ``` 70 | 71 | _Answer:_ 72 | 73 | ```java 74 | for (int count = 0; count < total; count++) { 75 | if (this.getName(count) == nametable.userName()) { 76 | return (true); 77 | } 78 | } 79 | ``` 80 | 81 | Not much to explain here. Making the loop idiomatic makes the code much easier to read and understand. 82 | -------------------------------------------------------------------------------- /chapter-1/1.4-function-macros/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 1: Style 2 | 3 | ## 1.4 Function Macros 4 | 5 | ### Exercise 1-9 6 | 7 | Identify the problem with this macro definition: 8 | 9 | ```c 10 | #define ISDIGIT(c) ((c >= '0') && (c <= '9'>)) ? 1 : 0 11 | ``` 12 | 13 | _Answer:_ 14 | Problems: 15 | 16 | - multiple evaluation - this macros uses its argument twice, which means that being passed a statement, that statement would be evaluated twice, hence possiblity for side effects. _Example:_ 17 | 18 | ```c 19 | int is_digit = ISDIGIT(getchar()) 20 | ``` 21 | 22 | would be compiled to 23 | 24 | ```c 25 | int is_digit = ((getchar() >= '0') && (getchar() <= '9'>)) ? 1 : 0 26 | ``` 27 | 28 | which will read two values, and will possibly yield wrong result. 29 | 30 | - macro not wrapped in parentheses - which means that if used the wrong way it will evaluate in an unexpected way and the whole code will have different meaning. _Example:_ 31 | 32 | ```c 33 | int is_digit_or_letter = ISDIGIT(c) ? 1 : ISLETTER(c) ? 1 : 0 34 | ``` 35 | 36 | would be compiled as: 37 | 38 | ```c 39 | int is_digit_or_letter = ((c >= '0') && (c <= '9'>)) ? 1 : 0 ? 1 : ISLETTER(c) ? 1 : 0 40 | ``` 41 | 42 | which could again yield unexpected result and lead to nasty bugs and hours of debugging. 43 | -------------------------------------------------------------------------------- /chapter-1/1.5-magic-numbers/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 1: Style 2 | 3 | ## 1.5 Magic Numbers 4 | 5 | **Summary:** Magic numbers should be extracted into constants/enum to easy readability and to prevent having to change a lot of code if something changes (for example an array size). 6 | 7 | ### Exercise 1-10 8 | 9 | How would you rewrite these definitions to minimize potential errors? 10 | 11 | ```c 12 | #define FT2METER 0.3048 13 | #define METER2FT 3.28084 14 | #define MI2FT 5280.0 15 | #define MI2KM 1.609344 16 | #define SQMI2SQKM 2.589988 17 | ``` 18 | 19 | _Answer:_ 20 | 21 | ```c 22 | const long FT2METER = 0.3048; 23 | const long METER2FT = 1/FT2METER; 24 | const long MI2KM = 1.609344; 25 | const long MI2FT = MI2KM * 1000 * METER2FT; 26 | const long SQMI2SQKM = MI2KM * MI2KM; 27 | ``` 28 | 29 | The first option to look at when defining constants should be the `enum`. However, this won't work here, because in C/C++ enum values can only be integers. 30 | The next option is `const`. This allows us to define only 2 of them and compute the other 3 based on these 2 values. 31 | This computation is cheep, since it would be executed only once - at compile time. 32 | This is safer than hardcoding all the values by hand. 33 | Macros should not be used for constants, because according to the book, they may change the lexical structure of a program. 34 | -------------------------------------------------------------------------------- /chapter-1/1.6-comments/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 1: Style 2 | 3 | ## 1.6 Comments 4 | 5 | **Summary:** Comment should clarify, not confuse. They should agree with the code, not contradict it. They should describe why, not how, and should not repeat the code. They should not be everywhere, just where this clarifications are needed. 6 | 7 | ### Exercise 1-11 8 | 9 | Comment on these comments _(answers inline)_: 10 | 11 | ```c 12 | void dict::insert(string& w) 13 | // returns 1 if w in dictionary, otherwise returns 0 14 | ``` 15 | 16 | This comment is obvious wrong and should be deleted. While it is possible for an `insert` method on `dict` to return a result, based on the presence of the inserted value, the code obviously shows that this function is `void`, hence it returns no result. 17 | 18 | ```c 19 | if (n > MAX || n % 2 > 0) // test for even number 20 | ``` 21 | 22 | Here the comment is fully wrong. First, the code is for odd number, not even. Second, there is one more check - if `n` is bigger than `MAX`, which is not docummented. In my opinion, this comment should be deleted, because it is very obvious what's going on from the code itself. 23 | 24 | ```c 25 | // Write a message 26 | // Add to line counter for each line written 27 | 28 | void write_message() 29 | { 30 | // increment line counter 31 | line_number = line_number + 1; 32 | fprintf(fout, "%d %s\n%d %s\n%d %s\n", 33 | line_number, HEADER, 34 | line_number + 1, BODY, 35 | line_number + 2, TRAILER); 36 | // increment line counter 37 | line_number = line_number + 2; 38 | } 39 | ``` 40 | 41 | Starting with the first comment - 42 | 43 | ```c 44 | // Write a message 45 | ``` 46 | 47 | this is not wrong, but is the same as the method name, so it is obsolete. 48 | A better comment would be 49 | 50 | ```c 51 | // Write a meesage to fout 52 | ``` 53 | 54 | or no comment at all. 55 | 56 | ```c 57 | // Add to line counter for each line written 58 | ``` 59 | 60 | This is misleading, because it does not tell how much it is added to the line counter. 61 | A better and more description comment would be: 62 | 63 | ```c 64 | // Increments line_number with 3 65 | ``` 66 | 67 | Onto the next comments: 68 | 69 | - the first `// increment line counter` is obsolete, because that is evident from the code. 70 | - the second `// increment line counter` is downright wrong, because the value of `line_number` is increased with `2`, 71 | while `increment` implies increase by 1 (or at least that is the most common user of the term). 72 | -------------------------------------------------------------------------------- /chapter-1/1.7-why-bother/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 1: Style 2 | 3 | ## 1.7 Why Bother? 4 | 5 | **Summary:** Because good code is easier to read and understand. And most probably it is more right that bad code. 6 | Writing good code is a habit, which once learned, you will apply everywhere, and will write good code even under pressure of time, without that slowing you down. 7 | -------------------------------------------------------------------------------- /chapter-1/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 1: Style 2 | 3 | ## Table of Contents 4 | 5 | - [1.1 Names](1.1-names) 6 | - [1.2 Expressions and Statements](1.2-expressions-and-statements) 7 | - [1.3 Consistency and Idioms](1.3-consistency-and-idioms) 8 | - [1.4 Function Macros](1.4-function-macros) 9 | - [1.5 Magic Numbers](1.5-magic-numbers) 10 | - [1.6 Comments](1.6-comments) 11 | - [1.7 Why Bother?](1.7-why-bother) 12 | 13 | ## Supplementary Reading 14 | 15 | - _The Elements of Style_ by Strunk and White 16 | - _The Elements of Programming Style_ by Brian Kernighan and P. J. Plauger 17 | - _Writing Solid Code_ by Steve Maguire 18 | - _Code Complete_ by Steve McConnel 19 | - _Expert C Programming: Deep C Secrets_ by Peter van der Linden 20 | -------------------------------------------------------------------------------- /chapter-2/2.1-searching/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 2: Algorithms and Data structures 2 | 3 | ## 2.1 Searching 4 | -------------------------------------------------------------------------------- /chapter-2/2.1-searching/binary_search.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include "values.c" 4 | 5 | int lookup(char *name, char *tab[], int ntab); 6 | 7 | int main() 8 | { 9 | int res = lookup("B", letters, NELEMS(letters)); 10 | if (res != 1) 11 | { 12 | printf("ERROR: Got %d for `lookup(\"B\", letters, NELEMS(letters)), expected %d`\n", res, 1); 13 | return 1; 14 | } 15 | 16 | res = lookup("G", letters, NELEMS(letters)); 17 | if (res != 6) 18 | { 19 | printf("ERROR: Got %d for `lookup(\"G\", letters, NELEMS(letters)), expected %d`\n", res, 6); 20 | return 1; 21 | } 22 | 23 | res = lookup("Q", letters, NELEMS(letters)); 24 | if (res != -1) 25 | { 26 | printf("ERROR: Got %d for `lookup(\"Q\", letters, NELEMS(letters)), expected %d`\n", res, -1); 27 | return 1; 28 | } 29 | 30 | printf("SUCCESS.\n"); 31 | return 0; 32 | } 33 | 34 | int lookup(char *name, char *tab[], int ntab) 35 | { 36 | int low, high, mid, cmp; 37 | 38 | low = 0; 39 | high = ntab - 1; 40 | while (low <= high) 41 | { 42 | mid = (low + high) / 2; 43 | cmp = strcmp(name, tab[mid]); 44 | if (cmp < 0) 45 | high = mid - 1; 46 | else if (cmp > 0) 47 | low = mid + 1; 48 | else /* found match */ 49 | return mid; 50 | } 51 | return -1; /* no match */ 52 | } -------------------------------------------------------------------------------- /chapter-2/2.1-searching/linear_search.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | 4 | int lookup(char *word, char *array[]); 5 | 6 | int main() 7 | { 8 | char *flab[] = { 9 | "actually", 10 | "just", 11 | "quite", 12 | "really", 13 | NULL}; 14 | 15 | int res = lookup("just", flab); 16 | if (res != 1) 17 | { 18 | printf("ERROR: Got %d for `lookup(\"just\", flab), expected %d`\n", res, 1); 19 | return 1; 20 | } 21 | 22 | res = lookup("not present", flab); 23 | if (res != -1) 24 | { 25 | printf("ERROR: Got %d for `lookup(\"not present\", flab), expected %d`\n", res, 1); 26 | return 1; 27 | } 28 | 29 | printf("SUCCESS.\n"); 30 | return 0; 31 | } 32 | 33 | /* lookup: sequential search for word in an array */ 34 | int lookup(char *word, char *array[]) 35 | { 36 | for (int i = 0; array[i] != NULL; i++) 37 | { 38 | if (strcmp(word, array[i]) == 0) 39 | return i; 40 | } 41 | return -1; 42 | } -------------------------------------------------------------------------------- /chapter-2/2.1-searching/values.c: -------------------------------------------------------------------------------- 1 | #define NELEMS(array) (sizeof(array) / sizeof(array[0])) 2 | 3 | char *letters[] = { 4 | "A", 5 | "B", 6 | "C", 7 | "D", 8 | "E", 9 | "F", 10 | "G", 11 | "H", 12 | "I", 13 | "J"}; 14 | 15 | int numbers[] = { 16 | 100, 17 | 12, 18 | 11, 19 | 10, 20 | 9, 21 | 15, 22 | 18, 23 | 23, 24 | 27, 25 | 77, 26 | 909, 27 | 101, 28 | 1}; -------------------------------------------------------------------------------- /chapter-2/2.2-sorting/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 2: Algorithms and Data structures 2 | 3 | ## 2.2. Sorting 4 | -------------------------------------------------------------------------------- /chapter-2/2.2-sorting/quick_sort.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include "values.c" 4 | 5 | void quicksort(int v[], int n); 6 | void swap(int v[], int i, int j); 7 | 8 | int main() 9 | { 10 | quicksort(numbers, NELEMS(numbers)); 11 | printf("sorted numbers:\n"); 12 | for (int i = 0; i < NELEMS(numbers); i++) 13 | { 14 | printf("%d, ", numbers[i]); 15 | } 16 | printf("\n"); 17 | 18 | return 0; 19 | } 20 | 21 | void quicksort(int v[], int n) 22 | { 23 | int last = 0; 24 | if (n <= 1) 25 | return; 26 | 27 | swap(v, 0, rand() % n); /* move pivot to v[0] */ 28 | for (int i = 0; i < n; i++) 29 | if (v[i] < v[0]) 30 | swap(v, ++last, i); 31 | 32 | swap(v, 0, last); 33 | quicksort(v, last); 34 | quicksort(v + last + 1, n - last - 1); 35 | } 36 | 37 | /* swap: interchange v[i] and v[j] */ 38 | void swap(int v[], int i, int j) 39 | { 40 | int temp; 41 | 42 | temp = v[i]; 43 | v[i] = v[j]; 44 | v[j] = temp; 45 | } -------------------------------------------------------------------------------- /chapter-2/2.2-sorting/values.c: -------------------------------------------------------------------------------- 1 | #define NELEMS(array) (sizeof(array) / sizeof(array[0])) 2 | 3 | char *letters[] = { 4 | "A", 5 | "B", 6 | "C", 7 | "D", 8 | "E", 9 | "F", 10 | "G", 11 | "H", 12 | "I", 13 | "J"}; 14 | 15 | int numbers[] = { 16 | 100, 17 | 12, 18 | 11, 19 | 10, 20 | 9, 21 | 15, 22 | 18, 23 | 23, 24 | 27, 25 | 77, 26 | 909, 27 | 101, 28 | 1}; -------------------------------------------------------------------------------- /chapter-2/2.3-libraries/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 2: Algorithms and Data structures 2 | 3 | ## 2.3 Libraries 4 | 5 | ### Exercise 2.1 6 | 7 | Quicksort is most naturally expressed recursively. Write it iteratatively and compare the two versions. 8 | 9 | _Answer:_ Definetelly, the more natural way was the recursive one. I had to put a lot more thinking ang Googling into the iterative implementation. 10 | 11 | Both implementations can be found at [`quick_sort_iter_vs_rec.c`](quick_sort_iter_vs_rec.c) 12 | -------------------------------------------------------------------------------- /chapter-2/2.3-libraries/libraries.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include "values.c" 5 | 6 | void sort_numbers(); 7 | void sort_letters(); 8 | void binary_search(); 9 | int scmp(const void *p1, const void *p2); 10 | int icmp(const void *p1, const void *p2); 11 | int lookup(char *name, char *tab[], int ntab); 12 | 13 | int main() 14 | { 15 | sort_numbers(); 16 | sort_letters(); 17 | binary_search(); 18 | 19 | return 0; 20 | } 21 | 22 | void sort_numbers() 23 | { 24 | qsort(numbers, NELEMS(numbers), sizeof(numbers[0]), icmp); 25 | printf("sorted numbers:\n"); 26 | for (int i = 0; i < NELEMS(numbers); i++) 27 | { 28 | printf("%d, ", numbers[i]); 29 | } 30 | printf("\n"); 31 | } 32 | 33 | void sort_letters() 34 | { 35 | qsort(letters, NELEMS(letters), sizeof(letters[0]), scmp); 36 | printf("sorted letters:\n"); 37 | for (int i = 0; i < NELEMS(letters); i++) 38 | { 39 | printf("%s, ", letters[i]); 40 | } 41 | printf("\n"); 42 | } 43 | 44 | void binary_search() 45 | { 46 | int res = lookup("B", letters, NELEMS(letters)); 47 | if (res != 1) 48 | { 49 | printf("ERROR: Got %d for `lookup(\"B\", letters, NELEMS(letters)), expected %d`\n", res, 1); 50 | return; 51 | } 52 | 53 | res = lookup("G", letters, NELEMS(letters)); 54 | if (res != 6) 55 | { 56 | printf("ERROR: Got %d for `lookup(\"G\", letters, NELEMS(letters)), expected %d`\n", res, 6); 57 | return; 58 | } 59 | 60 | res = lookup("Q", letters, NELEMS(letters)); 61 | if (res != -1) 62 | { 63 | printf("ERROR: Got %d for `lookup(\"Q\", letters, NELEMS(letters)), expected %d`\n", res, -1); 64 | return; 65 | } 66 | 67 | printf("binary search: success\n"); 68 | } 69 | 70 | /* lookup: use binary search el in tab*/ 71 | int lookup(char *el, char *tab[], int ntab) 72 | { 73 | char **res = (char **)bsearch(&el, tab, ntab, sizeof(tab[0]), scmp); 74 | if (res == NULL) 75 | return -1; 76 | else 77 | return res - tab; 78 | } 79 | 80 | int scmp(const void *p1, const void *p2) 81 | { 82 | char *v1 = *(char **)p1; 83 | char *v2 = *(char **)p2; 84 | 85 | return strcmp(v1, v2); 86 | } 87 | 88 | int icmp(const void *p1, const void *p2) 89 | { 90 | int v1 = *(int *)p1; 91 | int v2 = *(int *)p2; 92 | 93 | if (v1 > v2) 94 | return 1; 95 | else if (v2 > v1) 96 | return -1; 97 | else 98 | return 0; 99 | } -------------------------------------------------------------------------------- /chapter-2/2.3-libraries/quick_sort_iter_vs_rec.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include "values.c" 4 | 5 | void swap(int v[], int i, int j); 6 | void quicksort_rec(int v[], int n); 7 | void quicksort_iter(int v[], int n); 8 | 9 | int main() 10 | { 11 | // make two copies of the numbers array 12 | int *numbers_rec = malloc(sizeof(numbers) * sizeof(numbers[0])); 13 | int *numbers_iter = malloc(sizeof(numbers) * sizeof(numbers[0])); 14 | if (numbers_rec == NULL || numbers_iter == NULL) 15 | { 16 | printf("error: cannot copy array content"); 17 | return 1; 18 | } 19 | for (int i = 0; i < NELEMS(numbers); i++) 20 | { 21 | numbers_iter[i] = numbers[i]; 22 | numbers_rec[i] = numbers[i]; 23 | } 24 | 25 | quicksort_rec(numbers_rec, NELEMS(&numbers_rec)); 26 | quicksort_iter(numbers_iter, NELEMS(&numbers_iter) - 1); 27 | 28 | for (int i = 0; i < NELEMS(numbers); i++) 29 | { 30 | if (numbers_iter[i] != numbers_rec[i]) 31 | { 32 | printf("error: expected element of both arrays to be equal."); 33 | return 1; 34 | } 35 | } 36 | 37 | printf("SUCCESS.\n"); 38 | return 0; 39 | } 40 | 41 | void quicksort_rec(int v[], int n) 42 | { 43 | int last = 0; 44 | if (n <= 1) 45 | return; 46 | 47 | swap(v, 0, rand() % n); /* move pivot to v[0] */ 48 | for (int i = 0; i < n; i++) 49 | if (v[i] < v[0]) 50 | swap(v, ++last, i); 51 | 52 | swap(v, 0, last); 53 | quicksort_rec(v, last); 54 | quicksort_rec(v + last + 1, n - last - 1); 55 | } 56 | 57 | int partition(int arr[], int l, int h) 58 | { 59 | int x = arr[h]; 60 | int i = (l - 1); 61 | 62 | for (int j = l; j <= h - 1; j++) 63 | { 64 | if (arr[j] <= x) 65 | { 66 | i++; 67 | swap(arr, i, j); 68 | } 69 | } 70 | swap(arr, ++i, h); 71 | return i; 72 | } 73 | 74 | void quicksort_iter(int v[], int n) 75 | { 76 | int l = 0; 77 | 78 | while (1) 79 | { 80 | int p = partition(v, l, n); 81 | 82 | // If there are elements on left side of pivot, 83 | // then sort the left side of the array 84 | if (p - 1 > l) 85 | { 86 | n = p - 1; 87 | } 88 | // else if there are elements on right side of pivot, 89 | // then sort the right side of the array 90 | else if (p + 1 < n) 91 | { 92 | l = p + 1; 93 | } 94 | // else the array is sorted and we have nothing more to do 95 | else 96 | { 97 | break; 98 | } 99 | } 100 | } 101 | 102 | /* swap: interchange v[i] and v[j] */ 103 | void swap(int v[], int i, int j) 104 | { 105 | int temp; 106 | 107 | temp = v[i]; 108 | v[i] = v[j]; 109 | v[j] = temp; 110 | } -------------------------------------------------------------------------------- /chapter-2/2.3-libraries/values.c: -------------------------------------------------------------------------------- 1 | #define NELEMS(array) (sizeof(array) / sizeof(array[0])) 2 | 3 | char *letters[] = { 4 | "A", 5 | "B", 6 | "C", 7 | "D", 8 | "E", 9 | "F", 10 | "G", 11 | "H", 12 | "I", 13 | "J"}; 14 | 15 | int numbers[] = { 16 | 100, 17 | 12, 18 | 11, 19 | 10, 20 | 9, 21 | 15, 22 | 18, 23 | 23, 24 | 27, 25 | 77, 26 | 909, 27 | 101, 28 | 1}; -------------------------------------------------------------------------------- /chapter-2/2.4-java-quicksort/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 2: Algorithms and Data structures 2 | 3 | ## 2.4. A Java Quicksort 4 | 5 | ### Exercise 2.2 6 | 7 | Our Java quicksort does a fair amount of type conversion as items are cast from their original type (like `Integer`) to `Object` and back again. 8 | Experiment with a version of QuickSort.sort that uses the specific type being sorted, to estimate what performance penalty is incurred by type conversions. 9 | 10 | _Answer:_ Results from executing the following code were: 11 | 12 | - `ObjectQuickSorter` took 5 ms 13 | - `GenericQuickSorter` took 13 ms 14 | - `IntegerQuickSorter` took 3 ms 15 | 16 | Therefore, the generic sorter is the slowest one and the type specific one (`IntegerQuickSorter`) is the quickest one. 17 | The difference between `IntegerQuickSorter` and `ObjectQuickSorter` is much smaller that the difference between these two and `GenericQuickSorter`. 18 | 19 | All implementations can be found at [`Main.java`](Main.java) 20 | -------------------------------------------------------------------------------- /chapter-2/2.5-o-notation/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 2: Algorithms and Data structures 2 | 3 | ## 2.5. O-Notation 4 | 5 | ### Exercise 2.3 6 | 7 | What are some input sequences that might cause a quicksort implementation to display worst-case behaviour? 8 | Try to find some that provoke the library version into running slowly. 9 | Automate the process so that you can specify and perform a large number of experiments easily. 10 | 11 | _Answer:_ My internet search found that bad input for quick sort can be: 12 | 13 | - already sorted arrays 14 | - arrays with all values equal 15 | - arrays with all values equal, but one 16 | - arrays sorted in reverse order 17 | 18 | Trying this in practice in code proved unresultful, probably because of the small input data. 19 | 20 | Doing this with mine, slighly modified algorithm that allowed me to count the number of iterations showed 21 | that indeed the bigger difference encoutered when comparing the same algorithm with smaller and bigger input 22 | was that when this input wah in the last 3 categories(arrays with all values equal, arrays with all values 23 | equal but one, arrays sorted in reverse order) 24 | 25 | The implentation can be found in [`quick_sort_worst_cases.c`](quick_sort_worst_cases.c). 26 | 27 | ### Exercise 2.4 28 | 29 | Design and implement an algorithm that will sort an array of _n_ integers as slowly as possible. 30 | You have to play fair: the algorithm must make progress and eventually terminate, and the implementation 31 | must not cheat with tricks like time-wasting loops. What is the complexity of your algorithm as a function of _n_? 32 | 33 | _Answer:_ For this I went with the classis Bubble sort. This is an algorith that takes an element of the 34 | array, compares it with all the other values in this array to find its place, and then does that for all of 35 | the elements of the array. The complexity of this is _O(n^2)_, because adding one element, would mean _n_ more operations. 36 | 37 | Another one that could be even slower that this one, would be one that tries all possible combinations to see if an array is sorted. 38 | The complexity of that would be _O(2^n)_, because adding an element, would mean twice as more operations. 39 | 40 | The implementation can be found at [`slow_sort.c`](slow_sort.c) 41 | -------------------------------------------------------------------------------- /chapter-2/2.5-o-notation/quick_sort_worst_cases.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | 5 | int icmp(const void *p1, const void *p2); 6 | void cmp_arr(int first[], int second[], char *name); 7 | 8 | int main() 9 | { 10 | int random[] = {6, 4, 5, 9, 2, 7, 3, 1, 10, 8}; 11 | int random_bigger[] = {6, 4, 5, 9, 2, 7, 3, 1, 10, 8, 6, 4, 5, 9, 2, 7, 3, 1, 10, 8}; 12 | int sorted[] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}; 13 | int sorted_bigger[] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20}; 14 | int sorted_reverse[] = {10, 9, 8, 7, 6, 5, 4, 3, 2, 1}; 15 | int sorted_reverse_bigger[] = {20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1}; 16 | int all_equal[] = {1, 1, 1, 1, 1, 1, 1, 1, 1, 1}; 17 | int all_equal_bigger[] = {1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1}; 18 | int all_equal_but_one[] = {1, 1, 1, 1, 1, 1, 1, 1, 1, 10}; 19 | int all_equal_but_one_bigger[] = {1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 10}; 20 | int all_equal_but_one_reverse[] = {10, 1, 1, 1, 1, 1, 1, 1, 1, 1}; 21 | int all_equal_but_one_reverse_bigger[] = {10, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1}; 22 | 23 | cmp_arr(random, random_bigger, "random"); 24 | cmp_arr(sorted, sorted_bigger, "sorted"); 25 | cmp_arr(sorted_reverse, sorted_reverse_bigger, "sorted_reverse"); 26 | cmp_arr(all_equal, all_equal_bigger, "all_equal"); 27 | cmp_arr(all_equal_but_one, all_equal_but_one_bigger, "all_equal_but_one"); 28 | cmp_arr(all_equal_but_one_reverse, all_equal_but_one_reverse_bigger, "all_equal_but_one_reverse"); 29 | } 30 | 31 | void cmp_arr(int first[], int second[], char *name) 32 | { 33 | struct timespec start, end; 34 | clock_gettime(CLOCK_MONOTONIC_RAW, &start); 35 | qsort(first, 10, sizeof(first[0]), icmp); 36 | clock_gettime(CLOCK_MONOTONIC_RAW, &end); 37 | uint64_t delta_first = (end.tv_sec - start.tv_sec) * 1000 + (end.tv_nsec - start.tv_nsec) / 1000000; 38 | printf("time for sorting %s: %llu\n", name, delta_first); 39 | 40 | clock_gettime(CLOCK_MONOTONIC_RAW, &start); 41 | qsort(second, 20, sizeof(second[0]), icmp); 42 | clock_gettime(CLOCK_MONOTONIC_RAW, &end); 43 | uint64_t delta_second = (end.tv_sec - start.tv_sec) * 1000 + (end.tv_nsec - start.tv_nsec) / 1000000; 44 | printf("time for sorting %s_bigger: %llu\n", name, delta_second); 45 | printf("for twice the more values, sorting %s array, the difference in iterations was %llu ms slower\n", name, delta_second - delta_first); 46 | } 47 | 48 | int icmp(const void *p1, const void *p2) 49 | { 50 | int v1 = *(int *)p1; 51 | int v2 = *(int *)p2; 52 | 53 | if (v1 > v2) 54 | return 1; 55 | else if (v2 > v1) 56 | return -1; 57 | else 58 | return 0; 59 | } -------------------------------------------------------------------------------- /chapter-2/2.5-o-notation/slow_sort.c: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | void swap(int *xp, int *yp); 4 | void sort(int arr[], int n); 5 | 6 | int main() 7 | { 8 | int arr[] = {10, 12, 45, 38, 57, 12, 27, 129, 17, 27, 49, 48}; 9 | sort(arr, sizeof(arr) / sizeof(arr[0])); 10 | for (int i = 0; i < sizeof(arr) / sizeof(arr[0]); i++) 11 | { 12 | printf("%d, ", arr[i]); 13 | } 14 | printf("\n"); 15 | return 0; 16 | } 17 | 18 | void sort(int arr[], int n) 19 | { 20 | for (int i = 0; i < n - 1; i++) 21 | for (int j = 0; j < n - i - 1; j++) 22 | if (arr[j] > arr[j + 1]) 23 | swap(&arr[j], &arr[j + 1]); 24 | } 25 | 26 | void swap(int *xp, int *yp) 27 | { 28 | int temp = *xp; 29 | *xp = *yp; 30 | *yp = temp; 31 | } -------------------------------------------------------------------------------- /chapter-2/2.6-growing-arrays/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 2: Algorithms and Data Structures 2 | 3 | ## 2.6 Growing Arrays 4 | 5 | ### Exercise 2.5 6 | 7 | In the code above, `delname` doesn't call `realloc` to return the memory freed by the deletion. 8 | Is this worthwhile? How would you decide whether to do so? 9 | 10 | _Answer:_ Although not calling `realloc` when deleting a memory could become a nasty memory leak, 11 | I think this can be fine when working with small amounts of data. However, if this data grows and there are many 12 | insert and delete operations, then this memory leak could cause problems, and in this case it's better to 13 | free the memory we are not using. 14 | 15 | ### Exercise 2.6 16 | 17 | Implement the necessary changes to `addname` and `delname` to delete items by marking deleted items as unused. How isolated is the rest of the program from this change. 18 | 19 | _Answer:_ The implementation can be found in `growing_arrays.c`, namely the `addname_marked` and `delname_marked` functions. 20 | As for the second question, the other part of the program is just a called for these functions and thus the way we delete and add elements is just an implementational detail, for which the other parts ot he program should not care. 21 | Provided that the interface is the same (which it is), the other parts of the program are well isolated from this change. 22 | -------------------------------------------------------------------------------- /chapter-2/2.6-growing-arrays/growing_arrays.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | 5 | typedef struct Nameval 6 | { 7 | char *name; 8 | int value; 9 | } Nameval; 10 | 11 | struct NVtab 12 | { 13 | int nval; /* current number of values */ 14 | int max; /* allocated number of values*/ 15 | Nameval *nameval; /* array of name-value pairs*/ 16 | } nvtab; 17 | 18 | enum 19 | { 20 | NVINIT = 1, 21 | NVGROW = 2 22 | }; 23 | 24 | int delname(char *name); 25 | int delname_no_memmove(char *name); 26 | int delname_marked(char *name); 27 | int addname(Nameval newname); 28 | int addname_marked(Nameval newname); 29 | void print_nvtab(); 30 | 31 | int main() 32 | { 33 | Nameval *n = (Nameval *)malloc(sizeof(Nameval)); 34 | (*n).name = "Ivan"; 35 | (*n).value = 1; 36 | addname(*n); 37 | 38 | (*n).name = "Georgi"; 39 | (*n).value = 2; 40 | addname(*n); 41 | 42 | (*n).name = "Anton"; 43 | (*n).value = 4; 44 | addname(*n); 45 | 46 | (*n).name = "Pesho"; 47 | (*n).value = 4; 48 | addname(*n); 49 | 50 | (*n).name = "Dragan"; 51 | (*n).value = 5; 52 | addname(*n); 53 | 54 | (*n).name = "Petkan"; 55 | (*n).value = 6; 56 | addname(*n); 57 | 58 | print_nvtab(); 59 | 60 | delname("Dragan"); 61 | printf("after deleting Dragan\n"); 62 | 63 | print_nvtab(); 64 | 65 | delname("Dragan"); 66 | delname_no_memmove("Petkan"); 67 | printf("after deleting Petkan\n"); 68 | 69 | print_nvtab(); 70 | 71 | delname_marked("Georgi"); 72 | (*n).name = "Gosho"; 73 | (*n).value = 5; 74 | addname_marked(*n); 75 | 76 | printf("Gosho should be in second place, instead of Georgi\n"); 77 | print_nvtab(); 78 | 79 | return 0; 80 | } 81 | 82 | void print_nvtab() 83 | { 84 | for (int i = 0; i < nvtab.nval; i++) 85 | { 86 | printf("%s-%d, ", nvtab.nameval[i].name, nvtab.nameval[i].value); 87 | } 88 | printf("\n"); 89 | } 90 | 91 | /* addname: add new name and value to nvtab */ 92 | int addname(Nameval newname) 93 | { 94 | Nameval *nvp; 95 | 96 | if (nvtab.nameval == NULL) /* first time */ 97 | { 98 | printf("initializing nvtab.nameval\n"); 99 | nvtab.nameval = (Nameval *)malloc(NVINIT * sizeof(Nameval)); 100 | if (nvtab.nameval == NULL) 101 | return -1; /* not enough memory to allocate */ 102 | nvtab.max = NVINIT; 103 | nvtab.nval = 0; 104 | } 105 | else if (nvtab.nval == nvtab.max) /* need to grow the array */ 106 | { 107 | printf("growing nvtab.nameval - new size: %d\n", NVGROW * nvtab.max); 108 | nvp = (Nameval *)realloc(nvtab.nameval, (NVGROW * nvtab.max) * sizeof(Nameval)); 109 | if (nvp == NULL) 110 | return -1; /* not enough memory to reallocate */ 111 | nvtab.max *= NVGROW; 112 | nvtab.nameval = nvp; 113 | } 114 | 115 | nvtab.nameval[nvtab.nval] = newname; 116 | return nvtab.nval++; 117 | } 118 | 119 | int delname(char *name) 120 | { 121 | for (int i = 0; i < nvtab.nval; i++) 122 | { 123 | if (strcmp(nvtab.nameval[i].name, name) == 0) 124 | { 125 | memmove(nvtab.nameval + i, nvtab.nameval + i + 1, (nvtab.nval - (i + 1)) * sizeof(Nameval)); 126 | nvtab.nval--; 127 | return 1; 128 | } 129 | } 130 | return 0; 131 | } 132 | 133 | int delname_no_memmove(char *name) 134 | { 135 | for (int i = 0; i < nvtab.nval; i++) 136 | { 137 | if (strcmp(nvtab.nameval[i].name, name) == 0) 138 | { 139 | for (int j = i; j < nvtab.nval; j++) 140 | { 141 | nvtab.nameval[j] = nvtab.nameval[j + 1]; 142 | } 143 | nvtab.nval--; 144 | return 1; 145 | } 146 | } 147 | return 0; 148 | } 149 | 150 | int addname_marked(Nameval newname) 151 | { 152 | Nameval *nvp; 153 | 154 | if (nvtab.nameval == NULL) /* first time */ 155 | { 156 | printf("initializing nvtab.nameval\n"); 157 | nvtab.nameval = (Nameval *)malloc(NVINIT * sizeof(Nameval)); 158 | if (nvtab.nameval == NULL) 159 | return -1; /* not enough memory to allocate */ 160 | nvtab.max = NVINIT; 161 | nvtab.nval = 0; 162 | } 163 | else 164 | { 165 | for (int i = 0; i < nvtab.nval; i++) 166 | { 167 | if (nvtab.nameval[i].name == NULL) 168 | { 169 | nvtab.nameval[i] = newname; 170 | return nvtab.nval++; 171 | } 172 | } 173 | } 174 | 175 | if (nvtab.nval == nvtab.max) /* need to grow the array */ 176 | { 177 | printf("growing nvtab.nameval - new size: %d\n", NVGROW * nvtab.max); 178 | nvp = (Nameval *)realloc(nvtab.nameval, (NVGROW * nvtab.max) * sizeof(Nameval)); 179 | if (nvp == NULL) 180 | return -1; /* not enough memory to reallocate */ 181 | nvtab.max *= NVGROW; 182 | nvtab.nameval = nvp; 183 | } 184 | 185 | nvtab.nameval[nvtab.nval] = newname; 186 | return nvtab.nval++; 187 | } 188 | 189 | int delname_marked(char *name) 190 | { 191 | for (int i = 0; i < nvtab.nval; i++) 192 | { 193 | if (strcmp(nvtab.nameval[i].name, name) == 0) 194 | { 195 | nvtab.nameval[i].name = NULL; 196 | nvtab.nval--; 197 | return 1; 198 | } 199 | } 200 | return 0; 201 | } 202 | -------------------------------------------------------------------------------- /chapter-2/2.7-lists/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 2: Algorithms and Data Structures 2 | 3 | ## 2.7 Lists 4 | 5 | ### Exercise 2-7 6 | 7 | Implement some of the other list operations: copy, merge, split, insert before of after specific item. 8 | How do the two insertion operations differ in difficulty? How much can you use the routines we've written, 9 | and how much must you create yourself? 10 | 11 | _Answer:_ The solution can be found in [`chapter-2/2.7-lists/lists.c`](lists.c). 12 | 13 | - Copying - the `copy` function. That is simply straight forward as it initializes a new list and start adding 14 | copies of the element of the old list to the new one. For the purpose we can reuse the `newitem` function, 15 | which create a new element. 16 | - Merge - the `merge` function. In this we reuse `copy` to make copies of both lists and then just add them 17 | together via the `addend` function. Neat. 18 | - Insert before - the `insert_before` function. This one is a bit trickier, because there are 2 cases: 19 | - when the element we want to insert before is the first one - in this case, we make a copy to the original list, 20 | point the `next` field of the new element to this copy, and then point the original list pointer to the pointer of the new element. 21 | - when the element we want to insert before is not the first one - in this case we point the `next` field of the new element to the `next` field of previous element of the list (of which we hold a copy), and then point the previous element of the list, of the new element. In both cases we don't reuse any of the existing functionality. 22 | - Insert after - the `insert_after` function. In this we iterate over the elements of the passed list, until we find the matching one. 23 | The make this point to the next from the original list, and the previous one points to the new element to be inserted. 24 | In this function we don't reuse any of the other ones, because it's just pointer changes and comparisons. 25 | 26 | ### Exercise 2-8 27 | 28 | Write recursive and iterative versions of `reverse`, which reverses a list. 29 | Do not create new list items; re-use the existing ones. 30 | 31 | _Answer:_ The solution can be found in [`chapter-2/2.7-lists/reverse.c`](reverse.c) 32 | 33 | ### Exercise 2-9 34 | 35 | Write a generic List type for C. The easiest way is to have each list item hold a `void*` that points to the data. 36 | Do the same for C++ by defining a template and for Java by defining a class that holds lists of type Object. 37 | What are the strenght and weaknesses of the various languages for this job? 38 | 39 | _Answer:_ The solutions can be found at [`generic_lists`](generic_lists). 40 | 41 | - The C solution is really weird, because in order to achieve generic list, we must use `void*`. 42 | That means that the type-checking responsibility is switched from the compiler, to the consumer of the list. 43 | Also, while trying it off it produces some warning of the sort `[-Wincompatible-pointer-types]`. 44 | So it should be compiled with flags to ignore this warning. 45 | All in all, I think that it is better if the 46 | consumers would implement their own list, instead of using this generic one. 47 | - C++ solution - TODO 48 | - Java solution - in Java this is pretty much implemented with `List` and all of its implementations, 49 | so I'd rather not spend time reimplementing those or implementing wrappers around them. 50 | 51 | ### Exercise 2-10 52 | 53 | Devise and implement a set of tests for verifying that the list routines you write are correct. 54 | Chapter 6 discusses strategies for testing. 55 | 56 | _Answer:_ The solution can be found at [`tests.c`](tests.c). 57 | 58 | I came up with mini test framework, where all of the tests are executed in the `int main()` function. 59 | The framework consists of a generic `int test(int (*fn)(Nameval *), char *description)` function, which accepts as arguments: 60 | 61 | - `int (*fn)(Nameval *)` - the function where the actual test is executed. It accepts a list as a parameter 62 | and returns an integer. It accepts a list, so that the test invoker is responsible for constructing the input data and cleaning it afterwards. 63 | It returns integer to indicate whether the test has executed succesfully or not. 64 | - `char * description` is the description of the test that is to be executed. 65 | 66 | Will this in place, `main` invokes `test` once for each test method we have. The test is executed and a 67 | response code is returned. If the response code is error, `test` displays an error message, 68 | if it is a success - it displays a success message and it propagates the code further on. 69 | This allows us to aggregate all error codes in `main` and display the number of failing tests (because all error code are `1` ATM). 70 | -------------------------------------------------------------------------------- /chapter-2/2.7-lists/generic_lists/generic_list.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | 4 | typedef struct 5 | { 6 | void *object; 7 | void **next; 8 | } List; 9 | 10 | typedef struct 11 | { 12 | char *name; 13 | int age; 14 | } Person; 15 | 16 | int main() 17 | { 18 | List *ll = (List *)malloc(sizeof(List)); 19 | if (ll == NULL) 20 | { 21 | printf("Not enought memory to allocate for the list."); 22 | return 1; 23 | } 24 | 25 | Person *p = (Person *)malloc(sizeof(Person)); 26 | if (p == NULL) 27 | { 28 | printf("Not enough memory to allocate for the person."); 29 | return 1; 30 | } 31 | 32 | p->name = "Peter"; 33 | p->age = 21; 34 | 35 | ll->object = p; 36 | ll->next = NULL; 37 | 38 | p = (Person *)malloc(sizeof(Person)); 39 | if (p == NULL) 40 | { 41 | printf("Not enough memory to allocate for the person."); 42 | return 1; 43 | } 44 | p->name = "Georgi"; 45 | p->age = 22; 46 | 47 | List *new = (List *)malloc(sizeof(List)); 48 | if (new == NULL) 49 | { 50 | printf("Not enought memory to allocate for the list."); 51 | return 1; 52 | } 53 | new->object = p; 54 | new->next = NULL; 55 | 56 | ll->next = new; 57 | 58 | for (; ll != NULL; ll = ll->next) 59 | { 60 | printf("%s: %d, ", ((Person *)ll->object)->name, ((Person *)ll->object)->age); 61 | } 62 | printf("\n"); 63 | 64 | return 0; 65 | } -------------------------------------------------------------------------------- /chapter-2/2.7-lists/lists.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | 5 | typedef struct Nameval Nameval; 6 | struct Nameval 7 | { 8 | char *name; 9 | int value; 10 | Nameval *next; 11 | }; 12 | 13 | void freeall(Nameval *listp); 14 | Nameval *copy(Nameval *listp); 15 | void printnv(Nameval *p, void *arg); 16 | Nameval *newitem(char *name, int value); 17 | void incounter(Nameval *listp, void *arg); 18 | Nameval *lookup(Nameval *listp, char *name); 19 | int delitem(Nameval *listp, char *name); 20 | Nameval *merge(Nameval *list, Nameval *other); 21 | Nameval *addend(Nameval *listp, Nameval *newp); 22 | Nameval *addfront(Nameval *listp, Nameval *newp); 23 | int insert_after(Nameval *list, char *name, Nameval *newelem); 24 | int insert_before(Nameval *list, char *name, Nameval *newelem); 25 | void apply(Nameval *listp, void (*fn)(Nameval *, void *), void *arg); 26 | 27 | void apply(Nameval *listp, void (*fn)(Nameval *, void *), void *arg) 28 | { 29 | for (; listp != NULL; listp = listp->next) 30 | (*fn)(listp, arg); 31 | } 32 | 33 | /* newitem: create new Nameval with name and value and return it */ 34 | Nameval *newitem(char *name, int value) 35 | { 36 | Nameval *newp = (Nameval *)malloc(sizeof(Nameval)); 37 | if (newp == NULL) 38 | return NULL; 39 | 40 | newp->name = name; 41 | newp->value = value; 42 | newp->next = NULL; 43 | 44 | return newp; 45 | } 46 | 47 | /* addfront: add newp to the front of listp */ 48 | Nameval *addfront(Nameval *listp, Nameval *newp) 49 | { 50 | newp->next = listp; 51 | return newp; 52 | } 53 | 54 | Nameval *addend(Nameval *listp, Nameval *newp) 55 | { 56 | if (listp == NULL) 57 | return newp; 58 | 59 | Nameval *p; 60 | for (p = listp; p->next != NULL; p = p->next) 61 | ; /* do nothing, just go to the end of the list */ 62 | p->next = newp; 63 | return listp; 64 | } 65 | 66 | Nameval *lookup(Nameval *listp, char *name) 67 | { 68 | while (listp != NULL) 69 | { 70 | if (strcmp(listp->name, name) == 0) 71 | { 72 | return listp; 73 | } 74 | listp = listp->next; 75 | } 76 | return NULL; /* no mathc found */ 77 | } 78 | 79 | int delitem(Nameval *listp, char *name) 80 | { 81 | Nameval *prev = NULL; 82 | for (Nameval *p = listp; p != NULL; p = p->next) 83 | { 84 | if (strcmp(p->name, name) == 0) 85 | { 86 | if (prev == NULL) 87 | listp = p->next; 88 | else 89 | listp->next = p->next; 90 | free(p); 91 | return 1; 92 | } 93 | prev = p; 94 | } 95 | return 0; 96 | } 97 | 98 | /* freeall: destroy the list */ 99 | void freeall(Nameval *listp) 100 | { 101 | Nameval *next; 102 | for (; listp != NULL; listp = next) 103 | { 104 | next = listp->next; 105 | /* assumes name is freed elsewhere */ 106 | free(listp); 107 | } 108 | } 109 | 110 | /* copy: copies the elements of listp into a new list, without modifying the original list 111 | returns: the beginning of the new list 112 | */ 113 | Nameval *copy(Nameval *listp) 114 | { 115 | Nameval *begin = NULL; 116 | 117 | for (; listp != NULL; listp = listp->next) 118 | { 119 | Nameval *new = (Nameval *)malloc(sizeof(Nameval)); 120 | if (new == NULL) 121 | return NULL; /* not enough memory to allocate new element*/ 122 | 123 | begin = addend(begin, newitem(listp->name, listp->value)); 124 | } 125 | 126 | return begin; 127 | } 128 | 129 | /* merge: returns a new list that contains all elements from list and other, without modifying the original lists */ 130 | Nameval *merge(Nameval *list, Nameval *other) 131 | { 132 | return addend(copy(list), copy(other)); 133 | } 134 | 135 | int insert_before(Nameval *list, char *name, Nameval *newelem) 136 | { 137 | Nameval *prev = NULL; 138 | 139 | for (; list != NULL; list = list->next) 140 | { 141 | if (strcmp(list->name, name) == 0) 142 | { 143 | if (prev == NULL) /* this means the matching element is the first one */ 144 | { 145 | newelem->next = copy(list); 146 | *list = *newelem; 147 | return 1; 148 | } 149 | 150 | newelem->next = prev->next; 151 | prev->next = newelem; 152 | return 1; 153 | } 154 | prev = list; 155 | } 156 | 157 | return 0; 158 | } 159 | 160 | /* insert_after: it finds the elemenent with name=name in list and inserts newelem after that */ 161 | int insert_after(Nameval *list, char *name, Nameval *newelem) 162 | { 163 | for (; list != NULL; list = list->next) 164 | { 165 | if (strcmp(list->name, name) == 0) 166 | { 167 | newelem->next = list->next; 168 | list->next = newelem; 169 | return 1; 170 | } 171 | } 172 | return 0; 173 | } 174 | 175 | void incounter(Nameval *listp, void *arg) 176 | { 177 | /* listp is unused */ 178 | int *ip = (int *)arg; 179 | (*ip)++; 180 | } 181 | 182 | void printnv(Nameval *p, void *arg) 183 | { 184 | char *fmt = (char *)arg; 185 | printf(fmt, p->name, p->value); 186 | } -------------------------------------------------------------------------------- /chapter-2/2.7-lists/reverse.c: -------------------------------------------------------------------------------- 1 | #include "lists.c" 2 | 3 | Nameval *reverse_iter(Nameval *list); 4 | Nameval *reverse_rec(Nameval *list); 5 | struct Nameval *recursiveReverseLL(struct Nameval *first); 6 | 7 | int main() 8 | { 9 | Nameval *list = NULL; 10 | 11 | list = addfront(list, newitem("Ivan", 2)); 12 | list = addfront(list, newitem("Gosho", 3)); 13 | list = addend(list, newitem("Petran", 4)); 14 | 15 | printf("printing list\n"); 16 | apply(list, printnv, "%s: %d, "); 17 | printf("\n"); 18 | 19 | list = reverse_iter(list); 20 | 21 | printf("printing list\n"); 22 | apply(list, printnv, "%s: %d, "); 23 | printf("\n"); 24 | 25 | list = reverse_rec(list); 26 | 27 | printf("printing list\n"); 28 | apply(list, printnv, "%s: %d, "); 29 | printf("\n"); 30 | 31 | return 0; 32 | } 33 | 34 | Nameval *reverse_iter(Nameval *list) 35 | { 36 | Nameval *prev = NULL; 37 | Nameval *next = NULL; 38 | Nameval *curr = list; 39 | 40 | while (curr != NULL) 41 | { 42 | next = curr->next; 43 | curr->next = prev; 44 | prev = curr; 45 | curr = next; 46 | } 47 | return prev; 48 | } 49 | 50 | Nameval *reverse_rec(Nameval *list) 51 | { 52 | if (list == NULL) 53 | return NULL; 54 | 55 | if (list->next == NULL) 56 | return list; 57 | 58 | Nameval *new = reverse_rec(list->next); 59 | list->next->next = list; 60 | list->next = NULL; 61 | return new; 62 | } -------------------------------------------------------------------------------- /chapter-2/2.7-lists/tests.c: -------------------------------------------------------------------------------- 1 | #include "lists.c" 2 | 3 | char *names[] = { 4 | "Georgi", 5 | "Pesho", 6 | "Bai Tosho"}; 7 | 8 | int test(int (*fn)(Nameval *), char *description); 9 | int assert_list(Nameval *list, char **expected_names, int expected_names_size); 10 | 11 | int test_addend(Nameval *list); 12 | int test_addfront(Nameval *list); 13 | int test_insert_after(Nameval *list); 14 | int test_insert_after_noexist(Nameval *list); 15 | int test_insert_after_noexist(Nameval *list); 16 | int test_insert_before(Nameval *list); 17 | int test_insert_before_noexist(Nameval *list); 18 | int test_merge(Nameval *list); 19 | int test_merge_null(Nameval *list); 20 | int test_delitem(Nameval *list); 21 | int test_delitem_noexist(Nameval *list); 22 | int test_lookup(Nameval *list); 23 | int test_lookup_noexist(Nameval *list); 24 | int test_copy(Nameval *list); 25 | int test_apply_incounter(Nameval *list); 26 | 27 | int main() 28 | { 29 | int res = 0; 30 | res += test(test_addend, "Adding element to the end of the list."); 31 | res += test(test_addfront, "Adding element to the front of the list."); 32 | res += test(test_insert_after, "Adding element after a certain element."); 33 | res += test(test_insert_after_noexist, "Adding element after a certain element that does not exist."); 34 | res += test(test_insert_before, "Adding element before a certain element."); 35 | res += test(test_insert_before_noexist, "Adding element before a certain element that does not exist."); 36 | res += test(test_merge, "Merging two lists."); 37 | res += test(test_merge_null, "Merging two lists, when one of the them is NULL."); 38 | res += test(test_delitem, "Deleting an item from the list."); 39 | res += test(test_delitem_noexist, "Deleting a non-existent item from the list."); 40 | res += test(test_lookup, "Looking up an item from the list."); 41 | res += test(test_lookup_noexist, "Looking up a non-existing item from the list."); 42 | res += test(test_copy, "Copy a list."); 43 | res += test(test_apply_incounter, "Count the element in a list, via the functions `apply` and `incounter`"); 44 | 45 | if (res == 0) 46 | printf(" ✅✅✅ All the test were executed succesfully. ✅✅✅\n"); 47 | else 48 | printf(" ❌❌❌There were %d failures❌❌❌\n", res); 49 | 50 | return res; 51 | } 52 | 53 | int test_addend(Nameval *list) 54 | { 55 | char *expected_names[] = { 56 | "Georgi", 57 | "Pesho", 58 | "Bai Tosho", 59 | "Gosho" /* this is inserted at the end */ 60 | }; 61 | 62 | list = addend(list, newitem("Gosho", 0)); 63 | 64 | return assert_list(list, expected_names, sizeof(expected_names) / sizeof(expected_names[0])); 65 | } 66 | 67 | int test_addfront(Nameval *list) 68 | { 69 | char *expected_names[] = { 70 | "Gosho", /* this is inserted at the front */ 71 | "Georgi", 72 | "Pesho", 73 | "Bai Tosho", 74 | }; 75 | 76 | list = addfront(list, newitem("Gosho", 0)); 77 | 78 | return assert_list(list, expected_names, sizeof(expected_names) / sizeof(expected_names[0])); 79 | } 80 | 81 | int test_insert_after(Nameval *list) 82 | { 83 | char *expected_names[] = { 84 | "Georgi", 85 | "Pesho", 86 | "Gosho", // inserted after Pesho in the test 87 | "Bai Tosho"}; 88 | 89 | int res = insert_after(list, "Pesho", newitem("Gosho", 0)); 90 | if (res != 1) 91 | { 92 | printf("[ERROR] Got (%d) response code from `insert_after`, expected 1\n", res); 93 | return 1; 94 | } 95 | 96 | return assert_list(list, expected_names, sizeof(expected_names) / sizeof(expected_names[0])); 97 | } 98 | 99 | int test_insert_after_noexist(Nameval *list) 100 | { 101 | int res = insert_after(list, "NO_EXIST", newitem("who cares", 0)); 102 | if (res != 0) 103 | { 104 | printf("[ERROR] Got (%d) response code from `insert_after` when element is not present, expected 0\n", res); 105 | return 1; 106 | } 107 | 108 | Nameval *el = lookup(list, "who cares"); 109 | if (el != NULL) 110 | { 111 | printf("[ERROR] Found element \"who cares\" in the list, expected to not be inserted.\n"); 112 | return 1; 113 | } 114 | 115 | return assert_list(list, names, sizeof(names) / sizeof(names[0])); 116 | } 117 | 118 | int test_insert_before(Nameval *list) 119 | { 120 | char *expected_names[] = { 121 | "Georgi", 122 | "Gosho", // inserted before Pesho in the test 123 | "Pesho", 124 | "Bai Tosho"}; 125 | 126 | int res = insert_before(list, "Pesho", newitem("Gosho", 0)); 127 | if (res != 1) 128 | { 129 | printf("[ERROR] Got (%d) error code from `insert_after`, expected 1\n", res); 130 | return 1; 131 | } 132 | 133 | return assert_list(list, expected_names, sizeof(expected_names) / sizeof(expected_names[0])); 134 | } 135 | 136 | int test_insert_before_noexist(Nameval *list) 137 | { 138 | int res = insert_before(list, "NO_EXIST", newitem("who cares", 0)); 139 | if (res != 0) 140 | { 141 | printf("[ERROR] Got (%d) response code from `insert_before` when element is not present, expected 0\n", res); 142 | return 1; 143 | } 144 | 145 | Nameval *el = lookup(list, "who cares"); 146 | if (el != NULL) 147 | { 148 | printf("[ERROR] Found element \"who cares\" in the list, expected to not be inserted.\n"); 149 | return 1; 150 | } 151 | 152 | return assert_list(list, names, sizeof(names) / sizeof(names[0])); 153 | } 154 | 155 | int test_merge(Nameval *list) 156 | { 157 | char *expected_names[] = { 158 | "Georgi", 159 | "Pesho", 160 | "Bai Tosho", 161 | "Bai Tosho", 162 | "Pesho", 163 | "Georgi", 164 | }; 165 | 166 | Nameval *second_list = NULL; 167 | second_list = addfront(second_list, newitem(names[0], 0)); 168 | second_list = addfront(second_list, newitem(names[1], 0)); 169 | second_list = addfront(second_list, newitem(names[2], 0)); 170 | 171 | Nameval *merged_list = merge(list, second_list); 172 | 173 | return assert_list(merged_list, expected_names, sizeof(expected_names) / sizeof(expected_names[0])); 174 | } 175 | 176 | int test_merge_null(Nameval *list) 177 | { 178 | Nameval *merged = merge(list, NULL); 179 | 180 | return assert_list(merged, names, sizeof(names) / sizeof(names[0])); 181 | } 182 | 183 | int test_delitem(Nameval *list) 184 | { 185 | char *expected_names[] = { 186 | "Georgi", 187 | /* "Pesho", - THIS IS THE DELETED ITEM*/ 188 | "Bai Tosho"}; 189 | 190 | int res = delitem(list, "Pesho"); 191 | if (res != 1) 192 | { 193 | printf("[ERROR] Got (%d) response code from delitem, expected 1 when item is deleted", res); 194 | return 1; 195 | } 196 | 197 | return assert_list(list, expected_names, sizeof(expected_names) / sizeof(expected_names[0])); 198 | } 199 | 200 | int test_delitem_noexist(Nameval *list) 201 | { 202 | int res = delitem(list, "NOT EXIST"); 203 | if (res != 0) 204 | { 205 | printf("[ERROR] Got (%d) response code from delitem, expected 0 when item is not deleted", res); 206 | return 1; 207 | } 208 | 209 | return assert_list(list, names, sizeof(names) / sizeof(names[0])); 210 | } 211 | 212 | int test_lookup(Nameval *list) 213 | { 214 | Nameval *res = lookup(list, "Pesho"); 215 | if (strcmp(res->name, "Pesho") != 0) 216 | { 217 | printf("[ERROR] Got (%s) for lookup(list, \"Pesho\", expected \"Pesho\"\n", res->name); 218 | return 1; 219 | } 220 | 221 | res = lookup(list, "Gosho"); 222 | if (res != NULL) 223 | { 224 | printf("[ERROR] Got (%s) for lookup(list, \"Gosho\", expected NULL for non-existing element", res->name); 225 | return 1; 226 | } 227 | 228 | return 0; 229 | } 230 | 231 | int test_lookup_noexist(Nameval *list) 232 | { 233 | Nameval *res = lookup(list, "Gosho"); 234 | if (res != NULL) 235 | { 236 | printf("[ERROR] Got (%s) for lookup(list, \"Gosho\", expected NULL for non-existing element", res->name); 237 | return 1; 238 | } 239 | 240 | return 0; 241 | } 242 | 243 | int test_copy(Nameval *list) 244 | { 245 | Nameval *res = copy(list); 246 | 247 | return assert_list(res, names, sizeof(names) / sizeof(names[0])); 248 | } 249 | 250 | int test_apply_incounter(Nameval *list) 251 | { 252 | int i = 0; 253 | apply(list, incounter, &i); 254 | 255 | if (i != 3) 256 | { 257 | printf("[ERROR] Got (%d) for `i` in `apply(list, incounter, &i)`, expected to be 3\n", i); 258 | return 1; 259 | } 260 | 261 | return 0; 262 | } 263 | 264 | int assert_list(Nameval *list, char **expected_names, int expected_names_size) 265 | { 266 | for (int i = 0; i < expected_names_size; i++, list = list->next) 267 | { 268 | if (list == NULL) 269 | { 270 | printf("[ERROR] list is NULL, expected more elements\n"); 271 | return 1; 272 | } 273 | if (strcmp(list->name, expected_names[i]) != 0) 274 | { 275 | printf("[ERROR] Got (%s) for name, expected (%s), i=%d\n", list->name, expected_names[i], i); 276 | return 1; 277 | } 278 | } 279 | 280 | return 0; 281 | } 282 | 283 | int test(int (*fn)(Nameval *), char *description) 284 | { 285 | Nameval *list = NULL; 286 | list = addend(list, newitem(names[0], 0)); 287 | list = addend(list, newitem(names[1], 0)); 288 | list = addend(list, newitem(names[2], 0)); 289 | 290 | int res = fn(list); 291 | if (res == 0) 292 | printf(" ✅ %s\n", description); 293 | else 294 | printf(" ❌ %s\n", description); 295 | 296 | freeall(list); 297 | 298 | return res; 299 | } -------------------------------------------------------------------------------- /chapter-2/2.8-trees/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 2: Algorithms and Data Structures 2 | 3 | ## Chapter 2.8 Trees 4 | 5 | ### Exercise 2.11 6 | 7 | Compare the performance of `lookup` and `nrlookup`. How expensive is recursion compared to iteration? 8 | 9 | _Answer:_ 10 | Solution can be found at [`rec_vs_iter.c`](rec_vs_iter.c). 11 | 12 | A simple test with 15 elements showed that the iterative function is always faster. 13 | Output was something in the borders of: 14 | 15 | ```text 16 | Iterative function is faster. 17 | Iterative function took 0 s, 236 ns 18 | Recursive function took 0 s, 316 ns 19 | ``` 20 | 21 | This means that with this amount of data, the iterative function is ~33% faster than the recursive one. 22 | However, these results are not to be taken as conclusive, because the amount of data we test it is really small, 23 | and I suspect that the way we measure time is not entirely accurate as well. 24 | 25 | ### Exercise 2.12 26 | 27 | Use in-order traversal to create a sort routine. What time complexity does it have? 28 | Under what conditions might it behave poorly? How does its performance compare to our quicksort and a library version? 29 | 30 | _Answer:_ The implementation can be found at [`sort.c`](sort.c). 31 | The time complexity should be _O(log n)_, however, in the end, all elements need to be traversed, so that 32 | they can be printed(written), so that leeds me to think the complexity would be _O(n)_. 33 | The performance depends on the structure of the tree. For a properly structured tree it should always 34 | perform in the same manner. 35 | TODO: execute a proper performance tests againts the list quicksort. 36 | 37 | ### Exercise 2.13 38 | 39 | Devise and implement a set of tests for verifying that the tree routines are correct. 40 | 41 | _Answer:_ The solution can be found at [`tests.c`](tests.c). I went with the same appoarch as in Exercise 2-10. 42 | -------------------------------------------------------------------------------- /chapter-2/2.8-trees/rec_vs_iter.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | 5 | #include "trees.c" 6 | 7 | typedef struct 8 | { 9 | __darwin_time_t tv_sec; 10 | long tv_nsec; 11 | } timespec; 12 | 13 | timespec *time_lookup(Nameval *treep, char *name); 14 | timespec *time_nrlookup(Nameval *treep, char *name); 15 | 16 | int main() 17 | { 18 | Nameval *treep = NULL; 19 | treep = insert(treep, newleaf("Item", 0)); 20 | treep = insert(treep, newleaf("Item2", 0)); 21 | treep = insert(treep, newleaf("Item3", 0)); 22 | treep = insert(treep, newleaf("Item4", 0)); 23 | treep = insert(treep, newleaf("Item5", 0)); 24 | treep = insert(treep, newleaf("Item6", 0)); 25 | treep = insert(treep, newleaf("Item7", 0)); 26 | treep = insert(treep, newleaf("Item8", 0)); 27 | treep = insert(treep, newleaf("Item9", 0)); 28 | treep = insert(treep, newleaf("Item10", 0)); 29 | treep = insert(treep, newleaf("Item11", 0)); 30 | treep = insert(treep, newleaf("Item12", 0)); 31 | treep = insert(treep, newleaf("Item13", 0)); 32 | treep = insert(treep, newleaf("Item14", 0)); 33 | 34 | char *name = "Item14"; 35 | 36 | timespec *rec_time = time_lookup(treep, name); 37 | timespec *iter_time = time_nrlookup(treep, name); 38 | 39 | if (rec_time->tv_sec == 0 && iter_time->tv_sec == 0 && rec_time->tv_nsec == 0 && iter_time->tv_nsec == 0) 40 | { 41 | printf("Cannot measure the execution time precisely enough."); 42 | return 1; 43 | } 44 | 45 | if (rec_time->tv_sec == iter_time->tv_sec) 46 | { 47 | if (rec_time->tv_nsec > iter_time->tv_nsec) 48 | { 49 | printf("Iterative function is faster.\n"); 50 | } 51 | else if (rec_time->tv_nsec < iter_time->tv_nsec) 52 | { 53 | printf("Recursive function is faster.\n"); 54 | } 55 | else 56 | { 57 | printf("Both function ran for the same time.\n"); 58 | } 59 | } 60 | else if (rec_time->tv_sec > iter_time->tv_sec) 61 | { 62 | printf("Iterative function is faster.\n"); 63 | } 64 | else if (rec_time->tv_nsec < iter_time->tv_nsec) 65 | { 66 | printf("Recursive function is faster.\n"); 67 | } 68 | 69 | printf("Iterative function took %lu s, %lu ns\n", iter_time->tv_sec, iter_time->tv_nsec); 70 | printf("Recursive function took %lu s, %lu ns\n", rec_time->tv_sec, rec_time->tv_nsec); 71 | 72 | return 0; 73 | } 74 | 75 | timespec *time_lookup(Nameval *treep, char *name) 76 | { 77 | struct timespec start, end; 78 | clock_gettime(CLOCK_MONOTONIC_RAW, &start); 79 | lookup(treep, name); 80 | clock_gettime(CLOCK_MONOTONIC_RAW, &end); 81 | 82 | timespec *tt = (timespec *)malloc(sizeof(timespec)); 83 | tt->tv_sec = (end.tv_sec - start.tv_sec); 84 | tt->tv_nsec = (end.tv_nsec - start.tv_nsec); 85 | 86 | return tt; 87 | } 88 | 89 | timespec *time_nrlookup(Nameval *treep, char *name) 90 | { 91 | struct timespec start, end; 92 | clock_gettime(CLOCK_MONOTONIC_RAW, &start); 93 | nrlookup(treep, name); 94 | clock_gettime(CLOCK_MONOTONIC_RAW, &end); 95 | 96 | timespec *tt = (timespec *)malloc(sizeof(timespec)); 97 | tt->tv_sec = (end.tv_sec - start.tv_sec); 98 | tt->tv_nsec = (end.tv_nsec - start.tv_nsec); 99 | 100 | return tt; 101 | } -------------------------------------------------------------------------------- /chapter-2/2.8-trees/sort.c: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | #include "trees.c" 4 | 5 | void print_sorted(Nameval *treep); 6 | 7 | Nameval *elems[100]; 8 | 9 | int main() 10 | { 11 | Nameval *treep = NULL; 12 | treep = insert(treep, newleaf("Item2", 0)); 13 | treep = insert(treep, newleaf("Item", 0)); 14 | treep = insert(treep, newleaf("Item4", 0)); 15 | treep = insert(treep, newleaf("Item3", 0)); 16 | treep = insert(treep, newleaf("Item5", 0)); 17 | 18 | print_sorted(treep); 19 | printf("\n"); 20 | } 21 | 22 | void print_sorted(Nameval *treep) 23 | { 24 | if (treep == NULL) 25 | return; 26 | 27 | print_sorted(treep->right); 28 | printf("%s, ", treep->name); 29 | print_sorted(treep->left); 30 | } -------------------------------------------------------------------------------- /chapter-2/2.8-trees/tests.c: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | #include "trees.c" 4 | 5 | char *names[] = { 6 | "Georgi", 7 | "Pesho", 8 | "Bai Tosho"}; 9 | 10 | int test(int (*fn)(Nameval *), char *description); 11 | 12 | int test_insert(Nameval *treep); 13 | int test_lookup(Nameval *treep); 14 | int test_nrlookup(Nameval *treep); 15 | int test_count_inorder(Nameval *treep); 16 | int test_count_preorder(Nameval *treep); 17 | int test_count_postorder(Nameval *treep); 18 | 19 | int main() 20 | { 21 | int res = 0; 22 | res += test(test_insert, "Insert an element into a tree."); 23 | res += test(test_lookup, "Look-up an element from the tree."); 24 | res += test(test_nrlookup, "Look-up an element from the tree (non-recursive)."); 25 | res += test(test_count_inorder, "Count the elements in the tree, walking it in order."); 26 | res += test(test_count_preorder, "Count the elements in the tree, walking it pre order."); 27 | res += test(test_count_postorder, "Count the elements in the tree, walking it post order."); 28 | 29 | if (res == 0) 30 | printf(" ✅✅✅ All the test were executed succesfully. ✅✅✅\n"); 31 | else 32 | printf(" ❌❌❌There were %d failures❌❌❌\n", res); 33 | 34 | return res; 35 | } 36 | 37 | int test(int (*fn)(Nameval *), char *description) 38 | { 39 | Nameval *treep = NULL; 40 | treep = insert(treep, newleaf(names[0], 0)); 41 | treep = insert(treep, newleaf(names[1], 0)); 42 | treep = insert(treep, newleaf(names[2], 0)); 43 | 44 | int res = fn(treep); 45 | if (res == 0) 46 | printf(" ✅ %s\n", description); 47 | else 48 | printf(" ❌ %s\n", description); 49 | 50 | return res; 51 | } 52 | 53 | void counttree(Nameval *treep, void *arg) 54 | { 55 | int *ip = (int *)arg; 56 | (*ip)++; 57 | } 58 | 59 | int assert_tree(Nameval *treep, char **expected_names, int N) 60 | { 61 | int i = 0; 62 | applyinorder(treep, counttree, &i); 63 | 64 | if (i != N) 65 | { 66 | printf("[ERROR] Got length (%d) after insertion, expected %d", i, N); 67 | return 1; 68 | } 69 | 70 | for (int i = 0; i < N; i++) 71 | { 72 | Nameval *el = lookup(treep, expected_names[i]); 73 | if (el == NULL) 74 | { 75 | printf("[ERROR] Got NULL for lookup(treep, %s), expected element to be present", expected_names[i]); 76 | return 1; 77 | } 78 | } 79 | 80 | return 0; 81 | } 82 | 83 | int test_insert(Nameval *treep) 84 | { 85 | char *expected_names[] = { 86 | "Georgi", 87 | "Pesho", 88 | "Bai Tosho", 89 | "Gosho" /* "Gosho" is to be inserted. */ 90 | }; 91 | 92 | treep = insert(treep, newleaf("Gosho", 0)); 93 | 94 | return assert_tree(treep, expected_names, sizeof(expected_names) / sizeof(expected_names[0])); 95 | } 96 | 97 | int _test_lookup(Nameval *treep, Nameval *(*lookup_fn)(Nameval *treep, char *name)) 98 | { 99 | Nameval *new = newleaf("Todor", 2); 100 | treep = insert(treep, new); 101 | 102 | Nameval *looked_up = lookup_fn(treep, "Todor"); 103 | if (looked_up == NULL) 104 | { 105 | printf("[ERROR] Got NULL for lookup(treep, \"Todor\"), expected element to be present"); 106 | return 1; 107 | } 108 | 109 | if (strcmp(looked_up->name, "Todor") != 0) 110 | { 111 | printf("[ERROR] Got (%s) for lookup(treep, \"Todor\")->name, expected \"Todor\"", looked_up->name); 112 | return 1; 113 | } 114 | if (looked_up->value != 2) 115 | { 116 | printf("[ERROR] Got (%d) for lookup(treep, \"Todor\")->value, expected (2)", looked_up->value); 117 | return 1; 118 | } 119 | 120 | return 0; 121 | } 122 | 123 | int test_lookup(Nameval *treep) 124 | { 125 | return _test_lookup(treep, lookup); 126 | } 127 | 128 | int test_nrlookup(Nameval *treep) 129 | { 130 | return _test_lookup(treep, nrlookup); 131 | } 132 | 133 | int _test_count(Nameval *treep, void (*walk_fn)(Nameval *treep, void (*fn)(Nameval *, void *), void *arg)) 134 | { 135 | treep = insert(treep, newleaf("one more", 0)); 136 | treep = insert(treep, newleaf("two more", 0)); 137 | treep = insert(treep, newleaf("three more", 0)); 138 | 139 | int c = 0; 140 | walk_fn(treep, counttree, &c); 141 | 142 | if (c != 6) 143 | { 144 | printf("[ERROR] Got (%d) for tree count, expected 6", c); 145 | } 146 | 147 | return 0; 148 | } 149 | 150 | int test_count_inorder(Nameval *treep) 151 | { 152 | return _test_count(treep, applyinorder); 153 | } 154 | 155 | int test_count_preorder(Nameval *treep) 156 | { 157 | return _test_count(treep, applypreorder); 158 | } 159 | 160 | int test_count_postorder(Nameval *treep) 161 | { 162 | return _test_count(treep, applypostorder); 163 | } 164 | -------------------------------------------------------------------------------- /chapter-2/2.8-trees/trees.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | 5 | typedef struct Nameval Nameval; 6 | struct Nameval 7 | { 8 | char *name; 9 | int value; 10 | Nameval *left; /* lesser */ 11 | Nameval *right; /* greater */ 12 | }; 13 | 14 | Nameval *newleaf(char *name, int value); 15 | void printtree(Nameval *treep, void *arg); 16 | Nameval *insert(Nameval *treep, Nameval *newp); 17 | Nameval *lookup(Nameval *treep, char *name); 18 | Nameval *nrlookup(Nameval *treep, char *name); 19 | void applyinorder(Nameval *treep, void (*fn)(Nameval *, void *), void *arg); 20 | void applypreorder(Nameval *treep, void (*fn)(Nameval *, void *), void *arg); 21 | void applypostorder(Nameval *treep, void (*fn)(Nameval *, void *), void *arg); 22 | 23 | void printtree(Nameval *treep, void *arg) 24 | { 25 | printf((char *)arg, treep->name, treep->value); 26 | } 27 | 28 | Nameval *newleaf(char *name, int value) 29 | { 30 | Nameval *n = (Nameval *)malloc(sizeof(Nameval)); 31 | 32 | n->name = name; 33 | n->value = value; 34 | n->left = NULL; 35 | n->right = NULL; 36 | 37 | return n; 38 | } 39 | /* insert: insert newp in treep and return treep */ 40 | Nameval *insert(Nameval *treep, Nameval *newp) 41 | { 42 | if (treep == NULL) 43 | return newp; 44 | 45 | int cmp = strcmp(treep->name, newp->name); 46 | if (cmp == 0) 47 | { 48 | printf("value (%s) already present at treep.", newp->name); 49 | return treep; 50 | } 51 | if (cmp < 0) 52 | treep->left = insert(treep->left, newp); 53 | else 54 | treep->right = insert(treep->right, newp); 55 | return treep; 56 | } 57 | 58 | /* lookup: looks up name in treep and return the node that contains it*/ 59 | Nameval *lookup(Nameval *treep, char *name) 60 | { 61 | if (treep == NULL) 62 | return NULL; 63 | 64 | int cmp = strcmp(treep->name, name); 65 | if (cmp == 0) 66 | return treep; 67 | if (cmp < 0) 68 | return lookup(treep->left, name); 69 | else 70 | return lookup(treep->right, name); 71 | } 72 | 73 | /* nrlookup: non-recursive implementation of lookup */ 74 | Nameval *nrlookup(Nameval *treep, char *name) 75 | { 76 | while (treep != NULL) 77 | { 78 | int cmp = strcmp(treep->name, name); 79 | if (cmp == 0) 80 | return treep; 81 | if (cmp < 0) 82 | treep = treep->left; 83 | else 84 | treep = treep->right; 85 | } 86 | 87 | return NULL; 88 | } 89 | 90 | /* applyinorder: apply fn(arg) in order to all the elements in treep */ 91 | void applyinorder(Nameval *treep, void (*fn)(Nameval *, void *), void *arg) 92 | { 93 | if (treep == NULL) 94 | return; 95 | 96 | applyinorder(treep->left, fn, arg); 97 | fn(treep, arg); 98 | applyinorder(treep->right, fn, arg); 99 | } 100 | 101 | /* applypostorder: apply fn(arg) post order to all the elements in treep */ 102 | void applypostorder(Nameval *treep, void (*fn)(Nameval *, void *), void *arg) 103 | { 104 | if (treep == NULL) 105 | return; 106 | 107 | applypostorder(treep->left, fn, arg); 108 | applypostorder(treep->right, fn, arg); 109 | fn(treep, arg); 110 | } 111 | 112 | /* appkypostorder: apply fn(arg) pre-order to all the elements in treep*/ 113 | void applypreorder(Nameval *treep, void (*fn)(Nameval *, void *), void *arg) 114 | { 115 | if (treep == NULL) 116 | return; 117 | 118 | fn(treep, arg); 119 | applypreorder(treep->left, fn, arg); 120 | applypreorder(treep->right, fn, arg); 121 | } -------------------------------------------------------------------------------- /chapter-2/2.9-hash-tables/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 2: Algorithms and Data Structures 2 | 3 | ## Section 2.9 Hash Tables 4 | 5 | ### Exercise 2.14 6 | 7 | Our hash function is an excellent general-purpose hash for strings. 8 | Nonetheless, peculiar data might cause poor behaviour. Contruct a data set that causes 9 | our hash function to perform badly. Is it easier to find a bad set for different values 10 | of `NHASH`? 11 | 12 | _Answer_: My experiment was to try out URL, since that was mentioned in the book as possible problematic input. 13 | I tried with 8 URLs. In the beggining when `NHASH` was way bigger that the number of inputs (500 > 8), there 14 | were no collisions. However, when I set `NHASH` to something small collisions started popping. 15 | For `NHASH=8` (the same number of inputs) there were 4 collisions. Even for `NHASH=16` (twice the number of inputs) there were 2 collisions, which was not good. So my guess is that input of that sort would be 16 | problematic for this algorigthm. And, of course, the smaller `NHASH` is, the bigger a possibility for collisions. 17 | 18 | ### Exercise 2.15 19 | 20 | Write a function to access the successive eleents of the hash table in sorted order. 21 | 22 | _Skipping this one, because I am not sure what exactly should be done here._ 23 | 24 | ### Exercise 2.16 25 | 26 | Change `lookup` so that if the average list length becomes more than `x`, the array is grown automatically 27 | by a factor of `y` and the hash table is rebuilt. 28 | 29 | _Answer:_ You can find the solution at [`lookup.c`](lookup.c). This one was tricky as there were some gotchas 30 | around the resizing of the map. In the end I managed to do it in the following way: 31 | 32 | - if the average length is bigger that the desired one, create an array of all the elements in the map 33 | - set all of the elements of the map to NULL 34 | - resize the map (`realloc`) 35 | - recalculate the hash of all the elements in the array and insert them into the map 36 | That are a lot of iterations, but I am not sure if I can come up with a more elegant solution. 37 | 38 | ### Exercise 2.17 39 | 40 | Design a hash function for storing the coordinates of points in 2-dimensions. How easily does your function 41 | adapt to changes in the type of the coordinates, for example from integer to folating point or 42 | from Cartesian to polar coordinates, or to changes from 2 to higher dimensions? 43 | 44 | _Answer:_ You can find the solution at [`coordinates.c`](coordinates.c). 45 | The hash function for coordinates takes into account four things: 46 | 47 | - the `x` part of the coordinates 48 | - the `y` part of the coordinates 49 | - the sum of `x` and `y` 50 | - the absolute difference of `x` and `y` 51 | These four are multiplied by four prime numbers and added together to compute the final hash. 52 | 53 | I decided that having the four factors is more collision-save than just the `x` and `y`, because 54 | this way if two combinations of `x` and `y` for which`(31 * x1) + (37 * y1) == (31 * x2) + (37 * y2)` 55 | yield the same value, taking into account the sum and the difference would prevent them from having the same hash. 56 | 57 | The current implementation does not care whether `x` or `y` are going to be `int`, `float`, `double` or something else. They can be anything that can be multiplied by `int`. 58 | 59 | As for Cartesian to polar - if the coordinates are numbers, the hash function will deal with them just fine. 60 | 61 | Finally, if we were to add a third point - `z`, then the `hash` function will have to be enhanced to take it into account as well. 62 | -------------------------------------------------------------------------------- /chapter-2/2.9-hash-tables/coordinates.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | 4 | typedef struct 5 | { 6 | float x; 7 | float y; 8 | } Coordinates; 9 | 10 | int hash(Coordinates *coordinates, int size); 11 | 12 | int main() 13 | { 14 | Coordinates *c = malloc(sizeof(Coordinates)); 15 | if (c == NULL) 16 | return 1; 17 | 18 | c->x = 5; 19 | c->y = 10; 20 | 21 | printf("%d\n", hash(c, 12)); 22 | 23 | c->x = 15; 24 | c->y = 20; 25 | 26 | printf("%d\n", hash(c, 12)); 27 | 28 | c->x = 50; 29 | c->y = 100; 30 | 31 | printf("%d\n", hash(c, 12)); 32 | return 0; 33 | } 34 | 35 | int hash(Coordinates *coordinates, int size) 36 | { 37 | int diff = coordinates->x - coordinates->y; 38 | if (diff < 0) 39 | diff = -diff; 40 | int sum = coordinates->x + coordinates->y; 41 | int h = (31 * coordinates->x) + (37 * coordinates->y) + (41 * diff) + (47 * sum); 42 | return h % size; 43 | } -------------------------------------------------------------------------------- /chapter-2/2.9-hash-tables/hash_func.c: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | #include "hash_tables.c" 4 | 5 | int main() 6 | { 7 | char *urls[] = { 8 | "http://localhost", 9 | "http://localhost:8000", 10 | "http://localhost:8001", 11 | "http://localhost:8000/docs", 12 | "http://localhost:8001/docs", 13 | "https://localhost", 14 | "http://test.local/docs", 15 | "https://test.local/docs", 16 | }; 17 | 18 | for (int i = 0; i < sizeof(urls)/sizeof(urls[0]); i++) 19 | { 20 | printf("%d\n", hash(urls[i])); 21 | } 22 | } -------------------------------------------------------------------------------- /chapter-2/2.9-hash-tables/hash_tables.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | 5 | enum 6 | { 7 | NHASH = 16, 8 | MULTIPLIER = 31 9 | }; 10 | 11 | typedef struct Nameval Nameval; 12 | struct Nameval 13 | { 14 | char *name; 15 | int value; 16 | Nameval *next; 17 | }; 18 | 19 | Nameval *symtab[NHASH]; 20 | 21 | void print_table(); 22 | unsigned int hash(char *str); 23 | Nameval *lookup(char *name, int value, int create); 24 | 25 | unsigned int hash(char *str) 26 | { 27 | unsigned int h = 0; 28 | unsigned char *p; 29 | 30 | for (p = (unsigned char *)str; *p != '\0'; p++) 31 | h = MULTIPLIER * h + *p; 32 | return h % NHASH; 33 | } 34 | 35 | /* lookup: find name in symtab, with optional create*/ 36 | Nameval *lookup(char *name, int value, int create) 37 | { 38 | Nameval *sym; 39 | int h = hash(name); 40 | for (sym = symtab[h]; sym != NULL; sym = sym->next) 41 | if (strcmp(sym->name, name) == 0) 42 | return sym; 43 | 44 | if (create) 45 | { 46 | sym = (Nameval *)malloc(sizeof(Nameval)); 47 | if (sym == NULL) 48 | return NULL; 49 | sym->name = name; 50 | sym->value = value; 51 | sym->next = symtab[h]; 52 | symtab[h] = sym; 53 | } 54 | return sym; 55 | } 56 | 57 | void print_table() 58 | { 59 | for (int i = 0; i < NHASH; i++) 60 | { 61 | Nameval *sym = symtab[i]; 62 | if (sym == NULL) 63 | continue; /* don't print the bucket if it does not have anything in it */ 64 | 65 | printf("symtab[%d] = ", i); 66 | while (sym != NULL) 67 | { 68 | printf("(%s: %d) -> ", sym->name, sym->value); 69 | sym = sym->next; 70 | } 71 | printf("NULL\n"); 72 | } 73 | } -------------------------------------------------------------------------------- /chapter-2/2.9-hash-tables/lookup.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | 4 | #include "hash_tables.c" 5 | 6 | int count(Nameval **map, int size); 7 | void print_table_s(Nameval **table, int size, int print_null); 8 | unsigned int hash_s(char *str); 9 | void resize_map(Nameval **map, int factor); 10 | void insert(Nameval **map, char *name, int value, int hash); 11 | Nameval *lookup_rear(Nameval **arr, char *name, int value, int create, int x, int y); 12 | 13 | int size = 10; 14 | int main() 15 | { 16 | Nameval **arr = malloc(sizeof(Nameval) * size); 17 | 18 | for (int i = 0; i < 42; i++) 19 | { 20 | char *name = (char *)malloc(sizeof(char) * 15); 21 | sprintf(name, "Name %d", i); 22 | lookup_rear(arr, name, 12, 1, 2, 2); 23 | } 24 | 25 | print_table_s(arr, size, 1); 26 | } 27 | 28 | /* lookup: find name in map, with optional create. 29 | If the avarage array length is more than x, grow the table by y and rebuild it */ 30 | Nameval *lookup_rear(Nameval **map, char *name, int value, int create, int x, int y) 31 | { 32 | /* search for name in map */ 33 | Nameval *sym; 34 | int h = hash(name); 35 | for (sym = map[h]; sym != NULL; sym = sym->next) 36 | if (strcmp(sym->name, name) == 0) 37 | return sym; 38 | 39 | /* if create flag is set and name not in map, add name, value to map */ 40 | if (create) 41 | { 42 | insert(map, name, value, h); 43 | } 44 | 45 | float av_length = count(map, size) / (float)size; 46 | 47 | /* resize tha map if average length more than x */ 48 | if (av_length > x) 49 | { 50 | int j = 0; 51 | Nameval **new_arr = malloc(sizeof(Nameval) * size); 52 | for (int i = 0; i < size; i++) 53 | for (Nameval *a = map[i]; a != NULL; a = a->next) 54 | new_arr[j++] = a; 55 | 56 | for (int i = 0; i < size; i++) 57 | map[i] = NULL; 58 | 59 | int old_size = size; 60 | size = size * y; 61 | map = realloc(map, sizeof(Nameval) * size); 62 | for (int i = 0; i < old_size; i++) 63 | { 64 | int h = hash_s(new_arr[i]->name); 65 | insert(map, new_arr[i]->name, new_arr[i]->value, h); 66 | } 67 | } 68 | 69 | return sym; 70 | } 71 | 72 | unsigned int hash_s(char *str) 73 | { 74 | unsigned int h = 0; 75 | unsigned char *p; 76 | 77 | for (p = (unsigned char *)str; *p != '\0'; p++) 78 | h = MULTIPLIER * h + *p; 79 | return h % size; 80 | } 81 | 82 | int count(Nameval **map, int size) 83 | { 84 | int sum = 0; 85 | for (int i = 0; i < size; i++) 86 | for (Nameval *arr = map[i]; arr != NULL; arr = arr->next) 87 | sum++; 88 | return sum; 89 | } 90 | 91 | void insert(Nameval **map, char *name, int value, int hash) 92 | { 93 | Nameval *sym = (Nameval *)malloc(sizeof(Nameval)); 94 | if (sym == NULL) 95 | return; 96 | sym->name = name; 97 | sym->value = value; 98 | sym->next = map[hash]; 99 | map[hash] = sym; 100 | } 101 | 102 | void print_table_s(Nameval **table, int size, int print_null) 103 | { 104 | for (int i = 0; i < size; i++) 105 | { 106 | Nameval *sym = table[i]; 107 | if (sym == NULL && !print_null) 108 | continue; /* don't print the bucket if it does not have anything in it */ 109 | printf("table[%d] = ", i); 110 | while (sym != NULL) 111 | { 112 | printf("(%s: %d) -> ", sym->name, sym->value); 113 | sym = sym->next; 114 | } 115 | printf("NULL\n"); 116 | } 117 | } -------------------------------------------------------------------------------- /chapter-2/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 2: Algorithms and Data structures 2 | 3 | ## Table of Contents 4 | 5 | - [2.1 Searching](2.1-searching) 6 | - [2.2 Sorting](2.2-sorting) 7 | - [2.3 Libraries](2.3-libraries) 8 | - [2.4 A Java Quicksort](2.4-java-quicksort) 9 | - [2.5 O-Notation](2.5-o-notation) 10 | - [2.6 Growing Arrays](2.6-growing-arrays) 11 | - [2.7 Lists](2.7-lists) 12 | - [2.8 Trees](2.8-trees) 13 | - [2.9 Hash Tables](2.9-hash-tables) 14 | 15 | ## Supplementary Reading 16 | 17 | - _Algorithms_ books by Bob Sedgewicks 18 | - _Algotithms in C++_ by Bob Sedgewicks 19 | - _The Art of Computer Programming_ by Don Knuth 20 | - _Design and Validation of Computer Protocols_ by Gerard Holzmann 21 | - _Software - Practice and Experience_ by Jon Bentley and Doug McIlroy 22 | -------------------------------------------------------------------------------- /chapter-3/3.1-the-markov-chain-algorithm/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 3: Design and Implementation 2 | 3 | ## 3.1 The Markov Chain Algorithm 4 | 5 | **Summary:** We are going to use the [Markov Chain Algorithm](https://en.wikipedia.org/wiki/Markov_chain) to write a program that generates text, based on an input text. 6 | The Markov Chain Algorithm can be summarised to the following steps: 7 | 8 | ```text 9 | set w1 and w2 to the first two words in the text 10 | print w1 and w2 11 | loop: 12 | randomly choose w3, one of the successors of w1 w2 in the text 13 | print w3 14 | replace w1 and w2 with w2 and w3 15 | repeat loop 16 | ``` 17 | -------------------------------------------------------------------------------- /chapter-3/3.2-data-structure-alternatives/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 3: Design and Implementation 2 | 3 | ## 3.2 Data Structure Alternatives 4 | 5 | **Summary:** While choosing data structures for our program we need to take into account a few things: 6 | 7 | - which structure fits naturally for the work that must be done 8 | - how big of an input we are expecting 9 | - what is the desired speed we want our program to run with 10 | 11 | Taking all these things into account we have chosen to go with a hash table. 12 | The prefix will be the key, and the suffix will be the value. 13 | Each prefix is a fixed-size set of words (we have decided to go with 2). 14 | Each suffix is a list of words. 15 | For now we will represent the words as strings. 16 | -------------------------------------------------------------------------------- /chapter-3/3.3-building-the-data-structure-in-c/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 3: Design and Implementation 2 | 3 | ## 3.3 Building the Data Structure in C 4 | 5 | **Summary:** Let's write some [code](structs.c)! 6 | -------------------------------------------------------------------------------- /chapter-3/3.3-building-the-data-structure-in-c/structs.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | 5 | enum 6 | { 7 | NPREF = 2, /* number of prefix words */ 8 | NHASH = 4093, /* size of state hash table array */ 9 | MAXGEN = 10000, /* maximum words generated */ 10 | MULTIPLIER = 31, /* prime number to be used for hash computations */ 11 | }; 12 | 13 | char *NONWORD = "\n"; /* to be used as terminating sequence flag */ 14 | 15 | typedef struct State State; 16 | typedef struct Suffix Suffix; 17 | struct State /* prefix + suffix list*/ 18 | { 19 | char *pref[NPREF]; /* prefix words */ 20 | Suffix *suf; /* list of suffixes */ 21 | State *next; /* next in hash table */ 22 | }; 23 | State *statetab[NHASH]; 24 | 25 | struct Suffix /* list of suffixes */ 26 | { 27 | char *word; /* suffix */ 28 | Suffix *next; /* next in list of suffixes */ 29 | }; 30 | 31 | /* hash: computer hash value for array of NPREF strings */ 32 | unsigned int hash(char *s[NPREF]) 33 | { 34 | unsigned int h = 0; 35 | 36 | for (int i = 0; i < NPREF; i++) 37 | for (unsigned char *p = (unsigned char *)s[i]; *p != '\0'; p++) 38 | h = MULTIPLIER * h + *p; 39 | return h % NHASH; 40 | } 41 | 42 | State *lookup(char *prefix[NPREF], int create) 43 | { 44 | int i; 45 | int h = hash(prefix); 46 | 47 | for (State *sp = statetab[h]; sp != NULL; sp = sp->next) 48 | { 49 | for (i = 0; i < NPREF; i++) 50 | if (strcmp(prefix[i], sp->pref[i]) != 0) 51 | break; 52 | // TODO: is that check needed, maybe we can return directly here 53 | if (i == NPREF) /* found it */ 54 | return sp; 55 | } 56 | 57 | if (create) 58 | { 59 | State *sp = (State *)malloc(sizeof(State)); 60 | if (sp == NULL) /* not enough memory to allocate for *sp */ 61 | return NULL; 62 | for (int i = 0; i < NPREF; i++) 63 | sp->pref[i] = prefix[i]; 64 | sp->suf = NULL; 65 | sp->next = statetab[h]; 66 | statetab[h] = sp; 67 | return sp; 68 | } 69 | 70 | /* not found and create not set */ 71 | return NULL; 72 | } 73 | 74 | /* addsuffix: add to state */ 75 | void addsuffix(State *sp, char *suffix) 76 | { 77 | Suffix *suff = (Suffix *)malloc(sizeof(Suffix)); 78 | if (suff == NULL) 79 | return; 80 | suff->word = suffix; 81 | suff->next = sp->suf; 82 | sp->suf = suff; 83 | } 84 | 85 | /* add: add suffix to suffix list, update prefix */ 86 | void add(char *prefix[NPREF], char *suffix) 87 | { 88 | State *sp = lookup(prefix, 1); 89 | addsuffix(sp, suffix); 90 | memmove(prefix, prefix + 1, (NPREF - 1) * sizeof(prefix[0])); 91 | prefix[NPREF - 1] = suffix; 92 | } 93 | 94 | /* build: read input, build prefix table*/ 95 | void build(char *prefix[NPREF], FILE *f) 96 | { 97 | char buf[100], fmt[10]; 98 | 99 | /* a call to fscanf will read until whitespace, which may overflow buffer. 100 | that is why we are building a format string that is equal to '%99s', 101 | so that we tell fscaf to stop after the 99th byte (leaving one bute for '\0')*/ 102 | sprintf(fmt, "%%%lus", sizeof(buf) - 1); 103 | while (fscanf(f, fmt, buf) != EOF) 104 | add(prefix, strdup(buf)); 105 | } 106 | -------------------------------------------------------------------------------- /chapter-3/3.4-generating-output/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 3: Design and Implementation 2 | 3 | ## 3.4 Generating Output 4 | 5 | **Summary:** Let's write some more [code](output.c)! 6 | 7 | ### Exercise 3-1 8 | 9 | The algorithm for selecting a random item from a list of unknown length depends on having a good random number generator. 10 | Design and carry out experiments to determine how weel the method works in practice. 11 | 12 | _Answer:_ Running the program multiple times always produced the same output, although there are prefixes with more than 13 | one suffix. This means that either the random function is not fully random, or the algorithm is broken. 14 | 15 | After adding a seed, based on current time in [this commit](https://github.com/asankov/the-practice-of-programming/commit/bf10c68f853ae3997fd445dc169443297a707fb8) the results started to vary, which 16 | means that the problem was the random number generator. 17 | 18 | ### Exercise 3-2 19 | 20 | If each input word is stored in a second hash table, the text is only stored once, which should save space. 21 | Measure some documents to estimate how much. This organization would allow us to compare pointers rather than strings 22 | in the hash chains for prefixes, which should run faster. Implement this version and measure the change in speed 23 | and memory consumption. 24 | 25 | _Answer:_ This change should make the program more memory efficient, because we would be storing every string just once. 26 | _TODO: add implementation_ 27 | 28 | ### Exercise 3-3 29 | 30 | Remove the statements that place sentinel `NONWORD`s at the beginning and end of the data, and modify `generate` 31 | so it starts and stops properly without them. Make sure it produces correct output for input with 0, 1, 2, 3 and 4 words. 32 | Compare the implementation to the version using sentinels. 33 | 34 | _Answer:_ _TODO: add answer and implementation_ 35 | -------------------------------------------------------------------------------- /chapter-3/3.4-generating-output/output.c: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | #include "../3.3-building-the-data-structure-in-c/structs.c" 4 | 5 | void generate(int nwords) 6 | { 7 | 8 | char *prefix[NPREF], *w; 9 | 10 | for (int i = 0; i < NPREF; i++) /* reset initial prefix */ 11 | prefix[i] = NONWORD; 12 | 13 | for (int i = 0; i < nwords; i++) 14 | { 15 | State *sp = lookup(prefix, 0); 16 | int nmatch = 0; 17 | for (Suffix *suf = sp->suf; suf != NULL; suf = suf->next) 18 | if (rand() % ++nmatch == 0) /* probability: 1/nmatch */ 19 | w = suf->word; 20 | if (strcmp(w, NONWORD) == 0) 21 | break; 22 | printf("%s ", w); 23 | memmove(prefix, prefix + 1, (NPREF - 1) * sizeof(prefix[0])); 24 | prefix[NPREF - 1] = w; 25 | } 26 | } 27 | 28 | void printt() 29 | { 30 | printf("Input prefix \t\t\t Suffix words\n"); 31 | for (int i = 0; i < NHASH; i++) 32 | { 33 | State *st = statetab[i]; 34 | 35 | for (State *st = statetab[i]; st != NULL; st = st->next) 36 | { 37 | if (strcmp(st->pref[0], "\n") == 0 || strcmp(st->pref[1], "\n") == 0) 38 | continue; 39 | printf("%s %s \t\t\t", st->pref[0], st->pref[1]); 40 | for (Suffix *s = st->suf; s != NULL; s = s->next) 41 | printf("%s ", s->word); 42 | printf("\n"); 43 | } 44 | } 45 | } 46 | 47 | int main(void) 48 | { 49 | char *prefix[NPREF]; 50 | 51 | for (int i = 0; i < NPREF; i++) 52 | prefix[i] = NONWORD; 53 | 54 | long seed = time(NULL); 55 | 56 | srand(seed); 57 | 58 | FILE *f = fopen("text.txt", "r"); 59 | if (f == NULL) 60 | return 1; 61 | build(prefix, f); 62 | fclose(f); 63 | add(prefix, NONWORD); 64 | // printt(); 65 | generate(MAXGEN); 66 | return 0; 67 | } 68 | -------------------------------------------------------------------------------- /chapter-3/3.4-generating-output/text.txt: -------------------------------------------------------------------------------- 1 | Show your flowcharts and conceal your tables and I will be mystified. Show your tables and your flowcharts will be obvious. -------------------------------------------------------------------------------- /chapter-3/3.5-java/Markov.java: -------------------------------------------------------------------------------- 1 | package main; 2 | 3 | import java.io.BufferedReader; 4 | import java.io.IOException; 5 | import java.io.InputStream; 6 | import java.io.InputStreamReader; 7 | import java.io.Reader; 8 | import java.io.StreamTokenizer; 9 | import java.util.HashMap; 10 | import java.util.Hashtable; 11 | import java.util.Map; 12 | import java.util.Random; 13 | import java.util.Vector; 14 | import java.util.stream.Collectors; 15 | 16 | public class Markov { 17 | 18 | private static final Integer MAX_WORDS = 10_000; 19 | 20 | public static void main(String[] args) throws IOException { 21 | Chain chain = new Chain(); 22 | chain.build(System.in); 23 | chain.generate(MAX_WORDS); 24 | } 25 | } 26 | 27 | class Chain { 28 | private static final Integer PREFIX_SIZE = 2; 29 | private static final String NON_WORD = "\n"; 30 | 31 | private Map> statetab = new HashMap<>(); 32 | private Prefix prefix = Prefix.from(PREFIX_SIZE, NON_WORD); 33 | private Random rand = new Random(); 34 | 35 | public void build(InputStream in) throws IOException { 36 | Reader r = new BufferedReader(new InputStreamReader(in)); 37 | StreamTokenizer st = new StreamTokenizer(r); 38 | 39 | st.resetSyntax(); 40 | st.wordChars(0, Character.MAX_VALUE); 41 | st.whitespaceChars(0, ' '); 42 | 43 | while (st.nextToken() != StreamTokenizer.TT_EOF) 44 | this.add(st.sval); 45 | this.add(NON_WORD); 46 | } 47 | 48 | public void generate(int words) { 49 | this.prefix = Prefix.from(PREFIX_SIZE, NON_WORD); 50 | 51 | for (int i = 0; i < words; i++) { 52 | Vector s = statetab.get(prefix); 53 | Integer r = Math.abs(rand.nextInt() % s.size()); 54 | String suf = s.elementAt(r); 55 | if (suf.equals(NON_WORD)) 56 | break; 57 | System.out.print(suf + " "); 58 | prefix.pref[0] = prefix.pref[1]; 59 | prefix.pref[1] = suf; 60 | } 61 | } 62 | 63 | public void add(String word) { 64 | Vector suf = statetab.get(prefix); 65 | if (suf == null) { 66 | suf = new Vector<>(); 67 | statetab.put(prefix.clone(), suf); 68 | } 69 | suf.addElement(word); 70 | prefix.pref[0] = prefix.pref[1]; 71 | prefix.pref[1] = word; 72 | } 73 | } 74 | 75 | class Prefix { 76 | 77 | private static final int MULTIPLIER = 31; 78 | 79 | public String[] pref; 80 | 81 | public static Prefix from(Integer size, String value) { 82 | Prefix p = new Prefix(); 83 | p.pref = new String[size]; 84 | for (int i = 0; i < size; i++) 85 | p.pref[i] = value; 86 | return p; 87 | } 88 | 89 | public Prefix clone() { 90 | Prefix n = new Prefix(); 91 | n.pref = this.pref.clone(); 92 | return n; 93 | } 94 | 95 | public int hashCode() { 96 | int h = 0; 97 | for (int i = 0; i < pref.length; i++) 98 | h = MULTIPLIER * h + pref[i].hashCode(); 99 | return h; 100 | } 101 | 102 | public boolean equals(Object o) { 103 | Prefix p = (Prefix) o; 104 | for (int i = 0; i < pref.length; i++) 105 | if (!this.pref[i].equals(p.pref[i])) 106 | return false; 107 | return true; 108 | } 109 | } -------------------------------------------------------------------------------- /chapter-3/3.5-java/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 3: Design and Implementation 2 | 3 | ## Section 3.5 Java 4 | 5 | **Summary:** Let's rewrite these stuff in [Java](Markov.java)! 6 | 7 | ### Exercise 3-4 8 | 9 | Revise the Java version of `markov` to use an array instead of a `Vector` for the prefix in the `State` class. 10 | 11 | _Answer:_ An array makes more sense than `Vector`, because the size is fixed and known when the object is instantiated. 12 | Therefore, we don't need data struct that can grow and shrink dinamically. 13 | 14 | Changes applied in [this commit](https://github.com/asankov/the-practice-of-programming/commit/a1530955650425780da796e8d04a42ceacdf275c). 15 | 16 | **BONUS:** Refactor the Java code to use up-to-data structures and practices: 17 | 18 | - Generics - `Map>` instead of `Hashtable` 19 | - Remove use of deprecated constructor of `StreamTokenizer` 20 | - Static constructor, instead of constructors with fields that differ from the fields of the class 21 | 22 | Changes applied in [this commit](https://github.com/asankov/the-practice-of-programming/commit/793994dae973f3d4d9a14224fb511f9d6fe9de82). 23 | -------------------------------------------------------------------------------- /chapter-3/3.6-c++/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 3: Design and Implementation 2 | 3 | ## Section 3.6 C++ 4 | 5 | **Summary:** Let's rewrite these stuff in [C++](markov.cpp)! 6 | 7 | ### Exercise 3-5 8 | 9 | The great strength of the STL is the ease with which one can experiment with different data structures. 10 | Modify the C++ version of Markov to use various structures to represent the prefix, suffix list, and state table. 11 | How does performance change for the different structures? 12 | 13 | _Answer:_ Changes applied in this [commit](https://github.com/asankov/the-practice-of-programming/commit/d15f071648e617437256b644f4649299bca332d0). 14 | 15 | The diff is very small, since we only reference these types once or twice. 16 | TODO: measure performance between the two versions. 17 | 18 | ### Exercise 3-6 19 | 20 | Write a C++ version that uses only classes and the `string` data type, but no other advanced library facilities. 21 | Compare it in style and speed to the STL versions. 22 | 23 | _Answer:_ I approached this incrementally, doing it type by type. 24 | The changes are part of the following commits: 25 | 26 | - `Suffixes` - `std::vector` to `class Suffixes` that uses a Linked list under the hood - [`#bd84f3e`](https://github.com/asankov/the-practice-of-programming/commit/bd84f3e6112069e25f56e798d08384ea3d5aa50b) 27 | - `State` - `std::map` to `class StateMap` that uses a hash map under the hood - [`#e84ba3d`](https://github.com/asankov/the-practice-of-programming/commit/e84ba3dd3f3f8fa7e9f831475b3e5499a129da47) 28 | - `Prefix` - `std::deque` to `class Prefix` that uses a simple array under the hood - [`#57f3f73`](https://github.com/asankov/the-practice-of-programming/commit/57f3f73a70cf9700a7559c694c8c7010afaca19d) 29 | 30 | The first one was easy, the other two took more time, because there were more work that needed to be done for them. 31 | The solution is really close to the C one, with the only difference that the details are encapsulated, behind the hoods of the classes. 32 | But they are still there and the developers need to care for them. 33 | In terms of the actual implementation that uses these structs and classes - not much changed, because I kept the contracts more or less the same. 34 | 35 | Although it was fun implementing this stuff and hitting all the obstacles, in an actual world situation I would rarely choose the custom implementations over the library ones. 36 | -------------------------------------------------------------------------------- /chapter-3/3.6-c++/markov.cpp: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include 5 | #include 6 | #include 7 | 8 | #include "structs.cpp" 9 | 10 | StateMap statetab; 11 | 12 | enum 13 | { 14 | NPREF = 2, 15 | MAXGEN = 10000, 16 | }; 17 | const char *NONWORD = "\n"; 18 | 19 | void generate(int nwords); 20 | void build(Prefix &prefix, std::istream &in); 21 | void add(Prefix &prefix, const std::string &s); 22 | 23 | int main(void) 24 | { 25 | Prefix prefix; 26 | 27 | for (int i = 0; i < NPREF; i++) 28 | add(prefix, NONWORD); 29 | build(prefix, std::cin); 30 | add(prefix, NONWORD); 31 | generate(MAXGEN); 32 | 33 | return 0; 34 | } 35 | 36 | void build(Prefix &prefix, std::istream &in) 37 | { 38 | std::string buf; 39 | 40 | while (in >> buf) 41 | add(prefix, buf); 42 | } 43 | 44 | void add(Prefix &prefix, const std::string &s) 45 | { 46 | if (prefix.size() == NPREF) 47 | { 48 | statetab[prefix]->push(s); 49 | prefix.pop_front(); 50 | } 51 | prefix.push_back(s); 52 | } 53 | 54 | void generate(int nwords) 55 | { 56 | Prefix prefix; 57 | for (int i = 0; i < NPREF; i++) 58 | add(prefix, NONWORD); 59 | 60 | srand(time(NULL)); 61 | for (int i = 0; i < nwords; i++) 62 | { 63 | Suffixes *suf = statetab[prefix]; 64 | const std::string w = (*suf)[rand() % suf->size()]; 65 | 66 | if (w == NONWORD) 67 | break; 68 | std::cout << w << " "; 69 | prefix.pop_front(); 70 | prefix.push_back(w); 71 | } 72 | } -------------------------------------------------------------------------------- /chapter-3/3.6-c++/structs.cpp: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include 5 | 6 | enum 7 | { 8 | NHASH = 4098, 9 | }; 10 | 11 | std::hash hasher; 12 | 13 | class Suffixes 14 | { 15 | private: 16 | class Suffix 17 | { 18 | public: 19 | std::string val; 20 | Suffix *next; 21 | }; 22 | 23 | Suffix *suffixes; 24 | int _size; 25 | 26 | public: 27 | int size() 28 | { 29 | return _size; 30 | } 31 | std::string operator[](int index) 32 | { 33 | int c = 0; 34 | for (Suffix *s = suffixes; s != NULL; s = s->next) 35 | if (index == c++) 36 | return s->val; 37 | 38 | return NULL; 39 | } 40 | void push(std::string val) 41 | { 42 | _size++; 43 | Suffix *s = new Suffix(); 44 | s->val = val; 45 | s->next = suffixes; 46 | suffixes = s; 47 | } 48 | }; 49 | 50 | class Prefix 51 | { 52 | public: 53 | std::string values[2]; 54 | int _size; 55 | 56 | int size() 57 | { 58 | return _size; 59 | } 60 | std::string at(int index) 61 | { 62 | return values[index]; 63 | } 64 | void push_back(std::string val) 65 | { 66 | values[_size++] = val; 67 | } 68 | void pop_front() 69 | { 70 | if (_size == 2) 71 | { 72 | values[0] = values[1]; 73 | values[1] = std::string(); 74 | } 75 | else if (_size == 1) 76 | { 77 | values[0] = std::string(); 78 | } 79 | else if (_size == 0) 80 | { 81 | return; 82 | } 83 | _size--; 84 | } 85 | 86 | int hash() 87 | { 88 | unsigned int h = 0; 89 | 90 | for (int i = 0; i < size(); i++) 91 | { 92 | h += hasher(at(i)); 93 | } 94 | 95 | return h % NHASH; 96 | } 97 | 98 | bool operator==(Prefix other) 99 | { 100 | if (size() != other.size()) 101 | return false; 102 | 103 | for (int i = 0; i < size(); i++) 104 | if (at(i).compare(other.at(i)) != 0) 105 | return false; 106 | 107 | return true; 108 | } 109 | 110 | Prefix() 111 | { 112 | _size = 0; 113 | values[0] = std::string(); 114 | values[1] = std::string(); 115 | } 116 | }; 117 | 118 | class StateMap 119 | { 120 | private: 121 | class StateMapEntry 122 | { 123 | public: 124 | Prefix *prefix; 125 | Suffixes *suffixes; 126 | StateMapEntry *next; 127 | }; 128 | 129 | public: 130 | StateMapEntry *entries[NHASH]; 131 | 132 | Suffixes *operator[](Prefix p) 133 | { 134 | int hash = p.hash(); 135 | for (StateMapEntry *e = entries[hash]; e != NULL; e = e->next) 136 | { 137 | if (p == *e->prefix) 138 | { 139 | return e->suffixes; 140 | } 141 | } 142 | 143 | //create 144 | StateMapEntry *new_entry = new StateMapEntry(); 145 | new_entry->prefix = new Prefix(); 146 | for (int i = 0; i < p.size(); i++) 147 | new_entry->prefix->push_back(p.at(i)); 148 | new_entry->suffixes = new Suffixes(); 149 | new_entry->next = entries[hash]; 150 | entries[hash] = new_entry; 151 | 152 | return entries[hash]->suffixes; 153 | } 154 | }; 155 | -------------------------------------------------------------------------------- /chapter-3/3.7-awk-and-perl/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 3: Design and Implementation 2 | 3 | ## Section 3.7 Awk and Perl 4 | 5 | **Summary:** Let's rewrite these stuff in [Awk](markov.awk) and [Perl](markov.pl)! 6 | 7 | ### Exercise 3-7 8 | 9 | Modify the Awk and Perl versions to handle prefixes of any length. 10 | Experiment to determine what effect this changes has on performances. 11 | 12 | _Answer:_ I am going to defer this exercise, until I get myself more familiar with Awk and Perl. 13 | -------------------------------------------------------------------------------- /chapter-3/3.7-awk-and-perl/markov.awk: -------------------------------------------------------------------------------- 1 | # markov.awk: markov hain algorithm for 2-word prefixes 2 | BEGIN { MAXGEN = 10000; NONWORD = "\n"; w1 = w2 = NONWORD } 3 | { for (i = 1; i <= NF; i++) { # read all words 4 | statetab[w1, w2, ++nsuffix[w1, w2]] = $i 5 | w1 = w2 6 | w2 = $i 7 | } 8 | } 9 | END { 10 | statetab[w1, w2, ++nsuffix[w1, w2]] = NONWORD # add tail 11 | w1 = w2 = NONWORD 12 | for (i = 0; i < MAXGEN; i++) { # generate 13 | r = int(rand()*nsuffix[w1, w2]) + 1 # nsuffix >= 1 14 | p = statetab[w1, w2, r] 15 | if (p == NONWORD) 16 | exit 17 | print p 18 | w1 = w2 19 | w2 = p 20 | } 21 | } 22 | -------------------------------------------------------------------------------- /chapter-3/3.7-awk-and-perl/markov.pl: -------------------------------------------------------------------------------- 1 | # markov.pl: markov chain algorithm for 2-word prefixes 2 | 3 | $MAXGEN = 10000; 4 | $NONWORD - "\n"; 5 | $w1 = $w2 = $NONWORD; # initial state 6 | while (<>) { # read each line of input 7 | foreach (split) { 8 | push(@{$statetab{$w1}{$w2}}, $_); 9 | ($w1, $w2) = ($w2, $_); # multiple assignment 10 | } 11 | } 12 | push(@{$statetab{$w1}{$w2}}, $NONWORD); # add tail 13 | $w1 = $w2 = $NONWORD; 14 | for ($i = 0; $i < $MAXGEN; $i++) { 15 | $suf = $statetab{$w1}{$w2}; # array reference 16 | $r = int(rand @$suf); # @$suf is number of elems 17 | exit if (($t = $suf->[$r]) eq $NONWORD); 18 | print "$t "; 19 | ($w1, $w2) = ($w2, $t); # advance chain 20 | } -------------------------------------------------------------------------------- /chapter-3/3.8-performance/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 3: Design and Implementation 2 | 3 | ## Section 3.8 Performance 4 | 5 | **Summary:** The authors have made performance tests on all the implementations, and the bottom line is that the C one is the fastest by far. Next is Perl and Java is the slowest. 6 | -------------------------------------------------------------------------------- /chapter-3/3.9-lessons/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 3: Design and Implementation 2 | 3 | ## Section 3.9 Lessons 4 | 5 | **Summary:** The first thing to designing a program is the data and the data structures about it. These foundations then 6 | can be implemented in different ways with different languages, but they will more or less be the same. 7 | Using libraries and high-level programming language can help us speed up developement, but comes at a cost of giving up 8 | complete control of what you program does, and not knowing what happens under the hood. This is a trade-off that needs to be considered. 9 | Finally, writing production-ready code involves iteration and experimentation. 10 | 11 | ### Exercise 3-8 12 | 13 | We have seen versions of the Markov program in a wide variaty of languages, including Scheme, Tcl, Prolog, Python, Generic Java, ML and Haskell; 14 | each presents its own challenges and advantages. Implement the program in your favourite language and compare its general flavor and performance. 15 | 16 | _Answer:_ Finally some [Go](go/main/main.go)! 🎉 17 | 18 | Go, being based on C++ means that the solution looks very much like the initial C++ solution that used `std::vector`, `std::queue`, etc. For key of the map, I had to use the array of string, not the wrapping `Prefix` struct, because 19 | Go does not like very much custom types as map keys. 20 | 21 | The other parts of the solution were pretty straight-forward and similar to what we have already done for the other languages. 22 | -------------------------------------------------------------------------------- /chapter-3/3.9-lessons/go/go.mod: -------------------------------------------------------------------------------- 1 | module github.com/asankov/the-practice-of-programming/chapter-3 2 | 3 | go 1.21.3 4 | -------------------------------------------------------------------------------- /chapter-3/3.9-lessons/go/main/main.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "bufio" 5 | "fmt" 6 | "io" 7 | "math/rand" 8 | "os" 9 | "time" 10 | ) 11 | 12 | //Suffixes is the data struct that wraps the slice of strings 13 | type Suffixes []string 14 | 15 | // Prefix is the data struct that wraps an array of size prefiSize of strings 16 | type Prefix struct { 17 | vals [prefixSize]string 18 | size int 19 | } 20 | 21 | func (p *Prefix) push(val string) { 22 | p.vals[p.size] = val 23 | p.size++ 24 | } 25 | 26 | func (p *Prefix) pop() { 27 | if p.size == 2 { 28 | p.vals[0] = p.vals[1] 29 | p.vals[1] = "" 30 | } else if p.size == 1 { 31 | p.vals[0] = "" 32 | } 33 | p.size-- 34 | } 35 | 36 | const ( 37 | prefixSize = 2 38 | ) 39 | 40 | var ( 41 | nonWord = "\n" 42 | maxGeneratedWords = 10_000 43 | statemap = make(map[[prefixSize]string]Suffixes) 44 | ) 45 | 46 | func main() { 47 | var prefix Prefix 48 | 49 | for i := 0; i < prefixSize; i++ { 50 | add(&prefix, nonWord) 51 | } 52 | build(&prefix, os.Stdin) 53 | add(&prefix, nonWord) 54 | generate(maxGeneratedWords) 55 | } 56 | 57 | func add(p *Prefix, s string) { 58 | if p.size == prefixSize { 59 | statemap[p.vals] = append(statemap[p.vals], s) 60 | p.pop() 61 | } 62 | p.push(s) 63 | } 64 | 65 | func build(p *Prefix, in io.Reader) { 66 | scanner := bufio.NewScanner(in) 67 | scanner.Split(bufio.ScanWords) 68 | 69 | for scanner.Scan() { 70 | w := scanner.Text() 71 | add(p, w) 72 | } 73 | } 74 | 75 | func generate(words int) { 76 | var prefix Prefix 77 | 78 | for i := 0; i < prefixSize; i++ { 79 | add(&prefix, nonWord) 80 | } 81 | 82 | s := rand.NewSource(time.Now().Unix()) 83 | random := rand.New(s) 84 | for i := 0; i < words; i++ { 85 | suf := statemap[prefix.vals] 86 | r := random.Intn(len(suf)) 87 | w := suf[r%len(suf)] 88 | 89 | if w == nonWord { 90 | break 91 | } 92 | fmt.Printf("%s ", w) 93 | prefix.pop() 94 | prefix.push(w) 95 | } 96 | } 97 | -------------------------------------------------------------------------------- /chapter-3/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 3: Design and Implementation 2 | 3 | ## Chapter summary 4 | 5 | Defining the data structures is a big part of desing a program. Once that is laid out, and if laid out well, 6 | the code will start fitting itself and the algorithms will be fairly obvious. 7 | 8 | In this chapter we will implement the [Markov Chain Algorithm](https://en.wikipedia.org/wiki/Markov_chain) in a variaty of languages, to assess the differences and 9 | similarities between each of them. 10 | 11 | ## Table of Contents 12 | 13 | - [3.1 The Markov Chain Algorithm](3.1-the-markov-chain-algorithm) 14 | - [3.2 Data Structure Alternatives](3.2-data-structure-alternatives) 15 | - [3.3 Building the Data Structure in C](3.3-building-the-data-structure-in-c) 16 | - [3.4 Generating Output](3.4-generating-output) 17 | - [3.5 Java](3.5-java) 18 | - [3.6 C++](3.6-c++) 19 | - [3.7 Awk and Perl](3.7-awk-and-perl) 20 | - [3.8 Performance](3.8-performance) 21 | - [3.9 Lessons](3.9-lessons) 22 | 23 | ## Supplementary Reading 24 | 25 | - _Generic Programming and the STL_ by Matthew Austern 26 | - _The C++ Programming Language_ by Bjarne Stroustrup 27 | - _The Java Programming Language_ by Ken Arnold and James Gosling 28 | - _Programming Perl_ by Larry Wall, Tom Christansen and Randal Schwartz 29 | - _Design Patterns: Elements of Reusable Object-Oriented Software_ by Erich Gamma, Richard Helm, Ralph Johnson and John Vlissides 30 | - _Computer Recreations_, Scientific American Magazine (article) 31 | -------------------------------------------------------------------------------- /chapter-4/4.1-comma-separated-values/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 4: Interfaces 2 | 3 | ## Section 4.1 Comma-Separated Values 4 | 5 | **Summary:** Comma-separated values (CSV) is a well-known text format for representing tabular data. 6 | In this chapter we will build a library to read CSV data and convert it into an internal represenation. 7 | 8 | The chapter includes a tcl script that downloads data from `quote.yahoo.com`, but the link is outdated, 9 | and instead of looking for a new one, I would just hardcode some data in a file. 10 | 11 | Sample data csv file: [data.csv](data.csv) 12 | -------------------------------------------------------------------------------- /chapter-4/4.1-comma-separated-values/data.csv: -------------------------------------------------------------------------------- 1 | "LU",86.375,11/5/1998,1:01PM,-0.125,86,86.375,85.0625,2888600, 2 | "T",96.375,11/5/1999,1:02PM,-0.225,86,86.375,85.0625,1888600, 3 | "MSFT",105.270,11/5/2001,2:03PM,0.125,81,81.375,85.0625,2000600, -------------------------------------------------------------------------------- /chapter-4/4.2-a-prototype-library/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 4: Interfaces 2 | 3 | ## Section 4.2 A Prototype Library 4 | 5 | **Summary:** In this section we will build a prototype, _not-production-ready_ library in C. 6 | -------------------------------------------------------------------------------- /chapter-4/4.2-a-prototype-library/csv.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | 4 | char buf[200]; /* input line buffer */ 5 | char *field[20]; /* fields */ 6 | 7 | /* unquote: remove leading and trailing quote */ 8 | char *unquote(char *p) 9 | { 10 | if (p[0] == '"') 11 | { 12 | if (p[strlen(p) - 1] == '"') 13 | p[strlen(p) - 1] = '\0'; 14 | p++; 15 | } 16 | return p; 17 | } 18 | 19 | /* csvgetline: read and parse line, return field count */ 20 | /* samle input: "LU",86.25, "11/4/1998","2:19PM",+4.0625 */ 21 | int csvgetline(FILE *fn) 22 | { 23 | char *p; 24 | 25 | if (fgets(buf, sizeof(buf), fn) == NULL) 26 | return -1; 27 | int nfield = 0; 28 | for (char *q = buf; (p = strtok(q, ",\n\r")) != NULL; q = NULL) 29 | field[nfield++] = unquote(p); 30 | return nfield; 31 | } 32 | 33 | int main(void) 34 | { 35 | int nf; 36 | 37 | while ((nf = csvgetline(stdin)) != -1) 38 | for (int i = 0; i < nf; i++) 39 | printf("field[%d] = '%s'\n", i, field[i]); 40 | return 0; 41 | } -------------------------------------------------------------------------------- /chapter-4/4.3-a-library-for-others/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 4: Interfaces 2 | 3 | ## Section 4.3 A Library For Others 4 | 5 | **Summary:** In this chapter we will build on the knowledge we gained writing the prototype and will build a full-fledged 6 | library to be used by others. 7 | 8 | The library consists of a [header file](csv.h), where the interfaces lives 9 | and a [source file](csv.c), where the implementation lives. The consumers of the library will import the header file, 10 | which will reference the compiled implementation. 11 | 12 | ## Exercise 4-1 13 | 14 | There are several degrees of laziness for field-splitting: among the possibilities are to split all 15 | at once but only when some field is requsted, to split only the field requested, or to split up to 16 | the field requested. Enumerate possibilities, assess their potential difficulty and benefits, 17 | then write and measure their speeds. 18 | 19 | _Answer:_ Possibilities: 20 | 21 | - split all at once but only when some field is requsted 22 | - this is easy. the only thing that is gonna change in the code is the place, where `split` is called. 23 | This is going to be the first invocation of `csvfield` 24 | - split only the field requested 25 | - here we need to count the delimeters and split only after counting up to N delimeters. 26 | - split up to the field requested 27 | - this will be easier than the second, because we start spliting one by one, and count how much we've splitted. 28 | 29 | TODO: implementations #5 30 | 31 | ## Exercise 4-2 32 | 33 | Add a facility so separators can be changed 34 | (a) to an arbitrary class of characters; 35 | (b) to different separators for different fields; 36 | (c) to a regular expressions (see Chapter 9). 37 | What should the interface look like? 38 | 39 | _Answer:_ For each of these the interface would change to something like 40 | 41 | ```c 42 | char *csvgetline(FILE *f, char *separator); 43 | ``` 44 | 45 | _Implementation:_ See this [commit](https://github.com/asankov/the-practice-of-programming/commit/a17b9876cd918e988d383993ee0a4003d958da4b). 46 | 47 | ## Exercise 4-3 48 | 49 | WE chose to use the static initialization provided by C as the basis of a one-time switch: 50 | if a pointer is NULL on entry, initialization is performed. Another possibility is to required the user to call an explicit 51 | initialization function, which could include suggested initial sizes for arrays. 52 | Implement a version that combines the best of both. What is the role of `reset` in your implementation? 53 | 54 | _Answer:_ The best of both worlds would be to give the user possibility to initialize the library with initial sizes, 55 | but if the user did not, we would still use the default ones (start from 1, grow to whenever necessary). 56 | 57 | _Implementation:_ See this [commit](https://github.com/asankov/the-practice-of-programming/commit/e6b9cb4315c5b184011fe6a80eb44c3d6699f690). 58 | 59 | ## Exercise 4-4 60 | 61 | Design and implement a library for creating CSV-formatted data. The simplest version might take an array of strings 62 | and print them with quotes and commas. A more sophisticated version might use a format string analogous to `printf`. 63 | Look at Chapter 9 for some suggestions on notation. 64 | 65 | _Answer:_ I build more like a prototype than a real library, but it is what it is. 66 | 67 | The interface consists of only method: 68 | 69 | ```c 70 | void generate_csv(FILE *fn) 71 | ``` 72 | 73 | this reads from `fn`, until `EOF` and prints a word everytime it encounters `" "`. It has no notion of new lines, and multiple columns, but hey, it's just a prototype. 74 | 75 | For the interface see [`generate_csv.h`](generate_csv.h). 76 | 77 | For the implementation see [`generate_csv.c`](generate_csv.h). 78 | -------------------------------------------------------------------------------- /chapter-4/4.3-a-library-for-others/csv.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | 5 | enum 6 | { 7 | NOMEM = -2 /* out of memory signal */ 8 | }; 9 | 10 | static char *line = NULL; /* input chars */ 11 | static char *sline = NULL; /* line copy used by split */ 12 | static int maxline = 0; /* size of line[] and sline[] */ 13 | static char **field = NULL; /* field pointers */ 14 | static int maxfield = 0; /* size of fields[] */ 15 | static int nfield = 0; /* number of fields in field[] */ 16 | 17 | static int _initial_maxline = 0; /* initial value of maxline configured by the consumer, or 0 by default */ 18 | static int _initial_maxfield = 0; /* initial value of maxfield configured by the consumer, or 0 by default */ 19 | 20 | void init(int initmaxfield, int initmaxline) /* initialize the library with proper starting values */ 21 | { 22 | maxline = _initial_maxline = initmaxline; 23 | maxfield = _initial_maxfield = initmaxfield; 24 | } 25 | /* reset: set variables back to starting values */ 26 | static void reset(void) 27 | { 28 | free(line); 29 | free(sline); 30 | free(field); 31 | line = NULL; 32 | sline = NULL; 33 | field = NULL; 34 | maxline = _initial_maxline; 35 | maxfield = _initial_maxfield; 36 | nfield = 0; 37 | } 38 | 39 | /* endofline: check for and consume \r, \n, \r\n, or EOF */ 40 | static int endofline(FILE *fn, int c) 41 | { 42 | int eol = (c == '\r' || c == '\n'); 43 | 44 | if (c == '\r') 45 | { 46 | c = getc(fn); 47 | if (c != '\n' && c != EOF) 48 | ungetc(c, fn); /* read too far; put c back */ 49 | } 50 | return eol; 51 | } 52 | 53 | /* advquoted: quoted field: return pointer to next separator*/ 54 | static char *advquoted(char *p, char *separator) 55 | { 56 | int i, j; 57 | 58 | for (i = j = 0; p[j] != '\0'; i++, j++) 59 | { 60 | if (p[j] == '"' && p[++j] != '"') 61 | { 62 | int k = strcspn(p + j, separator); 63 | memmove(p + i, p + j, k); 64 | i += k; 65 | j += k; 66 | break; 67 | } 68 | p[i] = p[j]; 69 | } 70 | p[i] = '\0'; 71 | return p + j; 72 | } 73 | 74 | /* csvfield: return pointer to n-th field */ 75 | char *csvfield(int n) 76 | { 77 | if (n < 0 || n >= nfield) 78 | return NULL; 79 | return field[n]; 80 | } 81 | 82 | /* csvnfield: return number of fields */ 83 | int csvnfield(void) 84 | { 85 | return nfield; 86 | } 87 | 88 | /* split: split line into fields */ 89 | static int split(char *separator) 90 | { 91 | char *sepp; /* pointer to temporary character */ 92 | int sepc; 93 | 94 | nfield = 0; 95 | if (line[0] == '\0') 96 | return 0; 97 | strcpy(sline, line); 98 | char *p = sline; 99 | 100 | do 101 | { 102 | if (nfield >= maxfield) 103 | { 104 | maxfield *= 2; /* double character size */ 105 | char **newf = (char **)realloc(field, maxfield * sizeof(field[0])); 106 | if (newf == NULL) 107 | return NOMEM; 108 | field = newf; 109 | } 110 | if (*p == '"') 111 | sepp = advquoted(++p, separator); /* skip initial quote */ 112 | else 113 | sepp = p + strspn(p, separator); 114 | sepc = sepp[0]; 115 | sepp[0] = '\0'; /* terminate field */ 116 | field[nfield++] = p; 117 | p = sepp + 1; 118 | } while (sepc == ','); 119 | 120 | return nfield; 121 | } 122 | 123 | /* csvgetline: get one line, grow as needed */ 124 | /* sample input: "LU",86.25,"11/4/1998","2:19PM",+4.0625 */ 125 | char *csvgetline(FILE *fn, char *separator) 126 | { 127 | int c, i; 128 | char *newl, *news; 129 | 130 | if (line == NULL) 131 | { 132 | maxline = maxfield = 1; /* allocate on first call */ 133 | line = (char *)malloc(maxline); 134 | sline = (char *)malloc(maxline); 135 | field = (char **)malloc(maxfield * sizeof(field[0])); 136 | if (line == NULL || sline == NULL || field == NULL) 137 | { 138 | reset(); 139 | return NULL; /* out of memory */ 140 | } 141 | } 142 | for (i = 0; (c = getc(fn)) != EOF && !endofline(fn, c); i++) 143 | { 144 | if (i >= maxline - 1) /* grow line */ 145 | { 146 | maxline *= 2; 147 | newl = (char *)realloc(line, maxline); 148 | news = (char *)realloc(sline, maxline); 149 | if (newl == NULL || news == NULL) 150 | { 151 | reset(); 152 | return NULL; /* out of memory */ 153 | } 154 | line = newl; 155 | sline = news; 156 | } 157 | line[i] = c; 158 | } 159 | line[i] = '\0'; 160 | if (split(separator) == NOMEM) 161 | { 162 | reset(); 163 | return NULL; /* out of memory */ 164 | } 165 | return (c == EOF && i == 0) ? NULL : line; 166 | } 167 | 168 | int main(void) 169 | { 170 | char *line; 171 | 172 | init(10, 10); 173 | 174 | while ((line = csvgetline(stdin, ",")) != NULL) 175 | { 176 | printf("line = '%s'\n", line); 177 | for (int i = 0; i < csvnfield(); i++) 178 | printf("field[%d] = '%s'\n", i, csvfield(i)); 179 | } 180 | 181 | return 0; 182 | } -------------------------------------------------------------------------------- /chapter-4/4.3-a-library-for-others/csv.h: -------------------------------------------------------------------------------- 1 | /* csv.h: interface for csv library */ 2 | 3 | #include 4 | 5 | extern char *csvgetline(FILE *f, char *separator); /* read next input line */ 6 | extern char *csvfield(int n); /* return field n */ 7 | extern int csvnfield(void); /* return number of fields */ 8 | extern void init(int maxfield, int maxline); /* initialize the library with proper starting values */ -------------------------------------------------------------------------------- /chapter-4/4.3-a-library-for-others/csv_generator.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | 4 | void generate_csv(FILE *fn) 5 | { 6 | int i, size = 100; 7 | char *word = NULL; 8 | 9 | while (1) 10 | { 11 | int c = fgetc(fn); 12 | 13 | if (word == NULL) 14 | { 15 | word = (char *)malloc(size * sizeof(char)); 16 | } 17 | if (i > size) 18 | { 19 | size *= 2; 20 | word = (char *)realloc(word, size * sizeof(char)); 21 | } 22 | 23 | if (c == ' ') 24 | { 25 | printf("%s, ", word); 26 | for (int j = 0; j < i; j++) 27 | word[j] = ' '; 28 | i = 0; 29 | } 30 | else if (c == EOF) 31 | { 32 | printf("%s\n", word); 33 | break; 34 | } 35 | else 36 | { 37 | word[i] = c; 38 | i++; 39 | } 40 | } 41 | } 42 | 43 | int main(void) 44 | { 45 | generate_csv(stdin); 46 | 47 | return 0; 48 | } -------------------------------------------------------------------------------- /chapter-4/4.3-a-library-for-others/csv_generator.h: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | void generate_csv(FILE *fn); 4 | -------------------------------------------------------------------------------- /chapter-4/4.3-a-library-for-others/values.txt: -------------------------------------------------------------------------------- 1 | val1 val2 val3 2 | -------------------------------------------------------------------------------- /chapter-4/4.4-a-c++-implementation/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 4: Interfaces 2 | 3 | ## Section 4.4 A C++ Implementation 4 | 5 | **Summary:** In this chapter we will implement the same library in C++. 6 | 7 | See the implementation in [`csv.cpp`](csv.cpp) 8 | 9 | ### Exercise 4-5 10 | 11 | Enhance the C++ implementation to overload subscripting with `operator[]` so that fields can be accesses as `csv[i]`. 12 | 13 | _Answer:_ Implemented directly in [`csv.cpp`](csv.cpp). See this [commit](https://github.com/asankov/the-practice-of-programming/commit/59a3bd15e0af3090a963603217589258a8c45b5a) 14 | 15 | ### Exercise 4-6 16 | 17 | Write a Java version of the CSV library, then compare the three implementations for clarity, robustness and speed. 18 | 19 | _Answer:_ TODO: implementation 20 | 21 | ### Exercise 4-7 22 | 23 | Repackage the C++ version of the CSV code as an STL iterator. 24 | 25 | _Answer:_ TODO: implementation 26 | 27 | ### Exercise 4-8 28 | 29 | The C++ version permits multiple independent `Csv` instances to operate concurrently without interfering, 30 | a benefit of encapsulating all the state in an object that can be instantiated multiple times. 31 | Modify the C version to achieve the same effect by replacing the global data structures with structures that are 32 | allocated and initialized by an explicit `csvnew` function. 33 | 34 | _Answer:_ TODO: implementation 35 | -------------------------------------------------------------------------------- /chapter-4/4.4-a-c++-implementation/csv.cpp: -------------------------------------------------------------------------------- 1 | 2 | #include 3 | #include 4 | #include 5 | #include 6 | #include 7 | #include 8 | 9 | // read and parse comma-separated values 10 | // sample input: "LU",86.25,"11/4/1998","2:19PM",+4.0625 11 | class Csv 12 | { 13 | public: 14 | Csv(std::istream &fin = std::cin, std::string sep = ",") : fin(fin), fieldsep(sep) {} 15 | 16 | int getline(std::string &); 17 | int getnfield() const { return nfield; } 18 | std::string operator[](int n); 19 | 20 | private: 21 | std::istream &fin; // input file pointer 22 | std::string line; // input line 23 | std::vector field; // field strings 24 | int nfield; // number of fields 25 | std::string fieldsep; // separator characters 26 | 27 | int split(); 28 | int endofline(char); 29 | int advplain(const std::string &line, std::string &fld, int); 30 | int advquoted(const std::string &line, std::string &fld, int); 31 | }; 32 | 33 | std::string Csv::operator[](int n) 34 | { 35 | if (n < 0 || n >= nfield) 36 | return ""; 37 | return field[n]; 38 | }; 39 | 40 | // getline: get one line, grow as needed 41 | int Csv::getline(std::string &str) 42 | { 43 | char c; 44 | 45 | for (line = ""; fin.get(c) && !endofline(c);) 46 | line += c; 47 | split(); 48 | str = line; 49 | return !fin.eof(); 50 | } 51 | 52 | // endofline: check for and consume \r, \n, \r\n, or EOF 53 | int Csv::endofline(char c) 54 | { 55 | int eol = (c == '\r' || c == '\n'); 56 | if (c == '\r') 57 | { 58 | fin.get(c); 59 | if (!fin.eof() && c != '\n') 60 | fin.putback(c); // read too far 61 | } 62 | return eol; 63 | } 64 | 65 | // split: split line into fields 66 | int Csv::split() 67 | { 68 | std::string fld; 69 | int i = 0, j; 70 | 71 | nfield = 0; 72 | if (line.length() == 0) 73 | return 0; 74 | 75 | do 76 | { 77 | if (i < line.length() && line[i] == '"') 78 | j = advquoted(line, fld, ++i); // skip quote 79 | else 80 | j = advplain(line, fld, i); 81 | 82 | if (nfield >= field.size()) 83 | field.push_back(fld); 84 | else 85 | field[nfield] = fld; 86 | nfield++; 87 | i = j + 1; 88 | } while (j < line.length()); 89 | 90 | return nfield; 91 | } 92 | 93 | // advquoted: quoted field: return index of next separator 94 | int Csv::advquoted(const std::string &s, std::string &fld, int i) 95 | { 96 | int j; 97 | 98 | fld = ""; 99 | for (j = i; j < s.length(); j++) 100 | { 101 | if (s[j] == '"' && s[++j] != '"') 102 | { 103 | int k = s.find_first_of(fieldsep, j); 104 | if (k > s.length()) // no separator found 105 | k = s.length(); 106 | for (k -= j; k-- > 0;) 107 | fld += s[j++]; 108 | break; 109 | } 110 | fld += s[j]; 111 | } 112 | return j; 113 | } 114 | 115 | // advplain: unquoted filed; return index of next separator 116 | int Csv::advplain(const std::string &s, std::string &fld, int i) 117 | { 118 | int j = s.find_first_of(fieldsep, i); // look for separator 119 | if (j > s.length()) // none found 120 | j = s.length(); 121 | fld = std::string(s, i, j - i); 122 | return j; 123 | } 124 | 125 | // Csvtest main: test Csv class 126 | int main(void) 127 | { 128 | std::string line; 129 | Csv csv; 130 | 131 | while (csv.getline(line) != 0) 132 | { 133 | std::cout << "line = " << line << "\n"; 134 | for (int i = 0; i < csv.getnfield(); i++) 135 | std::cout << "field[" << i << "] = '" << csv[i] << "'\n"; 136 | } 137 | return 0; 138 | } -------------------------------------------------------------------------------- /chapter-4/4.5-interface-principles/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 4: Interfaces 2 | 3 | ## Section 4.5 Interface Principles 4 | 5 | **Summary:** In this chapter we discussed the role of interfaces. They are the contract between the library developers and consumers. 6 | Some basic rules need to be followed by library authors to keep the library clean, up-to-the-point and easy to use: 7 | 8 | - Hide implementation details 9 | - Choose a small orthogonal set of primitives 10 | - Don't reach behind the user's back 11 | - Do the same thing the same way everywhere 12 | -------------------------------------------------------------------------------- /chapter-4/4.6-resource-management/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 4: Interfaces 2 | 3 | ## Section 4.6 Resource Management 4 | 5 | **Summary:** Getting resource management right is vital when it comes to designing interfaces and libraries. The questions that stand are of the sort: 6 | 7 | - who owns the memory used by both the program and the library? 8 | - who is responsible for cleaning up unused data? 9 | - who is responsible for handling data races? 10 | 11 | These problems are exposed in some of the C library functions like `strtok`. In OOP languages like C++ and Java objects are a good mechanics 12 | for solving them, because they provide encapsulation. C++ has destructors in which you can define cleanup logic, for when after an instance is not needed anymore. 13 | Java, on the other hand, has Garbage collector, which cleans a lot of stuff for you. 14 | 15 | Our problems become even more when we are in a multi-threaded environment. In such, it is vital that we don't use global variables, and our code shares nothing, 16 | except the interfaces. Other mechanisms that help are mutexed, locks and synchronized blocks. 17 | -------------------------------------------------------------------------------- /chapter-4/4.7-abort-retry-fail/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 4: Interfaces 2 | 3 | ## Section 4.7 Abort, Retry, Fail? 4 | 5 | **Summary:** Handling errors is an important part of building a library/writing a program. 6 | Functions should have a way to signal to the consumer that something went wrong. 7 | This could be done via the return value of the function, or an exception (is such are available in the language of choice). 8 | A good rule of thumb is to catch errors at a low level, and handle them at a high one. This would mean 9 | that libraries should not try to handle errors themselves, but rather return them to the consumer, 10 | so that he can choose the best way to do that. 11 | Also, exceptions should be used for really exceptional situations, and not for control flow. 12 | -------------------------------------------------------------------------------- /chapter-4/4.8-user-interfaces/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 4: Interfaces 2 | 3 | ## 4.8 User Interfaces 4 | 5 | **Summary:** In this chapter we discuss user interfaces. 6 | They follow more or less the same rules: 7 | 8 | When an error occurs, the outpur should reveal as much relevant information as possible, 9 | without that being too much. 10 | It should point the user to where the error is and to what is the right way to use the software. 11 | -------------------------------------------------------------------------------- /chapter-4/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 4: Interfaces 2 | 3 | ## Chapter summary 4 | 5 | Building a program is one thing, writing a code that is going to be consumed from others, another. 6 | In this chapter we will take a look at designing libraries and the problems that are to be solved in that aspect: 7 | 8 | - Interfaces - defining meaningful interface that expose enough functionallity, but in the same time hide the internals well 9 | - Information hiding - what is public and what private? 10 | - Resource management - who is responsible for the resources shared with the library? 11 | - Error handling - how do you handle errors, how do you recover from one and how do you return them to your consumer? 12 | 13 | ## Table of Contents 14 | 15 | - [4.1 Comma-Separated Values](4.1-comma-separated-values) 16 | - [4.2 A Prototype Library](4.2-a-prototype-library) 17 | - [4.3 A Library For Others](4.3-a-library-for-others) 18 | - [4.4 A C++ Implementation](4.4-a-c++-implementation) 19 | - [4.5 Interface Principles](4.5-interface-principles) 20 | - [4.6 Resource Management](4.6-resource-management) 21 | - [4.7 Abort, Retry, Fail?](4.7-abort-retry-fail) 22 | - [4.8 User Interfaces](4.8-user-interfaces) 23 | 24 | ## Supplementary Reading 25 | 26 | - _The Mythical Man Month_ by Frederick P. Brooks, Jr. 27 | - _Large-Scale C++ Software Design_ by John Lakos 28 | - _C Interfaces and Implementations_ by David Hanson 29 | - _Rapid Development_ by Steve McConnell 30 | - _Designing Visual Interfaces: Communication Oriented Techniques_ by Kevin Mullet and Darrell Sano 31 | - _Designing the User Interface: Strategies for Effective Human-Computer Interaction_ by Ben Shneidermann 32 | - _About Face: The Essentials of User Interface Desing_ by Alan Cooper 33 | - _User Interface Design_ by Harold Thimbleby 34 | -------------------------------------------------------------------------------- /chapter-5/5.1-debuggers/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 5: Debugging 2 | 3 | ## Section 5.1 Debuggers 4 | 5 | **Summary:** One powerful tool for debugging are debuggers. Most language come with one. 6 | However, debuggers are system and setup dependent. Sometime, it may be more useful to put a print statement somewhere 7 | or think hard about why the code could be broken. Blind probing with a debuger is not likely to be productive. 8 | Also, if you bug logs in the code, they stay in the code, but debugging sessions are transient, and when they are over, 9 | all of the information that was collected is lost. 10 | -------------------------------------------------------------------------------- /chapter-5/5.2-good-clues-easy-bugs/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 5: Debugging 2 | 3 | ## Section 5.2 Good Clues, Easy Bugs 4 | 5 | **Summary:** Solving bugs is like a murder mystery, where you are also the killer. 6 | Many bugs are easy to solve, if you know where to look for the right clues to their root cause. 7 | Such clues are: 8 | 9 | - **Look for familiar patterns** - look for familiar patterns of code, wher you've already made the same mistake 10 | - **Examine the most recent changes** - compare the program with the previous version. If you can't reproduce the problem in the older version, then it was introduces recently and you only need to look at your latest changes. 11 | - **Don't make the same mistake twice** - when you've fixed a bug, think of whether there is a place in which you may have made the same mistake. If yes, go there and fix it 12 | - **Debug it now, not later** - when you encounter a problem, debug it sooner, rather than later 13 | - **Get a stack trace** - a good feature of debuggers is to show you the stack trace. All the function call that were made before the error occurred. 14 | - **Read before typing** - resist the urge to start changing stuff in the code and see if the problem goes away. Instead, take a good look at the code, print it if you must, or take a break from it. Doing nany of those things has a better change of resolving the problem that changing random stuff. 15 | - **Explain your code to someone else** - basically rubber-duck debugging 16 | -------------------------------------------------------------------------------- /chapter-5/5.3-no-clues-hard-bugs/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 5: Debugging 2 | 3 | ## Section 5.3: No Clues, Hard Bugs 4 | 5 | **Summary:** If you have no idea why a bug is happenning, life gets tough. 6 | This chapter lists some techniques for making your life easier by showing you how to make your bugs discovarable. 7 | Such clues are: 8 | 9 | - **Make the bug reproducible** - make it so that the bug appears every time. Research the input and conditions that trigger the bug and use them 10 | when debugging. 11 | - **Divide and conquer** - when you find that input, bottle it down to the smallest possible input that triggers the bug. 12 | See what part of your program interacts with this input and focus on that. Also, focus on the latest changes of the program. 13 | - **Study the numerology of failures** - sometimes a pattern in numbers can be the source of the bug. Numbers like 1024, 2048, etc. 14 | - **Display output to localize your search** - if you are not sure what is happening where, or which part of the code you're reaching and which not, 15 | it's ok to put debug messages like: `got here` or `can't get here` and scan the output of the program for them. 16 | Also, be consistent in the way you print values like pointers and learn their format to get maximum information from this output. 17 | - **Write self-checking code** - write code that checks some conditions of your program and does something based on that, e.g. print something, exit, etc. Also, find a way to keep this code as part of your program for the next time such a bug occurs. You may comment out this code, or hide it under a debugging flag if you don't want to be executed every time. 18 | - **Write a log file** - stream your output to a log file, and later search there for patterns, reasons for failure, etc. Also, be sure to flush your buffers on exit, so that no output is lost. 19 | - **Draw a picture** - visualize your data in a picture, graphic, chart to get a visual output of your program. 20 | - **Use tools** - use all the tools at your disposal to the maximum. Tools like `grep`, `diff`, VCS, shell scripts, etc. 21 | Also, write small programs to validate your assumptions about the programming language you use. 22 | - **Keep records** - during long debugging sessions take notes to keep track of what you have tried. This way knowledge is saved for the time a similar bug appears. Also, you will be sure what you have tried and what not. 23 | -------------------------------------------------------------------------------- /chapter-5/5.4-last-resorts/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 5: Debugging 2 | 3 | ## Section 5.4 Last Resorts 4 | 5 | **Summary:** When all of the things in the last two chapters don't work it's time for the last resorts. 6 | 7 | Bugs could be as simple as misconseption in operator precendence, to a typo or a wrong parameter order. 8 | 9 | It's not a good option to blame the compiler or the CPU at first, but sometimes it is indeed their fault. 10 | 11 | Another nasty problem causes are the memory leaks. They cannot be reproduced every time, and make your program fail in mysterious ways. 12 | -------------------------------------------------------------------------------- /chapter-5/5.5-non-reproducible-bugs/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 5: Debugging 2 | 3 | ## Section 5.5 Non-reproducible Bugs 4 | 5 | **Summary:** Something a bug is just hard to reproduce. 6 | 7 | Not being able to reproduce the bug every time is nasty, but the fact itself gives you useful information - 8 | the fault is probably not in the algorithm, but in something external to the program. 9 | 10 | It could be an unitiliazed value that picks up random values, or a rogue seed. Remove everything random from the program in such case and make sure to initialized all variables to known values. 11 | 12 | If adding debugging code removes the bug, this proprably means that there is something wrong with the memory allocation, which the additional codes removes. 13 | 14 | When the program works for one person, but not for another debugging is hard, as you need to put yourself in the other person's shoes. 15 | The reason prorably is in the difference in the environments - files present, environment variables, etc. 16 | 17 | ### Exercise 5-1 18 | 19 | Write a version of `malloc` and `free` that can be used for debugging storage-management problems. 20 | One approach is to check the entire workspace on each call of `malloc` and `free`; 21 | another is to write logging information that can be processed by another program. 22 | Either way, add markers to the beggining and end of each allocated block to detect overruns at either end. 23 | 24 | _Answer:_ TODO 25 | -------------------------------------------------------------------------------- /chapter-5/5.6-debugging-tools/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 5: Debugging 2 | 3 | ## Section 5.6 Debugging Tools 4 | 5 | **Summary:** There are more debugging tools than a debugger. 6 | One such would be a program that reads a compiled program and output its meaningful content. 7 | This way if we are getting an error message, but don't know where that is comming from we can do: 8 | 9 | ```console 10 | strings *.exe *.dll | grep 'some error message' 11 | ``` 12 | 13 | For the implementation, see [`strings.c`](strings.c) 14 | 15 | ### Exercise 5-2 16 | 17 | The `strings` program prints strings with `MINLEN` or more characters, which sometimes produces more output than is useful. 18 | Provide `strings` with an optional argument to define the minimum string length. 19 | 20 | _Answer:_ Changed applied to [`strings.c`](strings.c) in [this commit](https://github.com/asankov/the-practice-of-programming/commit/4a99559bcb586c94036bc97107e21a9112c314b9). 21 | 22 | ### Exercise 5-3 23 | 24 | Write `vis`, which copies input to output, except that it displays non-printable bytes like backspaces, control characters, 25 | and non-ASCII characters as `\Xhh` where `hh` is the hexadecimal representaton of the non-printable byte. 26 | By contrast with `strings`, `vis` is more useful for examining inputs that contain only a few non-printable characters. 27 | 28 | _Answer:_ See [`vis.c`](vis.c) added in [this commit](https://github.com/asankov/the-practice-of-programming/commit/f81c7edac11980493af906a5a9bdb1900bf9a8cf). 29 | 30 | ### Exercise 5-4 31 | 32 | What does `vis` produce if the input is `\XOA`? How could you make output of `vis` unambiguous? 33 | 34 | _Answer_: 35 | 36 | ```console 37 | $ gcc vis.c -o executable && ./executable xoa.txt 38 | 39 | $ gcc vis.c -o executable && ./executable xoa.txt --minlen 1 40 | xoa.txt:\XOA 41 | ``` 42 | 43 | It producec nothing, because the input is too short. If we set the min input to 1 (or something smaller that the actual output), then it outputs the text. 44 | 45 | ### Exercise 5-5 46 | 47 | Extend `vis` to process a sequence of files, fold long lines at any desired column, and remove non-printable characters entirely. 48 | What other features might be consistent with the role of the program? 49 | 50 | _Answer:_ 51 | 52 | - process a sequence of files: See the changes to [`vis.c`](vis.c) added in [this commit](https://github.com/asankov/the-practice-of-programming/commit/4cbb1e96ef3805c376d777f2d4c3f005087f2f6c). 53 | -------------------------------------------------------------------------------- /chapter-5/5.6-debugging-tools/strings.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | 5 | #define BUFSIZE 10000 6 | 7 | int min_len = 6; 8 | 9 | void strings(char *name, FILE *fn); 10 | 11 | int main(int argc, char *argv[]) 12 | { 13 | int i; 14 | FILE *fn; 15 | 16 | if (argc == 1 || argc == 3) { 17 | printf("usage: strings filename [--minlen minlen]"); 18 | return -1; 19 | } 20 | 21 | if (argc == 4) { 22 | min_len = atoi(argv[3]); 23 | } 24 | 25 | char *filename = argv[1]; 26 | if ((fn = fopen(filename, "rb")) == NULL) { 27 | printf("can't open %s", filename); 28 | return -1; 29 | } 30 | strings(filename, fn); 31 | fclose(fn); 32 | 33 | return 0; 34 | } 35 | 36 | void strings(char *name, FILE *fn) 37 | { 38 | int c, i; 39 | char buf[BUFSIZE]; 40 | 41 | do { 42 | for (i = 0; (c=getc(fn)) != EOF;) 43 | { 44 | if (!isprint(c)) 45 | break; 46 | 47 | buf[i++] = c; 48 | if (i >= BUFSIZE) 49 | break; 50 | } 51 | if (i >= min_len) 52 | printf("%s:%.*s\n", name, i, buf); 53 | } while (c != EOF); 54 | } 55 | -------------------------------------------------------------------------------- /chapter-5/5.6-debugging-tools/vis.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | 5 | #define BUFSIZE 10000 6 | 7 | int min_len = 6; 8 | 9 | void vis(char *name, FILE *fn); 10 | 11 | int main(int argc, char *argv[]) 12 | { 13 | FILE *fn; 14 | int num_files = argc - 1; 15 | 16 | if (argc == 1 || argc == 3) { 17 | printf("usage: strings filename [--minlen minlen]"); 18 | return -1; 19 | } 20 | 21 | if (argc >= 4) { 22 | min_len = atoi(argv[3]); 23 | num_files -= 1; // because the last two args are "--minlen" and 24 | } 25 | 26 | for (int i = 1; i < num_files; i++) { 27 | char *filename = argv[i]; 28 | if ((fn = fopen(filename, "rb")) == NULL) { 29 | printf("can't open %s", filename); 30 | return -1; 31 | } 32 | vis(filename, fn); 33 | fclose(fn); 34 | } 35 | 36 | 37 | return 0; 38 | } 39 | 40 | void vis(char *name, FILE *fn) 41 | { 42 | int c, i; 43 | char buf[BUFSIZE]; 44 | 45 | do { 46 | for (i = 0; (c=getc(fn)) != EOF;) 47 | { 48 | if (!isprint(c)) { 49 | sprintf(&buf[i++], "\\"); 50 | sprintf(&buf[i++], "X"); 51 | char s[3]; 52 | sprintf(s, "%02x", c); 53 | sprintf(&buf[i++], "%c", s[0]); 54 | sprintf(&buf[i++], "%c", s[1]); 55 | break; 56 | } 57 | 58 | buf[i++] = c; 59 | if (i >= BUFSIZE) 60 | break; 61 | } 62 | if (i >= min_len) 63 | printf("%s:%.*s\n", name, i, buf); 64 | } while (c != EOF); 65 | } 66 | -------------------------------------------------------------------------------- /chapter-5/5.6-debugging-tools/xoa.txt: -------------------------------------------------------------------------------- 1 | \XOA -------------------------------------------------------------------------------- /chapter-5/5.7-other-peoples-bugs/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 5: Debugging 2 | 3 | ## Section 5.7 Other People's Bugs 4 | 5 | **Summary:** When working on software build by other people (which is most often the case), it is inevitable to have to fix bugs, introduced by other people. 6 | In that case, every lessons introduced so far apply, among other things. 7 | 8 | It is important to understand the structure of the program and how things work. 9 | Tools like `grep` and IDEs help with this. 10 | It is also important to understand the history of the code. Places where changes were made frequently are often a sign that the code there is poorly understood or the requirements have changed ofter, both of which could be a cause for buggy code. 11 | 12 | When tracking errors in a program you don't have the source for it's important to remember few things: 13 | 14 | - make sure you are testing with the latest version, because the bug you have found may be fixed in a newer version 15 | - find an easy way to reproduce it, so that you could point that out to the maintainers 16 | - make sure that the problem is real, and that the program is indeed not meant to behave that way 17 | -------------------------------------------------------------------------------- /chapter-5/5.8-summary/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 5: Debugging 2 | 3 | ## Section 5.8 Summary 4 | 5 | Whether we like debugging or not, we will have to do it regullarly. That is why it's important to write good code with fewer bugs. 6 | 7 | Once we see a bug, the first thing to do is think how it could have happened and where it could have come from. 8 | The next steps are to put a few debugging statements and find where exactly the bug is - **divide and conquer**. 9 | 10 | Other aids are explaining our code to someone else (rubber duck debugging), using debugging tools and stepping through our program. 11 | 12 | Finally, know yourself and the errors you make. When you have found and fixed the bug, think of a similar places where the same bug may exist. 13 | -------------------------------------------------------------------------------- /chapter-5/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 5: Debugging 2 | 3 | ## Chapter summary 4 | 5 | Bugs happening is inevitable. No matter the language or the language features that try to prevent bugs. 6 | Good programmer learn from their mistake and learn when fixing bugs. 7 | Debugging is hard and there are there are many debugging techniques - from debugging to language features. 8 | 9 | ## Table of Contents 10 | 11 | - [5.1 Debuggers](5.1-debuggers) 12 | - [5.2 Good Clues, Easy Bugs](5.2-good-clues-easy-bugs) 13 | - [5.3 No Clues, Hard Bugs](5.3-no-clues-hard-bugs) 14 | - [5.4 Last Resorts](5.4-last-resorts) 15 | - [5.5 Non-reproducible Bugs](5.5-non-reproducible-bugs) 16 | - [5.6 Debugging Tools](5.6-debugging-tools) 17 | - [5.7 Other People's Bugs](5.7-other-peoples-bugs) 18 | - [5.8 Summary](5.8-summary) 19 | 20 | ## Supplementary Reading 21 | 22 | - _Writing Solid Code_ by Steve Maguire 23 | - _Code Complete_ by Steve McConnell 24 | -------------------------------------------------------------------------------- /chapter-6/6.1-test-as-you-write-the-code/6-1-a.c: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | int test(int in, int exp); 4 | int factorial(int n); 5 | 6 | int main() 7 | { 8 | int res = 0; 9 | res += test(0, 1); 10 | res += test(-1, 0); 11 | res += test(1, 1); 12 | 13 | if (res != 0) 14 | { 15 | printf("❌ %d tests failed\n", res); 16 | } 17 | else 18 | { 19 | printf("✅ all tests passed\n"); 20 | } 21 | 22 | return res; 23 | } 24 | 25 | int test(int in, int exp) 26 | { 27 | int actual = factorial(in); 28 | if (actual != exp) 29 | return 1; 30 | return 0; 31 | } 32 | 33 | // 6-1.a This is supposed to print factorial 34 | int factorial(int n) 35 | { 36 | // return 0 for negative values of n 37 | if (n < 0) 38 | { 39 | return 0; 40 | } 41 | // return 1 for 0! and 1! 42 | if (n < 2) 43 | { 44 | return 1; 45 | } 46 | 47 | int fac = 1; 48 | while (n--) 49 | { 50 | printf("before fac *= n, fac = %d, n = %d\n", fac, n); 51 | fac *= n; 52 | printf("after fac *= n, fac = %d, n = %d\n", fac, n); 53 | } 54 | printf("return fac, fac = %d\n", fac); 55 | return fac; 56 | } 57 | -------------------------------------------------------------------------------- /chapter-6/6.1-test-as-you-write-the-code/6-1-b.c: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | void print(char *s); 4 | 5 | int main() 6 | { 7 | printf("Test 1:\n"); 8 | print("hello"); 9 | 10 | printf("Test 2:\n"); 11 | char s[] = {}; 12 | print(s); 13 | 14 | printf("Test 3:\n"); 15 | print("\0"); 16 | 17 | printf("Test 4:\n"); 18 | print(NULL); 19 | 20 | return 0; 21 | } 22 | 23 | // 6-1.b This is supposed to print the characters of a string one per line. 24 | void print(char *s) 25 | { 26 | if (s == NULL) 27 | return; 28 | 29 | int i = 0; 30 | while (s[i] != '\0') 31 | { 32 | putchar(s[i++]); 33 | putchar('\n'); 34 | } 35 | } -------------------------------------------------------------------------------- /chapter-6/6.1-test-as-you-write-the-code/6-1-c.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | 5 | int test(char *src); 6 | void strcpy2(char *dest, char *src); 7 | 8 | int main() 9 | { 10 | int res = 0; 11 | res += test(""); 12 | res += test(NULL); 13 | res += test("something"); 14 | 15 | if (res != 0) 16 | { 17 | printf("❌ %d tests failed\n", res); 18 | } 19 | else 20 | { 21 | printf("✅ all tests passed\n"); 22 | } 23 | 24 | return res; 25 | } 26 | 27 | int test(char *src) 28 | { 29 | if (src == NULL) 30 | return 0; 31 | 32 | char *n = (char *)malloc(100 * sizeof(char)); 33 | strcpy2(n, src); 34 | return strcmp(n, src) != 0; 35 | } 36 | 37 | void strcpy2(char *dest, char *src) 38 | { 39 | for (int i = 0; src[i] != '\0'; i++) 40 | dest[i] = src[i]; 41 | } -------------------------------------------------------------------------------- /chapter-6/6.1-test-as-you-write-the-code/6-1-d.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | 5 | void strncpy2(char *t, char *s, int n); 6 | int test(char *src, char *expected, int n); 7 | 8 | int main() 9 | { 10 | int res = 0; 11 | res += test("something", "somet", 5); 12 | res += test("something", "something", 9); 13 | res += test("something", "something", 100); 14 | res += test("something", "", 0); 15 | res += test("something", "", -1); 16 | res += test("", "", 10); 17 | res += test(NULL, "", 1); 18 | 19 | if (res != 0) 20 | { 21 | printf("❌ %d tests failed\n", res); 22 | } 23 | else 24 | { 25 | printf("✅ all tests passed\n"); 26 | } 27 | 28 | return res; 29 | } 30 | 31 | int test(char *src, char *expected, int n) 32 | { 33 | char *nn = (char *)malloc(100 * sizeof(char)); 34 | strncpy2(nn, src, n); 35 | return strcmp(nn, expected) != 0; 36 | } 37 | 38 | void strncpy2(char *t, char *s, int n) 39 | { 40 | if (t == NULL || s == NULL) 41 | return; 42 | while (n > 0 && *s != '\0') 43 | { 44 | *t = *s; 45 | t++; 46 | s++; 47 | n--; 48 | } 49 | } -------------------------------------------------------------------------------- /chapter-6/6.1-test-as-you-write-the-code/6-1-e.c: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | void compare(int i, int j); 4 | 5 | int main() 6 | { 7 | compare(1, 2); 8 | compare(1, 0); 9 | compare(1, 1); 10 | } 11 | 12 | void compare(int i, int j) 13 | { 14 | if (i > j) 15 | printf("%d is greater than %d.\n", i, j); 16 | else if (i < j) 17 | printf("%d is smaller than %d.\n", i, j); 18 | else 19 | printf("%d is equal to %d.\n", i, j); 20 | } -------------------------------------------------------------------------------- /chapter-6/6.1-test-as-you-write-the-code/6-1-f.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | 4 | void character(char c); 5 | 6 | int main() { 7 | character('A'); 8 | character('Z'); 9 | character('L'); 10 | character('M'); 11 | 12 | character('a'); 13 | character('z'); 14 | character('l'); 15 | character('m'); 16 | } 17 | 18 | void character(char c) 19 | { 20 | printf("%c - ", c); 21 | c = toupper(c); 22 | if (c >= 'A' && c <= 'Z') { 23 | if (c <= 'L') 24 | printf("first half of alphabet\n"); 25 | else 26 | printf("second half of alphabet\n"); 27 | } 28 | } -------------------------------------------------------------------------------- /chapter-6/6.1-test-as-you-write-the-code/README.md: -------------------------------------------------------------------------------- 1 | ## Chapter 6: Testing 2 | 3 | ## Section 6.1 Test as You Write the Code 4 | 5 | **Summary:** The earlier a problem is found, the better. 6 | If we test our code as we are writing it, we may catch some bugs early and we have at least one test run before the code has even been compiled. 7 | Finding and fixing the bugs while writing the code will save us the time to troubleshoot later and fix bugs in an already working and deployed system. 8 | 9 | ### Test code at its boundaries 10 | 11 | One approach is to test small pieces of code at their boundary conditions. 12 | This include `for` or `while` loops, conditional statements, etc. 13 | Very often the bug includes at the bondary condition. When the input is empty, full, only one, etc. 14 | If our code works for the boundary inputs it will probably work for the normal ones as well. 15 | Boundary condition checking is effective for finding off-by-one errors. It becomes second nature with time and practice. 16 | It helps eliminate some bugs, but not all of them. 17 | 18 | ### Test pre- and post-conditions 19 | 20 | Verify that the code works for input which do not make sense. 21 | Basically, this means to validate your inputs and return a sensible value (`0`, `[]`, `NULL`, etc.), even if the inputs are non-sensible. 22 | Bogus inputs should not be ignored as they may lead to ugly crashed down the road. 23 | 24 | ### Use assertions 25 | 26 | Use assertions to validate your inputs. 27 | 28 | C/C++ provide an assertion facility in `` that lets you do: 29 | 30 | ```c 31 | assert(n > 0) 32 | ``` 33 | 34 | If that fails, the program will abort with a useful message, pointing at the callee (not the called function itself). 35 | This will help us identify who is at fault. 36 | However, because it aborts the program it is to be used only in extreme situations in which recovering is impossible. 37 | 38 | ### Program defensively 39 | 40 | Check for conditions that can't or shouldn't happen, but might, because of an error somewhere else. 41 | 42 | ### Check error returns 43 | 44 | Always check error returned from functions. 45 | For example, `fprintf` or `fwrite` will return errors if there is unsufficient memory or another serious problem ocurred. 46 | 47 | ### Exercise 6-1 48 | 49 | Check out these examples at their boundaries, then fix them as necesary according to the principles of style in Chapter 1 and the advice in this chapter. 50 | 51 | - **6-1.a** See [`6-1-a.c`](6-1-a.c). 52 | We see that with faulty input, 0, -1, etc. we get multiple iterations before the value is returned. 53 | This is because we don't check if the input is valid. 54 | Solution in this [commit](https://github.com/asankov/the-practice-of-programming/commit/f349b9aec39b0dda02e75ab3b36541e3930a05a2) 55 | - **6-1.b** See [`6-1-b.c`](6-1-b.c). 56 | Original solution results in the following when running `6-1-b.c`: 57 | 58 | ```text 59 | Test 1: 60 | h 61 | e 62 | l 63 | l 64 | o 65 | Test 2: 66 | 67 | Test 3: 68 | 69 | Test 4: 70 | [1] 31362 segmentation fault ./executable 71 | ``` 72 | 73 | We get a segmentation fault when `NULL` is passed to the `print` function, because we never check whether the passed value is legit. Also, we get a new line outputed when such is not part of the input (Test 3) - this is because we use `do-while` loop, instead of a `while` loop. 74 | Solution in this [commit](https://github.com/asankov/the-practice-of-programming/commit/076d8f06f381e551f516eebab7563fc36d540619) 75 | 76 | - **6-1.c** This is meant to copy a string from source to destination. 77 | See [`6-1-c.c`](6-1-c.c) Original solution results in the following when running `6-1-c.c`: 78 | 79 | ```text 80 | [1] 40929 segmentation fault ./executable 81 | ``` 82 | 83 | This is because we don't check if the input is NULL. 84 | Solution in this [commit](https://github.com/asankov/the-practice-of-programming/commit/d2ddb1169f251ccbfcfe92ff3dfedf6c0549685b) 85 | 86 | - **6-1.d** Another string copy, which attempts to copy `n` characters from `s` to `t`. 87 | The original solution produces this output when runnung `6-1-d.c`: 88 | 89 | ```text 90 | [1] 42710 segmentation fault ./executable 91 | ``` 92 | 93 | This is because the function does not check whether the input is not `NULL`. 94 | Solution in this [commit](https://github.com/asankov/the-practice-of-programming/commit/5824a5424f72c2479d65a8d8304d404437d1937d) 95 | 96 | - **6-1.e** A numerical comparison. 97 | See [`6-1-e.c`](6-1-e.c). 98 | Running the origin solution results in: 99 | 100 | ```text 101 | 1 is smaller than 2. 102 | 1 is greater than 0. 103 | 1 is smaller than 1. 104 | ``` 105 | 106 | This is because we don't handle the quality of the two numbers. 107 | After applying the solution: 108 | 109 | ```text 110 | 1 is smaller than 2. 111 | 1 is greater than 0. 112 | 1 is equal to 1. 113 | ``` 114 | 115 | See the solution in this [commit](https://github.com/asankov/the-practice-of-programming/commit/853b52f3998d978d1ff1fc524a85c82490bed629) 116 | 117 | - **6-1.f** A character class test. 118 | See [`6-1-f.c`](6-1-f.c). 119 | Running the original solution resulted in: 120 | 121 | ```text 122 | A - first half of alphabet 123 | Z - second half of alphabet 124 | L - first half of alphabet 125 | M - second half of alphabet 126 | a - z - l - m - % 127 | ``` 128 | 129 | which means the program is working fine for upper-case letter, but not for lower-case. 130 | The solution resulted in: 131 | 132 | ```text 133 | A - first half of alphabet 134 | Z - second half of alphabet 135 | L - first half of alphabet 136 | M - second half of alphabet 137 | a - first half of alphabet 138 | z - second half of alphabet 139 | l - first half of alphabet 140 | m - second half of alphabet 141 | ``` 142 | 143 | See the solution in this [commit](https://github.com/asankov/the-practice-of-programming/commit/e464dd266dc63d0928374b91bae5bb7b515aeada) 144 | 145 | ### Exercise 6-2 146 | 147 | As we are writing this book in late 1998, [the Year 2000 problem](https://en.wikipedia.org/wiki/Year_2000_problem) looms as perhaps the biggest boundary condition problem ever. 148 | 149 | - **6-2.a** What dates would you use to check whether a system is likely to work in the year 2000? 150 | Supposing the tests are expensive to perform, in what order would you do your tests after trying January 1, 2000 itself? 151 | 152 | _Answer:_ The most obvious first choice is `January 1, 2000`. 153 | The next ones - maybe `January 2, 2000` is another good choice. 154 | The next one should be offset in a sensible way - by a number that corresponds with the number of bits that when added to the date go to the next boundary condition. 155 | 156 | - **6-2.b** How would you test the standart function `ctime`, which returns a string representation of the date in this form: 157 | 158 | ```text 159 | Fri Dec 31 23:58:27 EST 1999\n\0 160 | ``` 161 | 162 | Suppose your program calls `ctime`. 163 | How would you write your code to defend againts flawed implementation? 164 | 165 | _Answer:_ Proper testing of the function can be done by pattern/regex matching. E.g. we can strip away the parts we don't care about and validate that we have a proper year, proper time (hour not bigger than 23, minutes and seconds not bigger than 59, etc.). We can defend our program from faulty implementation by checking the result of the function. The simplest check could be a `NULL` check. More comprehensive testing involves the kinds of tests described in the previous sentence. 166 | 167 | - **6-2.c** Describe how you would test a calendar program that prints output like this: 168 | 169 | ```text 170 | January 2000 171 | S M Tu W Th F S 172 | 1 173 | 2 3 4 5 6 7 8 174 | 9 10 11 12 13 14 15 175 | 16 17 18 19 20 21 22 176 | 23 24 25 26 27 28 29 177 | 30 31 178 | ``` 179 | 180 | _Answer:_ Firstly, we can validate that the month name is a valid one, comparing it to a list of all valid month names. 181 | Secondly, we can validate the rows that says which day of the week it is, because that is always the same. 182 | Finally, we can validate that all the numbers are in order. 183 | 184 | - **6-2.d** What other time boundaries can you think of in systems that you use, and how would you test to see whether they are handled correctly. 185 | 186 | _Answer:_ This is context specific, but I think that one way of validating the time boundaries is having a clear definiton of what those boundaries are, what mistake can happen and what mistake should never happen and have tests for those cases. 187 | -------------------------------------------------------------------------------- /chapter-6/6.2-systematic-testing/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 6: Testing 2 | 3 | ## Section 6.2 Systematic Testing 4 | 5 | **Summary:** It's important to test systematically so you know what you are testing and what to expect. 6 | 7 | ### Test incrementally 8 | 9 | Test as you write the code. Don't wait to write everything at test it at once. It's much harder and time consuming. 10 | Instead, write small testable parts, test them in isolation, and when combine them, test that they work together as well. 11 | 12 | ### Test simple parts first 13 | 14 | The incremental approach also applies to how we test features. We should first test the easy and most common parts. 15 | Only when we are sure this is working, should we go on with the harder and less used parts of our program. 16 | 17 | If we want to test a program that does binary search in an array of integers, this is the order of the test cases we have to execute: 18 | 19 | - search an array with no elements 20 | - search an array with one element and a trial value that is 21 | - less that the single entry of the arrat 22 | - equal to the single entry 23 | - greater than the single entry 24 | - search an array with two elements and trial values that 25 | - check all five possible combinations 26 | - check behaviour with duplicate elements in the array and trial values 27 | - less that the value in the array 28 | - equal to the value 29 | - greater than the value 30 | - search an array with three elements as with two elements 31 | - search an array with four elements as with two and three 32 | 33 | This cases can be executed manually, but it is good if we write a test scaffold that will run the tests for us. 34 | 35 | ### Know what output to expect 36 | 37 | It is important to know what are the expected results of the tests, otherwise we are wasting our time. 38 | This is context specific and depends on what we are testing, but generally: 39 | 40 | - we can have sample output to compare to the actual one 41 | - we can test the inverse (encryption/decryption) 42 | - we can validate that the output is within known boundaries 43 | 44 | ### Verify conservation properties 45 | 46 | Conservation properties should be verified within a program. 47 | For example, if we are testing a hash map the number of insertions, minus the number of deletions should be equal to the 48 | number of elements within the hash map. 49 | 50 | ### Compare independent implementations 51 | 52 | A good way of testing a program is comparing its output to another independent implementation of the same program. 53 | If the outputs differ, then at least one of them is wrong. If they are the same - there is a good change both are right. 54 | 55 | ### Measure test coverage 56 | 57 | The main idea of testing is that every statement of our program has been executed at least once. 58 | That is why it is important to measure test coverage and to be sure that this coverage it as higher as possible. 59 | It is hard to achieve 100% coverage, because there are always some "can't be reached" statements, and because by just varying the inputs it's hard to get to all parts of our program. 60 | 61 | ### Exercise 6-3 62 | 63 | Describe how you would test `freq`. 64 | 65 | _Answer:_ `freq` is a program that outputs the times each character is found in a files. 66 | See [`freq.c`](freq.c) 67 | I would start by defining test files and the expected output for each of them. 68 | Then I would run the program on the files and compare the input to the expected one. 69 | 70 | ### Exercise 6-4 71 | 72 | Design and implement a version of `freq` that measures the frequencies of other types of data values, such as 32-bit integers or floating-point numbers. 73 | Can you make one version of the program handle a variety of types elegantly? 74 | 75 | _Answer:_ The desing of the program will be similar to the one we have now. The difference would be in reading the input. 76 | When doing so we would need to handle the case where the char is a number and read until a delimiter is found. 77 | Then parse the number and increment its value in the character counter store. 78 | TODO: implementation 79 | -------------------------------------------------------------------------------- /chapter-6/6.2-systematic-testing/freq.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | 5 | unsigned long count[UCHAR_MAX + 1]; 6 | 7 | /* freq main: display byte frequency counts */ 8 | int main(void) 9 | { 10 | int c; 11 | 12 | while ((c = getchar()) != EOF) 13 | count[c]++; 14 | 15 | for (c = 0; c <= UCHAR_MAX; c++) 16 | { 17 | if (count[c] != 0) 18 | printf("%.2x %c %lu\n", c, isprint(c) ? c : '-', count[c]); 19 | } 20 | 21 | return 0; 22 | } 23 | -------------------------------------------------------------------------------- /chapter-6/6.3-test-automation/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 6: Testing 2 | 3 | ## Section 6.3 Test Automation 4 | 5 | **Summary**: Executing tests manually is a tedious job. 6 | That's why we need automation - tests that run programatically and check our program for errors. 7 | Automated tests should be easy to run. The easier they are to run, the more often you will run them. 8 | 9 | ### Automate regression testing 10 | 11 | Often, when fixing a bug, we test whether the bug has been fixed, but not whether we have not broken something else. 12 | The most popular type of testing is regression testing - whether all functionalities in the new version work as 13 | they have worked in the old version. 14 | 15 | When running such test suite we assume the old version worked correctly. 16 | This is not always the case, hence we need to 17 | verify that the test suite is right often. 18 | 19 | A good practice is that tests produce output only when they are failing and are otherwise silent. 20 | This, however, is not true every time, so we must apply it with caution. 21 | 22 | ### Create self-contained tests 23 | 24 | The best tests know their own input and output without depending on something externally. 25 | 26 | It is even possible to create our own DSL (Domain Specific Language) that describes tests. 27 | 28 | For example this is such language for testing `awk` programs: 29 | 30 | ```awk 31 | try {if ($1 == 1) print "yes"; else print "no"} 32 | 1 yes 33 | 1.0 yes 34 | 1E0 yes 35 | 0.1E1 yes 36 | 10E-1 yes 37 | 01 yes 38 | +1 yes 39 | 10E-2 no 40 | 10 no 41 | ``` 42 | 43 | It runs what is after `try` with the input from the first column and expects the output of the second column. 44 | 45 | When new functionality is added new tests cases should be added as well. Test cases should not be deleted. 46 | If a bug is found, first a test case that fails should be written to reproduce the bug, and then the bug should be fixed. 47 | From then on, we will know that this bug would not occurr again. 48 | 49 | ### Exercise 6-5 50 | 51 | Design a test suite for `printf`, using as many mechanical aids as possible. 52 | -------------------------------------------------------------------------------- /chapter-6/6.4-test-scaffolds/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 6: Testing 2 | 3 | ## Section 6.4 Test Scaffolds 4 | 5 | **Summary:** Sometimes testing the whole program is not what you want. 6 | Sometimes you want to test a small, isolated part of the program. 7 | Ofter this boils down to calling a function with a given input, and checking the return result. 8 | To do that, we need a proper scaffold and a testing matrix - given input, expected output. 9 | 10 | ### Exercise 6-6 11 | 12 | Create the test scaffold for `memster` along the lines that we indicated. 13 | 14 | ### Exercise 6-7 15 | 16 | Create tests for the rest of the `mem...` family. 17 | 18 | ### Exercise 6-8 19 | 20 | Specify a testing regime for numerical routines like `sqrt`, `sin`, and so on, as found in `math.h`. 21 | What values make sense? What independent checks can be performed? 22 | 23 | ### Exercise 6-9 24 | 25 | Define mechanisms for testing the functions of the C `str...` family, like `strcmp`. 26 | Some of these functions, especially tokenizers like `strtok` and `strcspn` are significantly more complicated than the `mem...` family, 27 | so more sophisticated tests will be called for. 28 | -------------------------------------------------------------------------------- /chapter-6/6.5-stress-tests/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 6: Testing 2 | 3 | ## Section 6.5 Stress Tests 4 | 5 | **Summary:** Programs need to be tested with inputs, which are considered "not-possible", because a computer does not know what is possible and what not and can provide any input to your program. 6 | For example, if we are working with files, we always need to test our program with empty files. 7 | If we are working with text files, we need to test our program with binary files and vice-versa. 8 | 9 | Many problems come because programs don't handle wrong inputs well. 10 | This can cause crashes, but also overflows and potential security problems as well. 11 | 12 | The Ariadne 5 rocket exploded on its maiden flight in June 1996, because the navigation code was inherited from Ariadne 4. 13 | Ariadne 5 was faster, so it provided bigger values to some of the variables in this code. 14 | Shortly after launch, the software tried to convert a 64-bit floating point number to a 16-bit integer, which generated an overflow. 15 | The software detected the overflow, but the error handling was not correct. 16 | The software just decided to shut down the entire subsystem, which led to the rocket going off course and exploding. 17 | 18 | ### Exercise 6-10 19 | 20 | Try to create a file that will crash your favourite text editor, compiler or other program. 21 | 22 | _Answer_: I think that a big enough file should be able to crash any editor. 23 | -------------------------------------------------------------------------------- /chapter-6/6.6-tips-for-testing/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 6: Testing 2 | 3 | ## Section 6.6 Tips for Testing 4 | 5 | There are a lot of tricks that make the testing easier. 6 | 7 | For example, if we are allocating array memory, we can allocate less for testing, instead of trying to produce large amounts of data for testing. 8 | 9 | Have a way to reproduce random tests, e.g. store the seed. 10 | 11 | Don't write new code until you've fixed all known bugs. The new code can be affected by the bugs. 12 | -------------------------------------------------------------------------------- /chapter-6/6.7-who-does-the-testing/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 6: Testing 2 | 3 | ## Section 6.7 Who Does the Testing? 4 | 5 | Testing should be done by the implementer. 6 | 7 | However, the implementer is aware of the code as has assumptions which could lead to missing valid test cases (white-box testing). 8 | 9 | Black-box testing (not knowing the code) is also a good approach, because it approaches the testing process without any assumptions. 10 | 11 | User testing is also important, because the user is the one that uses the software. 12 | -------------------------------------------------------------------------------- /chapter-6/6.8-testing-the-markov-program/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 6: Testing 2 | 3 | ## Section 6.8 Testing the Markov Program 4 | 5 | In this chapter we applied the techniques we learned to test the Markov program from Chapter 3. 6 | -------------------------------------------------------------------------------- /chapter-6/6.9-summary/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 6: Testing 2 | 3 | ## Section 6.9 Summary 4 | 5 | The most important thing for testing is _to do it_. 6 | 7 | Test automation is good, because machines don't get tired or fool themselves into thinking the code works when it does not. 8 | -------------------------------------------------------------------------------- /chapter-6/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 6: Testing 2 | 3 | ## Chapter summary 4 | 5 | Debugging is the process of troubleshooting bugs. Testing is the process of trying to break the problem. 6 | Testing can demonstrate the presence of bugs, but not their absense. 7 | Good way to write bug free code is to generate the code. If we do that based on a higher level language or specifications, assuming that the specification and generators are correct, so will be the code. 8 | 9 | ## Table of contents 10 | 11 | - [6.1 Test as You Write the Code](6.1-test-as-you-write-the-code) 12 | - [6.2 Systematic Testing](6.2-systematic-testing) 13 | - [6.3 Test Automation](6.3-test-automation) 14 | - [6.4 Test Scaffolds](6.4-test-scaffolds) 15 | - [6.5 Stress Tests](6.5-stress-tests) 16 | - [6.6 Tips for Testing](6.6-tips-for-testing) 17 | - [6.7 Who Does The Testing](6.7-who-does-the-testing) 18 | - [6.8 Testing the Markov Program](6.8-testing-the-markov-program) 19 | 20 | ## Supplementary Reading 21 | 22 | - _Software - Practice and Experience_ by Jon Bentley and Doug McIlroy 23 | - _Programming Pearls_ by Jon Bentley 24 | - _More Programming Pearls_ by Jon Bentley 25 | -------------------------------------------------------------------------------- /chapter-7/7-2/main.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import ( 4 | "fmt" 5 | "time" 6 | ) 7 | 8 | func Timed(do func()) time.Duration { 9 | start := time.Now() 10 | do() 11 | 12 | return time.Now().Sub(start) 13 | } 14 | 15 | func main() { 16 | duration := Timed(func() { 17 | fmt.Println("Inside Timed, sleeping for 5 seconds") 18 | 19 | time.Sleep(5 * time.Second) 20 | }) 21 | 22 | fmt.Printf("Execution took %s\n", duration.String()) 23 | } 24 | -------------------------------------------------------------------------------- /chapter-7/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 7: Performance 2 | 3 | ## Chapter Summary 4 | 5 | In the past, performance was important, because computers were lacking memory and compute power. 6 | 7 | Nowadays, performance is not that important in most cases. 8 | 9 | We should only be optimizing parts of the code that we know are worth optimizing like bottlenecks. 10 | 11 | ## 7.1 A Bottleneck 12 | 13 | This chapter tells the story of a bottleneck in an email gateways - the spam filter. 14 | 15 | It tells how the problem was investigated, profiled and finally optimized. 16 | 17 | ### Exercise 7-1 18 | 19 | A table that maps a single character to the set of patterns that begin with this character gives an order of magnitude improvement. 20 | Implement a version of `isspam` that uses two characters as the index. 21 | How much improvement does that lead to? 22 | These are special cases of a data structure called a _trie_. 23 | Most such data structures are based on trading space for time. 24 | 25 | ## 7.2 Timing and Profiling 26 | 27 | ### Automated timing measurements 28 | 29 | Most systems have commands that time the execution of a program. 30 | For Unix such command is `time`. 31 | 32 | ### Use a profiler 33 | 34 | Profilers are important, because they show much time was spend in each part of the program (for example, how much time is spend in each function and how many times a function is called). 35 | 36 | They can be crucial to understanding where the bottleneck is and who is the culprit. 37 | 38 | ### Concentrate on the hot spots 39 | 40 | Optimize the most slow parts of the program first. 41 | 42 | ### Draw a picture 43 | 44 | Some profilers can generate a visual overview of the performance of the program. 45 | That can be used to easily understand the profile, or to compare the profiles of two versions of a program. 46 | 47 | ### Exercise 7-2 48 | 49 | Whether or not your system has a `time` command, use `clock` or `getTime` to write a timing facility for your own use. 50 | Compare its times to a wall clock. 51 | How does other activity on the machine affect the timings? 52 | 53 | _Answer:_ [Here](./7-2/main.go). 54 | 55 | ### Exercise 7-3 56 | 57 | In the first profile, `strchr` was called 48,350,000 times and `strncmp` only 46,180,000. 58 | Explain the difference. 59 | 60 | ## 7.3 Strategies for Speed 61 | 62 | Some strategies for what to do when you need to optimize: 63 | 64 | ### Use a better algorithm or data structure 65 | 66 | This can have a huge performance benefit, if we have initially chosen the wrong algorithm or data structure. 67 | 68 | Sometimes changning the algorithm or the data structure includes trading memory for disk space. 69 | 70 | ### Enable compiler optimizations 71 | 72 | Compilers can do some optimizations on behalf of the programmer to make the code run faster. 73 | 74 | ### Tune the code 75 | 76 | Change the code in a way that will be more efficient. 77 | 78 | ### Don't optimiza what doesn't matter 79 | 80 | Don't optimize parts of the code that are not bottlenecks, or that are not used enough for their speed to matter. 81 | 82 | ## 7.4 Tuning the Code 83 | 84 | Some strategies for how to change the code to be more efficient: 85 | 86 | ### Collect common subexpressions 87 | 88 | For example: 89 | 90 | ```c 91 | sqrt(dx*dx + dy*dy) + (sqrt(dx*dx + dy*dy) > 0 ? ...) 92 | ``` 93 | 94 | can become: 95 | 96 | ```c 97 | sqrtr = sqrt(dx*dx + dy*dy) 98 | sqrtr + (sqrtr > 0 ? ...) 99 | ``` 100 | 101 | This removes one computation. 102 | 103 | Another example: 104 | 105 | ```c 106 | for (i = 0; i < nstarting[c]; i++) {...} 107 | ``` 108 | 109 | becomes: 110 | 111 | ```c 112 | n = nstarting[c] 113 | for (i = 0; i < n; i++) {...} 114 | ``` 115 | 116 | This changes the times we lookup the `c` element of the `nstarting` array to just one, instead of each time we loop. 117 | 118 | ### Replace expensive operations by cheap ones 119 | 120 | If we have a function that is too expensive, we can look for a way to re-write it or replace it with something else. 121 | 122 | ### Unroll or eliminate loops 123 | 124 | Loops add overhead to the code. 125 | We can look for ways to avoid them. 126 | 127 | For example: 128 | 129 | ```c 130 | for (i = 0; i < 3; i++) 131 | a[i] = b[i] + c[i]; 132 | ``` 133 | 134 | can become: 135 | 136 | ```c 137 | a[1] = b[1] + c[1]; 138 | a[2] = b[2] + c[2]; 139 | a[3] = b[3] + c[3]; 140 | ``` 141 | 142 | ### Cache frequently-used values 143 | 144 | Caching can improve performance, because it replaces computation with a lookup. 145 | 146 | When we call a computational-function we can store the computated result into a cache, and if we call it again with the same value we can get the result from the cache instead of precomputing it again. 147 | 148 | This consumes more memory, but it more computation-efficient. 149 | 150 | ### Write a special-purpose allocator 151 | 152 | Sometimes allocations can slow down the program. 153 | 154 | We can write our own allocator that does multiple allocations at ones and caches the values. 155 | When we call our allocator again, it will return an already allocated memory, instead of making another allocation. 156 | 157 | This again trades speed for memory. 158 | 159 | ### Buffer input and output 160 | 161 | Batch IO operations, instead of performing them right away. 162 | 163 | ### Handle special cases separately 164 | 165 | Have different logic for special cases (e.g. too big computations, or too big memory allocations). 166 | 167 | ### Precompute results 168 | 169 | Similar to caching frequently-used values, we can cache some results, we know there is a high change we will use. 170 | 171 | For example, if we write a `sin` or `cos` function, we can precompute the results from `0` to `360` instead of calculating them each time. 172 | 173 | ### Use approximate values 174 | 175 | If we can get away with less precission, we can approximate special inputs to known ones. 176 | 177 | For example, we can this for the `sin` and `cos` functions where we approximate the input to the known 0-360 values. 178 | 179 | ### Rewrite in a lower-level languages 180 | 181 | Languages like C and C++ are more efficient that Java an Python. 182 | 183 | ### Exercise 7-4 184 | 185 | One way to make a function like `memset` run faster is to have it write in word-sized chunks instead of byte-sized; this is likely to match the hardware better and might reduce the loop overhead by a factor of four or eight. 186 | The downside is that there are now a variety of end effects to deal with if the target is not aligned on a word boundary and if the length is not a multiple of the word size. 187 | Write a version of `memset` that does this optimization. 188 | Compare its performance to the existing library version and to a straightforward byte-at-a-time loop. 189 | 190 | ### Exercise 7-5 191 | 192 | Write a memory allocator `smalloc` for C strings that uses a special-purpose allocator for small strings but calls `malloc` directly for large ones. 193 | You will need to define a struct to represent the strings in either case. 194 | How do you decide where to switch from calling `smalloc` to `malloc`? 195 | 196 | ## 7.5 Space Efficiency 197 | 198 | Some strategies for how to be more space-efficient: 199 | 200 | ### Save space by using the smallest possible data type 201 | 202 | For example, replacing a `double` with a `float`. 203 | 204 | ### Don't store what you can easily recompute 205 | 206 | By NOT storing values that can be easily recomputed we can save space by using more compute power. 207 | 208 | ## 7.6 Estimation 209 | 210 | Estimate the cost of the operations you do in the code. 211 | 212 | Some strategies for how to change the code to be more efficient: 213 | 214 | Some strategies for how to change the code to be more efficient: 215 | 216 | ### Exercise 7-6 217 | 218 | Create a set of tests for estimating the costs of basic operations for computers and compilers near you, and investigate similarities and differences in per-formance. 219 | 220 | ### Exercise 7-7 221 | 222 | Create a cost model for higher-level operations in C++. Among the features that might be included are construction, copying, and deletion of class objects; member function calls; virtual functions; inline functions; the iostream library; the STL. This exercise is open-ended, so concentrate on a small set of representative operations. • 223 | 224 | ### Exercise 7-8 225 | 226 | Repeat the previous exercise for Java. 227 | 228 | ## 7.7 Summary 229 | 230 | Performance optimizations can only be done, when it is obvious that performance improvements are needed. 231 | 232 | Choosing the right algorithm and data structure for the code is the most important thing for the performance of the code. 233 | Most of the time, just doing that will be all that's needed for the code to perform well. 234 | 235 | For special cases, there are more optimization strategies. 236 | 237 | Sometimes we care about compute optimizations and we trade memory for compute power, sometimes it's the vice versa - we care about memory efficiency and we trade compute for memory. 238 | 239 | ## Supplementary Reading 240 | 241 | - _Software - Practice and Experience_ by Jon Bentley and Doug McIlroy 242 | - _Programming Pearls_ by Jon Bentley 243 | - _More Programming Pearls_ by Jon Bentley 244 | - _Inner Loops_ by Rick Booth 245 | - _Computer Organization and Design: The Hardware/Software Interface_ by John Hennesy and David Patterson 246 | -------------------------------------------------------------------------------- /chapter-8/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 8: Portability 2 | 3 | Portability is when our program can run in different environments. 4 | 5 | This is desired, because we never know how and where our program is going to be run. 6 | 7 | This chapter describes a few important things to consider when thinking about portability. 8 | 9 | ## 8.1 Language 10 | 11 | Most languages (like C and C++) define a standard. 12 | If we comply with that standard, we can expect that our program will be able to run in many places, be complied by different compilers, etc. 13 | 14 | However, that is not always true. 15 | Sometimes the standards are ambigious, or just not respected by the compilers. 16 | 17 | For example, the ANSI C standard does not define the actual size of the datatypes, so we should not assume that a char will always be 8 bits. 18 | 19 | ## 8.2 Headers and Libraries 20 | 21 | Headers and Libraries are an extension of the programming languages. 22 | 23 | When using a language, we can assume that its standard language will be always available, but there might be rare cases where that is not true. 24 | In these cases, it's better to make this a problem of our consumers, instead of putting huge effort to mitigate this on our side. 25 | 26 | In the header files we can include directives to instruct the environment when to include some code and when not, but we should not overcomplicate this as it can become unmaintainable. 27 | 28 | ## 8.3 Program Organization 29 | 30 | This chapter discusses different approaches to avoid `#ifdef`-hell, e.g. too many environment conditional statements in header files. 31 | 32 | Approaches include things like using only features available on all systems **(intersection)**, or using all features available on any platform **(union)** and having complex installation logic/headers that include different pieces of code based on the environment. 33 | 34 | ### Exercise 8-1 35 | 36 | Investigate how your compiler handles code contained within a conditional block like 37 | 38 | ```c 39 | const int DEBUG = 0; 40 | /* or enum { DEBUG = 0 }; */ 41 | /* or final boolean DEBUG = false; */ 42 | 43 | if (DEBUG) { 44 | // some code 45 | } 46 | ``` 47 | 48 | Under what circumstances does it check syntax? 49 | When does it generate code? 50 | If you have access to more than one compiler, how do the results compare? 51 | 52 | ## 8.4 Isolation 53 | 54 | A good way to organize our programs is to have environment-specific code in different files, e.g. (`unix.c`, `windows.c`, `mac.c`, etc.) and a central place where we control the inclusion of these files based on environment properties. 55 | 56 | We can define an interface that will be used by our code, and depending on the system we can include the appropriate file that implements this interface for the given system. 57 | 58 | ## 8.5 Data Exchange 59 | 60 | Text is a good medium for data exchange, because it's portable. 61 | 62 | However, even with text there are sometimes portability issues, like encoding or different line-endings (LF vs CRLF). 63 | 64 | ### Exercise 8-2 65 | 66 | Write a program to remove spurious carriage returns from a file. 67 | Write a second program to add them by replacing each newline with a carriage return and newline. 68 | How would you test these programs? 69 | 70 | ## 8.6 Byte Order 71 | 72 | Sometimes we want to use binary data. 73 | 74 | A big issue that comes up with binary data is the way different machine represent byte-sized data. 75 | 76 | Some use big-endian and some use little-endian. 77 | 78 | For example, this order of bytes that represents a 4-bytes integer: 79 | 80 | | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | | | | 81 | | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | 82 | | 11 | 22 | 33 | 44 | 83 | 84 | is interpreted as `0x11223344` on a big-endian machine, and as `0x44332211` on a little-endian machine. 85 | 86 | To see byte order in action, try this program: 87 | 88 | ```c 89 | /* byte order: display bytes of a long */ 90 | int main(void) { 91 | unsigned long x; 92 | unsigned char *p; 93 | int i; 94 | 95 | /* 11 22 33 44 => big-endian */ 96 | /* 44 33 22 11 => little-endian */ 97 | /* x = 0x1122334455667788UL for 64-bit long */ 98 | 99 | x = 0x11223344UL; 100 | p = (unsigned char *) &x; 101 | for (i = 0; i < sizeof(long); i++) 102 | printf("%x ", *p++); 103 | printf("\n"); 104 | return 0; 105 | } 106 | ``` 107 | 108 | On a 32-bit big-endian machine the output is: 109 | 110 | ```text 111 | 11 22 33 44 112 | ``` 113 | 114 | but on a little-endian machine it is: 115 | 116 | ```text 117 | 44 33 22 11 118 | ``` 119 | 120 | and on the PDP-11 (a vintage 16-bit machine still used in embedded systems) it is: 121 | 122 | ```text 123 | 22 11 44 33 124 | ``` 125 | 126 | This problem can be mitigated if: 127 | 128 | - the two-sized between which data is exchange explicitly agree on the order of the bytes transmitted 129 | - or we use a fixed byte order for the data exchange (this requires to transmit the data byte to byte, which may be slow and expensive) 130 | 131 | Higher-level languages like Java don't have this problem, because they hide the byte-order completely. 132 | 133 | ## 8.7 Portability and Upgrade 134 | 135 | One of the issues of portability is when we upgrade the software - we might introduce a feature that is not portable and does not work on all supported systems. 136 | 137 | Another issue is **backwards-compatibility** - we need to make sure that the new changes we are adding do not break the existing use-cases of our program. 138 | 139 | ## 8.8 Internationalization 140 | 141 | A program can be used by many people, using many language and data formats. 142 | We should not assume a given encoding (for example, ASCII) or a given date format (for example, MM/DD/YYYY). 143 | 144 | The Unicode character sets tries to mitigate this problem by supporting all languages in a single character set. 145 | However, Unicode uses 16 bits per character, which is more than a byte. 146 | Hence, we have the little-endian/big-endian problem again. 147 | This is mitigated by encodings like UTF-8, which is also backwards-compatible with ASCII (because it uses the same 8-bits for the same ASCII characters). 148 | 149 | ## 8.9 Summary 150 | 151 | Writing portable code is a good thing to strive for, but it is not always easy to do so. 152 | 153 | There are two approaches to portable code - **union** and **intersection**. 154 | **Union** is using all available features on each platforms by using things like conditional compilations. 155 | **Intersection** is using only features available on ALL platform, and extracting all platform-specific code in separate files. 156 | 157 | In the long run, the benfits of the intersection approach outweight the drawbacks. 158 | 159 | ## Supplementary Reading 160 | 161 | - _The C Programming Language_ by Brian Kernighan and Dennis Ritchie 162 | - _C: A Reference Manual_ by Sam Harbison and Guy Steele 163 | - _The Java Language Specification_ by James Gosling, Bill Joy and Guy Steele 164 | - _Advanced Programming in Unix Environment_ by Rich Stevens 165 | - "On holy wars and a plea for peace" by Danny Cohen (article) 166 | -------------------------------------------------------------------------------- /chapter-9/README.md: -------------------------------------------------------------------------------- 1 | # Chapter 9: Notation 2 | 3 | Notation is the way we express things. 4 | 5 | It could be a natural language, a programming language or a DSL. 6 | This chapter talks about the importancy for choosing the right "notation" for your problem. 7 | 8 | ## 9.1 Formatting Data 9 | 10 | The way we format the data is dependent on notation, so choosing the right notation is important for solving the problem efficiently. 11 | 12 | The format specifiers for the `printf` function in C are notation themselves and they give us an easy way to express intention (like print a char, print a number, print a number with 2 decimal points, etc.) 13 | 14 | ### Exercise 9-1 15 | 16 | Modify pack and unpack to transmit signed values correctly, even between machines with different sizes for short and long. How should you modify the format strings to specify a signed data item? How can you test the code to check, for example, that it correctly transfers a -1 from a computer with 32-bit longs to one with 64-bit longs? 17 | 18 | ### Exercise 9-2 19 | 20 | Extend pack and unpack to handle strings; one possibility is to include the length of the string in the format string. Extend them to handle repeated items with a count. How does this interact with the encoding of strings? 21 | 22 | ### Exercise 9-3 23 | 24 | The table of function pointers in the C program above is at the heart of C++'s virtual function mechanism. Rewrite pack and unpack and receive in C++ to take advantage of this notational convenience. 25 | 26 | ### Exercise 9-4 27 | 28 | Write a command-line version of printf that prints its second and subsequent arguments in the format given by its first argument. Some shells already provide this as a built-in. 29 | 30 | ### Exercise 9-5 31 | 32 | Write a function that implements the format specifications found in spreadsheet programs or in Java's DecimalFormat class, which display numbers according to patterns that indicate mandatory and optional digits, location of decimal points and commas, and so on. 33 | To illustrate, the format 34 | 35 | ```text 36 | ##,##0.00 37 | ``` 38 | 39 | specifies a number with two decimal places, at least one digit to the left of the decimal point, a comma after the thousands digit, and blank-filling up to the ten-thousands place. 40 | It would represent 12345.67 as 12, 345.67 and .4 as \_\_\_\_0.40 (using underscores to stand for blanks). 41 | For a full specification, look at the definition of Decimal Format or a spreadsheet program. 42 | 43 | ## 9.2 Regular Expressions 44 | 45 | Regular Expressions are another powerful notation that allows us to express patterns of text that we want to match. 46 | 47 | ### Exercise 9-6 48 | 49 | How does the performance of match compare to strstr when searching for plain text? 50 | 51 | ### Exercise 9-7 52 | 53 | Write a non-recursive version of matchhere and compare its performance to the recursive version. 54 | 55 | ### Exercise 9-8 56 | 57 | Add some options to grep. Popular ones include -v to invert the sense of the match, -i to do case-insensitive matching of alphabetics, and -n to include line numbers in the output. How should the line numbers be printed? Should they be printed on the same line as the matching text? " 58 | 59 | ### Exercise 9-9 60 | 61 | Add the + (one or more) and ? (zero or one) operators to match. The pattern a+bb? matches one or more a's followed by one or two b's. 62 | 63 | ### Exercise 9-10 64 | 65 | The current implementation of match turns off the special meaning of ^ and $ if they don't begin or end the expression, and of \* if it doesn't immediately follow a literal character or a period. A more conventional design is to quote a metacharacter by preceding it with a backslash. Fix match to handle backslashes this 66 | way. 67 | 68 | ### Exercise 9-11 69 | 70 | Add character classes to match. Character classes specify a match for any one of the characters in the brackets. They can be made more convenient by adding ranges, for example [a-z] to match any lower-case letter, and inverting the sense, for example [^0-9] to match any character except a digit. • Exercise 9-12. Change match to use the leftmost-longest version of matchstar, and modify it to return the character positions of the beginning and end of the matched text. Use that to build a program gres that is like grep but prints every input line after substituting new text for text that matches the pattern, as in % gres 'homoiousian' 'homoousian' mission.stmt 71 | 72 | ## 9.3 Programmable Tools 73 | 74 | Combining notations and programming tools we can create other tools and workflows that can do specialised jobs for us. 75 | 76 | ## 9.4 Interpreters, Compilers and Virtual Machines 77 | 78 | The notation of a machine is the machine language. 79 | This means that whatever notation we choose to write our program we would need to transform it into machine language, before the computer can execute it. 80 | 81 | That can happen in few ways: 82 | 83 | - via a compiler - a tool that compiles our program, e.g. turning it into an executable 84 | - via an interpreter - a tool that interprets our program line-by-line 85 | - via a virtual machine - an intermediary level between the machine and our code that executes our program 86 | 87 | All of the approaches have pros and cons which we need to consider between picking the one appropriate for our job. 88 | 89 | Usually, compiled code is faster that interpreted one, but compilation takes time, and a compiled program runs only on the CPU architecture and OS for which it is compiled. 90 | If we want to run it on another CPU/OS we need to recompile it. 91 | 92 | Interpreters allows us to run the same code on every machine, but it can be slower that compilation. 93 | 94 | Virtual machine (like the Java Virtual Machine) are a good middle-ground between the two, but they add additional overhead to the system. 95 | 96 | ## 9.5 Programs that Write Programs 97 | 98 | We can write programs that write programs (this is actually what compilers are). 99 | 100 | Or we can create our DSL and write a program that write code based on instructions written in this DSL. 101 | 102 | ### Exercise 9-15 103 | 104 | One of the old chestnuts of computing is to write a program that when executed will reproduce itself exactly, in source form. 105 | This is a neat special case of a program that writes a program. 106 | Give it a try in some of your favorite languages. 107 | 108 | ## 9.6 Using Macros to Generate Code 109 | 110 | We need to be careful with overusing macros in C, but they can something be useful for generating code. 111 | 112 | For example, this macro is a function that executed our code multiple times and prints the execution time. 113 | 114 | ```c 115 | #define LOOP(CODE) { \ 116 | t0 = clock(); \ 117 | for (i = 0; i < n; i++) { CODE; } \ 118 | printf("%7d", clock() - t0); \ 119 | } 120 | ``` 121 | 122 | and can be used like this: 123 | 124 | ```c 125 | LOOP(f1 = f2) 126 | LOOP(f1 = f2 + f3) 127 | LOOP(f1 = f2 - f3) 128 | ``` 129 | 130 | ### Exercise 9-16 131 | 132 | Exercise 7-7 involved writing a program to measure the cost of various operations in C++. 133 | Use the ideas of this section to create another version of the program. 134 | 135 | ### Exercise 9-17 136 | 137 | Exercise 7-8 involved doing a cost model for Java, which has no macro capability. Solve the problem by writing another program, in whatever language (or languages) you choose, that writes the Java version and automates the timing runs. 138 | 139 | ## 9.7 Compilation on the Fly 140 | 141 | Also called Just-In-Time (JIT). 142 | 143 | This can be useful and fast, because it saves compilation time and compiles only the parts that is sure will run. 144 | Also, it can produce compile-time optimizations like ommiting devision by zero checks (when sure that the divisor is not 0). 145 | 146 | ### Exercise 9-18 147 | 148 | The on-the-fly compiler generates faster code if it can replace expressions that contain only constants, such as `max (3\*3, 4/2)`, by their value. 149 | Once it has recognized such an expression, how should it compute its value? 150 | 151 | ### Exercise 9-19 152 | 153 | How would you test an on-the-fly compiler? 154 | 155 | ## Supplementary Reading 156 | 157 | - _The Unix Programming Environment_ by Brian Kernighan and Rob Pike 158 | - _TEX: The Program_ by Don Knuth 159 | - _A Retargetable C Compiler: Design and Implementation_ by Chris Fraser and David Hanson 160 | - _The Java Virtual Machine Specification_ by Tim Lindholm and Frank Yellin 161 | - _Communication of the ACM_ 162 | - _Mastering Regular Expressions_ by Jeffrey E.F. Friedl 163 | - _Hardware/Software Tradeoffs for Bitmap Graphics on the Blit_ in _Software - Practice and Experience_ (magazine) 164 | --------------------------------------------------------------------------------