for more details.)\n",
54 | "\n",
55 | "| Offset | Size (in bytes) | Description |\n",
56 | "|--------|------------------------------|--------------------------------------------------------------------|\n",
57 | "| 0 | 1 | first byte of signature, must be 0x42 (the ASCII character 'B') |\n",
58 | "| 1 | 1 | second byte of signature, must be 0x4D (the ASCII character 'M') |\n",
59 | "| 2 | 4 | size of the BMP file in bytes (unreliable, ignored) |\n",
60 | "| 6 | 2 | Must be zero |\n",
61 | "| 8 | 2 | Must be zero |\n",
62 | "| 10 | 4 | Must be the value 54 (i.e. 0x00000036) |\n",
63 | "| 14 | 4 | Must be the value 40 (i.e. 0x00000028) |\n",
64 | "| 18 | 4 | *Width* (image width in pixels, as signed integer) |\n",
65 | "| 22 | 4 | *Height* (image height in pixels, as signed integer) |\n",
66 | "| 26 | 2 | Must be one |\n",
67 | "| 28 | 2 | Number of bits per pixel (must be 32) |\n",
68 | "| 30 | 4 | Compression type (must be 0 = no compression) |\n",
69 | "| 34 | 4 | Size of image data in bytes (must be 4\\**Width*\\**Height*) |\n",
70 | "| 38 | 4 | unreliable (ignored) |\n",
71 | "| 42 | 4 | unreliable (ignored) |\n",
72 | "| 46 | 4 | Must be zero |\n",
73 | "| 50 | 4 | Must be zero |\n",
74 | "| 54 | 4\\**Width*\\**Height* | Pixel data, laid out in rows |\n",
75 | "\n",
76 | "The first byte (offset 0) of a valid BMP file is the character 'B'; the\n",
77 | "second byte (offset 1) is the character 'M'. The 3rd to 6th bytes\n",
78 | "(offsets 2 to 5 inclusive) indicate the total length of the BMP file but\n",
79 | "are unreliable in practice and so let us assume that they are ignored by\n",
80 | "all BMP parsing code. The 7th and 8th bytes (offsets 6 and 7) are\n",
81 | "interpreted as a 2-byte integer that must be zero, i.e. each of these\n",
82 | "bytes must be zero. The same is true for the 9th and 10th bytes (offsets\n",
83 | "8 and 9), and so on."
84 | ]
85 | },
86 | {
87 | "cell_type": "markdown",
88 | "id": "cfb0a548",
89 | "metadata": {
90 | "id": "59164571"
91 | },
92 | "source": [
93 | "## Your Tasks\n",
94 | "\n",
95 | "\n",
96 | "### Question 1\n",
97 | "Imagine you are choosing a value for each of the fields in the table\n",
98 | "above *in order*, i.e. you first choose a value for the first byte of\n",
99 | "the file, then choose a value for the second byte of the file, then for\n",
100 | "following 4-bytes, and so on. For each field, identify the total number\n",
101 | "of valid values there are to choose from, assuming you have already\n",
102 | "chosen values for all fields that have come before.\n",
103 | "\n",
104 | "### Question 2\n",
105 | "The BMP header (i.e. everything excluding the pixel data) as described\n",
106 | "above has a fixed length of 54 bytes. Using the answer from the previous\n",
107 | "question or otherwise, what is the probability that a (uniformly)\n",
108 | "randomly generated string of 54 bytes is a valid BMP header?\n",
109 | "\n",
110 | "### Question 3\n",
111 | "Suppose you have a valid 54-byte header and you mutate an arbitrary\n",
112 | "(uniformly randomly chosen) byte in the header to a new value (different\n",
113 | "from its original value). What is the probability of producing a valid\n",
114 | "header?\n",
115 | "\n",
116 | "### Question 4\n",
117 | "Imagine you had to write a fuzzer to fuzz some BMP processing code that\n",
118 | "processed BMP files of the format described above. If you had to choose\n",
119 | "between generating completely random inputs vs. using random mutation on\n",
120 | "existing BMP files, which strategy would you choose?\n"
121 | ]
122 | }
123 | ],
124 | "metadata": {
125 | "colab": {
126 | "include_colab_link": true,
127 | "name": "SWEN90006_Tutorial_06.ipynb",
128 | "provenance": []
129 | },
130 | "kernelspec": {
131 | "display_name": "Java",
132 | "language": "java",
133 | "name": "java"
134 | },
135 | "language_info": {
136 | "codemirror_mode": "java",
137 | "file_extension": ".jshell",
138 | "mimetype": "text/x-java-source",
139 | "name": "Java",
140 | "pygments_lexer": "java",
141 | "version": "10.0.2+13"
142 | }
143 | },
144 | "nbformat": 4,
145 | "nbformat_minor": 5
146 | }
147 |
--------------------------------------------------------------------------------
/SWEN90006_Tutorial_8.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "id": "a2f8b49d",
6 | "metadata": {
7 | "id": "9f6077ed"
8 | },
9 | "source": [
10 | "# SWEN90006 Tutorial 8"
11 | ]
12 | },
13 | {
14 | "cell_type": "markdown",
15 | "id": "849ea99f",
16 | "metadata": {
17 | "id": "97e7daf0"
18 | },
19 | "source": [
20 | "## Introduction\n",
21 | "\n",
22 | "The aim of this tutorial is for you to familiarise yourself with code coverage-guided fuzzing tools like AFL. In the class exercise last week (Week 8), we used generation-based/model-based black-box fuzzing tools like Peach fuzzer to generate inputs to trigger the faults in two versions of the read_and_process program. In this tutorial, you will use AFL to do the same task. "
23 | ]
24 | },
25 | {
26 | "cell_type": "markdown",
27 | "id": "9a3f3550",
28 | "metadata": {
29 | "id": "46680c7b"
30 | },
31 | "source": [
32 | "## Working with programs\n",
33 | "\n",
34 | "Following the instructions at https://github.com/SWEN90006-2021/security-testing to setup a Docker image and Docker container.\n",
35 | "\n",
36 | "### First program: read_and_process_v1.c\n",
37 | "\n",
38 | "To trigger the fault, called FAULT-1, in this simpler version of read_and_process, AFL needs to generate an input file that satisfies the following conditions:\n",
39 | "-\tThe file should adhere to the structure of the expected file format. Recall that a valid file starts with a 4-byte \"signature\", followed by a list of chunks and each chunk has three parts: i) a chunk type stored in 4 bytes, ii) a 4-byte data length, and iii) the chunk data.\n",
40 | "-\tThe list of chunks should have at least one data chunk of the \"BOOM\" type.\n",
41 | "\n",
42 | "### Your tasks\n",
43 | "\n",
44 | "Compile read_and_process_v1.c with afl-clang-fast so that AFL can be used to fuzz test the generated binary with code coverage feedback enabled.\n",
45 | "\n",
46 | "```bash\n",
47 | "afl-clang-fast -g -o read_and_process_v1 read_and_process_v1.c\n",
48 | "```\n",
49 | "Run AFL to fuzz test the program in four settings. If your computer has enough CPU cores (>= 4 cores), you can start four docker containers and run four settings at the same time to speed up the experiments. You can stop AFL by typing down the combination \"Ctrl + C\". You can also set up a timeout for each command if you wish by using the timeout command (see https://linuxize.com/post/timeout-command-in-linux/).\n",
50 | "\n",
51 | "#### Setting-1\n",
52 | "Use only a random file, containing a number like \"1234\", as seed input.\n",
53 | "\n",
54 | "```bash\n",
55 | "mkdir corpus-random\n",
56 | "echo \"1234\" > corpus-random/random_file\n",
57 | "afl-fuzz -d -i corpus-random -o out-setting-1 ./read_and_process_v1 @@\n",
58 | "```\n",
59 | "\n",
60 | "If you want to run afl-fuzz fuzzer for 5 minutes, you can use this command\n",
61 | "\n",
62 | "```bash\n",
63 | "timeout 5m afl-fuzz -d -i corpus-random -o out-setting-1 ./read_and_process_v1 @@\n",
64 | "```\n",
65 | "#### Setting-2\n",
66 | "Use the same seed corpus in Setting-1 and use a so-called fuzzing dictionary. A fuzzing dictionary contains predefined tokens that can be randomly inserted into the generated input.\n",
67 | "\n",
68 | "Create a file named fuzz.dict that contains three tokens as below. Inside the Docker container, you can use simple text editors like vim or nano to create this file.\n",
69 | "\n",
70 | "```\n",
71 | "\"ABCD\"\n",
72 | "\"BOOM\"\n",
73 | "\"CHUK\"\n",
74 | "```\n",
75 | "\n",
76 | "And run the following fuzzing command. The \"-x\" option is used to specify a fuzzing dictionary.\n",
77 | "\n",
78 | "```bash\n",
79 | "afl-fuzz -d -x fuzz.dict -i corpus-random -o out-setting-2 ./read_and_process_v1 @@\n",
80 | "```\n",
81 | "\n",
82 | "#### Setting-3\n",
83 | "Use a seed corpus containing a valid input file that adheres to the specified format. The \"printf\" command is used to generate a valid file that has a single chunk of type \"CHUK\" and the chunk's data consists of four bytes. Note that all numbers are stored in little-endian order so \"\\x04\\x00\\x00\\x00\" is 0x04 in hexadecimal and 4 in decimal.\n",
84 | "\n",
85 | "```bash\n",
86 | "mkdir corpus-valid\n",
87 | "printf \"ABCDCHUK\\x04\\x00\\x00\\x00\\x00\\x00\\x00\\x00\" > corpus-valid/valid_file\n",
88 | "afl-fuzz -d -i corpus-valid -o out-setting-3 ./read_and_process_v1 @@\n",
89 | "```\n",
90 | "\n",
91 | "#### Setting-4\n",
92 | "Use the same seed corpus in Setting-3 together with the fuzzing dictionary in Setting-2.\n",
93 | "\n",
94 | "```bash\n",
95 | "afl-fuzz -d -x fuzz.dict -i corpus-valid -o out-setting-4 ./read_and_process_v1 @@\n",
96 | "```\n",
97 | "\n",
98 | "Discuss the results in the four settings with a focus on the pros and cons of each setting. The crash-triggering inputs should be found in out-setting-*/crashes folder where normal inputs should be found in out-setting-*/queue folder.\n",
99 | "\n",
100 | "### Second program: read_and_process_v2.c\n",
101 | "\n",
102 | "To trigger the fault in this version of read_and_process, in addition to FAULT-1's conditions, AFL needs to generate an input file that satisfies the following conditions:\n",
103 | "-\tThe data of the BOOM chunk contains two 4-byte integer numbers named x and y.\n",
104 | "-\tThe values of x and y must satisfy this constraint: (x + y > 283) && (x + y < 286)\n",
105 | "\n",
106 | "### Your tasks:\n",
107 | "\n",
108 | "Compile read_and_process_v2.c with afl-clang-fast.\n",
109 | "\n",
110 | "```bash\n",
111 | "afl-clang-fast -g -o read_and_process_v2 read_and_process_v2.c\n",
112 | "```\n",
113 | "\n",
114 | "#### Setting-5\n",
115 | "Use a seed corpus containing a valid input file that adheres to the specified format.\n",
116 | "\n",
117 | "```bash\n",
118 | "afl-fuzz -d -i corpus-valid -o out-setting-5 ./read_and_process_v2 @@\n",
119 | "```\n",
120 | "\n",
121 | "#### Setting-6\n",
122 | "Use the same seed corpus in Setting-5 together with a fuzzing dictionary.\n",
123 | "\n",
124 | "```bash\n",
125 | "afl-fuzz -d -x fuzz.dict -i corpus-valid -o out-setting-6 ./read_and_process_v2 @@\n",
126 | "```\n",
127 | "\n",
128 | "Discuss the results in the two settings with a focus on the pros and cons of each setting. The crash-triggering inputs should be found in out-setting-*/crashes folder where normal inputs should be found in out-setting-*/queue folder."
129 | ]
130 | },
131 | {
132 | "cell_type": "code",
133 | "execution_count": null,
134 | "id": "2ef001ee",
135 | "metadata": {},
136 | "outputs": [],
137 | "source": []
138 | }
139 | ],
140 | "metadata": {
141 | "colab": {
142 | "include_colab_link": true,
143 | "name": "SWEN90006_Tutorial_06.ipynb",
144 | "provenance": []
145 | },
146 | "kernelspec": {
147 | "display_name": "Java",
148 | "language": "java",
149 | "name": "java"
150 | },
151 | "language_info": {
152 | "codemirror_mode": "java",
153 | "file_extension": ".jshell",
154 | "mimetype": "text/x-java-source",
155 | "name": "Java",
156 | "pygments_lexer": "java",
157 | "version": "10.0.2+13"
158 | }
159 | },
160 | "nbformat": 4,
161 | "nbformat_minor": 5
162 | }
163 |
--------------------------------------------------------------------------------
/SWEN90006_Tutorial_8_2024.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "id": "a2f8b49d",
6 | "metadata": {
7 | "id": "9f6077ed"
8 | },
9 | "source": [
10 | "# SWEN90006 Tutorial 8"
11 | ]
12 | },
13 | {
14 | "cell_type": "markdown",
15 | "id": "849ea99f",
16 | "metadata": {
17 | "id": "97e7daf0"
18 | },
19 | "source": [
20 | "## Introduction\n",
21 | "\n",
22 | "The aim of this tutorial is for you to familiarise yourself with generation-based and mutation-based fuzzing. In today's session, we will first reflect on mutation-based fuzzing a bit more using last week's example. And then, we will use generation-based/model-based black-box fuzzing tools like Peach fuzzer to generate inputs to trigger the faults in two versions of the read_and_process program. "
23 | ]
24 | },
25 | {
26 | "cell_type": "markdown",
27 | "id": "9a3f3550",
28 | "metadata": {
29 | "id": "46680c7b"
30 | },
31 | "source": [
32 | "## Mutation-based fuzzing (cont'd)\n",
33 | "\n",
34 | "Please recap the BMP header excecise, review the concept of Mutation-based fuzzing, and answer Question 3 and Question 4 from last week again. \n",
35 | "\n",
36 | "### Question 3\n",
37 | "Suppose you have a valid 54-byte header and you mutate an arbitrary\n",
38 | "(uniformly randomly chosen) byte in the header to a new value (different\n",
39 | "from its original value). What is the probability of producing a valid\n",
40 | "header?\n",
41 | "\n",
42 | "### Question 4\n",
43 | "Imagine you had to write a fuzzer to fuzz some BMP processing code that\n",
44 | "can process BMP files of the format described above. If you had to choose\n",
45 | "between generating completely random inputs vs. performing random mutation \n",
46 | "on existing (valid) BMP files, which strategy would you choose?\n"
47 | ]
48 | },
49 | {
50 | "cell_type": "markdown",
51 | "id": "2ef001ee",
52 | "metadata": {},
53 | "source": [
54 | "\n",
55 | "## Generation-based fuzzing\n",
56 | "\n",
57 | "\n",
58 | "### Building a Docker Container\n",
59 | "\n",
60 | "We provided a docker container that has all tools introduced in the lecture. This docker will be used to help you understand fuzzing better, reproduce the demos during the lecture, and carry out fuzzing experiments in tutorials for the next few weeks. \n",
61 | "\n",
62 | "Following the instructions at https://github.com/SWEN90006-2021/security-testing to setup a Docker image and Docker container.\n",
63 | "\n",
64 | "\n",
65 | "### Week 8 in-class exercise\n",
66 | "\n",
67 | "This part of the instructions are the same as `Week 8 in-class exercise: generation-based fuzzing` on Canvas.\n",
68 | "\n",
69 | "We will look at two exercises in which you are asked to apply these fuzzing techniques to discover the faults in two versions of a program named `read_and_process.c` (stored in `read_and_process.zip`). This program mimics some functionalities of media processing libraries like LibPNG. The program takes a file as input and the file is expected to adhere to a specific format. A valid file starts with a 4-byte \"signature\". After that, the file contains a list of chunks and each chunk has three parts: i) a chunk type stored in 4 bytes, ii) a 4-byte data length, and iii) the chunk data.\n",
70 | "\n",
71 | "The below images taken from the lecture slides illustrates the file format:\n",
72 | "\n",
73 | "\n",
74 | "\n",
75 | "### Question 1: Manual analysis\n",
76 | "\n",
77 | "What is the fault in [read_and_process_v1.c](https://github.com/SWEN90006-2021/security-testing/blob/main/read_and_process_v1.c)? What are the conditions to trigger the fault?\n",
78 | "\n",
79 | "### Question 2: Write an input model for the given file format \n",
80 | "\n",
81 | "Create a new file with name `input_model.xml`\n",
82 | "\n",
83 | "Hint: the input model is also on the lecture slides\n",
84 | "\n",
85 | "### Question 3: Generation-based fuzzing\n",
86 | "Use the input model and use generation-based fuzzing (`generation_fuzzer.sh`) to automatically generate an input to trigger that fault. You would need to update the fuzzer scripts to capture SIGABORT (return code = 134) instead of SIGFAULT (return code = 139).\n",
87 | "\n",
88 | "\n",
89 | "\n",
90 | "```shell\n",
91 | "// First, compile the buggy program read_and_process_v1.c\n",
92 | "cd $WORKDIR\n",
93 | "gcc -o read_and_process_v1 read_and_process_v1.c\n",
94 | "\n",
95 | "// Next, run generation-based fuzzer to fuzz the read_and_process_v1 program\n",
96 | "generation_fuzzer.sh ./read_and_process_v1 input_model.xml 20 results-no-seeds\n",
97 | "```\n",
98 | "
\n",
99 | "\n",
100 | "### Question 4: Work on another program using all steps from questions 1 - 3\n",
101 | "What is the fault in [read_and_process_v2.c](https://github.com/SWEN90006-2021/security-testing/blob/main/read_and_process_v2.c)? \n",
102 | "What are the conditions to trigger that fault? \n",
103 | "Is the input model written in Question-2 helpful to discover that fault? If it does not work, discuss the ideas to fuzz test this program.\n"
104 | ]
105 | }
106 | ],
107 | "metadata": {
108 | "colab": {
109 | "include_colab_link": true,
110 | "name": "SWEN90006_Tutorial_06.ipynb",
111 | "provenance": []
112 | },
113 | "kernelspec": {
114 | "display_name": "Java",
115 | "language": "java",
116 | "name": "java"
117 | },
118 | "language_info": {
119 | "codemirror_mode": "java",
120 | "file_extension": ".jshell",
121 | "mimetype": "text/x-java-source",
122 | "name": "Java",
123 | "pygments_lexer": "java",
124 | "version": "10.0.2+13"
125 | }
126 | },
127 | "nbformat": 4,
128 | "nbformat_minor": 5
129 | }
130 |
--------------------------------------------------------------------------------
/SWEN90006_Tutorial_9.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "id": "3adcba9a",
6 | "metadata": {
7 | "id": "9f6077ed"
8 | },
9 | "source": [
10 | "# SWEN90006 Tutorial 9"
11 | ]
12 | },
13 | {
14 | "cell_type": "markdown",
15 | "id": "960f7aae",
16 | "metadata": {
17 | "id": "97e7daf0"
18 | },
19 | "source": [
20 | "## Introduction\n",
21 | "\n",
22 | "The aim of this tutorial is to give you more experience with symbolic\n",
23 | "execution and how it can be used to verify software."
24 | ]
25 | },
26 | {
27 | "cell_type": "markdown",
28 | "id": "e9c224d8",
29 | "metadata": {
30 | "id": "46680c7b"
31 | },
32 | "source": [
33 | "## Symbolic Execution\n",
34 | "\n",
35 | "Recall from the lecture that the aim of symbolic execution is to, in\n",
36 | "effect, execute *multiple* inputs at the same time -- possibly an\n",
37 | "infinite number. It does this by using symbolic values to represent\n",
38 | "inputs, instead of concrete values.\n",
39 | "\n",
40 | "A complete symbolic execution of a program produces a *symbolic\n",
41 | "execution tree*. Each path through the tree represents a single path\n",
42 | "through the program, including all inputs on that path. Each node along\n",
43 | "a path represents the symbolic values of variables at that point in the\n",
44 | "execution.\n",
45 | "\n",
46 | "\n",
47 | "## Tasks and Questions\n",
48 | "\n",
49 | "### Question 1\n",
50 | "\n",
51 | "Figure C.3 shows an implementation of a program that\n",
52 | "returns the minimum of two integers. Draw the complete symbolic\n",
53 | "execution tree for this program. In your execution tree, use the\n",
54 | "variables `X_0` and `Y_0` to represent the initial symbolic values of\n",
55 | "`x` and `y` respectively.\n",
56 | "\n",
57 | "At the return statement at line 10, assume the existence of a symbolic\n",
58 | "variable `RET`, to which the return value is assigned when a return\n",
59 | "statement is executed.\n",
60 | "\n",
61 | "```c\n",
62 | " 1. int min(int x, int y)\n",
63 | " 2. {\n",
64 | " 3. int minimum = 0;\n",
65 | " 4. if (x > y) {\n",
66 | " 5. minimum = y;\n",
67 | " 6. }\n",
68 | " 7. else if (y > x) {\n",
69 | " 8. minimum = x;\n",
70 | " 9. }\n",
71 | " 10. return minimum;\n",
72 | " 11. }\n",
73 | "```\n",
74 | "\n",
75 | "Figure C.3: An implementation of the Min function
\n",
76 | "\n",
77 | "### Question 2\n",
78 | "Does the program in Figure C.3 ensure the precondition that the number\n",
79 | "returned is always the minimum; i.e. that `RET` is the minimum number of\n",
80 | "`X_0` and `Y_0`?"
81 | ]
82 | },
83 | {
84 | "cell_type": "markdown",
85 | "id": "c45d2585",
86 | "metadata": {},
87 | "source": []
88 | },
89 | {
90 | "cell_type": "code",
91 | "execution_count": null,
92 | "id": "63604442",
93 | "metadata": {},
94 | "outputs": [],
95 | "source": []
96 | }
97 | ],
98 | "metadata": {
99 | "colab": {
100 | "include_colab_link": true,
101 | "name": "SWEN90006_Tutorial_06.ipynb",
102 | "provenance": []
103 | },
104 | "kernelspec": {
105 | "display_name": "Java",
106 | "language": "java",
107 | "name": "java"
108 | },
109 | "language_info": {
110 | "codemirror_mode": "java",
111 | "file_extension": ".jshell",
112 | "mimetype": "text/x-java-source",
113 | "name": "Java",
114 | "pygments_lexer": "java",
115 | "version": "10.0.2+13"
116 | }
117 | },
118 | "nbformat": 4,
119 | "nbformat_minor": 5
120 | }
121 |
--------------------------------------------------------------------------------
/SWEN90006_Tutorial_9_2024.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "id": "3adcba9a",
6 | "metadata": {
7 | "id": "9f6077ed"
8 | },
9 | "source": [
10 | "# SWEN90006 Tutorial 9"
11 | ]
12 | },
13 | {
14 | "cell_type": "markdown",
15 | "id": "960f7aae",
16 | "metadata": {
17 | "id": "97e7daf0"
18 | },
19 | "source": [
20 | "## Introduction\n",
21 | "\n",
22 | "The aim of this tutorial is for you to familiarise yourself with code coverage-guided fuzzing tools like AFL. In the class exercise last week (Week 8), we used generation-based/model-based black-box fuzzing tools like Peach fuzzer to generate inputs to trigger the faults in two versions of the read_and_process program. In this tutorial, you will use AFL to do the same task. "
23 | ]
24 | },
25 | {
26 | "cell_type": "markdown",
27 | "id": "e9c224d8",
28 | "metadata": {
29 | "id": "46680c7b"
30 | },
31 | "source": [
32 | "## Working with programs\n",
33 | "\n",
34 | "We will use the same docker image that we built in Tutorial 8. You may use this command to list all docker images\n",
35 | "\n",
36 | "```bash\n",
37 | "docker ps -a\n",
38 | "```\n",
39 | "\n",
40 | "You may see the containers that you used from Week 8's tutorial:\n",
41 | "\n",
42 | "| CONTAINER ID | IMAGE | COMMAND | CREATED | STATUS | PORTS | NAMES |\n",
43 | "|--------------|-----------------|-------------|-------------|-------------------------|-------|-----------------|\n",
44 | "| SOME_ID | swen90006 | \"/bin/bash\" | 7 days ago | Exited / Up | | SOME_NAME |\n",
45 | "\n",
46 | "Then you can start the docker container again without creating a new one by:\n",
47 | "\n",
48 | "```bash\n",
49 | "# start the container, if the status is not up:\n",
50 | "docker start \n",
51 | "# enter the container's shell:\n",
52 | "docker exec -it /bin/bash\n",
53 | "```\n",
54 | "\n",
55 | "If you have not build the docker yet, please follow the instructions at https://github.com/SWEN90006-2021/security-testing to setup a Docker image and Docker container.\n",
56 | "\n",
57 | "### First program: read_and_process_v1.c\n",
58 | "\n",
59 | "To trigger the fault, called FAULT-1, in this simpler version of read_and_process, AFL needs to generate an input file that satisfies the following conditions:\n",
60 | "-\tThe file should adhere to the structure of the expected file format. Recall that a valid file starts with a 4-byte \"signature\", followed by a list of chunks and each chunk has three parts: i) a chunk type stored in 4 bytes, ii) a 4-byte data length, and iii) the chunk data.\n",
61 | "-\tThe list of chunks should have at least one data chunk of the \"BOOM\" type.\n",
62 | "\n",
63 | "### Your tasks\n",
64 | "\n",
65 | "Compile read_and_process_v1.c with afl-clang-fast so that AFL can be used to fuzz test the generated binary with code coverage feedback enabled.\n",
66 | "\n",
67 | "```bash\n",
68 | "afl-clang-fast -g -o read_and_process_v1 read_and_process_v1.c\n",
69 | "```\n",
70 | "Run AFL to fuzz test the program in four settings. If your computer has enough CPU cores (>= 4 cores), you can start four docker containers and run four settings at the same time to speed up the experiments. You can stop AFL by typing down the combination \"Ctrl + C\". You can also set up a timeout for each command if you wish by using the timeout command (see https://linuxize.com/post/timeout-command-in-linux/).\n",
71 | "\n",
72 | "#### Setting-1\n",
73 | "Use only a random file, containing a number like \"1234\", as seed input.\n",
74 | "\n",
75 | "```bash\n",
76 | "mkdir corpus-random\n",
77 | "echo \"1234\" > corpus-random/random_file\n",
78 | "afl-fuzz -d -i corpus-random -o out-setting-1 ./read_and_process_v1 @@\n",
79 | "```\n",
80 | "\n",
81 | "If you want to run afl-fuzz fuzzer for 5 minutes, you can use this command\n",
82 | "\n",
83 | "```bash\n",
84 | "timeout 5m afl-fuzz -d -i corpus-random -o out-setting-1 ./read_and_process_v1 @@\n",
85 | "```\n",
86 | "#### Setting-2\n",
87 | "Use the same seed corpus in Setting-1 and use a so-called fuzzing dictionary. A fuzzing dictionary contains predefined tokens that can be randomly inserted into the generated input.\n",
88 | "\n",
89 | "Create a file named fuzz.dict that contains three tokens as below. Inside the Docker container, you can use simple text editors like vim or nano to create this file.\n",
90 | "\n",
91 | "```\n",
92 | "\"ABCD\"\n",
93 | "\"BOOM\"\n",
94 | "\"CHUK\"\n",
95 | "```\n",
96 | "\n",
97 | "And run the following fuzzing command. The \"-x\" option is used to specify a fuzzing dictionary.\n",
98 | "\n",
99 | "```bash\n",
100 | "afl-fuzz -d -x fuzz.dict -i corpus-random -o out-setting-2 ./read_and_process_v1 @@\n",
101 | "```\n",
102 | "\n",
103 | "#### Setting-3\n",
104 | "Use a seed corpus containing a valid input file that adheres to the specified format. The \"printf\" command is used to generate a valid file that has a single chunk of type \"CHUK\" and the chunk's data consists of four bytes. Note that all numbers are stored in little-endian order so \"\\x04\\x00\\x00\\x00\" is 0x04 in hexadecimal and 4 in decimal.\n",
105 | "\n",
106 | "```bash\n",
107 | "mkdir corpus-valid\n",
108 | "printf \"ABCDCHUK\\x04\\x00\\x00\\x00\\x00\\x00\\x00\\x00\" > corpus-valid/valid_file\n",
109 | "afl-fuzz -d -i corpus-valid -o out-setting-3 ./read_and_process_v1 @@\n",
110 | "```\n",
111 | "\n",
112 | "#### Setting-4\n",
113 | "Use the same seed corpus in Setting-3 together with the fuzzing dictionary in Setting-2.\n",
114 | "\n",
115 | "```bash\n",
116 | "afl-fuzz -d -x fuzz.dict -i corpus-valid -o out-setting-4 ./read_and_process_v1 @@\n",
117 | "```\n",
118 | "\n",
119 | "Discuss the results in the four settings with a focus on the pros and cons of each setting. The crash-triggering inputs should be found in out-setting-*/crashes folder where normal inputs should be found in out-setting-*/queue folder.\n",
120 | "\n",
121 | "### Second program: read_and_process_v2.c\n",
122 | "\n",
123 | "To trigger the fault in this version of read_and_process, in addition to FAULT-1's conditions, AFL needs to generate an input file that satisfies the following conditions:\n",
124 | "-\tThe data of the BOOM chunk contains two 4-byte integer numbers named x and y.\n",
125 | "-\tThe values of x and y must satisfy this constraint: (x + y > 283) && (x + y < 286)\n",
126 | "\n",
127 | "### Your tasks:\n",
128 | "\n",
129 | "Compile read_and_process_v2.c with afl-clang-fast.\n",
130 | "\n",
131 | "```bash\n",
132 | "afl-clang-fast -g -o read_and_process_v2 read_and_process_v2.c\n",
133 | "```\n",
134 | "\n",
135 | "#### Setting-5\n",
136 | "Use a seed corpus containing a valid input file that adheres to the specified format.\n",
137 | "\n",
138 | "```bash\n",
139 | "afl-fuzz -d -i corpus-valid -o out-setting-5 ./read_and_process_v2 @@\n",
140 | "```\n",
141 | "\n",
142 | "#### Setting-6\n",
143 | "Use the same seed corpus in Setting-5 together with a fuzzing dictionary.\n",
144 | "\n",
145 | "```bash\n",
146 | "afl-fuzz -d -x fuzz.dict -i corpus-valid -o out-setting-6 ./read_and_process_v2 @@\n",
147 | "```\n",
148 | "\n",
149 | "### Discussion\n",
150 | "\n",
151 | "Discuss the results in the two settings with a focus on the pros and cons of each setting. The crash-triggering inputs should be found in out-setting-*/crashes folder where normal inputs should be found in out-setting-*/queue folder."
152 | ]
153 | },
154 | {
155 | "cell_type": "markdown",
156 | "id": "c45d2585",
157 | "metadata": {},
158 | "source": []
159 | },
160 | {
161 | "cell_type": "code",
162 | "execution_count": null,
163 | "id": "63604442",
164 | "metadata": {
165 | "vscode": {
166 | "languageId": "java"
167 | }
168 | },
169 | "outputs": [],
170 | "source": []
171 | }
172 | ],
173 | "metadata": {
174 | "colab": {
175 | "include_colab_link": true,
176 | "name": "SWEN90006_Tutorial_06.ipynb",
177 | "provenance": []
178 | },
179 | "kernelspec": {
180 | "display_name": "Java",
181 | "language": "java",
182 | "name": "java"
183 | },
184 | "language_info": {
185 | "codemirror_mode": "java",
186 | "file_extension": ".jshell",
187 | "mimetype": "text/x-java-source",
188 | "name": "Java",
189 | "pygments_lexer": "java",
190 | "version": "10.0.2+13"
191 | }
192 | },
193 | "nbformat": 4,
194 | "nbformat_minor": 5
195 | }
196 |
--------------------------------------------------------------------------------
/figures/CFG-Tutorial-3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/SWEN90006/tutorials/e978ad3a613bfaffc622a515fe06c91bbef85095/figures/CFG-Tutorial-3.png
--------------------------------------------------------------------------------
/figures/CFG-Tutorial-4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/SWEN90006/tutorials/e978ad3a613bfaffc622a515fe06c91bbef85095/figures/CFG-Tutorial-4.png
--------------------------------------------------------------------------------
/figures/CFG-Tutorial-5.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/SWEN90006/tutorials/e978ad3a613bfaffc622a515fe06c91bbef85095/figures/CFG-Tutorial-5.png
--------------------------------------------------------------------------------
/figures/CFG1_tut4.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/SWEN90006/tutorials/e978ad3a613bfaffc622a515fe06c91bbef85095/figures/CFG1_tut4.png
--------------------------------------------------------------------------------
/figures/Graph-State-Diagram.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/SWEN90006/tutorials/e978ad3a613bfaffc622a515fe06c91bbef85095/figures/Graph-State-Diagram.png
--------------------------------------------------------------------------------
/figures/Input_structure_tut8.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/SWEN90006/tutorials/e978ad3a613bfaffc622a515fe06c91bbef85095/figures/Input_structure_tut8.png
--------------------------------------------------------------------------------
/figures/LWIG-TTT.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/SWEN90006/tutorials/e978ad3a613bfaffc622a515fe06c91bbef85095/figures/LWIG-TTT.png
--------------------------------------------------------------------------------
/figures/Log-Graph.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/SWEN90006/tutorials/e978ad3a613bfaffc622a515fe06c91bbef85095/figures/Log-Graph.png
--------------------------------------------------------------------------------
/figures/min-execution-tree.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/SWEN90006/tutorials/e978ad3a613bfaffc622a515fe06c91bbef85095/figures/min-execution-tree.png
--------------------------------------------------------------------------------