├── labs ├── 10 │ └── README.md ├── 11 │ ├── test.ac │ ├── compiler.l │ ├── compiler.y │ └── README.md ├── 12 │ ├── print_vul.py │ ├── main.c │ ├── printf-secure.c │ ├── bench.c │ ├── mem_test.c │ ├── fprintf.c │ ├── bench_mem_test.c │ ├── Makefile │ ├── vuln.c │ └── README.md ├── 04 │ ├── lex_examples │ │ ├── example0.l │ │ ├── compiler.l │ │ ├── example1.l │ │ ├── example2.l │ │ ├── test.l │ │ ├── Makefile │ │ ├── example3.l │ │ ├── example3_linuxzoo.l │ │ └── example_input_file.l │ ├── calculator_v1 │ │ ├── Makefile │ │ ├── calc.l │ │ └── calc.y │ ├── calculator_v2 │ │ ├── Makefile │ │ ├── simplecalc.l │ │ └── simplecalc.y │ ├── calculator_v2_linux_zoo │ │ ├── Makefile │ │ ├── simplecalc.l │ │ ├── simplecalc.y │ │ └── README.md │ ├── calculator_v3 │ │ ├── Makefile │ │ ├── simplecalc.l │ │ └── simplecalc.y │ ├── yacc_examples │ │ ├── Makefile │ │ ├── ambi_calculator.l │ │ └── ambi_calculator.y │ └── README.md ├── 01 │ ├── asm-analitics.sh │ ├── Makefile │ ├── simple_foo.c │ ├── checker.sh │ └── README.md ├── 03 │ ├── word_count │ │ ├── wc │ │ └── wc.l │ ├── code_generator.py │ └── README.md ├── 05 │ ├── bottom-up │ │ ├── Makefile │ │ └── shift-reduce.c │ ├── recursive_descent_parser_v2 │ │ ├── grammar.conf │ │ ├── Makefile │ │ ├── log │ │ └── README.md │ ├── recursive_descent_parser │ │ ├── README.md │ │ └── solution_v1.c │ ├── analyze_cfg │ │ ├── gramar_generator.py │ │ └── README.md │ └── recursive_descent_parser_v3 │ │ └── README.md ├── 02 │ ├── checker.sh │ ├── sample_ok.c │ ├── sample_w_errors.c │ ├── hello.c │ ├── aliasing.c │ ├── parse_tree.py │ └── README.md ├── 09 │ ├── example.c │ ├── Makefile │ └── loop-test.c ├── 06 │ ├── yacc │ │ └── README.md │ ├── ast │ │ └── README.md │ └── README.md ├── 08 │ └── README.md └── 07 │ └── README.md ├── dockerimage ├── Makefile └── Dockerfile ├── final_project ├── f1 │ ├── data.py │ └── README.md ├── detection_loops │ └── README.md ├── emojis_compiler │ └── README.md └── smart_compiler │ └── README.md ├── .gitignore ├── LICENSE └── README.md /labs/04/lex_examples/example0.l: -------------------------------------------------------------------------------- 1 | %% 2 | .|\n ECHO; 3 | %% 4 | -------------------------------------------------------------------------------- /labs/11/test.ac: -------------------------------------------------------------------------------- 1 | /hola 2 | f b 3 | i a 4 | a = a + 3.2 5 | p a 6 | -------------------------------------------------------------------------------- /dockerimage/Makefile: -------------------------------------------------------------------------------- 1 | all: 2 | docker build -t vmrod25/ubuntu_compilers . 3 | -------------------------------------------------------------------------------- /labs/01/asm-analitics.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/bash 2 | 3 | echo 'hi, this is done :)' 4 | 5 | echo 'mistake' 6 | -------------------------------------------------------------------------------- /labs/03/word_count/wc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/VictorRodriguez/compilers-lecture/HEAD/labs/03/word_count/wc -------------------------------------------------------------------------------- /labs/05/bottom-up/Makefile: -------------------------------------------------------------------------------- 1 | all: 2 | gcc shift-reduce.c -o shift-reduce 3 | 4 | clean: 5 | rm -rf shift-reduce 6 | -------------------------------------------------------------------------------- /labs/01/Makefile: -------------------------------------------------------------------------------- 1 | all: 2 | gcc simple_foo.c -o simple_foo 3 | test: 4 | ./simple_foo 5 | clean: 6 | rm -rf simple_foo 7 | -------------------------------------------------------------------------------- /labs/12/print_vul.py: -------------------------------------------------------------------------------- 1 | import struct 2 | 3 | addrs=0x08049ddd 4 | 5 | print(("a"*32) + (struct.pack("i",addrs))) 6 | 7 | 8 | -------------------------------------------------------------------------------- /final_project/f1/data.py: -------------------------------------------------------------------------------- 1 | import pandas as pd 2 | 3 | df = pd.read_csv('results.csv') 4 | print(df[(df['points'] > 10) & (df["laps"] <= 40)]) 5 | -------------------------------------------------------------------------------- /labs/02/checker.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/bash 2 | 3 | patch -p1 < test.patch 4 | make 5 | ./my_compiler sample_ok.c 6 | ./my_compiler sample_w_errors.c 7 | -------------------------------------------------------------------------------- /labs/05/recursive_descent_parser_v2/grammar.conf: -------------------------------------------------------------------------------- 1 | ::= 2 | ::= +|ε 3 | ::= 4 | ::= *|ε 5 | ::= ()|id 6 | -------------------------------------------------------------------------------- /labs/02/sample_ok.c: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | int main(){ 4 | for (int x = 0; x < 10;x++){ 5 | printf("hi\n"); 6 | } 7 | return 0; 8 | } 9 | -------------------------------------------------------------------------------- /labs/04/lex_examples/compiler.l: -------------------------------------------------------------------------------- 1 | %{ 2 | #include 3 | %} 4 | 5 | %% 6 | stop printf("Stop command received\n"); 7 | start printf("Start command received\n"); 8 | %% 9 | -------------------------------------------------------------------------------- /labs/04/lex_examples/example1.l: -------------------------------------------------------------------------------- 1 | %{ 2 | #include 3 | %} 4 | 5 | %% 6 | stop printf("Stop command received\n"); 7 | start printf("Start command received\n"); 8 | %% 9 | -------------------------------------------------------------------------------- /labs/04/lex_examples/example2.l: -------------------------------------------------------------------------------- 1 | %{ 2 | #include 3 | %} 4 | 5 | %% 6 | [0123456789]+ printf("NUMBER\n"); 7 | [a-zA-Z][a-zA-Z0-9]* printf("WORD\n"); 8 | %% 9 | -------------------------------------------------------------------------------- /labs/02/sample_w_errors.c: -------------------------------------------------------------------------------- 1 | #include 2 | // coment (this should not be detected 3 | int main(){ 4 | for (int x = 0; x < 10;x++) 5 | printf("hi\n); 6 | } 7 | return 0; 8 | } 9 | -------------------------------------------------------------------------------- /labs/12/main.c: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | int foo(int i) { 4 | return i; 5 | } 6 | 7 | int main() { 8 | int ret; 9 | int i = 10; 10 | ret = foo(i); 11 | return ret; 12 | } 13 | -------------------------------------------------------------------------------- /dockerimage/Dockerfile: -------------------------------------------------------------------------------- 1 | FROM ubuntu:latest 2 | RUN apt-get update 3 | RUN apt-get install -y bison flex vim git libbison-dev make gcc-multilib python3 4 | RUN git clone https://github.com/VictorRodriguez/compilers-lecture.git 5 | 6 | -------------------------------------------------------------------------------- /labs/04/calculator_v1/Makefile: -------------------------------------------------------------------------------- 1 | all: 2 | yacc -d calc.y 3 | lex calc.l 4 | gcc -ll lex.yy.c y.tab.c y.tab.h -o calc 5 | 6 | clean: 7 | rm -rf *.c 8 | rm -rf *.h 9 | rm -rf a.out 10 | rm -rf calc 11 | rm -rf *.gch 12 | -------------------------------------------------------------------------------- /labs/05/recursive_descent_parser_v2/Makefile: -------------------------------------------------------------------------------- 1 | all: 2 | gcc syntax_parser.c -o syntax_parser 3 | ./syntax_parser grammar.conf tokens.txt > log 4 | dot log -Tpng >result.png 5 | 6 | clean: 7 | rm -rf syntax_parser 8 | rm -rf result.png 9 | -------------------------------------------------------------------------------- /labs/02/hello.c: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | 4 | int foo(){ 5 | //this is my homework :) 6 | return 3; 7 | } 8 | 9 | int main(){ 10 | for (int x = 0; x < 10;x++){ 11 | printf("hi\n"); 12 | } 13 | return 0; 14 | } 15 | -------------------------------------------------------------------------------- /labs/12/printf-secure.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | 4 | void foo(char *s) { 5 | printf(s); 6 | } 7 | 8 | int main(){ 9 | char greeting[] = "Hello"; 10 | foo(greeting); 11 | return EXIT_SUCCESS; 12 | } 13 | -------------------------------------------------------------------------------- /labs/04/calculator_v2/Makefile: -------------------------------------------------------------------------------- 1 | all: 2 | yacc -d simplecalc.y 3 | lex simplecalc.l 4 | gcc y.tab.c lex.yy.c -ly -ll -o simplecalc 5 | 6 | clean: 7 | rm -rf simplecalc 8 | rm -rf lex.yy.c 9 | rm -rf y.tab.c 10 | rm -rf y.tab.h 11 | rm -rf y.tab.h.gch 12 | -------------------------------------------------------------------------------- /labs/04/calculator_v2_linux_zoo/Makefile: -------------------------------------------------------------------------------- 1 | all: 2 | yacc -d simplecalc.y 3 | lex simplecalc.l 4 | gcc y.tab.c lex.yy.c -o simplecalc 5 | 6 | clean: 7 | rm -rf simplecalc 8 | rm -rf lex.yy.c 9 | rm -rf y.tab.c 10 | rm -rf y.tab.h 11 | rm -rf y.tab.h.gch 12 | -------------------------------------------------------------------------------- /labs/04/calculator_v3/Makefile: -------------------------------------------------------------------------------- 1 | all: 2 | yacc -d simplecalc.y 3 | lex simplecalc.l 4 | gcc y.tab.c lex.yy.c -ly -ll -o simplecalc 5 | 6 | clean: 7 | rm -rf simplecalc 8 | rm -rf lex.yy.c 9 | rm -rf y.tab.c 10 | rm -rf y.tab.h 11 | rm -rf y.tab.h.gch 12 | -------------------------------------------------------------------------------- /labs/04/lex_examples/test.l: -------------------------------------------------------------------------------- 1 | %{ 2 | //https://regex101.com/r/hlv3Q4/1 3 | //egrep '^0\.0$|^[1-9][0-9]*\.[0-9]*[1-9]$|^[1-9][0-9]*\.0$' test.txt 4 | #include 5 | %} 6 | 7 | %% 8 | ^0\.0$|^[1-9][0-9]*\.[0-9]*[1-9]$|^[1-9][0-9]*\.0$ printf("VALID\n"); 9 | %% 10 | -------------------------------------------------------------------------------- /labs/04/yacc_examples/Makefile: -------------------------------------------------------------------------------- 1 | all: 2 | yacc -d ambi_calculator.y 3 | lex ambi_calculator.l 4 | gcc y.tab.c lex.yy.c -ly -ll -o ambi_calculator 5 | 6 | clean: 7 | rm -rf ambi_calculator 8 | rm -rf lex.yy.c 9 | rm -rf y.tab.c 10 | rm -rf y.tab.h 11 | rm -rf y.tab.h.gch 12 | -------------------------------------------------------------------------------- /labs/05/recursive_descent_parser_v2/log: -------------------------------------------------------------------------------- 1 | graph G { 2 | S -- E; 3 | E -- T; 4 | E -- ED; 5 | T -- F; 6 | T -- TD; 7 | F -- (; 8 | F -- E; 9 | F -- ); 10 | E -- T; 11 | E -- ED; 12 | T -- F; 13 | T -- TD; 14 | F -- id; 15 | TD -- ε; 16 | ED -- ε; 17 | TD -- ε; 18 | ED -- ε; 19 | } -------------------------------------------------------------------------------- /labs/04/calculator_v2/simplecalc.l: -------------------------------------------------------------------------------- 1 | %{ 2 | #include "y.tab.h" 3 | extern int yylval; 4 | %} 5 | %% 6 | [0-9]+ { yylval = atoi(yytext); return NUMBER; } 7 | [a-z] { yylval = yytext[0]; return NAME; } 8 | [ \t] ; /* ignore whitespace */ 9 | \n return 0; /* logical EOF */ 10 | . return yytext[0]; 11 | %% 12 | -------------------------------------------------------------------------------- /labs/04/yacc_examples/ambi_calculator.l: -------------------------------------------------------------------------------- 1 | %{ 2 | #include "y.tab.h" 3 | extern int yylval; 4 | %} 5 | %% 6 | [0-9]+ { yylval = atoi(yytext); return NUMBER; } 7 | [a-z] { yylval = yytext[0]; return NAME; } 8 | [ \t] ; /* ignore whitespace */ 9 | \n return 0; /* logical EOF */ 10 | . return yytext[0]; 11 | %% 12 | -------------------------------------------------------------------------------- /labs/09/example.c: -------------------------------------------------------------------------------- 1 | // check the dot output in http://www.webgraphviz.com/ 2 | 3 | unsigned euclid(unsigned a, unsigned b) { 4 | while (a != b) 5 | if (a > b) 6 | a = a - b; 7 | else 8 | b = b - a; 9 | return a; 10 | } 11 | 12 | int main(){ 13 | int ret; 14 | ret = euclid(1,2); 15 | return ret; 16 | } 17 | -------------------------------------------------------------------------------- /labs/01/simple_foo.c: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | int array_test[256]; 4 | 5 | float foo(){ 6 | float a,c = 1; 7 | float b = 1; 8 | a = b + c * 60; 9 | return a; 10 | } 11 | 12 | int main(void) { 13 | printf("Hello World\n"); 14 | float result = foo(); 15 | printf("%f\n",result); 16 | return 0; 17 | } 18 | -------------------------------------------------------------------------------- /labs/09/Makefile: -------------------------------------------------------------------------------- 1 | all: 2 | gcc -fdump-tree-cfg-graph example.c 3 | gcc -O2 -ftree-vectorize loop-test.c -o loop-test -fopt-info-vec &> loop-build.log 4 | 5 | clean: 6 | @find . -type f -executable -exec sh -c "file -i '{}' | grep -q 'x-executable; charset=binary'" \; -print | xargs rm -f 7 | rm -rf a.out example.c.012t.cfg example.c.012t.cfg.dot 8 | 9 | -------------------------------------------------------------------------------- /labs/04/lex_examples/Makefile: -------------------------------------------------------------------------------- 1 | all: 2 | lex example0.l 3 | gcc lex.yy.c -o example0 -ll 4 | lex example1.l 5 | gcc lex.yy.c -o example1 -ll 6 | lex example2.l 7 | gcc lex.yy.c -o example2 -ll 8 | lex example3.l 9 | gcc lex.yy.c -o example3 -ll 10 | 11 | clean: 12 | rm -rf *.c 13 | rm -rf example0 14 | rm -rf example1 15 | rm -rf example2 16 | rm -rf example3 17 | -------------------------------------------------------------------------------- /labs/03/word_count/wc.l: -------------------------------------------------------------------------------- 1 | %{ 2 | #include 3 | unsigned long charCount = 0, wordCount = 0, lineCount = 0; 4 | %} 5 | 6 | word [^ \t\n]+ 7 | eol \n 8 | 9 | %% 10 | {word} {wordCount++;charCount +=yyleng;} 11 | {eol} {charCount++;lineCount++;} 12 | . {charCount++;} 13 | %% 14 | 15 | int main(){ 16 | yylex(); 17 | printf("%lu %lu %lu\n",lineCount,wordCount,charCount); 18 | return 0; 19 | } 20 | 21 | -------------------------------------------------------------------------------- /labs/04/calculator_v3/simplecalc.l: -------------------------------------------------------------------------------- 1 | %{ 2 | #include "y.tab.h" 3 | %} 4 | %% 5 | [0-9]+.[0-9]+ { yylval.dval = atof(yytext); return NUMBER; } 6 | [a-z]+ { yylval.sval = strdup(yytext); return STRING; } 7 | [ \t] ; /* ignore whitespace */ 8 | \n return 0; /* logical EOF */ 9 | . return yytext[0]; 10 | %% 11 | -------------------------------------------------------------------------------- /labs/04/calculator_v2_linux_zoo/simplecalc.l: -------------------------------------------------------------------------------- 1 | %{ 2 | #include "y.tab.h" 3 | extern int yylval; 4 | 5 | %} 6 | 7 | %option noyywrap 8 | 9 | 10 | %% 11 | [0-9]+ { yylval = atoi(yytext); return NUMBER; } 12 | [a-z] { yylval = yytext[0]; return NAME; } 13 | [ \t] ; /* ignore whitespace */ 14 | \n return 0; /* logical EOF */ 15 | . return yytext[0]; 16 | %% 17 | -------------------------------------------------------------------------------- /labs/12/bench.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #define MAX 100000000 5 | 6 | static struct timeval tm1; 7 | int a[256], b[256], c[256]; 8 | 9 | void foo(); 10 | 11 | int main(){ 12 | foo(); 13 | return 0; 14 | } 15 | 16 | void foo(){ 17 | int i,x; 18 | for (x=0; x 3 | #include 4 | 5 | int main(int argc, char **argv) { 6 | char buffer[5]; 7 | printf ("Buffer Contains: %s , Size Of Buffer is %ld\n", 8 | buffer,sizeof(buffer)); 9 | //strcpy(buffer,"deadbeef"); 10 | strcpy(buffer,argv[1]); 11 | printf ("Buffer Contains: %s , Size Of Buffer is %ld\n", 12 | buffer,sizeof(buffer)); 13 | } 14 | -------------------------------------------------------------------------------- /labs/04/lex_examples/example3.l: -------------------------------------------------------------------------------- 1 | %{ 2 | #include 3 | %} 4 | 5 | %% 6 | [a-zA-Z][a-zA-Z0-9]* printf("WORD "); 7 | [a-zA-Z0-9\/.-]+ printf("FILENAME "); 8 | \" printf("QUOTE "); 9 | \{ printf("OBRACE "); 10 | \} printf("EBRACE "); 11 | ; printf("SEMICOLON "); 12 | \n printf("\n"); 13 | [ \t]+ /* ignore whitespace */; 14 | %% 15 | -------------------------------------------------------------------------------- /labs/09/loop-test.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #define MAX 1000000 5 | 6 | static struct timeval tm1; 7 | int a[256], b[256], c[256]; 8 | 9 | void foo(); 10 | 11 | int main(){ 12 | foo(); 13 | return 0; 14 | } 15 | 16 | __attribute__((target_clones("avx2","arch=atom","default"))) 17 | void foo(){ 18 | int i,x; 19 | for (x=0; x 2 | //Global Variable 3 | int G_Var; 4 | 5 | void AliasingFunction(int& InputVar) //InputVar becomes an alias fo G_Var 6 | { 7 | G_Var = InputVar + 1; //Uses Global Variable 8 | //Same effect as G_Var = G_Var + 1 9 | cout << InputVar << endl; 10 | cout << G_Var << endl; 11 | } 12 | 13 | void main() 14 | { 15 | G_Var = 2; 16 | /*calls SomeFunction with the global Variable as a parameter */ 17 | AliasingFunction(G_Var); 18 | } 19 | -------------------------------------------------------------------------------- /labs/01/checker.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/bash 2 | 3 | 4 | if [ -f "test.patch" ] 5 | then 6 | echo "patch exist" 7 | git apply --check test.patch 8 | if [ $? -eq 0 ] 9 | then 10 | echo "Patch apply clean" 11 | patch -p1 < test.patch 12 | fi 13 | 14 | fi 15 | 16 | make 17 | objdump -d ./simple_foo > log 18 | if [ -f asm-analytics.sh ] 19 | then 20 | bash asm-analytics.sh log 21 | exit 0 22 | fi 23 | if [ -f asm-analytics.py ] 24 | then 25 | python asm-analytics.py log 26 | exit 0 27 | fi 28 | 29 | -------------------------------------------------------------------------------- /labs/04/calculator_v2/simplecalc.y: -------------------------------------------------------------------------------- 1 | %{ 2 | #include 3 | int yylex(); 4 | void yyerror(const char *s); 5 | %} 6 | 7 | 8 | %token NAME NUMBER 9 | %% 10 | 11 | statement: NAME '=' expression {printf("%c = %d\n", $1,$3);} 12 | | expression {printf("= %d\n", $1);} 13 | ; 14 | expression: expression '+' NUMBER { $$ = $1 + $3;} 15 | | expression '-' NUMBER { $$ = $1 - $3;} 16 | | expression '*' NUMBER { $$ = $1 * $3;} 17 | | '(' expression ')' {$$ = $2;} 18 | | NUMBER {$$ = $1;} 19 | ; 20 | 21 | 22 | -------------------------------------------------------------------------------- /labs/12/fprintf.c: -------------------------------------------------------------------------------- 1 | 2 | // Compile with gcc fprintf.c -m32 -fanalyzer and see how 3 | // Wanalyzer-unsafe-call-within-signal-handler works 4 | // Important it has to be with signal, if only printf(buffer) 5 | // it will not be detected by compiler 6 | 7 | #include 8 | #include 9 | #include 10 | 11 | static void handler(int signum, char buffer[100]){ 12 | printf(buffer); 13 | } 14 | 15 | int main(int argc, char** argv) { 16 | char buffer[100]; 17 | strncpy(buffer, argv[1], 100); 18 | signal(SIGINT, handler); 19 | return 0; 20 | } 21 | -------------------------------------------------------------------------------- /labs/04/lex_examples/example3_linuxzoo.l: -------------------------------------------------------------------------------- 1 | %{ 2 | #include 3 | %} 4 | 5 | %option noyywrap 6 | 7 | %% 8 | [a-zA-Z][a-zA-Z0-9]* { printf("WORD "); } 9 | [a-zA-Z0-9\/.-]+ { printf("FILENAME "); } 10 | \" { printf("QUOTE "); } 11 | \{ { printf("OBRACE "); } 12 | \} { printf("EBRACE "); } 13 | ; { printf("SEMICOLON "); } 14 | \n { printf("\n"); } 15 | [ \t]+ /* ignore whitespace */ 16 | %% 17 | 18 | int main() { 19 | yylex(); 20 | return 0; 21 | } 22 | -------------------------------------------------------------------------------- /labs/04/lex_examples/example_input_file.l: -------------------------------------------------------------------------------- 1 | %{ 2 | #include 3 | %} 4 | 5 | %% 6 | [0123456789]+ printf("NUMBER\n"); 7 | [a-zA-Z][a-zA-Z0-9]* printf("WORD\n"); 8 | %% 9 | 10 | int main(int argc, char **argv) { 11 | FILE *fd; 12 | 13 | if (argc == 2) 14 | { 15 | if (!(fd = fopen(argv[1], "r"))) 16 | { 17 | perror("Error: "); 18 | return (-1); 19 | } 20 | yyset_in(fd); 21 | yylex(); 22 | fclose(fd); 23 | } 24 | else 25 | printf("Usage: a.out filename\n"); 26 | return (0); 27 | } 28 | 29 | -------------------------------------------------------------------------------- /labs/04/calculator_v3/simplecalc.y: -------------------------------------------------------------------------------- 1 | %{ 2 | #include 3 | int yylex(); 4 | void yyerror(const char *s); 5 | %} 6 | 7 | %union { 8 | double dval; 9 | char *sval; 10 | } 11 | 12 | 13 | %token NUMBER 14 | %token STRING 15 | 16 | %type expression 17 | %% 18 | 19 | statement: STRING '=' expression {printf("%s = %f\n", $1,$3);} 20 | ; 21 | expression: expression '+' NUMBER { $$ = $1 + $3;} 22 | | expression '-' NUMBER { $$ = $1 - $3;} 23 | | expression '*' NUMBER { $$ = $1 * $3;} 24 | | '(' expression ')' {$$ = $2;} 25 | | NUMBER {$$ = $1;} 26 | ; 27 | -------------------------------------------------------------------------------- /labs/12/bench_mem_test.c: -------------------------------------------------------------------------------- 1 | // fortify_test.c 2 | #include 3 | #include 4 | #define MAX 100000000 5 | 6 | char buffer[5]; 7 | 8 | void foo(char *value){ 9 | strcpy(buffer,value); 10 | } 11 | 12 | int main(int argc, char **argv) { 13 | printf ("Buffer Contains: %s , Size Of Buffer is %ld\n", 14 | buffer,sizeof(buffer)); 15 | int i,x; 16 | for (x=0; x 3 | int yylex(); 4 | void yyerror(const char *s); 5 | %} 6 | 7 | 8 | %token NAME NUMBER 9 | 10 | %left '*''/' 11 | %left '+''-' 12 | 13 | %% 14 | 15 | statement: NAME '=' expression ';' {printf("%c = %d\n", $1,$3);} 16 | | expression {printf("= %d\n", $1);} 17 | ; 18 | expression: expression '+' expression { $$ = $1 + $3;} 19 | | expression '*' expression { $$ = $1 * $3;} 20 | | expression '/' expression { $$ = $1 / $3;} 21 | | expression '-' expression { $$ = $1 - $3;} 22 | | NUMBER {$$ = $1;} 23 | ; 24 | 25 | 26 | -------------------------------------------------------------------------------- /labs/04/calculator_v1/calc.l: -------------------------------------------------------------------------------- 1 | %{ 2 | 3 | #include 4 | #include "y.tab.h" 5 | int c; 6 | extern int yylval; 7 | %} 8 | %% 9 | " " ; 10 | i { 11 | yylval = atoi(yytext); 12 | return(IDCL); 13 | } 14 | p { 15 | yylval = atoi(yytext); 16 | return(PRINT); 17 | } 18 | [a-h]|[j-o]|[q-z] { 19 | c = yytext[0]; 20 | yylval = c - 'a'; 21 | return(LETTER); 22 | } 23 | [0-9] { 24 | c = yytext[0]; 25 | yylval = c - '0'; 26 | return(DIGIT); 27 | } 28 | [^a-z0-9\b] { 29 | c = yytext[0]; 30 | return(c); 31 | } 32 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Prerequisites 2 | *.d 3 | 4 | # Object files 5 | *.o 6 | *.ko 7 | *.obj 8 | *.elf 9 | 10 | # Linker output 11 | *.ilk 12 | *.map 13 | *.exp 14 | 15 | # Precompiled Headers 16 | *.gch 17 | *.pch 18 | 19 | # Libraries 20 | *.lib 21 | *.a 22 | *.la 23 | *.lo 24 | 25 | # Shared objects (inc. Windows DLLs) 26 | *.dll 27 | *.so 28 | *.so.* 29 | *.dylib 30 | 31 | # Executables 32 | *.exe 33 | *.out 34 | *.app 35 | *.i*86 36 | *.x86_64 37 | *.hex 38 | 39 | # Debug files 40 | *.dSYM/ 41 | *.su 42 | *.idb 43 | *.pdb 44 | 45 | # Kernel Module Compile Results 46 | *.mod* 47 | *.cmd 48 | .tmp_versions/ 49 | modules.order 50 | Module.symvers 51 | Mkfile.old 52 | dkms.conf 53 | 54 | y.tab.c 55 | y.tab.h 56 | lex.yy.c 57 | 58 | *.patch 59 | -------------------------------------------------------------------------------- /labs/04/calculator_v2_linux_zoo/simplecalc.y: -------------------------------------------------------------------------------- 1 | %{ 2 | #include 3 | int yylex(); 4 | void yyerror (char const *s) { 5 | fprintf (stderr, "%s\n", s); 6 | } 7 | %} 8 | 9 | 10 | %token NAME NUMBER 11 | %% 12 | 13 | statement: NAME '=' expression {printf("%c = %d\n", $1,$3);} 14 | | expression {printf("= %d\n", $1);} 15 | ; 16 | expression: expression '+' NUMBER { $$ = $1 + $3;} 17 | | expression '-' NUMBER { $$ = $1 - $3;} 18 | | expression '*' NUMBER { $$ = $1 * $3;} 19 | | '(' expression ')' {$$ = $2;} 20 | | NUMBER {$$ = $1;} 21 | ; 22 | 23 | %% 24 | 25 | int main(){ 26 | yyparse(); 27 | return 0; 28 | } 29 | -------------------------------------------------------------------------------- /labs/11/compiler.l: -------------------------------------------------------------------------------- 1 | %{ 2 | #include "y.tab.h" 3 | %} 4 | %% 5 | [ \t] ; /* ignore whitespace */ 6 | \n return 0; 7 | [abcdeghjklmnoqrstuvwxyz] { yylval.sval = strdup(yytext); return ID; } 8 | \/\/.* { yylval.sval = strdup(yytext); return COMMENT; } 9 | p { yylval.sval = strdup(yytext); return PRINT; } 10 | f { yylval.sval = strdup(yytext); return TYPE; } 11 | i { yylval.sval = strdup(yytext); return TYPE; } 12 | [0123456789]+ { yylval.dval = atoi(yytext); return INTEGER; } 13 | [0123456789]+.[0123456789]+ { yylval.dval = atof(yytext); return FLOAT; }; 14 | . return yytext[0]; 15 | %% 16 | -------------------------------------------------------------------------------- /labs/12/Makefile: -------------------------------------------------------------------------------- 1 | all: 2 | gcc main.c -o main 3 | gcc bench.c -fstack-protector-all -o bench-fstack-protection-all 4 | gcc bench.c -fstack-protector-strong -o bench-fstack-protection-strong 5 | gcc -D_FORTIFY_SOURCE=1 -Wall -g -O2 mem_test.c -o mem_test 6 | gcc -D_FORTIFY_SOURCE=1 -Wall -g -O2 vuln.c -o vuln 7 | gcc -Wall -g -O2 bench_mem_test.c -o bench_mem_test 8 | gcc -D_FORTIFY_SOURCE=2 -Wall -g -O2 bench_mem_test.c -o bench_mem_test-forty-2 9 | gcc -Wall -Wextra -Wformat-security printf-secure.c -o printf-secure 10 | 11 | rop: 12 | export CFLAGS="" 13 | export CXXFLAGS="" 14 | export FCFLAGS="" 15 | export FFLAGS="" 16 | gcc -m32 -O0 vuln.c -o vuln -fno-stack-protector --static -fno-pic 17 | 18 | clean: 19 | rm -rf main 20 | rm -rf bench-fstack-protection-all 21 | rm -rf bench-fstack-protection-strong 22 | rm -rf vuln 23 | rm -rf mem_test 24 | rm -rf bench_mem_test 25 | rm -rf bench_mem_test-forty-2 26 | rm -rf printf-secure 27 | -------------------------------------------------------------------------------- /labs/05/recursive_descent_parser/README.md: -------------------------------------------------------------------------------- 1 | # Lab 05 v2 instructions 2 | 3 | ## Objective 4 | 5 | Make the student understand how create a recursive descent parser of a basic 6 | grammar 7 | 8 | # Requirements 9 | 10 | * Linux machine, either a VM or a bare metal host 11 | * git send mail server installed and configured on your Linux machine 12 | 13 | ## Instructions 14 | 15 | Create a basic parser for the next grammar: 16 | 17 | ``` 18 | E --> i E' 19 | E' --> + i E' | e 20 | ``` 21 | 22 | The code could be in any of the following lenguages: 23 | 24 | * C ( recomended ) 25 | * C++ 26 | * python 27 | * Java 28 | 29 | ## Please send the mail as git send mail: 30 | 31 | ``` 32 | $ git add recursive_descent_parser.c Makefile 33 | $ git commit -s -m -homework-05 34 | $ git send-email -1 35 | 36 | ``` 37 | Do some tests sending the mail to your personal account, if you get the mail, 38 | then you can be sure I will get the mail 39 | 40 | 41 | ## Time to do the homework: 42 | 43 | One week from the moment the mail is sent to students 44 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2019 Victor Rodriguez 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /labs/12/vuln.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | 4 | /* 5 | * IMPORTANT: To compile 32 bit binaries on 64 bit Linux version, you have to 6 | * Install libx32gcc development package and 32 bit GNU C Library 7 | * IN GCC 10 in CLR compile as: 8 | * gcc -m32 -O0 vuln.c -o vuln -fno-stack-protector --static -fno-pic 9 | * python2 -c 'import struct; print "l"*32+struct.pack("i",0x08049db5)' | ./vuln 10 | */ 11 | 12 | 13 | int modified = 0; 14 | 15 | void secretFunction_variable(){ 16 | if(modified != 0) { 17 | printf("you have changed the 'modified' variable\n"); 18 | } else { 19 | printf("Try again?\n"); 20 | } 21 | } 22 | 23 | void secretFunction_mul(int a, int b) 24 | { 25 | printf("The answer is %d\n", a*b); 26 | exit(0); 27 | } 28 | 29 | void secretFunction() 30 | { 31 | printf("Congratulations!\n"); 32 | printf("You have entered in the secret function!\n"); 33 | } 34 | 35 | void echo() 36 | { 37 | char buffer[20]; 38 | 39 | printf("Enter some text:\n"); 40 | scanf("%s", buffer); 41 | printf("You entered: %s\n", buffer); 42 | } 43 | 44 | int main() 45 | { 46 | echo(); 47 | 48 | return 0; 49 | } 50 | -------------------------------------------------------------------------------- /labs/11/compiler.y: -------------------------------------------------------------------------------- 1 | %{ 2 | #include 3 | int yylex(); 4 | void yyerror(const char *s); 5 | extern FILE* yyin; 6 | %} 7 | 8 | %union { 9 | double dval; 10 | char *sval; 11 | } 12 | 13 | 14 | %token FLOAT 15 | %token INTEGER 16 | %token TYPE 17 | %token ID 18 | %token PRINT 19 | %token COMMENT 20 | %% 21 | 22 | instruction: reserved 23 | | statement 24 | | COMMENT 25 | ; 26 | reserved: TYPE ID {printf("Valid expression declaration \n");} 27 | | PRINT ID {printf("Valid expression print \n");} 28 | ; 29 | statement: ID '=' expression {printf("Valid expression, assignment \n");} 30 | ; 31 | expression: number '+' number 32 | | number '-' number 33 | | number '*' number 34 | | number '/' number 35 | | number 36 | ; 37 | number: FLOAT 38 | | INTEGER 39 | | ID 40 | ; 41 | 42 | %% 43 | int main(int argc, char **argv) { 44 | FILE *fd; 45 | 46 | if (argc == 2) 47 | { 48 | if (!(fd = fopen(argv[1], "r"))) 49 | { 50 | perror("Error: "); 51 | return (-1); 52 | } 53 | yyin = fd; 54 | yyparse(); 55 | fclose(fd); 56 | } 57 | else 58 | printf("Usage: ./lex_analaizer filename\n"); 59 | return (0); 60 | } 61 | -------------------------------------------------------------------------------- /final_project/f1/README.md: -------------------------------------------------------------------------------- 1 | # Final project 2 | 3 | Please create a compiler that reads human sentences and generate pandas queries 4 | that search over the F1 files from https://www.kaggle.com/datasets/rohanrao/formula-1-world-championship-1950-2020 5 | 6 | ## Input of the query 7 | 8 | The way you generate your query is up to you, you can choose and define the grammar 9 | that you prefer it just has to be humanly readable and easy to understand, a good 10 | example is: 11 | 12 | ``` 13 | Search all the results in results.csv with more than 10 points and less than 40 laps 14 | ``` 15 | 16 | or 17 | 18 | ``` 19 | Retrieve all the results that meet the specified criteria: more than 10 points and less than 40 laps 20 | ``` 21 | This will go in a file input.txt 22 | 23 | Your team can decide how to define your grammar 24 | 25 | Once you have your grammar please use YACC and LEX to generate a compiler that must generate the next 2 lines 26 | 27 | 28 | ``` 29 | ./mycompiler input.txt 30 | 31 | ``` 32 | as a result it will print: 33 | 34 | ``` 35 | import pandas as pd 36 | df = pd.read_csv('results.csv') 37 | print(df[(df['points'] > 10) & (df["laps"] <= 40)]) 38 | ``` 39 | 40 | Where results.csv and the logic can change based on the sentence analyzed by your compiler 41 | 42 | 43 | Have fun! 44 | 45 | 46 | -------------------------------------------------------------------------------- /labs/02/parse_tree.py: -------------------------------------------------------------------------------- 1 | """ 2 | Taken from: 3 | http://interactivepython.org/runestone/static/pythonds/Trees/ParseTree.html 4 | pip install pythonds 5 | """ 6 | from pythonds.basic.stack import Stack 7 | from pythonds.trees.binaryTree import BinaryTree 8 | 9 | def buildParseTree(fpexp): 10 | fplist = fpexp.split() 11 | pStack = Stack() 12 | eTree = BinaryTree('') 13 | pStack.push(eTree) 14 | currentTree = eTree 15 | for i in fplist: 16 | if i == '(': 17 | currentTree.insertLeft('') 18 | pStack.push(currentTree) 19 | currentTree = currentTree.getLeftChild() 20 | elif i not in ['+', '-', '*', '/', ')']: 21 | currentTree.setRootVal(int(i)) 22 | parent = pStack.pop() 23 | currentTree = parent 24 | elif i in ['+', '-', '*', '/']: 25 | currentTree.setRootVal(i) 26 | currentTree.insertRight('') 27 | pStack.push(currentTree) 28 | currentTree = currentTree.getRightChild() 29 | elif i == ')': 30 | currentTree = pStack.pop() 31 | else: 32 | raise ValueError 33 | return eTree 34 | 35 | 36 | def main(): 37 | pt = buildParseTree("( ( 10 + 5 ) * 3 )") 38 | pt.postorder() 39 | 40 | if __name__ == "__main__": 41 | main() 42 | -------------------------------------------------------------------------------- /labs/06/yacc/README.md: -------------------------------------------------------------------------------- 1 | # Lab 06 v2 instructions 2 | 3 | ## Objective 4 | 5 | Design an RDP (recursive-descent-parser) for the next basic grammar: 6 | 7 | ``` 8 | S -> aBc 9 | B -> bc | b 10 | ``` 11 | 12 | # Requirements 13 | 14 | * Linux machine, either a VM or a bare metal host 15 | * GCC compiler (at least version 4.8) 16 | * YACC or BISON 17 | * git send mail server installed and configured on your Linux machine 18 | 19 | ## Instructions 20 | 21 | Create a program in YACC that taking as an example the grammar: 22 | 23 | ``` 24 | S -> aBc 25 | B -> bc | b 26 | ``` 27 | 28 | * Read the token list from another file tokens.txt 29 | * Print which ones are valid strings and which ones not 30 | 31 | Example of tokens: 32 | 33 | ``` 34 | abbbbbc 35 | abbbb 36 | abc 37 | abcbcbcc 38 | ac 39 | ``` 40 | 41 | ## Expected result: 42 | 43 | Your sytax tree should: 44 | 45 | * Detect if each list of tokens has a sytanx error 46 | 47 | 48 | ## Please send the mail as git send mail: 49 | 50 | ``` 51 | $ git add syntax_analaizer.c 52 | $ git commit -s -m -homework-06 53 | $ git send-email -1 54 | 55 | ``` 56 | Do some tests sending the mail to your personal account, if you get the mail, 57 | then you can be sure I will get the mail 58 | 59 | ## Time to do the homework: 60 | 61 | One week from the moment the mail is sent to students 62 | 63 | -------------------------------------------------------------------------------- /labs/11/README.md: -------------------------------------------------------------------------------- 1 | # CFG implemented in Lex and YACC 2 | 3 | 4 | Lex and yacc help you write programs that transform structured input. This includes 5 | an enormous range of applications—anything from a simple text search program that 6 | looks for patterns in its input file to a C compiler that transforms a source program into 7 | optimized object code. 8 | 9 | One of those applications are the context-free grammars (CFG). CFG are a more 10 | powerful method of describing languages. Such grammars can describe certain 11 | features that have a recursive structure, which makes them useful in a variety 12 | of applications. 13 | 14 | ## Activities 15 | 16 | Please create a compiler for the Advance Calculator programing language:  17 | 18 | Types In ac, there are only two data types:  19 | Integer: a sequence of decimal numerals 20 | Float : five fractional digits after the decimal point 21 | 22 | Keywords In ac, there are three reserved keywords, each limited for simplicity 23 | to a single letter:  24 | f (declares a float variable) 25 | i (declares an integer variable) 26 | p (prints the value of a variable). 27 | 28 | Variables The ac language offers only 23 possible variable names, drawn from 29 | the lowercase Roman alphabet and excluding the three reserved keywords f, i, 30 | and p. Variables must be declared prior to using them. 31 | 32 | 33 | 34 | 35 | -------------------------------------------------------------------------------- /labs/10/README.md: -------------------------------------------------------------------------------- 1 | # CFG implemented in Lex and YACC 2 | 3 | 4 | Lex and yacc help you write programs that transform structured input. This includes 5 | an enormous range of applications—anything from a simple text search program that 6 | looks for patterns in its input file to a C compiler that transforms a source program into 7 | optimized object code. 8 | 9 | One of those applications are the context-free grammars (CFG). CFG are a more 10 | powerful method of describing languages. Such grammars can describe certain 11 | features that have a recursive structure, which makes them useful in a variety 12 | of applications. 13 | 14 | ## Activities 15 | 16 | Create a program in LEX and YACC that analyise the CFG: 17 | 18 | ``` 19 | ⟨SENTENCE⟩ → ⟨NOUN-PHRASE⟩⟨VERB-PHRASE⟩ ⟨NOUN-PHRASE⟩ → ⟨CMPLX-NOUN⟩ | ⟨CMPLX-NOUN⟩⟨PREP-PHRASE⟩ 20 | ⟨VERB-PHRASE⟩ → ⟨CMPLX-VERB⟩ | ⟨CMPLX-VERB⟩⟨PREP-PHRASE⟩ ⟨PREP-PHRASE⟩ → ⟨PREP⟩⟨CMPLX-NOUN⟩ 21 | ⟨CMPLX-NOUN⟩ → ⟨ARTICLE⟩⟨NOUN⟩ ⟨CMPLX-VERB⟩ → ⟨VERB⟩ | ⟨VERB⟩⟨NOUN-PHRASE⟩ 22 | ⟨ARTICLE⟩ → a | the 23 | ⟨NOUN⟩ → boy | girl | flower 24 | ⟨VERB⟩ → touches | likes | sees ⟨PREP⟩ → with 25 | ``` 26 | 27 | Example of Input file : 28 | 29 | a boy sees 30 | the boy sees a flower 31 | a girl with a flower likes the boy 32 | a flower sees a flower 33 | 34 | Each of these strings has a derivation in grammar 35 | 36 | expected way to test yoru code: 37 | 38 | ``` 39 | ./analyzer test.txt 40 | PASS 41 | PASS 42 | PASS 43 | FAIL 44 | ``` 45 | 46 | -------------------------------------------------------------------------------- /labs/05/recursive_descent_parser/solution_v1.c: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | 5 | //FIXME 6 | 7 | /*************************** 8 | Example: 9 | Grammar: 10 | E --> i E' 11 | E' --> + i E' | e 12 | ***************************/ 13 | 14 | char l; 15 | 16 | 17 | bool E_alpha(); 18 | bool E_alpha_1(); 19 | bool E_alpha_2(); 20 | 21 | void error(){ 22 | printf("Error\n"); 23 | exit(-1); 24 | } 25 | // Match function 26 | bool match(char t) { 27 | if (l == t) { 28 | l = getchar(); 29 | return true; 30 | } 31 | else 32 | error(); 33 | } 34 | 35 | // Definition of E' as per the given production 36 | bool E_alpha(){ 37 | if(E_alpha_1() || E_alpha_2()){ 38 | return true; 39 | }else{ 40 | error(); 41 | } 42 | } 43 | 44 | // Definition of E_1' as per the given production 45 | bool E_alpha_2() { 46 | if (l == '\n'){ 47 | return true; 48 | }else{ 49 | return false; 50 | } 51 | } 52 | 53 | // Definition of E_2' as per the given production 54 | bool E_alpha_1() { 55 | if (l == '+') { 56 | if(match('+') && match('i') && E_alpha()) 57 | return true; 58 | }else{ 59 | return false; 60 | } 61 | } 62 | 63 | // Definition of E, as per the given production 64 | bool E() { 65 | if (l == 'i') { 66 | if (match('i') && E_alpha()){ 67 | return true; 68 | } 69 | }else{ 70 | error(); 71 | } 72 | } 73 | 74 | int main() { 75 | 76 | do { 77 | l = getchar(); 78 | // E is a start symbol. 79 | E(); 80 | 81 | } while (l != '\n' && l != EOF); 82 | 83 | if (l == '\n') 84 | printf("Parsing Successful\n"); 85 | } 86 | -------------------------------------------------------------------------------- /labs/05/analyze_cfg/gramar_generator.py: -------------------------------------------------------------------------------- 1 | import string 2 | import random 3 | import argparse 4 | 5 | 6 | # example of gramar in BNF form 7 | # ::= "+" 8 | # | 9 | # ::= "*" 10 | # | 11 | # ::= "(" ")" 12 | # | 13 | # ::= integer 14 | 15 | def id_generator(size=10, chars=string.ascii_uppercase + string.digits): 16 | return ''.join(random.choice(chars) for _ in range(size)) 17 | 18 | operators = ["|",""] 19 | 20 | def get_expresions(start): 21 | none_term,first_section,operand,second_section = get_statements() 22 | if start: 23 | S = " ::= <%s> | <%s>" % (first_section,second_section) 24 | else: 25 | sections = [first_section,second_section] 26 | S = "<%s> ::= <%s> %s <%s>" % (random.choice(sections),\ 27 | random.choice(string.ascii_lowercase),\ 28 | operand,\ 29 | random.choice(string.ascii_lowercase)) 30 | return S 31 | 32 | def get_terminals(): 33 | terminal = "<%s> ::= %s" % (random.choice(string.ascii_lowercase),\ 34 | random.choice(string.ascii_lowercase)) 35 | return(terminal) 36 | 37 | 38 | def get_statements(): 39 | 40 | none_term = random.choice(string.ascii_lowercase) 41 | first_section = random.choice(string.ascii_lowercase) 42 | operand = random.choice(operators) 43 | second_section = random.choice(string.ascii_lowercase) 44 | return none_term,first_section,operand,second_section 45 | 46 | start = True 47 | print(get_expresions(start)) 48 | 49 | start = False 50 | for i in range(10): 51 | print(get_expresions(start)) 52 | print(get_terminals()) 53 | -------------------------------------------------------------------------------- /labs/05/recursive_descent_parser_v2/README.md: -------------------------------------------------------------------------------- 1 | # Lab 05 v3 instructions 2 | 3 | ## Objective 4 | 5 | Design an algorithm that reads an LL(1) gramar and produces the corresponding AST. 6 | 7 | # Requirements 8 | 9 | * Linux machine, either a VM or a bare metal host 10 | * GCC compiler (at least version 4.8) 11 | * [DOT grapviz](http://www.graphviz.org) library 12 | * git send mail server installed and configured on your Linux machine 13 | 14 | ## Instructions 15 | 16 | Create a program in C that taking as an example the gramar in Figure 1: 17 | 18 | * Read the gramar from an external file [gramar.conf](gramar.conf) 19 | * Gramar will be in Backus-Naur Form (BNF) notation 20 | * Read the token list from another file tokens.txt 21 | * Assume that is an LL(1) gramar 22 | * Create the abstrct syntax tree as an struct in C code 23 | * Print the AST graph using [DOT grapviz](http://www.graphviz.org) library 24 | 25 | Figure 1 26 | 27 | Example of tokens.txt: 28 | 29 | ``` 30 | floatdcl id 31 | intdcl id 32 | id assign inum 33 | id assign id plus fnum 34 | print id 35 | 36 | ``` 37 | 38 | ## Expected result: 39 | 40 | 41 | The image can be generated as: 42 | ``` 43 | ./sytax_parser > log 44 | dot log -Tpng >result.png 45 | ``` 46 | 47 | 48 | ``` 49 | 50 | ``` 51 | 52 | ## Please send the mail as git send mail: 53 | 54 | ``` 55 | $ git add lex_analaizer.c 56 | $ git commit -s -m -homework-04 57 | $ git send-email -1 58 | 59 | ``` 60 | Do some tests sending the mail to your personal account, if you get the mail, 61 | then you can be sure I will get the mail 62 | 63 | ## Time to do the homework: 64 | 65 | One week from the moment the mail is sent to students 66 | 67 | -------------------------------------------------------------------------------- /labs/05/recursive_descent_parser_v3/README.md: -------------------------------------------------------------------------------- 1 | # Lab 05 v3 instructions 2 | 3 | ## Objective 4 | 5 | Design an RDP (recursive-descent-parser) for the next basic grammar: 6 | 7 | ``` 8 | S -> aBc 9 | B -> bc | b 10 | ``` 11 | 12 | It should check an input string and generate the Syntax Tree using [DOT 13 | grapviz](http://www.graphviz.org) library. 14 | 15 | # Requirements 16 | 17 | * Linux machine, either a VM or a bare metal host 18 | * GCC compiler (at least version 4.8) 19 | * [DOT grapviz](http://www.graphviz.org) library installed 20 | * git send mail server installed and configured on your Linux machine 21 | 22 | ## Instructions 23 | 24 | Create a program in C that taking as an example the grammar: 25 | 26 | ``` 27 | S -> aBc 28 | B -> bc | b 29 | ``` 30 | 31 | * Read the token list from another file tokens.txt 32 | * Create the abstrct syntax tree as an struct in C code 33 | * Print the AST graph using [DOT grapviz](http://www.graphviz.org) library 34 | 35 | Example of tokens: 36 | 37 | ``` 38 | abbbbbc 39 | abbbb 40 | abc 41 | abcbcbcc 42 | ac 43 | ``` 44 | 45 | ## Expected result: 46 | 47 | Your sytax tree should: 48 | 49 | * Detect if each list of tokens has a sytanx error 50 | * If it does not have a syntax error, print the AST graph using [DOT 51 | grapviz](http://www.graphviz.org) library 52 | * Test the output in: [webgraphviz](http://www.webgraphviz.com/) 53 | 54 | 55 | 56 | ## Please send the mail as git send mail: 57 | 58 | ``` 59 | $ git add syntax_analaizer.c 60 | $ git commit -s -m -homework-04 61 | $ git send-email -1 62 | 63 | ``` 64 | Do some tests sending the mail to your personal account, if you get the mail, 65 | then you can be sure I will get the mail 66 | 67 | ## Time to do the homework: 68 | 69 | One week from the moment the mail is sent to students 70 | 71 | -------------------------------------------------------------------------------- /final_project/detection_loops/README.md: -------------------------------------------------------------------------------- 1 | # Detection of SIMD loops for optimization 2 | 3 | 4 | ## Introduction 5 | 6 | The rapid evolution of computer architectures has also led to an insatiable 7 | demand for new compiler technology. Almost all high-performance systems take 8 | advantage of the same two basic techniques: parallelism and memory hierarchies. 9 | Parallelism can be found at several levels: at the instruction level, where 10 | multiple operations are executed simultaneously and at the processor level, 11 | where different threads of the same application are run on different 12 | processors. Memory hierarchies are a response to the basic limitation that we 13 | can build very fast storage or very large storage, but not storage that is both 14 | fast and large. 15 | 16 | ## Goal 17 | 18 | This final project will be able to detect one of the core parts of source code 19 | for optimizations: loops that execute single instruction multiple data code. 20 | 21 | ## Design 22 | 23 | Having the next code: 24 | 25 | ```C 26 | 27 | float a[256] = {0}; 28 | float b[256] = {0}; 29 | float c[256] = {0}; 30 | 31 | int main(){ 32 | for(int x = 0; x < 10000000; x++) 33 | for (int i=0; i<256; i++){ 34 | c[i] = a[i] + b[i]; 35 | } 36 | } 37 | 38 | ``` 39 | 40 | Your project should be a binary that by reading the C code prints: 41 | 42 | ``` 43 | loop detected in line 8, canidate for SIM 44 | ``` 45 | 46 | 47 | ## Report and presentation 48 | 49 | Presentation should be done in front of the team with a writen [report] made in 50 | latex. 51 | 52 | Teams have to deliver: 53 | 54 | * Printed report 55 | * Printed LateX code 56 | * Send presentation to profesor 57 | 58 | Resources: 59 | * https://www.epaperpress.com/lexandyacc/intro.html 60 | 61 | [report](https://github.com/VictorRodriguez/operating-systems-lecture/blob/master/projects/report.tex) 62 | -------------------------------------------------------------------------------- /labs/05/analyze_cfg/README.md: -------------------------------------------------------------------------------- 1 | # Lab 05 v1 instructions 2 | 3 | ## Objective 4 | 5 | Make the student understand how to analyze a valid CFG 6 | 7 | # Requirements 8 | 9 | * Linux machine, either a VM or a bare metal host 10 | * git send mail server installed and configured on your Linux machine 11 | 12 | ## Instructions 13 | 14 | 15 | A grammar is reduced if each of its nonterminals and productions participates 16 | in the derivation of some string in the grammar’s language. Nonterminals that 17 | can be safely removed are called useless. 18 | 19 | ``` 20 | ::= | 21 | 22 | ::= a 23 | 24 | ::= b 25 | 26 | ::= c 27 | ``` 28 | 29 | The above grammar contains two kinds of nonterminals that cannot participate in any derived string: 30 | 31 | • With S as the start symbol, the nonterminal C cannot appear in any phrase. 32 | • Any phrase that mentions B cannot be rewritten using the grammar’s rules to contain only terminals. 33 | 34 | Referring to Section 4.2.1 of the book: Crafting a Compiler 2nd edition 35 | 36 | * Devise an algorithm to detect nonterminals that cannot be reached from a CFG’s goal symbol. 37 | * Devise an algorithm to detect nonterminals that cannot derive any terminal string in a CFG. 38 | 39 | The code could be in any of the following lenguages: 40 | 41 | * C ( recomended ) 42 | * C++ 43 | * python 44 | * Java 45 | 46 | ``` 47 | make 48 | ./gramar_analyser 49 | 50 | ``` 51 | 52 | ## Please send the mail as git send mail: 53 | 54 | ``` 55 | $ git add gramar_analyser.* Makefile 56 | $ git commit -s -m -homework-05 57 | $ git send-email -1 58 | 59 | ``` 60 | Do some tests sending the mail to your personal account, if you get the mail, 61 | then you can be sure I will get the mail 62 | 63 | 64 | ## Time to do the homework: 65 | 66 | One week from the moment the mail is sent to students 67 | 68 | -------------------------------------------------------------------------------- /labs/06/ast/README.md: -------------------------------------------------------------------------------- 1 | # Lab 06 v3 instructions 2 | 3 | ## Objective 4 | 5 | Make the student create an abstract syntax tree and generate Syntax-Directed 6 | Translation 7 | 8 | # Requirements 9 | 10 | * Linux machine, either a VM or a bare metal host 11 | * GCC compiler 12 | * git send mail server installed and configured on your Linux machine 13 | 14 | ## Instructions 15 | 16 | Consider the grammar: 17 | 18 | ``` 19 | E -> int | ( E ) | E + E | E - E | E * E 20 | ``` 21 | 22 | And the string 23 | 24 | 5 + (2 + 3) 25 | 26 | Create a C code that: 27 | 28 | * Do a top down parsing to check if a string is valid 29 | * Consider input of 1 digit (10 +2 is not valid) to make it more simple 30 | * Generate an abstract syntax tree and generate Syntax-Directed Translation 31 | * Produce the output: 32 | 33 | ```assembly 34 | add 2,3,acum \\ 2 + 3 and store result in acum register 35 | add &acum,5,acum \\ acum register + 5 and store result in acum register 36 | ``` 37 | Reuse the code created in lab 04 if possible 38 | 39 | ## How could it be tested: 40 | 41 | ``` 42 | 43 | make ( compile everything ) 44 | 45 | ./calculator 46 | 47 | ``` 48 | 49 | Example of CODEFILE: 50 | 51 | ``` 52 | 5 + (2 - 3) 53 | 5 + (2 * (2+4)) 54 | (3+1) 55 | ``` 56 | 57 | Example of output: 58 | 59 | ```assembly 60 | sub 2,3,acum 61 | add &acum,5,acum 62 | 63 | add 2,4,acum 64 | mul &acum,2,acum 65 | add &acum,5,acum 66 | 67 | add 3,1,acum 68 | ``` 69 | 70 | ## Please send the mail as git send mail: 71 | 72 | ``` 73 | $ git add syntax-direct-translator.c Makefile 74 | $ git commit -s -m -homework-06 75 | $ git send-email -1 76 | 77 | ``` 78 | Do some tests sending the mail to your personal 79 | account, if you get the mail, then you can be sure I 80 | will get the mail 81 | 82 | 83 | ## Time to do the homework: 84 | 85 | One week from the moment the mail is sent to students 86 | 87 | ## References to use: 88 | 89 | * [SYNTAX-DIRECTED-TRANSLATION](http://pages.cs.wisc.edu/~fischer/cs536.s06/course.hold/html/NOTES/4.SYNTAX-DIRECTED-TRANSLATION.html) 90 | 91 | -------------------------------------------------------------------------------- /labs/04/README.md: -------------------------------------------------------------------------------- 1 | # Lab 04 instructions 2 | 3 | ## Objective 4 | 5 | Make the student understand the power of lex language making a C code that 6 | performs the lexical analysis of the ac src program 7 | 8 | # Requirements 9 | 10 | * Linux machine, either a VM or a bare metal host 11 | * GCC compiler (at least version 4.8) 12 | * lex compiler 13 | * Autotools 14 | * git send mail server installed and configured on your Linux machine 15 | 16 | ## Instructions 17 | 18 | Please generate a LEX code to parse the previous example of lab 03. 19 | 20 | A valid line of code in ac could be: 21 | 22 | ``` 23 | // basic code 24 | 25 | //float b 26 | f b 27 | 28 | // integer a 29 | i a 30 | 31 | // a = 5 32 | a = 5 33 | 34 | // b = a + 3.2 35 | b = a + 3.2 36 | 37 | //print 8.5 38 | p b 39 | ``` 40 | 41 | Your output should be 42 | 43 | ``` 44 | COMMENT 45 | COMMENT 46 | floatdcl id 47 | COMMENT 48 | intdcl id 49 | COMMENT 50 | id assign inum 51 | COMMENT 52 | id assign id plus fnum 53 | COMMENT 54 | print id 55 | ``` 56 | 57 | ## Expected result: 58 | 59 | * Code a lex_analaizer.L that fulfill the requirements 60 | * Generate a random AC code with: 61 | 62 | ``` 63 | python3 code_generator.py > example.ac 64 | 65 | ``` 66 | 67 | * Compile your code with the makefile and execute as follows: 68 | 69 | ``` 70 | ./lex_analaizer example.ac 71 | ``` 72 | 73 | 74 | ## Please send the mail as PR: 75 | 76 | ``` 77 | $ git add lex_analaizer.l 78 | $ git commit -s -m -homework-04 79 | ``` 80 | Do some tests sending the mail to your personal account, if you get the mail, 81 | then you can be sure I will get the mail 82 | 83 | ## Good links for Hints 84 | 85 | * [lextutorial](https://ds9a.nl/lex-yacc/cvs/lex-yacc-howto.html) 86 | * [lex & yacc Second 87 | Edition](https://www.amazon.com/lex-yacc-Doug-Brown/dp/1565920007) 88 | At the end of chapter 1 there is a very similar code as the one requested in 89 | this homework, you just need to read chapter 1 of this book :) 90 | * [useoflexinc](https://www.quora.com/What-is-the-function-of-yylex-yyin-yyout-and-fclose-yyout-in-LEX) 91 | 92 | ## Time to do the homework: 93 | 94 | One week from the moment the mail is sent to students 95 | 96 | -------------------------------------------------------------------------------- /labs/01/README.md: -------------------------------------------------------------------------------- 1 | # Lab 01 instructions 2 | 3 | ## Objective 4 | 5 | Make the students get familiar with the compiler tools and some basics as sw 6 | developers such as: 7 | 8 | * How to compile from the command line 9 | * How to read the object file in the command line 10 | * How to automate the analysis of ASM code 11 | 12 | # Requirements 13 | 14 | * Linux machine, either a VM or a baremetal host 15 | * GCC compiler (at least version 4.8) 16 | * Autotools 17 | * shell scripting 18 | * git send mail server installed and configured on your Linux machine 19 | 20 | ## Instructions 21 | 22 | * Inside this directory read carefully and understand what's inside the Makefile 23 | file 24 | * Compile using the Autotools make: 25 | 26 | ``` 27 | $ make 28 | ``` 29 | * Execute the binary and check that the binary actually produces the expected 30 | result 31 | * Analyze the object code with: 32 | 33 | ``` 34 | objdump -d ./simple_foo | less 35 | ``` 36 | * Make a script (bash or python, is free for you to decide ) called 37 | asm-analytics.sh, this script will have the next requirements 38 | 39 | * Count how many different instructions you have 40 | * Count how many times each instruction is used 41 | * Count how many functions the binary has 42 | * Print the virtual address of each function 43 | 44 | ## Expected result: 45 | 46 | ``` 47 | $ objdump -d ./simple_foo > log 48 | $ ./asm-analytics.sh log 49 | Hi, this is the output of the analysis: 50 | You have 7 kind of instructions in this object file: 51 | movq : Executed 7 times 52 | movss : Executed 3 times 53 | addss : Executed 2 times 54 | You have 2 functions: 55 | main : Located at 100000ef0 addr 56 | foo : Located at 100000ef0 addr 57 | ``` 58 | ## Please send the mail as git send mail: 59 | 60 | ``` 61 | $ git add asm-analytics.sh 62 | $ git commit -s -m -homework-01 63 | $ git send-email -1 64 | 65 | ``` 66 | Do some tests sending the mail to your personal account, if you get the mail, 67 | then you can be sure I will get the mail 68 | 69 | # Time to do the homework: 70 | 71 | one week from the moment the mail is sent to students 72 | 73 | -------------------------------------------------------------------------------- /final_project/emojis_compiler/README.md: -------------------------------------------------------------------------------- 1 | # Emoji to Natural language compiler 2 | 3 | 4 | ## Introduction 5 | 6 | The days of English as a global language may be coming to an end—it might be 7 | replaced in the near future by icons of smiley faces, cats and hearts. While 8 | more than 1.5 billion people speak English, 3.2 billion use the Internet, three 9 | quarters of them through smartphones equipped with emojis. 10 | 11 | More than 90% of social networking users communicate through these symbols and 12 | more than 6 billion emojis are exchanged every day. This kind of message has 13 | become so common that the Oxford Dictionary named the “Face with Tears of Joy” 14 | emoji the Word of the Year 2015. 15 | 16 | Because of this the importance of a language compiler from Emoji langue to 17 | natural language might be necesary in the incoming world. 18 | 19 | ## Goal 20 | 21 | * Do a research of the [state of the art] for natural language processing of emoji 22 | * Create basic gramar for simple known sentences 23 | * Design lexical/syntax/semantic analyser 24 | * Implement in Lex and YACCT. 25 | 26 | ## Design and expected input/output 27 | 28 | Example of input: 29 | 30 | ``` 31 | :wave: how are :arrow_right::bust_in_silhouette:? 32 | ``` 33 | 34 | Should be translated to: 35 | ``` 36 | hi how are you? 37 | ``` 38 | or 39 | 40 | This is 🦋 🆒! I ❌️ ⏳️ 👀 what this thing 🥫 do! 41 | 42 | Should be translated to: 43 | ``` 44 | This is pretty cool! I can't wait to see what this thing can do! 45 | ``` 46 | 47 | Take as help the page: 48 | 49 | https://emojitranslate.com/ 50 | 51 | Other examples of inputs could be: 52 | 53 | * Hmm this is a rather interesting little app. 54 | * It's really sad that not enough puppies are adopted every year. 55 | * This thing totally sucks! Who's idea was this anyways?! 56 | 57 | ## Report and presentation 58 | 59 | Presentation should be done in front of the team with a writen report made in 60 | latex using this 61 | [template](https://github.com/VictorRodriguez/operating-systems-lecture/blob/master/projects/report.tex) 62 | 63 | Teams have to deliver: 64 | 65 | * Printed report 66 | * Printed LateX code 67 | * Send presentation to profesor 68 | 69 | Resources: 70 | * https://www.bbc.com/future/article/20151012-will-emoji-become-a-new-language 71 | 72 | 73 | [state of the art](https://www.bbc.com/future/article/20151012-will-emoji-become-a-new-language) 74 | -------------------------------------------------------------------------------- /labs/08/README.md: -------------------------------------------------------------------------------- 1 | # Lab 08 instructions 2 | 3 | ## Objective 4 | 5 | Understand the optimizations generated from the compiler toolchain 6 | 7 | ## Requirements 8 | 9 | * Linux machine, either a VM or a bare metal host 10 | * GCC compiler 11 | * git send mail server installed and configured on your Linux machine 12 | 13 | ## Instructions 14 | Most C compilers (including the GCC compilers) allow a user to examine the 15 | machine instructions generated for a given source program. Run the following 16 | program through such a C compiler and examine the instructions generated for 17 | the for loop. Next, recompile the program, enabling 18 | optimization[1](https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html) 19 | [3](https://blog.linuxplumbersconf.org/2016/ocw/system/presentations/3795/original/GCC.pdf) 20 | , and reexamine the instructions generated for the for loop. What improvements 21 | have been made? 22 | 23 | Assuming that the program spends all of its time in the for loop, estimate the 24 | speedup obtained. Write a suitable main C function that allocates and 25 | initializes a million-element array to pass to proc. Execute and time the 26 | unoptimized and optimized versions of the program and evaluate the accuracy 27 | of your estimate: 28 | 29 | 30 | ```C 31 | int proc(int a[]) { 32 | int sum = 0, i; 33 | for (i=0; i < 1000000; i++) 34 | sum += a[i]; 35 | return sum; 36 | } 37 | ``` 38 | 39 | ## How teacher is going to review: 40 | 41 | * Make a report following the next [overleaf template](https://github.com/VictorRodriguez/operating-systems-lecture/blob/master/projects/report.tex) 42 | 43 | * Sections that are necesary: 44 | * Title 45 | * Abstract 46 | * Introduction 47 | * Objective 48 | * Development 49 | * Results 50 | * Conclusion 51 | * References 52 | * Upload the template and put a link on the body of yoru commit 53 | * Send the patch of a filecreated on this directory 54 | * Patch must include: 55 | * optimizations.c 56 | * Makefile 57 | 58 | More background about compiler optimizations at [2](https://www.youtube.com/watch?v=jVYnT_onb70) 59 | 60 | 61 | ## Please send the mail as git send mail: 62 | 63 | ``` 64 | $ git add optimizations.c Makefile 65 | $ git commit -s -m -homework-06 66 | $ git send-email -1 67 | 68 | ``` 69 | Do some tests sending the mail to your personal 70 | account, if you get the mail, then you can be sure I 71 | will get the mail 72 | 73 | 74 | ## Time to do the homework: 75 | 76 | One week from the moment the mail is sent to students 77 | 78 | 79 | -------------------------------------------------------------------------------- /labs/03/code_generator.py: -------------------------------------------------------------------------------- 1 | import string 2 | import random 3 | import argparse 4 | 5 | def id_generator(size=10, chars=string.ascii_uppercase + string.digits): 6 | return ''.join(random.choice(chars) for _ in range(size)) 7 | 8 | opreators = ["+","-","*","/"] 9 | 10 | def get_comment_line(): 11 | comment_line = "//%s" % (id_generator()) 12 | return comment_line 13 | 14 | def get_float_line(): 15 | float_line = "f %s" % (random.choice(string.ascii_lowercase)) 16 | return float_line 17 | 18 | def get_integer_line(): 19 | integer_line = "i %s"% (random.choice(string.ascii_lowercase)) 20 | return integer_line 21 | 22 | def get_asigment_line(): 23 | asigment_line = "%s = %s" %(random.choice(string.ascii_lowercase),\ 24 | random.randint(0,100)) 25 | return asigment_line 26 | 27 | def get_asigment_line_2(): 28 | asigment_line_2 = "%s = %s %s %s" % \ 29 | (random.choice(string.ascii_lowercase),\ 30 | random.choice(string.ascii_lowercase),\ 31 | random.choice(opreators),\ 32 | random.randint(0,100)) 33 | return asigment_line_2 34 | 35 | def get_print_line(): 36 | print_line = "p %s" % (random.choice(string.ascii_lowercase)) 37 | return print_line 38 | 39 | 40 | parser = argparse.ArgumentParser(description='Generate random AC code') 41 | parser.add_argument('--stress', dest='stress', action='store_true',\ 42 | help='generate HUGE code to stress the lab') 43 | args = parser.parse_args() 44 | 45 | if args.stress: 46 | f= open("random_code.ac","w+") 47 | for x in range(0, 100000): 48 | comment_line = get_comment_line() 49 | float_line = get_float_line() 50 | integer_line = get_integer_line() 51 | asigment_line = get_asigment_line() 52 | asigment_line_2 = get_asigment_line_2() 53 | print_line = get_print_line() 54 | 55 | f.write(comment_line + "\n") 56 | f.write(float_line + "\n") 57 | f.write(integer_line + "\n") 58 | f.write(asigment_line+ "\n") 59 | f.write(asigment_line_2 + "\n") 60 | f.write(print_line + "\n") 61 | 62 | f.close() 63 | 64 | else: 65 | comment_line = get_comment_line() 66 | float_line = get_float_line() 67 | integer_line = get_integer_line() 68 | asigment_line = get_asigment_line() 69 | asigment_line_2 = get_asigment_line_2() 70 | print_line = get_print_line() 71 | 72 | print(comment_line) 73 | print(float_line) 74 | print(integer_line) 75 | print(asigment_line) 76 | print(asigment_line_2) 77 | print(print_line) 78 | 79 | -------------------------------------------------------------------------------- /labs/06/README.md: -------------------------------------------------------------------------------- 1 | # Lab 06 instructions 2 | 3 | ## Objective 4 | 5 | Make the student create a CFG for the calculator previusly created 6 | 7 | # Requirements 8 | 9 | * Linux machine, either a VM or a bare metal host 10 | * GCC compiler 11 | * git send mail server installed and configured on your Linux machine 12 | 13 | ## Instructions 14 | Taking as imput the following code: 15 | 16 | ``` 17 | $ cat FILE 18 | 19 | // basic code 20 | 21 | //float b 22 | f b 23 | 24 | // integer a 25 | i a 26 | 27 | // a = 5 28 | a = 5 29 | 30 | // b = a + 3.2 31 | b = a + 3.2 32 | 33 | //print 8.5 34 | p b 35 | ``` 36 | Reuse the code created in lab 04 to generate: 37 | 38 | ``` 39 | $ cat tokens.out 40 | 41 | floatdcl id 42 | intdcl id 43 | id assign inum 44 | id assign id plus fnum 45 | print id 46 | ``` 47 | IMPORTANT: calculator should accept + - * / as operator 48 | 49 | 50 | Reuse your code from lab 05 to: 51 | 52 | * Detect nonterminals that cannot be reached from a CFG’s goal symbol. 53 | * Detect nonterminals that cannot derive any terminal string in a CFG. 54 | 55 | (reuse the same code , no need to re write in C, just call as a process) 56 | 57 | Once we know that the Grammer propoused for the calculator has none of these 58 | errors, please create a code in C for the Recursive Decent Parsing algorithm, follow example from: 59 | 60 | [ Recursive Decent Parsing](https://www.youtube.com/watch?v=nv9J5Jb7IxM) 61 | 62 | 63 | ## How could it be tested: 64 | ``` 65 | 66 | make ( compile everything ) 67 | 68 | ./lexic_analyzer 69 | 70 | ``` 71 | This will generate the tokens.out 72 | 73 | Use then for generate the parse tree: 74 | 75 | ``` 76 | ./syntax-calc tokens.out 77 | ``` 78 | And generate a CFG derivation tree in this format: 79 | 80 | ``` 81 | digraph D { 82 | 83 | A -> {B, C, D} -> {F} 84 | 85 | } 86 | 87 | ``` 88 | 89 | View of this tree, put this code in: 90 | 91 | https://dreampuf.github.io/GraphvizOnline 92 | 93 | More info about DOT code: 94 | 95 | https://renenyffenegger.ch/notes/tools/Graphviz/examples/index 96 | 97 | 98 | 99 | ## Please send the mail as git send mail: 100 | 101 | ``` 102 | $ git add syntax-calc.c Makefile 103 | $ git commit -s -m -homework-06 104 | $ git send-email -1 105 | 106 | ``` 107 | Do some tests sending the mail to your personal 108 | account, if you get the mail, then you can be sure I 109 | will get the mail 110 | 111 | 112 | ## Time to do the homework: 113 | 114 | One week from the moment the mail is sent to students 115 | 116 | -------------------------------------------------------------------------------- /labs/04/calculator_v1/calc.y: -------------------------------------------------------------------------------- 1 | %{ 2 | #include 3 | int regs[26]; 4 | int base; 5 | %} 6 | %start list 7 | %token DIGIT LETTER IDCL PRINT 8 | %left '|' 9 | %left '&' 10 | %left '+' '-' 11 | %left '*' '/' '%' 12 | %left UMINUS /*supplies precedence for unary minus */ 13 | %% /* beginning of rules section */ 14 | list: /*empty */ 15 | | 16 | list stat '\n' 17 | | 18 | list error '\n' 19 | { 20 | yyerrok; 21 | } 22 | ; 23 | stat: expr 24 | { 25 | printf("%d\n",$1); 26 | } 27 | | 28 | LETTER '=' expr 29 | { 30 | regs[$1] = $3; 31 | } 32 | | 33 | IDCL LETTER 34 | { 35 | regs[$2] = 0; 36 | } 37 | | 38 | PRINT LETTER 39 | { 40 | printf("%d\n",regs[$2]); 41 | } 42 | ; 43 | expr: '(' expr ')' 44 | { 45 | $$ = $2; 46 | } 47 | | 48 | expr '*' expr 49 | { 50 | $$ = $1 * $3; 51 | } 52 | | 53 | expr '/' expr 54 | { 55 | if ($3 == 0){ 56 | $$ = 0; 57 | yyerror("Division by 0"); 58 | }else 59 | $$ = $1 / $3; 60 | } 61 | | 62 | expr '%' expr 63 | { 64 | $$ = $1 % $3; 65 | } 66 | | 67 | expr '+' expr 68 | { 69 | $$ = $1 + $3; 70 | } 71 | | 72 | expr '-' expr 73 | { 74 | $$ = $1 - $3; 75 | } 76 | | 77 | expr '&' expr 78 | { 79 | $$ = $1 & $3; 80 | } 81 | | 82 | expr '|' expr 83 | { 84 | $$ = $1 | $3; 85 | } 86 | | 87 | '-' expr %prec UMINUS 88 | { 89 | $$ = -$2; 90 | } 91 | | 92 | LETTER 93 | { 94 | $$ = regs[$1]; 95 | } 96 | | 97 | number 98 | ; 99 | number: DIGIT 100 | { 101 | $$ = $1; 102 | base = ($1==0) ? 8 : 10; 103 | } | 104 | number DIGIT 105 | { 106 | $$ = base * $1 + $2; 107 | } 108 | ; 109 | %% 110 | main() 111 | { 112 | return(yyparse()); 113 | } 114 | yyerror(s) 115 | char *s; 116 | { 117 | fprintf(stderr, "%s\n",s); 118 | } 119 | yywrap() 120 | { 121 | return(1); 122 | } 123 | -------------------------------------------------------------------------------- /labs/02/README.md: -------------------------------------------------------------------------------- 1 | # Lab 02 instructions 2 | 3 | ## Objective 4 | 5 | 6 | The study of compilers is mainly a study of how we design the right 7 | mathematical models and choose the right algorithms while balancing the need 8 | for generality and power against simplicity and efficiency. Some of the most 9 | fundamental models are finite-state machines and regular expressions. This 10 | homework/lab makes the student get familiar with the first level of the 11 | compiler which is the lexical analysis. 12 | 13 | The goal is to write a program to check a C program for rudimentary syntax 14 | errors like unbalanced parentheses, brackets, and braces. Don't forget about 15 | quotes, both single and double, escape sequences and comments. (This program 16 | is hard if you do it in full generality.) 17 | 18 | 19 | ## Instructions 20 | 21 | * Create a file named: 22 | ```my_compiler.c``` 23 | 24 | * Make a code in C that given a C code could detect these errors: 25 | * unbalanced parentheses 26 | * brackets 27 | * braces 28 | * quotes (both single and double) 29 | * comments 30 | * Make a Makefile to build and clean (take as an example the one from lab 01): 31 | 32 | 33 | ## Expected result: 34 | 35 | Giving a sample C code as hello.c: 36 | 37 | ``` 38 | #include 39 | 40 | int main(){ 41 | for (int x = 0; x < 10;x++){ 42 | printf("hi\n"); 43 | } 44 | return 0; 45 | } 46 | 47 | ``` 48 | 49 | Run as: 50 | 51 | ``` 52 | ./my_compiler hello.c 53 | There are no errors 54 | ``` 55 | 56 | Or modify to insert some errors and comments like hello-errors.c: 57 | 58 | ``` 59 | #include 60 | // coment (this should not be detected 61 | int main(){ 62 | for (int x = 0; x < 10;x++) 63 | printf("hi\n); 64 | } 65 | return 0; 66 | } 67 | 68 | ``` 69 | 70 | ``` 71 | ./my_compiler hello.c 72 | There is a missing { (optional : error in this line: 73 | There is a missing " (optional : error in this line: 74 | ``` 75 | 76 | If your code can detect these errors it is more than fine. 77 | 78 | 79 | ## Please send the mail as git send mail: 80 | 81 | ``` 82 | $ git add ./my_compiler.c Makefile 83 | $ git commit -s -m -homework-02 84 | $ git send-email -1 85 | ``` 86 | Do some tests sending the mail to your personal account, if you get the mail, 87 | then you can be sure I will get the mail 88 | 89 | # Time to do the homework: 90 | 91 | 1 week from the moment the mail is sent to students 92 | 93 | # How the teacher is going to test your homework 94 | 95 | * Copy your patch form the mail to test.patch 96 | * patch -p1 < test.patch 97 | * make 98 | * ./my_compiler hello.c ( should print no errors ) 99 | * ./my_compiler hello-errors.c ( should print the errors ) 100 | 101 | Test by yourself this flow, you can generate your patch with: 102 | 103 | ``` 104 | git format-patch -1 105 | ``` 106 | The generated file should be like: 107 | ``` 108 | 0001--homework-02.patch 109 | ``` 110 | 111 | -------------------------------------------------------------------------------- /labs/05/bottom-up/shift-reduce.c: -------------------------------------------------------------------------------- 1 | //Including Libraries 2 | #include 3 | #include 4 | #include 5 | 6 | //Global Variables 7 | int z = 0, i = 0, j = 0, c = 0; 8 | 9 | // Modify array size to increase 10 | // length of string to be parsed 11 | char a[16], ac[20], stk[15], act[10]; 12 | 13 | // This Function will check whether 14 | // the stack contain a production rule 15 | // which is to be Reduce. 16 | // Rules can be E->2E2 , E->3E3 , E->4 17 | void check() 18 | { 19 | // Coping string to be printed as action 20 | strcpy(ac,"REDUCE TO E -> "); 21 | 22 | // c=length of input string 23 | for(z = 0; z < c; z++) 24 | { 25 | //checking for producing rule E->4 26 | if(stk[z] == '4') 27 | { 28 | printf("%s4", ac); 29 | stk[z] = 'E'; 30 | stk[z + 1] = '\0'; 31 | 32 | //pinting action 33 | printf("\n$%s\t%s$\t", stk, a); 34 | } 35 | } 36 | 37 | for(z = 0; z < c - 2; z++) 38 | { 39 | //checking for another production 40 | if(stk[z] == '2' && stk[z + 1] == 'E' && 41 | stk[z + 2] == '2') 42 | { 43 | printf("%s2E2", ac); 44 | stk[z] = 'E'; 45 | stk[z + 1] = '\0'; 46 | stk[z + 2] = '\0'; 47 | printf("\n$%s\t%s$\t", stk, a); 48 | i = i - 2; 49 | } 50 | 51 | } 52 | 53 | for(z=0; z3E3 56 | if(stk[z] == '3' && stk[z + 1] == 'E' && 57 | stk[z + 2] == '3') 58 | { 59 | printf("%s3E3", ac); 60 | stk[z]='E'; 61 | stk[z + 1]='\0'; 62 | stk[z + 1]='\0'; 63 | printf("\n$%s\t%s$\t", stk, a); 64 | i = i - 2; 65 | } 66 | } 67 | return ; //return to main 68 | } 69 | 70 | //Driver Function 71 | int main() 72 | { 73 | printf("GRAMMAR is -\nE->2E2 \nE->3E3 \nE->4\n"); 74 | 75 | // a is input string 76 | strcpy(a,"32423"); 77 | 78 | // strlen(a) will return the length of a to c 79 | c=strlen(a); 80 | 81 | // "SHIFT" is copied to act to be printed 82 | strcpy(act,"SHIFT"); 83 | 84 | // This will print Lables (column name) 85 | printf("\nstack \t input \t action"); 86 | 87 | // This will print the initial 88 | // values of stack and input 89 | printf("\n$\t%s$\t", a); 90 | 91 | // This will Run upto length of input string 92 | for(i = 0; j < c; i++, j++) 93 | { 94 | // Printing action 95 | printf("%s", act); 96 | 97 | // Pushing into stack 98 | stk[i] = a[j]; 99 | stk[i + 1] = '\0'; 100 | 101 | // Moving the pointer 102 | a[j]=' '; 103 | 104 | // Printing action 105 | printf("\n$%s\t%s$\t", stk, a); 106 | 107 | // Call check function ..which will 108 | // check the stack whether its contain 109 | // any production or not 110 | check(); 111 | } 112 | 113 | // Rechecking last time if contain 114 | // any valid production then it will 115 | // replace otherwise invalid 116 | check(); 117 | 118 | // if top of the stack is E(starting symbol) 119 | // then it will accept the input 120 | if(stk[0] == 'E' && stk[1] == '\0') 121 | printf("Accept\n"); 122 | else //else reject 123 | printf("Reject\n"); 124 | } 125 | // This code is contributed by Ritesh Aggarwal 126 | 127 | -------------------------------------------------------------------------------- /labs/04/calculator_v2_linux_zoo/README.md: -------------------------------------------------------------------------------- 1 | # Lex and Yacc: Simple Calculator Example 2 | 3 | ## Introduction 4 | Lex and Yacc are powerful tools used to generate lexical analyzers (scanners) and parsers. In this guide, we will create a simple calculator using Lex (Flex) and Yacc (Bison) that can evaluate expressions with addition, subtraction, multiplication, and variable assignment. 5 | 6 | ## Prerequisites 7 | Ensure you have `flex` and `bison` installed: 8 | 9 | ### On Linux/macOS: 10 | ```sh 11 | sudo apt install flex bison # Ubuntu/Debian 12 | sudo pacman -S flex bison # Arch Linux 13 | brew install flex bison # macOS (Homebrew) 14 | ``` 15 | 16 | ### On Windows: 17 | Use MinGW or Cygwin to install `flex` and `bison`. 18 | 19 | ## Understanding Lex and Yacc Internal Functions 20 | Lex and Yacc use several internal functions that play key roles in lexical analysis and parsing: 21 | 22 | - **`yyparse()`**: This function is automatically generated by Yacc (Bison). It drives the parsing process by calling `yylex()` to fetch tokens and applying grammar rules. 23 | - **`yylex()`**: This function is generated by Lex (Flex). It scans the input text and returns tokens to the parser. It is responsible for pattern matching based on the rules defined in the `.l` file. 24 | - **`yyerror(const char *s)`**: This function is called by `yyparse()` whenever a syntax error occurs. It prints an error message, allowing debugging of incorrect input. 25 | 26 | ## Writing the Lex Scanner 27 | Create a file named `scanner.l` with the following content: 28 | 29 | ```c 30 | %{ 31 | #include "y.tab.h" 32 | #include 33 | extern int yylval; 34 | %} 35 | 36 | %option noyywrap 37 | 38 | %% 39 | [0-9]+ { yylval = atoi(yytext); return NUMBER; } 40 | [a-z] { yylval = yytext[0]; return NAME; } 41 | [ \t] ; /* Ignore whitespace */ 42 | \n return 0; /* Logical EOF */ 43 | . return yytext[0]; 44 | %% 45 | ``` 46 | 47 | ## Writing the Yacc Parser 48 | Create a file named `parser.y`: 49 | 50 | ```c 51 | %{ 52 | #include 53 | int yylex(); 54 | void yyerror (char const *s) { 55 | fprintf (stderr, "%s\n", s); 56 | } 57 | %} 58 | 59 | %token NAME NUMBER 60 | 61 | %% 62 | 63 | statement: NAME '=' expression { printf("%c = %d\n", $1, $3); } 64 | | expression { printf("= %d\n", $1); } 65 | ; 66 | 67 | expression: expression '+' NUMBER { $$ = $1 + $3; } 68 | | expression '-' NUMBER { $$ = $1 - $3; } 69 | | expression '*' NUMBER { $$ = $1 * $3; } 70 | | '(' expression ')' { $$ = $2; } 71 | | NUMBER { $$ = $1; } 72 | ; 73 | 74 | %% 75 | 76 | int main() { 77 | yyparse(); 78 | return 0; 79 | } 80 | ``` 81 | 82 | ## Compiling and Running the Calculator 83 | Run the following commands: 84 | 85 | ```sh 86 | flex scanner.l 87 | bison -d parser.y 88 | gcc lex.yy.c parser.tab.c -o calculator -lm 89 | ./calculator 90 | ``` 91 | 92 | ### Example Input/Output: 93 | ``` 94 | x = 3 + 5 95 | x = 8 96 | 5 * 2 97 | = 10 98 | ``` 99 | 100 | ## Extending the Calculator 101 | - Add division (`/` operator) 102 | - Implement variable storage 103 | - Support floating-point numbers 104 | 105 | This guide provides a basic starting point for learning Lex and Yacc. Experiment and extend it to build more complex parsers! 106 | 107 | -------------------------------------------------------------------------------- /final_project/smart_compiler/README.md: -------------------------------------------------------------------------------- 1 | # Design of Compilers Final Project 2 | 3 | ## Design a compiler with a CFG autogenerated by netural network 4 | 5 | ### Goal 6 | 7 | Compiler is a software which converts a program written in high level language 8 | (Source Language) to low level language (Object/Target/Machine Language). The 9 | goal of this project is to generate a compiler usinghelper tools such as YAAC 10 | and LEX but with the difference that the CFG used in the syntax analyser will 11 | not be 100% acurate. An AI system will be the one in charge of generate the CFG 12 | based on validated inputs. 13 | 14 | ### Design 15 | 16 | This is the design of a regular compiler: 17 | 18 | ``` 19 | 20 | source 21 | + 22 | +-----------+ | 23 | bas.y +-----> | | +----> y.tab.c + | 24 | | yaac | | v 25 | | | | 26 | +-----------+ +---> +---------+ +--------+ 27 | | gcc | +-->+compiler| 28 | +----> +---------+ +--+-----+ 29 | | | 30 | +-----------+ | | 31 | bas.l +-----> | | +----> lex.yy.c+ v 32 | | lex | compiled 33 | | | output 34 | +-----------+ 35 | 36 | ``` 37 | The change we are going to do is: 38 | ``` 39 | 40 | examples of 41 | valid/invalid 42 | source code 43 | + 44 | | 45 | | 46 | v 47 | +------+ +-------+ +--------+ +-----+---+ 48 | | CFG | +----->+ bas.y +----->+ yaac | +----->+compiler | 49 | +--+---+ +-------+ +--------+ +--+------+ 50 | ^ | 51 | | | 52 | | v 53 | | +--------+-----------+ 54 | | | Artificial | 55 | | | intelligence | 56 | +---------------------------------------+ system | 57 | +--------------------+ 58 | ``` 59 | 60 | ## Report and presentation 61 | 62 | Presentation should be done in front of the team with a writen report made in 63 | latex. The template for it is in this 64 | [link](https://github.com/VictorRodriguez/operating-systems-lecture/blob/master/projects/report.tex) 65 | 66 | Teams have to deliver: 67 | 68 | * Printed report 69 | * Printed LateX code 70 | * Send presentation to profesor 71 | 72 | Resources: 73 | * https://www.epaperpress.com/lexandyacc/intro.html 74 | 75 | -------------------------------------------------------------------------------- /labs/07/README.md: -------------------------------------------------------------------------------- 1 | # Practice Exercise: Building a Simple Chatbot with `lex` and `yacc` 2 | 3 | #### Objective 4 | In this exercise, you will create a simple chatbot using `lex` (for lexical analysis) and `yacc` (for parsing). The chatbot will be able to respond to greetings, queries about the time, and farewells. This practice will help you understand how to use `lex` and `yacc` to build a basic interactive application. 5 | 6 | #### Prerequisites 7 | - Basic understanding of C programming. 8 | - Familiarity with `lex` and `yacc` tools. 9 | - Basic knowledge of lexical analysis and parsing. 10 | 11 | #### Instructions 12 | 13 | 1. **Setup Your Environment**: 14 | - Ensure you have `lex` (or `flex`) and `yacc` (or `bison`) installed on your system. 15 | - Create a working directory for this exercise. 16 | 17 | 2. **Create the Lex Specification**: 18 | - Create a file named `chatbot.l`. 19 | - Define patterns to match user inputs for greetings, farewells, and time queries. 20 | 21 | ```c 22 | %{ 23 | #include "y.tab.h" 24 | %} 25 | 26 | %% 27 | 28 | hello { return HELLO; } 29 | hi { return HELLO; } 30 | hey { return HELLO; } 31 | goodbye { return GOODBYE; } 32 | bye { return GOODBYE; } 33 | time { return TIME; } 34 | what[' ']is[' ']the[' ']time { return TIME; } 35 | what[' ']time[' ']is[' ']it { return TIME; } 36 | \n { return 0; } /* End of input on newline */ 37 | 38 | . { return yytext[0]; } 39 | 40 | %% 41 | 42 | int yywrap() { 43 | return 1; 44 | } 45 | ``` 46 | 47 | 3. **Create the Yacc Specification**: 48 | - Create a file named `chatbot.y`. 49 | - Define grammar rules to handle different types of user inputs. 50 | 51 | ```c 52 | %{ 53 | #include 54 | #include 55 | 56 | void yyerror(const char *s); 57 | int yylex(void); 58 | %} 59 | 60 | %token HELLO GOODBYE TIME 61 | 62 | %% 63 | 64 | chatbot : greeting 65 | | farewell 66 | | query 67 | ; 68 | 69 | greeting : HELLO { printf("Chatbot: Hello! How can I help you today?\n"); } 70 | ; 71 | 72 | farewell : GOODBYE { printf("Chatbot: Goodbye! Have a great day!\n"); } 73 | ; 74 | 75 | query : TIME { 76 | time_t now = time(NULL); 77 | struct tm *local = localtime(&now); 78 | printf("Chatbot: The current time is %02d:%02d.\n", local->tm_hour, local->tm_min); 79 | } 80 | ; 81 | 82 | %% 83 | 84 | int main() { 85 | printf("Chatbot: Hi! You can greet me, ask for the time, or say goodbye.\n"); 86 | while (yyparse() == 0) { 87 | // Loop until end of input 88 | } 89 | return 0; 90 | } 91 | 92 | void yyerror(const char *s) { 93 | fprintf(stderr, "Chatbot: I didn't understand that.\n"); 94 | } 95 | ``` 96 | 97 | 4. **Compile the Lex and Yacc Files**: 98 | - Open a terminal in your working directory. 99 | - Run the following commands to compile the lex and yacc files: 100 | 101 | ```sh 102 | lex chatbot.l 103 | yacc -d chatbot.y 104 | cc lex.yy.c y.tab.c -o chatbot -ll -ly 105 | ``` 106 | 107 | 5. **Run the Chatbot**: 108 | - Execute the compiled chatbot program: 109 | 110 | ```sh 111 | ./chatbot 112 | ``` 113 | 114 | - Test the chatbot by typing various inputs: 115 | - Greetings like "hello", "hi", or "hey". 116 | - Time queries like "what is the time", "what time is it", or simply "time". 117 | - Farewells like "goodbye" or "bye". 118 | 119 | 6. **Extend the Chatbot**: 120 | - Add more patterns and responses to the chatbot. Think of additional questions users might ask and how the chatbot should respond. For example: 121 | - Ask for the chatbot's name: "what is your name". 122 | - Inquire about the weather: "what is the weather". 123 | - Ask how the chatbot is doing: "how are you". 124 | 125 | 126 | #### Submission 127 | Create a pull request and submit the following files: 128 | - `chatbot.l` 129 | - `chatbot.y` 130 | 131 | Ensure your code is well-commented to explain your logic and any enhancements you made. 132 | 133 | #### Assessment 134 | You will be evaluated on: 135 | - Correctness of the lexical and grammar rules. 136 | - Functionality of the chatbot based on the provided and additional commands. 137 | - Code readability and comments. 138 | 139 | By completing this exercise, you will gain practical experience in using `lex` and `yacc` to build and extend a basic interactive application, laying the foundation for more complex projects in the future. 140 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # compilers-lecture 2 | 3 | This GIT repository contains a list of workshops and labs for compilers' design 4 | lecture. The level of complexity is for undergraduate students of computer 5 | science careers at ITESM university 6 | 7 | This is plan to be deliverable in a period of time of 4 to 6 months. It is 8 | adjustable in time and order; however the expectations and basic topics should 9 | keep the essence of the plan. 10 | 11 | ## Agenda: 12 | 13 | * [Introduction to compilers](https://drive.google.com/open?id=18-tj7JEHmfY9QH7tDWEB0FHgfZslgUd3FQmgZh4uMCQ) 14 | * Language Processors 15 | * [Evolution of programing languages](https://docs.google.com/presentation/d/1eyi0sNu1XZ498knSle2CwgrozegSqYfVOvCGOxuL5rc/edit?usp=sharing) 16 | * The Structure of a Compiler 17 | * Lexical Analysis 18 | * Syntax Analysis 19 | * Semantic Analysis 20 | * Intermediate Code Generation 21 | * Code Optimization 22 | * Code Generation 23 | * Symbol-Table Management 24 | * The Grouping of Phases into Passes 25 | * Compiler-Construction Tools 26 | * [Simple Syntax directed Translation](https://docs.google.com/presentation/d/1KE89YKtU4IDtK5locxnlciRHkWltg9VG_C42ORVR7WI/edit?usp=sharing) 27 | * Syntax definition 28 | * Definition of gramars 29 | * Parse Trees 30 | * Ambiguity 31 | * [Lexical Analyser and Regex](https://docs.google.com/presentation/d/1kpLPDliaGBJbckxPY2lRWv38FHG55jfVX6OWH-FSPeM/edit?usp=sharing) 32 | * Regex 33 | * DFA 34 | * LEX 35 | * YACC 36 | * [Grammars and Parsing](https://drive.google.com/open?id=1pUU1y9kDVrs9kkP_Zh1oC59G94Hi3FzSqehC9nmve0g) 37 | * Grammars in our lifes 38 | * Parse trees 39 | * Properties of CFGs 40 | * Reduced Gramars 41 | * Backus-Naur Form (BNF) notation 42 | * Parsers and Recognizers 43 | * [Top-Down Parsing](https://docs.google.com/presentation/d/1b9ecDphpIwD-gSvFawZQzXSg1U_HAel7CmucwWCAtI4/edit?usp=sharing) 44 | * LL(k) 45 | * Recursive-descent parsers 46 | * Eliminating Ambiguity 47 | * Elimination of left recursion 48 | * [Bottom-Up Parsing](https://drive.google.com/open?id=1WCBfCFD-7AuhNQYGEi1ZxJKsevSjsKCmL6kM0Uss5Mw) 49 | * LR(k) 50 | * Shift-reduce 51 | * Simple LR Parser algorithm 52 | * [Error Recovery](https://docs.google.com/presentation/d/1054xs2_vMLsILOO4l9YksCOTclGmcYHu94wOndmCaFA/edit?usp=sharing) 53 | * Panic mode 54 | * Statement mode 55 | * Error productions 56 | * Global correction 57 | * [Abstract Syntax Trees](https://docs.google.com/presentation/d/1NeO-SkZLcOQAyYevel_xV5zBk-CL5Pw7c7fvEz7XKmY/edit?usp=sharing) 58 | * Example of Abstract Syntax Tree 59 | * Semantic Actions 60 | * Semantic Actions dependencies 61 | * [Semantic Analysis](https://drive.google.com/open?id=1Tt-VbEa4nQLzoVRJnJ_sIgf4K-BWtsv74ygew1z13T8) 62 | * Scope resolution 63 | * Type checking 64 | * Array-bound checking 65 | * Semantic Errors 66 | * Attribute Grammar 67 | * Inherited attributes 68 | * [Symbol Table & Intermediate Code Generation](https://docs.google.com/presentation/d/1-EP0_CbRf-g9clIx7GcAgLCe-UEFv_WRpCmrfRoJXxY/edit?usp=sharing) 69 | * [Code optimization](https://drive.google.com/open?id=1sEwM70e70PzgdeEZ3g348ai_0xCntddm5W0nrFdxbrc) 70 | 71 | 72 | ## Extra topic Trainings ( in case you need them ) 73 | * Git training 74 | * https://drive.google.com/file/d/0B7iKrGdVkDhINERiQnppOU5IVVk/view?usp=sharing 75 | * Bash Trainings: 76 | * https://drive.google.com/file/d/0B7iKrGdVkDhILU9QRWllWmNKM2M/view?usp=sharing&resourcekey=0-kupyruZHY8ZyMM6sMHahPg 77 | * https://drive.google.com/file/d/0B7iKrGdVkDhIWGVhVzhtTlZjWGc/view?usp=sharing 78 | * https://drive.google.com/file/d/0B7iKrGdVkDhIRkVPSlNPdkdSS2c/view?usp=sharing 79 | * https://drive.google.com/file/d/0B7iKrGdVkDhIbkdKYWI1R19oMzQ/view?usp=sharing 80 | * GCC for performance Trainings: 81 | * https://drive.google.com/open?id=0B7iKrGdVkDhIUzZTVTduczJrQTg 82 | * GCC ROP attacks workshop: 83 | * https://github.com/VictorRodriguez/operating-systems-lecture/tree/master/labs/gcc/security 84 | * Makefile trainings: 85 | * [Tutorial 1](https://www.tutorialspoint.com/makefile/index.html) 86 | * [Tutorial 2](https://www.coursera.org/lecture/introduction-embedded-systems/6-make-18etg) 87 | * [Tutorial 3](https://github.com/lifeissweetgood/makefile-tutorial) 88 | * GDB tutorial: 89 | * [GDB 101](https://docs.google.com/presentation/d/1aaExMhw1xqWeX8uiUNKiuDD1ZWlIcm5v3wWCn1rlc-w/edit?usp=sharing) 90 | * CFG tool: 91 | * [Tool](https://web.stanford.edu/class/archive/cs/cs103/cs103.1156/tools/cfg/) 92 | ## Ponderation: 93 | 94 | * 35% First Term 95 | * 15% Weekly Quizzes 96 | * 5% Reading Summaries 97 | * 15% Labs 98 | 99 | * 35% Second Term 100 | * 15% Weekly Quizzes 101 | * 5% Reading Summaries 102 | * 15% Labs 103 | 104 | * 30% Final Exam ( if student decides to do final project it is = 10% and final 105 | exam 20 % ) 106 | 107 | ## Bibliography: 108 | 109 | ### Mian Book : 110 | * Crafting a Compiler, by Charles N. Fischer 111 | 112 | ### Other good books : 113 | 114 | * Introduction to computer theory Book by Daniel I. A. Cohen 115 | * Principles of Compiler Design Textbook by Alfred Aho and Jeffrey Ullman 116 | * Compilers: Principles, Techniques, and Tools; Alfred Aho, Jeffrey Ullman, Monica S. Lam, and Ravi Sethi 117 | * Languages and Machines: An Introduction to the Theory of Computer Science [3rd Edition](https://www.amazon.com/Languages-Machines-Introduction-Computer-Science/dp/0321322215) 118 | * Assembly Language Step-by-Step: Programming with Linux 3rd Edition by Jeff Duntemann 119 | -------------------------------------------------------------------------------- /labs/03/README.md: -------------------------------------------------------------------------------- 1 | # Lab 03 instructions 2 | 3 | ## Objective 4 | 5 | Make the student develop the next part of the compiler, the lexical analyser of 6 | the code that we will create : AC adding calculator 7 | 8 | # Requirements 9 | 10 | * Linux machine, either a VM or a baremetal host 11 | * GCC compiler (at least version 4.8) 12 | * Autotools 13 | * git send mail server installed and configured on your Linux machine 14 | 15 | ## Instructions 16 | 17 | Our language is called ac (for adding calculator). When compared with most 18 | programming languages, ac is relatively simple, yet it serves nicely as a study 19 | for examining the phases and data structures of a compiler. We first define ac 20 | informally: 21 | 22 | * Types: Most programming languages offer a significant number of predefined 23 | data types, with the ability to extend existing types or specify new data 24 | types. In ac, there are only two data types: integer and float. An integer 25 | type is a sequence of decimal numerals, as found in most programming 26 | languages. A float type allows five fractional digits after the decimal 27 | point. 28 | 29 | * Keywords Most programming languages have a number of reserved keywords, 30 | such as if and while, which would otherwise serve as variable names. In ac, 31 | there are three reserved keywords, each limited for simplicity to a single 32 | letter: f (declares a float variable), i (declares an integer variable), and 33 | p (prints the value of a variable). 34 | 35 | * Variables Some programming languages insist that a variable be declared by 36 | specifying the variable’s type prior to using the variable’s name. The ac 37 | language offers only 23 possible variable names, drawn from the lowercase 38 | Roman alphabet and excluding the three reserved keywords f, i, and p. 39 | Variables must be declared prior to using them. 40 | 41 | Most programming languages have rules that dictate circumstances under which a 42 | given type can be converted into another type. In some cases, such type 43 | conversion is handled automatically by the compiler, while other cases require 44 | explicit syntax (such as casts) to allow the type conversion. In ac, conversion 45 | from integer type to float type is accomplished automatically. Conversion in 46 | the other direction is not allowed under any circumstances. 47 | 48 | For the target of translation, we use the widely available program dc (for desk 49 | calculator), which is a stack-based calculator that uses reverse Polish 50 | notation (RPN). When an ac program is translated into a dc program, the 51 | resulting instructions must be acceptable to the dc program and must faithfully 52 | represent the operations specified in an ac program. 53 | 54 | The objective of this lab is the creation of lexical analyzer of ac. 55 | 56 | A valid line of code in ac could be: 57 | 58 | ``` 59 | // basic code 60 | 61 | //float b 62 | f b 63 | 64 | // integer a 65 | i a 66 | 67 | // a = 5 68 | a = 5 69 | 70 | // b = a + 3.2 71 | b = a + 3.2 72 | 73 | //print 8.5 74 | p b 75 | ``` 76 | 77 | Before validating if a code has syntax error with a CFG is necesary to 78 | translate the code to an string of tokens such as: 79 | 80 | ``` 81 | floatdcl id intdcl id 82 | id assign inum 83 | id assign id plus fnum 84 | print id 85 | ``` 86 | 87 | ## Expected result: 88 | 89 | * Code a lex file that fullfill the requirements 90 | * Code a Makefile for this code 91 | * Generate a random AC code with: 92 | 93 | ``` 94 | python3 code_generator.py > example.ac 95 | 96 | ``` 97 | 98 | * Compile and execute as follows: 99 | 100 | ``` 101 | lex lexic_analyzer.l (the lex code you generate in your homework) 102 | gcc lex.yy.c -o lexical_scan -lfl 103 | ./lexical_scan 104 | ``` 105 | 106 | This should generate kind of the following file: lex.out 107 | 108 | ``` 109 | floatdcl id intdcl id 110 | id assign inum 111 | id assign id plus fnum 112 | print id 113 | ``` 114 | 115 | 116 | ## Please send the mail as git send mail: 117 | 118 | ``` 119 | $ git add lexic_analyzer.l 120 | $ git commit -s -m -homework-03 121 | $ git send-email -1 122 | 123 | ``` 124 | Do some tests sending the mail to your personal account, if you get the mail, 125 | then you can be sure I will get the mail 126 | 127 | ## Performance test (Extra work, not mandatory for the lab) 128 | 129 | The code generation has an option to generate stress examples: 130 | 131 | ``` 132 | python3 code_generator.py --stress 133 | ``` 134 | 135 | This will generate a huge AC random code 136 | 137 | Try your solution with this option and check how much time to your solution to 138 | do the lex part of the compiler. Consider that this is just the first part of 139 | the compiler work, other parts are necesary in the future 140 | 141 | Do you think that we could make it faster? 142 | 143 | Why performance is important? 144 | 145 | Suppose we want to implement a very fast compiler that can compile a program in 146 | a few seconds. We will use 30,000 lines per minute (500 lines per second) as 147 | our goal. (Compilers such as Turbo C++ achieve such speeds.) If an average line 148 | contains 20 characters, the compiler must scan 10,000 characters per second. On 149 | a processor that executes 10,000,000 instructions per second, even if we did 150 | nothing but scanning, we would have only 1,000 instructions per input character 151 | to spend. 152 | 153 | However, because scanning is not the only thing a compiler does, 250 154 | instructions per character is more realistic. This is a rather tight budget, 155 | considering that even a simple assignment takes several instructions on a 156 | typical processor. 157 | 158 | Although faster processors are common these days and 30,000 159 | lines per minute is an ambitious speed, clearly a poorly coded scanner can 160 | dramatically impact a compiler’s performance. 161 | 162 | ## Time to do the homework: 163 | 164 | One week from the moment the mail is sent to students 165 | 166 | -------------------------------------------------------------------------------- /labs/12/README.md: -------------------------------------------------------------------------------- 1 | # Security Flags at GCC and its impact in performance 2 | 3 | Stack buffer overflows are a longstanding problem for C programs that leads to 4 | all manner of ills, many of which are security vulnerabilities. The biggest 5 | problems have typically been with string buffers on the stack coupled with bad 6 | or missing length tests. A programmer who mistakenly leaves open the 7 | possibility of overrunning a buffer on a function's stack may be allowing 8 | attackers to overwrite the return pointer pushed onto the stack earlier. Since 9 | the attackers may be able to control what gets written, they can control where 10 | the function returns ( based on https://lwn.net/Articles/584225/ ) 11 | 12 | GCC, like many compilers, offers features to help detect and prevent this 13 | vulnerabilities 14 | 15 | This basic document describe the the following security flags: 16 | 17 | ``` 18 | Stack execution protection: LDFLAGS="-z noexecstack" 19 | Data relocation and protection (RELRO): LDLFAGS="-z relro -z now" 20 | Stack-based Buffer Overrun Detection: CFLAGS=”-fstack-protector-strong” 21 | if using GCC 4.9 or newer, 22 | otherwise CFLAGS="-fstack-protector" 23 | Fortify source: CFLAGS="-O2 -D_FORTIFY_SOURCE=2" 24 | Format string vulnerabilities: CFLAGS="-Wformat -Wformat-security" 25 | ``` 26 | 27 | 28 | ## GCC Stack Protection Mechanisms (CFLAGS=”-fstack-protector-strong”) 29 | 30 | Stack-based Buffer Overrun Detection: CFLAGS=”-fstack-protector-strong” 31 | 32 | This flag emits extra code to check for buffer overflows, such as stack 33 | smashing attacks. This is done by adding a guard variable to functions with 34 | vulnerable objects (canary). The basic idea behind stack protection is to push 35 | a "canary" (a randomly chosen integer) on the stack just after the function 36 | return pointer has been pushed. The canary value is then checked before the 37 | function returns; if it has changed, the program will abort. Generally, stack 38 | buffer overflow (aka "stack smashing") attacks will have to change the value of 39 | the canary as they write beyond the end of the buffer before they can get to 40 | the return pointer. Since the value of the canary is unknown to the attacker, 41 | it cannot be replaced by the attack. Thus, the stack protection allows the 42 | program to abort when that happens rather than return to wherever the attacker 43 | wanted it to go. 44 | 45 | Putting stack protection into every function is both overkill and may hurt 46 | performance, so one of the GCC options chooses a subset of functions to 47 | protect. 48 | 49 | * The existing -fstack-protector-all option will protect all functions 50 | * The -fstack-protector option chooses any function that declares a character 51 | array of eight bytes or more in length on its stack. 52 | * The -fstack-protector-strong option has been developed to broaden the scope 53 | of the stack protection without extending it to every function in the 54 | program. 55 | 56 | Example code for -fstack-protector-strong 57 | 58 | ``` 59 | int foo(int i) { 60 | return i; 61 | } 62 | 63 | int main() { 64 | int ret; 65 | int i = 10; 66 | ret = foo(i); 67 | return ret; 68 | } 69 | ``` 70 | 71 | If we compile with 72 | 73 | ``` 74 | gcc main.c -o main -fstack-protector-all 75 | ``` 76 | 77 | We force to all the functions to have the security protection enebale, the foo 78 | function then generate the following code 79 | 80 | ``` 81 | 0000000000401112 : 82 | 401112: 48 83 ec 18 sub $0x18,%rsp 83 | 401116: ba 28 00 00 00 mov $0x28,%edx 84 | 40111b: 64 48 8b 0a mov %fs:(%rdx),%rcx 85 | 40111f: 48 89 4c 24 08 mov %rcx,0x8(%rsp) 86 | 401124: 31 c9 xor %ecx,%ecx 87 | 401126: 48 8b 74 24 08 mov 0x8(%rsp),%rsi 88 | 40112b: 64 48 33 32 xor %fs:(%rdx),%rsi 89 | 40112f: 75 07 jne 401138 90 | 401131: 89 f8 mov %edi,%eax 91 | 401133: 48 83 c4 18 add $0x18,%rsp 92 | 401137: c3 retq 93 | 401138: e8 f3 fe ff ff callq 401030 <__stack_chk_fail@plt> 94 | ``` 95 | If we change to use -fstack-protector-strong the foo function is not affected: 96 | 97 | 98 | ``` 99 | 00000000004010f2 : 100 | 4010f2: 89 f8 mov %edi,%eax 101 | 4010f4: c3 retq 102 | 103 | 00000000004010f5
: 104 | 4010f5: b8 0a 00 00 00 mov $0xa,%eax 105 | 4010fa: c3 retq 106 | 4010fb: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) 107 | ``` 108 | The -fstack-protector-strong is recomended to be used to do not affect the 109 | perfomrance, instead of protecting all functions that coudl affect the 110 | perfomrance , the next question is how much coudl this falg affect the 111 | performance ? 112 | 113 | Here is a simple benchmark code: 114 | 115 | ``` 116 | #include 117 | #include 118 | #include 119 | #define MAX 100000000 120 | 121 | static struct timeval tm1; 122 | int a[256], b[256], c[256]; 123 | 124 | void foo(); 125 | 126 | int main(){ 127 | foo(); 128 | return 0; 129 | } 130 | 131 | void foo(){ 132 | int i,x; 133 | for (x=0; x: 145 | 401112: 48 83 ec 18 sub $0x18,%rsp 146 | 401116: 64 48 8b 04 25 28 00 mov %fs:0x28,%rax 147 | 40111d: 00 00 148 | 40111f: 48 89 44 24 08 mov %rax,0x8(%rsp) 149 | 401124: 31 c0 xor %eax,%eax 150 | 401126: b9 40 42 0f 00 mov $0xf4240,%ecx 151 | 40112b: eb 04 jmp 401131 152 | 40112d: ff c9 dec %ecx 153 | 40112f: 74 25 je 401156 154 | 401131: b8 00 00 00 00 mov $0x0,%eax 155 | 401136: 8b 90 60 44 40 00 mov 0x404460(%rax),%edx 156 | 40113c: 03 90 60 40 40 00 add 0x404060(%rax),%edx 157 | 401142: 89 90 60 48 40 00 mov %edx,0x404860(%rax) 158 | 401148: 48 83 c0 04 add $0x4,%rax 159 | 40114c: 48 3d 00 04 00 00 cmp $0x400,%rax 160 | 401152: 75 e2 jne 401136 161 | 401154: eb d7 jmp 40112d 162 | 401156: 48 8b 44 24 08 mov 0x8(%rsp),%rax 163 | 40115b: 64 48 33 04 25 28 00 xor %fs:0x28,%rax 164 | 401162: 00 00 165 | 401164: 75 05 jne 40116b 166 | 401166: 48 83 c4 18 add $0x18,%rsp 167 | 40116a: c3 retq 168 | 40116b: e8 c0 fe ff ff callq 401030 <__stack_chk_fail@plt> 169 | ``` 170 | 171 | with -fstack-protector-strong 172 | 173 | ``` 174 | 00000000004010f2 : 175 | 4010f2: b9 40 42 0f 00 mov $0xf4240,%ecx 176 | 4010f7: eb 04 jmp 4010fd 177 | 4010f9: ff c9 dec %ecx 178 | 4010fb: 74 25 je 401122 179 | 4010fd: b8 00 00 00 00 mov $0x0,%eax 180 | 401102: 8b 90 60 44 40 00 mov 0x404460(%rax),%edx 181 | 401108: 03 90 60 40 40 00 add 0x404060(%rax),%edx 182 | 40110e: 89 90 60 48 40 00 mov %edx,0x404860(%rax) 183 | 401114: 48 83 c0 04 add $0x4,%rax 184 | 401118: 48 3d 00 04 00 00 cmp $0x400,%rax 185 | 40111e: 75 e2 jne 401102 186 | 401120: eb d7 jmp 4010f9 187 | 401122: c3 retq 188 | ``` 189 | 190 | The difernece in terms of performance is: 191 | 192 | ``` 193 | $ perf stat ./bench-fstack-protection-strong 194 | Executed 3 times: 195 | 154,054,541,632 instructions:u # 2.84 insn per cycle 196 | 154,059,056,179 instructions:u # 2.84 insn per cycle 197 | 154,013,291,687 instructions:u # 2.84 insn per cycle 198 | 199 | Mean (Average): Mean (Average): 154042296499.33 instructions 200 | Sample Standard Deviation: 14560845.185207 instructions # 2.84 insn per cycle 201 | ``` 202 | 203 | ``` 204 | $ perf stat ./bench-fstack-protection-all 205 | Executed 3 times: 206 | 153,987,502,402 instructions:u # 2.84 insn per cycle 207 | 153,995,330,286 instructions:u # 2.84 insn per cycle 208 | 154,004,792,847 instructions:u # 2.84 insn per cycle 209 | 210 | Mean (Average): 153995875178.33 instructions 211 | Sample Standard Deviation: 4998751.6046743 instructions # 2.84 insn per cycle 212 | ``` 213 | 214 | 215 | The delta in performance ( instructions ) of this example by using th flag : 216 | * -fstack-protection-all ( forcing to use fstack protection in all functions,simulate the worst case scenario ) 217 | against: 218 | * -fstack-protection-stong ( less functions are afected ) 219 | is 220 | 221 | ``` 222 | abs( 154042296499.33 - 153995875178.33 ) = 46421321 instructions = ~0.03 % of degradation 223 | ``` 224 | ## Fortify source (CFLAGS="-O2 -D_FORTIFY_SOURCE=2") 225 | 226 | The FORTIFY_SOURCE macro provides lightweight support for detecting buffer 227 | overflows in various functions that perform operations on memory and strings. 228 | Not all types of buffer overflows can be detected with this macro, but it does 229 | provide an extra level of validation for some functions that are potentially a 230 | source of buffer overflow flaws. (based on 231 | https://access.redhat.com/blogs/766093/posts/1976213) 232 | 233 | FORTIFY_SOURCE provides buffer overflow checks for the following functions: 234 | 235 | ``` 236 | memcpy, mempcpy, memmove, memset, strcpy, stpcpy, strncpy, strcat, 237 | strncat, sprintf, vsprintf, snprintf, vsnprintf, gets. 238 | ``` 239 | [reference = http://man7.org/linux/man-pages/man7/feature_test_macros.7.html] 240 | 241 | 242 | For example the next code : 243 | 244 | ``` 245 | #include 246 | 247 | void secretFunction() 248 | { 249 | printf("Congratulations!\n"); 250 | printf("You have entered in the secret function!\n"); 251 | } 252 | 253 | void echo() 254 | { 255 | char buffer[20]; 256 | 257 | printf("Enter some text:\n"); 258 | scanf("%s", buffer); 259 | printf("You entered: %s\n", buffer); 260 | } 261 | 262 | int main() 263 | { 264 | echo(); 265 | return 0; 266 | } 267 | ``` 268 | 269 | Is a good example of vulneravility and by following some simple steps in 270 | 271 | https://github.com/VictorRodriguez/operating-systems-lecture/tree/master/labs/gcc/security 272 | 273 | We can easily access to the secretFunction() by buffer overflow 274 | 275 | When we compile with: -D_FORTIFY_SOURCE=1 we get 276 | 277 | ``` 278 | vuln.c: In function ‘echo’: 279 | vuln.c:18:5: warning: ignoring return value of ‘scanf’, declared with attribute warn_unused_result [-Wunused-result] 280 | scanf("%s", buffer); 281 | ^~~~~~~~~~~~~~~~~~~ 282 | ``` 283 | 284 | As we can see the compiler (GCC 8.1) do a very good job by detecting the 285 | security issue at scanf function 286 | 287 | Compiling with: 288 | 289 | ``` 290 | gcc -Wall -g -O2 vuln.c -o vuln 291 | ``` 292 | 293 | Generates no issue or warning 294 | 295 | Taking another example with more easy to handle buffer overflow 296 | 297 | ``` 298 | #include 299 | #include 300 | 301 | int main(int argc, char **argv) { 302 | char buffer[5]; 303 | printf ("Buffer Contains: %s , Size Of Buffer is %ld\n", 304 | buffer,sizeof(buffer)); 305 | strcpy(buffer,"deadbeef"); 306 | printf ("Buffer Contains: %s , Size Of Buffer is %ld\n", 307 | buffer,sizeof(buffer)); 308 | } 309 | ``` 310 | 311 | with -D_FORTIFY_SOURCE=1 we get: 312 | 313 | ``` 314 | $ gcc -D_FORTIFY_SOURCE=1 -Wall -g -O2 mem_test.c -o mem_test 315 | In file included from /usr/include/string.h:494, 316 | from mem_test.c:3: 317 | In function ‘strcpy’, 318 | inlined from ‘main’ at mem_test.c:9:1: 319 | /usr/include/bits/string_fortified.h:90:10: warning: ‘__builtin___memcpy_chk’ forming offset [6, 9] is out of the bounds [0, 5] of object ‘buffer’ with type ‘char[5]’ [-Warray-bounds] 320 | return __builtin___strcpy_chk (__dest, __src, __bos (__dest)); 321 | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 322 | mem_test.c: In function ‘main’: 323 | mem_test.c:6:6: note: ‘buffer’ declared here 324 | char buffer[5]; 325 | ^~~~~~ 326 | ``` 327 | 328 | The compiler returns a warning because it correctly detects the buffer overflaw 329 | in the buffer variable: 330 | 331 | if we modify the strcpy(buffer,"deadbeef") to strcpy(buffer,argv[1]) 332 | 333 | -D_FORTIFY_SOURCE=1 will not detect a thing , because at compile tiem it does 334 | not has an idea of the lenght of the string to copy to buffer and if it will 335 | generate a buffer overflow. However with -D_FORTIFY_SOURCE=2 it does generate 336 | code to check at build time: 337 | 338 | ``` 339 | $ gcc -D_FORTIFY_SOURCE=2 -Wall -g -O2 mem_test.c -o mem_test 340 | $ ./mem_test aaaaaaaaaaaaaa 341 | Buffer Contains: , Size Of Buffer is 5 342 | *** buffer overflow detected ***: ./mem_test terminated 343 | Aborted (core dumped) 344 | ``` 345 | 346 | 347 | 348 | Performance can be measure with the next code: 349 | 350 | ``` 351 | #include 352 | #include 353 | #define MAX 100000000 354 | 355 | char buffer[5]; 356 | 357 | void foo(char *value){ 358 | strcpy(buffer,value); 359 | } 360 | 361 | int main(int argc, char **argv) { 362 | printf ("Buffer Contains: %s , Size Of Buffer is %ld\n", 363 | buffer,sizeof(buffer)); 364 | int i,x; 365 | for (x=0; x, which checks for a potential buffer overflow: 378 | 379 | ``` 380 | 00000000004011a0 : 381 | 4011a0: 48 89 fe mov %rdi,%rsi 382 | 4011a3: ba 05 00 00 00 mov $0x5,%edx 383 | 4011a8: bf 39 40 40 00 mov $0x404039,%edi 384 | 4011ad: e9 7e fe ff ff jmpq 401030 <__strcpy_chk@plt> 385 | 4011b2: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) 386 | 4011b9: 00 00 00 387 | 4011bc: 0f 1f 40 00 nopl 0x0(%rax) 388 | ``` 389 | 390 | The difernece in terms of performance is: 391 | 392 | ``` 393 | gcc -D_FORTIFY_SOURCE=2 -Wall -g -O2 bench_mem_test.c -o bench_mem_test-forty-2 394 | 395 | perf stat ./bench_mem_test-forty-2 aaa 396 | 397 | 1,485,272,192,918 instructions:u # 2.31 insn per cycle 398 | 1,485,453,650,633 instructions:u # 2.31 insn per cycle 399 | 1,485,491,542,369 instructions:u # 2.31 insn per cycle 400 | 401 | Mean (Average): Mean (Average): 1485405795306.7 instructions 402 | Sample Standard Deviation: 67690828.052631 instructions # 2.31 insn per cycle 403 | ``` 404 | 405 | 406 | ``` 407 | gcc -Wall -g -O2 bench_mem_test.c -o bench_mem_test 408 | 409 | perf stat ./bench_mem_test aaa 410 | 411 | 1,485,386,835,017 instructions:u # 2.3 insn per cycle 412 | 1,485,194,110,041 instructions:u # 2.3 insn per cycle 413 | 1,485,184,126,261 instructions:u # 2.3 insn per cycle 414 | 415 | Mean (Average): Mean (Average): 1485255023773 instructions 416 | Sample Standard Deviation: 65968608.694825 instructions # 2.3 insn per cycle 417 | ``` 418 | 419 | The delta in performance ( instructions ) of this example by using this flag : 420 | 421 | * -D_FORTIFY_SOURCE=2 422 | 423 | is 424 | 425 | ``` 426 | abs( 1485405795306.7 - 1485255023773 ) = 150771533.7 instructions = ~0.01 % of degradation 427 | ``` 428 | 429 | ## Format string vulnerabilities( CFLAGS="-Wformat -Wformat-security") 430 | 431 | -Wformat check calls to printf and scanf, etc., to make sure that the arguments 432 | supplied have types appropriate to the format string specified, and that the 433 | conversions specified in the format string make sense. This includes standard 434 | functions, and others specified by format attributes (see Function Attributes), 435 | in the printf, scanf, strftime and strfmon (an X/Open extension, not in the C 436 | standard) families (or other target-specific families). 437 | 438 | With -Wformat-security if -Wformat is specified, also warn about uses of format 439 | functions that represent possible security problems. At present, this warns 440 | about calls to printf and scanf functions where the format string is not a 441 | string literal and there are no format arguments, as in printf (foo);. This may 442 | be a security hole if the format string came from untrusted input and contains 443 | ‘%n’. (This is currently a subset of what -Wformat-nonliteral warns about, but 444 | in future warnings may be added to -Wformat-security that are not included in 445 | -Wformat-nonliteral.) 446 | 447 | [taken from https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html] 448 | 449 | Example of code that -Wformat-security detects 450 | 451 | ``` 452 | #include 453 | #include 454 | 455 | void foo(char *s) { 456 | printf(s); 457 | } 458 | 459 | int main(){ 460 | char greeting[] = "Hello"; 461 | foo(greeting); 462 | return EXIT_SUCCESS; 463 | } 464 | 465 | $ gcc -Wall -Wextra -Wformat-security printf-secure.c 466 | printf-secure.c: In function ‘foo’: 467 | printf-secure.c:5:5: warning: format not a string literal and no format arguments [-Wformat-security] 468 | printf(s); 469 | ^~~~~~ 470 | ``` 471 | 472 | There is no performance penalty since the code is not affected , just a warning 473 | is generated at compile time 474 | 475 | ## Data relocation and protection (RELRO): LDLFAGS="-z relro -z now" 476 | 477 | A dynamically linked ELF binary uses a look-up table called Global Offset Table 478 | (GOT) to dynamically resolve functions that are located in shared libraries. 479 | There are several steps in the middle to make it possible: 480 | 481 | * First, the function call is actually pointing to the Procedure Linkage Table 482 | (PLT), which exists in the .plt section of the binary. 483 | 484 | * The .plt section contains x86 instructions that point directly to the GOT, 485 | which lives in the .got.plt section. 486 | 487 | * The .got.plt section contains binary data. The GOT contain pointers back to 488 | the PLT or to the location of the dynamically linked function. 489 | 490 | By default, the GOT is populated dynamically while the program is running. The 491 | first time a function is called, the GOT contains a pointer back to the PLT, 492 | where the linker is called to find the actual location of the function in 493 | question (this is the part we’re not going into detail about). The location 494 | found is then written to the GOT. The second time a function is called, the GOT 495 | contains the known location of the function. This is called “lazy binding.”. 496 | Since we know that the GOT lives in a predefined place and is writable, all 497 | that is needed is a bug that lets an attacker write four bytes anywhere. 498 | 499 | To prevent this kind of exploitation technique, we can tell the linker to 500 | resolve all dynamically linked functions at the beginning of execution and make 501 | the GOT read-only. For this case the compiler provide 2 flags 502 | 503 | * -Wl,-z,now : It Disable lazy binding. 504 | * -Wl,-z,relro : Makes Read-only segments after relocation 505 | 506 | The performance of RELocation Read-Only depends on the number of times the 507 | library is called, is hard to generate a micr benchmark for it 508 | 509 | [source https://developers.redhat.com/blog/2018/03/21/compiler-and-linker-flags-gcc/] 510 | [source https://medium.com/@HockeyInJune/relro-relocation-read-only-c8d0933faef3] 511 | 512 | ## Stack execution protection: LDFLAGS="-z noexecstack" 513 | 514 | Buffer overflow exploits often put some code in a program's data area or stack, and then jump to it. If all writable addresses are non-executable, such an attack is prevented. 515 | 516 | By default, gcc will mark the stack non-executable, unless an executable stack is needed for function trampolines. The gcc marking can be overridden via the -z execstack or -z noexecstack compiler flags. 517 | 518 | As we can see in the following example: 519 | 520 | ``` 521 | $ gcc main.c -o main-execstack -z execstack 522 | $ objdump -p main-execstack | grep -i -A1 stack 523 | main-execstack: file format elf64-x86-64 524 | 525 | -- 526 | STACK off 0x0000000000000000 vaddr 0x0000000000000000 paddr 0x0000000000000000 align 2**4 527 | filesz 0x0000000000000000 memsz 0x0000000000000000 flags rwx 528 | ``` 529 | 530 | The stack is marked with flags for read, write and execute ( flags rwx ) 531 | 532 | if we compile with -z noexecstack the stack shows: 533 | 534 | ``` 535 | $ gcc main.c -o main-noexecstack -z noexecstack 536 | $ objdump -p main-noexecstack | grep -i -A1 stack 537 | main-noexecstack: file format elf64-x86-64 538 | 539 | -- 540 | STACK off 0x0000000000000000 vaddr 0x0000000000000000 paddr 0x0000000000000000 align 2**4 541 | filesz 0x0000000000000000 memsz 0x0000000000000000 flags rw- 542 | ``` 543 | 544 | With no executable flag ( flags rw- ) 545 | 546 | [source https://www.win.tue.nl/~aeb/linux/hh/protection.html] 547 | 548 | Same information could be read with the readelf -l tool , looking for the PT_GNU_STACK. The PT_GNU_STACK is an ELF header item that indicates whether the binary has an executable stack. 549 | 550 | 551 | ``` 552 | $ readelf -l ./main-noexecstack | grep -i -A1 stack 553 | GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000 554 | 0x0000000000000000 0x0000000000000000 RW 0x10 555 | ``` 556 | 557 | The performance in this case is not affected since the compilers by default has the flag as -z noexecstack. 558 | --------------------------------------------------------------------------------