├── Docs ├── 代码生成.md ├── 总结感想.md ├── 申优文档.md ├── 编译器设计文档 18231047王肇凯.md ├── 词法分析.md ├── 语法分析.md └── 错误处理.md ├── Error.cpp ├── Error.h ├── Grammar.cpp ├── Grammar.h ├── Lexer.cpp ├── Lexer.h ├── MipsGenerator.cpp ├── MipsGenerator.h ├── PseudoCode.cpp ├── PseudoCode.h ├── README.md ├── SymTable.cpp ├── SymTable.h ├── error.txt ├── main.cpp ├── mips.txt ├── output.txt ├── testfile.txt ├── utils.cpp ├── utils.h ├── 理论课复习.md └── 编译复习.assets ├── image-20201218171246280.png ├── image-20201218171400954.png ├── image-20201218171457550.png ├── image-20201218171836337.png ├── image-20201218172026311.png ├── image-20201218172244030.png ├── image-20201218172325093.png ├── image-20201218172522000.png ├── image-20201218172739904.png ├── image-20201218190028035.png ├── image-20201218191319672.png ├── image-20201218191757062.png ├── image-20201218192949362.png ├── image-20201218193000041.png ├── image-20201218193407568.png ├── image-20201218193423032.png ├── image-20201218193431134.png ├── image-20201218193442511.png ├── image-20201218194615594.png ├── image-20201218194802577.png ├── image-20201218213638153.png ├── image-20201218213653036.png ├── image-20201218213842335.png ├── image-20201218214617169.png ├── image-20201218215236369.png ├── image-20201218215744752.png ├── image-20201218221426852.png ├── image-20201218223508003.png ├── image-20201218223518758.png ├── image-20201218223755686.png ├── image-20201218231224384.png ├── image-20201218231428059.png ├── image-20201218232652191.png ├── image-20201218232936168.png ├── image-20201218233350071.png ├── image-20201218235603775.png ├── image-20201219000218823.png ├── image-20201219000237993.png ├── image-20201219000544158.png ├── image-20201219000551450.png ├── image-20201219024920861.png ├── image-20201219025057751.png ├── image-20201219025119781.png ├── image-20201219025400821.png ├── image-20201219025629886.png ├── image-20201219030553694.png ├── image-20201219132342904.png ├── image-20201219132645535.png ├── image-20201219133304008.png ├── image-20201219145150654.png ├── image-20201219145432922.png ├── image-20201219150410470.png ├── image-20201219150633664.png ├── image-20201219163528891.png ├── image-20201219163703622.png ├── image-20201219164223017.png ├── image-20201219164554022.png ├── image-20201219164637499.png ├── image-20201219172904468.png ├── image-20201219172934173.png ├── image-20201219173036638.png ├── image-20201219173208674.png ├── image-20201219173228439.png ├── image-20201219173252271.png ├── image-20201219174124518.png ├── image-20201219183747445.png ├── image-20201219185021620.png ├── image-20201219185340471.png ├── image-20201219185447743.png ├── image-20201219190127485.png ├── image-20201219191139115.png ├── image-20201219191323265.png ├── image-20201219191605715.png ├── image-20201219191730110.png ├── image-20201219192209848.png ├── image-20201219193800340.png ├── image-20201219194516616.png ├── image-20201219194548072.png ├── image-20201219194749266.png ├── image-20201219195348919.png ├── image-20201219195531303.png ├── image-20201219195924384.png ├── image-20201219195941673.png ├── image-20201219195953750.png ├── image-20201220205917778.png ├── image-20201220211842025.png ├── image-20201220212048559.png ├── image-20201220212201248.png ├── image-20201220212214729.png ├── image-20201220212223103.png ├── image-20201220213018574.png ├── image-20201220213359773.png ├── image-20201220213408759.png ├── image-20201220213422417.png ├── image-20201220213451847.png ├── image-20201220213500740.png ├── image-20201220225611291.png ├── image-20201220225654409.png ├── image-20201220225814036.png ├── image-20201220234813574.png ├── image-20201221000655194.png ├── image-20201221004641803.png ├── image-20201221005619335.png ├── image-20201221005841526.png ├── image-20201221010257570.png ├── image-20201221010521124.png ├── image-20201221010616923.png ├── image-20201221011537195.png ├── image-20201221012129929.png ├── image-20201221012140350.png ├── image-20201221012409669.png ├── image-20201221012410679.png ├── image-20201221012420398.png ├── image-20201221020740988.png ├── image-20201221021325389.png ├── image-20201221021601571.png ├── image-20201221022024457.png ├── image-20201221022403232.png ├── image-20201221141410119.png ├── image-20201221142503522.png ├── image-20201221142504459.png ├── image-20201221142506694.png └── image-20201221142509301.png /Docs/代码生成.md: -------------------------------------------------------------------------------- 1 | 18231047 王肇凯 2 | 3 | 4 | 5 | ## 代码生成 6 | 7 | ### 最初设计 8 | 9 | #### 中间代码: 10 | 11 | 设计为四元式的形式,即(运算符,操作数1,操作数2,结果),多个四元式存储在全局的静态变量中。在原先的每个语法分析子程序中增加语义分析的内容生成中间代码。本次作业涉及到的操作符包括加减乘除、读、写、赋值七种。 12 | 13 | 例如对于`E = A op B op C op D`的形式(op为同级运算符,例如乘除或加减),按照翻译文法生成的序列为: 14 | 15 | ``` 16 | op, A, B, #T1 17 | op, #T1, C, #T2 18 | op, #T2, D, #T3 19 | :=, E, #T3 20 | ``` 21 | 22 | 同时,在进入函数时产生`FUNC void main`的四元式用于标识作用域。 23 | 24 | 运算符包括以下种类: 25 | 26 | ``` 27 | #define OP_PRINT "PRINT" 28 | #define OP_SCANF "SCANF" 29 | #define OP_ASSIGN ":=" 30 | #define OP_ADD "+" 31 | #define OP_SUB "-" 32 | #define OP_MUL "*" 33 | #define OP_DIV "/" 34 | #define OP_FUNC "FUNC" 35 | #define OP_END_FUNC "END_FUNC" 36 | 37 | #define OP_ARR_LOAD "ARR_LOAD" 38 | #define OP_ARR_SAVE "ARR_SAVE" 39 | #define OP_LABEL "LABEL" 40 | #define OP_JUMP_IF "JUMP_IF" 41 | #define OP_JUMP_UNCOND "JUMP" 42 | 43 | #define OP_PREPARE_CALL "PREPARE_CALL" 44 | #define OP_CALL "CALL" 45 | #define OP_PUSH_PARA "PUSH_PARA" 46 | #define OP_RETURN "RETURN" 47 | ``` 48 | 49 | 50 | 51 | 52 | 53 | #### MIPS代码: 54 | 55 | 语法分析结束后,根据中间代码借助符号表生成。将字符串以`.asciiz`存在数据区,全局变量存在`$gp`上方,局部变量和临时变量(四元式的中间结果)分别存在s寄存器和t寄存器,若寄存器无空闲则放在`$sp$`下方。 56 | 57 | 在进行赋值和四则运算操作时,根据四元式的操作数在内存/寄存器/为常量分情况处理。 58 | 59 | ``` 60 | /* 对于a=b: 61 | * a在寄存器,b在寄存器:move a,b 62 | * a在寄存器,b在内存:lw a,b 63 | * a在寄存器,b为常量:li a,b 64 | * 65 | * a在内存,b在寄存器:sw b,a 66 | * a在内存,b在内存:lw reg,b sw reg,a 67 | * a在内存,b为常量 li reg,b sw reg,a 68 | */ 69 | ``` 70 | 71 | ``` 72 | /* 对于a=b+c: 73 | * abc都在寄存器/常量: add a,b,c 74 | * ab在寄存器/常量,c在内存(或反过来): lw reg2,c add a,b,reg2 75 | * a在寄存器,bc在内存: lw reg1,b lw reg2,c add a,reg1,reg2 76 | * a在内存,bc在寄存器/常量: add reg1,b,c sw reg1,a 77 | * ab在内存,c在寄存器/常量(或反过来): lw reg1,b add reg1,reg1,c sw reg1,a 78 | * abc都在内存: lw reg1,b lw reg2,c add reg1,reg1,reg2 sw reg1,a 79 | */ 80 | ``` 81 | 82 | 83 | 84 | 85 | 86 | ### 实现与完善 87 | 88 | 89 | 90 | 中间代码类实现如下: 91 | 92 | ```c++ 93 | class PseudoCode { 94 | public: 95 | string op; 96 | string num1; 97 | string num2; 98 | string result; 99 | 100 | PseudoCode(string op, string n1, string n2, string r) : 101 | op(std::move(op)), num1(std::move(n1)), num2(std::move(n2)), result(std::move(r)) {}; 102 | 103 | PseudoCode() = default; 104 | 105 | string to_str() const; 106 | }; 107 | 108 | class PseudoCodeList { 109 | public: 110 | static vector codes; 111 | static int code_index; 112 | static vector strcons; 113 | static int strcon_index; 114 | 115 | static string add(const string &op, const string &n1, const string &n2, const string &r); 116 | static void refactor(); 117 | static void show(); 118 | static void save_to_file(const string &out_path); 119 | }; 120 | ``` 121 | 122 | 语义分析时调用静态方法`PseudoCodeList::add`生成四元式。 123 | 124 | 125 | 126 | MIPS生成类维护变量记录当前函数作用域,方便在符号表中查找。维护变量记录t寄存器和s寄存器的使用情况以方便分配。 127 | 128 | 翻译过程基本依照前文设计。例如对于四则运算的处理如下: 129 | 130 | ```c++ 131 | bool a_in_reg = in_reg(code.num1); 132 | bool b_in_reg = in_reg(code.num2); 133 | string a = symbol_to_addr(code.num1); 134 | string b = symbol_to_addr(code.num2); 135 | string reg = "$k0"; 136 | 137 | if (a_in_reg) { 138 | if (b_in_reg) { 139 | generate("move", a, b); 140 | } else if (is_const(code.num2)) { 141 | generate("li", a, b); 142 | } else { 143 | generate("lw", a, b); 144 | } 145 | } else { 146 | if (b_in_reg) { 147 | generate("sw", b, a); 148 | } else if (is_const(code.num2)) { 149 | generate("li", reg, b); 150 | generate("sw", reg, a); 151 | } else { 152 | generate("lw", reg, b); 153 | generate("sw", reg, a); 154 | } 155 | } 156 | ``` 157 | 158 | 同时为了方便debug,在每次翻译一条中间代码时生成一条注释,以表示连续的几条语句的目的。 159 | 160 | ```assembly 161 | # === #T170 = #T169 * num2 === 162 | mul $t2, $t1, $s0 163 | ``` 164 | 165 | s寄存器在进入新的函数时释放,t寄存器在第一次被读取时释放。 166 | 167 | -------------------------------------------------------------------------------- /Docs/总结感想.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | 18231047王肇凯 4 | 5 | 6 | 7 | ## 总结感想 8 | 9 | 编译是本科三门系统类课程中的最后一门,总得来说帮我们建立了对计算机底层更加清晰的认识,接触到了前沿编译器的设计理念方法。特别是实验环节,通过动手实践,在以下方面有所收获: 10 | 11 | * C++编程开发技术和面向对象设计 12 | * 编译器的实现方法 13 | * 工程能力 14 | * 大规模系统的架构设计方法 15 | 16 | * 独立解决问题的能力 17 | 18 | * 时间管理能力 19 | 20 | 课程进行到优化时压力尤其大,也与临近期末和许多课DDL时间接近有关。每次优化完都要把所有测试数据跑一遍来debug,相当于每个优化都顶得上单独一次作业,压力还是挺大的。不过最终还是在竞速中取得了不错的成绩,感觉努力也没有白费,工程能力有了显著的提高。除了学习到了编译技术的知识,也算是第一次写一个完整的五千行级别代码的项目,可以说是软件工程实践吧。希望这门课越来越好。 21 | 22 | 23 | 24 | ## 课程建议 25 | 26 | * 仿照另外两门课程设计,提供一些教程;尽管相对来说编译理论课和实验课距离最近,但如果有教程能在前期在整体设计上给予指导的话,应该可以避免很多重构。具体来说可以包括如下内容: 27 | * 整体架构设计(如单遍多遍、高内聚低耦合系统) 28 | * 调试方法 29 | * 优化方法的具体实现 30 | * 鼓励大家参与讨论区;相对来讲这门课的讨论区不如OO、计组等活跃,讨论区可以用来让同学们互相分享经验,但编译的讨论区都是助教答疑回答问题,缺少同学间的互动(OO就有类似整体设计思路等),可以用适度的讨论加分来鼓励同学们参与 31 | * 一些小点 32 | * 竞速排序:希望设置成以最高分为准而不是最后一次为准 33 | * 评测排队等待时间过长 34 | * 测试程序经常有bug,希望多审核一下 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | -------------------------------------------------------------------------------- /Docs/申优文档.md: -------------------------------------------------------------------------------- 1 | # 申优文章 2 | 3 | 18231047王肇凯 4 | 5 | 以下是一些关于如何完成课程设计的分享,包括我所遇到的问题和自己的解决方法 6 | 7 | 8 | 9 | ### C++编程语言不熟悉 10 | 11 | 解决方法: 12 | 13 | 看菜鸟教程自学C++语法:由于大二下学期OO课学过另一门面向对象语言Java,可以从两门语言的不同之处入手进行自学; 14 | 15 | 此外在实际上手过程中,一个好的开发环境也能帮助我们边用边学,e.g.我强推Clion(Jetbrains,yyds!),由于mac端VS不能开发C++而从Clion上手,实际发现体验相当不错,特别是在语言特性不熟悉的时候可以通过实时的warning(Clang-Tidy检查)和suggestions来了解一些效率高、安全、简洁的写法,以完成对于新语言的入门。举例:函数参数不变时可以变为`const string &`,只使用一次时可以使用`std::move`来节省内存拷贝等等。 16 | 17 | 18 | 19 | ### 调试困难 20 | 21 | 实际上从测试数据来说,编译相对于之前的两门系统类课程计组和OS来说调试难度已经是比较低了,而且课程组提供了相当多的测试库,具有很高的覆盖范围,其他课程的盲debug现象很少会再出现。 22 | 23 | 然而这门课比较复杂的地方在于整体步骤比较繁琐:以往的课程是编译你的源代码为目标程序——目标程序读入输入产生输出——比对输出,而这门课是:编译你的编译器源代码——你的编译器编译测试数据为目标代码(其中还经过了中间代码的环节)——Mars运行目标代码读入输入产生输出——比对输出 24 | 25 | 所以导致发现一个bug时,需要先从错误输出定位到错误的mips代码,再回去查生成的mips是否有误,再回去查询中间代码是否有误,整体上比较花费时间。 26 | 27 | 如下是一些可以尝试的方法: 28 | 29 | * 实现自动化评测:编写脚本以命令行形式完成编译、调用mars的过程 30 | * 对拍:找已经AC的同学要来程序进行对拍,进行黑箱测试(比对输出的步骤也可以写在评测脚本里面) 31 | * 多去讨论区发言,每个人分享遇到的问题和解决方法,就可以塑造一个比较良好的(反内卷的)学习氛围 32 | * 注重单元测试等测试方法,例如先实现最基本功能保证没有bug再进行优化等 33 | * 造数据:特别是做优化时,例如加完函数内联应该主动构造一些函数调用的样例进行测试,比大规模黑箱测试再定位问题要节省时间 34 | * 手写调试方法:如panic、assert、全局DEBUG开关等 35 | 36 | 37 | 38 | ### 编译器实现原理不熟悉 39 | 40 | 解决方法: 41 | 42 | 还是要注重理论课的学习,这门课的理论课和实验课联系最为紧密,尤其是词法分析语法分析可以直接用书上代码的实现方式来完成,后面的优化算法更是和课上内容紧密相连。因此平时上课和作业都要认真对待。 43 | 44 | 45 | 46 | ### 整体设计没有头绪,经常推倒重构 47 | 48 | 解决方法: 49 | 50 | 在动手写代码之前进行设计的时候多和其他人讨论设计方案,以免到了实现时才发现设计上的缺陷,以至推倒重来。 51 | 52 | 也可以参考往届学长的设计文档进行整体架构设计。 53 | 54 | 以下有一些设计实现上的建议: 55 | 56 | * 解耦合:设计成多遍处理,源程序=词法分析=>词法分析结果=语法分析语义分析=>中间代码=目标代码生成=>目标代码 57 | * 面向对象:OO不能白学,不能再一main到底了,从可读性、调试难度、可扩展性来说都不好 58 | * 多封装:实际上也就是面向对象的主要思想(毕竟编译器也用不到继承和多态),把常用的方法封装起来,好处如下: 59 | * 节省时间:少写很多代码 60 | * 防止出错:很多错误是复制粘贴的地方忘了改,封装起来再改参数就直观多了 61 | * 方便调试:直接注释掉函数调用语句就可以删去整个行为逻辑;也方便增加cout等调试语句 62 | * 多用STL:C++比C好的一个方面就是有丰富的数据结构,比静态数组高到不知道哪里去了 63 | 64 | 65 | 66 | ### 版本混乱,改着改着就改乱了 67 | 68 | 解决方法: 69 | 70 | 使用git进行版本控制,建立private的github仓库,同步到远程进行备份;有新的想法可以新建分支编写,写乱了可以直接`git reset --hard`回退到之前的版本。 71 | 72 | 73 | 74 | 75 | 76 | 77 | 78 | 79 | 80 | -------------------------------------------------------------------------------- /Docs/编译器设计文档 18231047王肇凯.md: -------------------------------------------------------------------------------- 1 | # 编译器设计文档 2 | 3 | 18231047 王肇凯 4 | 5 | 6 | 7 | [TOC] 8 | 9 | # 功能说明 10 | 11 | 基于C0文法的编译器,主要功能为将源代码(单文件)翻译为Mips汇编代码。支持在语法检查时报错,精确到具体行列。同时设置了多种优化以减少目标代码的运行开销。此外增加了输出优化前后中间代码、输出语法树等功能。 12 | 13 | ### 文法 14 | 15 | ``` 16 | <加法运算符> ::= +|- 17 | <乘法运算符> ::= *|/ 18 | <关系运算符> ::= <|<=|>|>=|!=|== 19 | <字母> ::= _|a|...|z|A|...|Z 20 | <数字> ::= 0|1|...|9 21 | <字符> ::= '<加法运算符>'|'<乘法运算符>'|'<字母>'|'<数字>' 22 | <字符串> ::= "{十进制编码为32,33,35-126的ASCII字符}" 23 | 24 | <程序> ::= [<常量说明>][<变量说明>]{<有返回值函数定义>|<无返回值函数定义>}<主函数> 25 | <常量说明> ::= const<常量定义>;{ const<常量定义>;} 26 | <常量定义>::=int<标识符>=<整数>{,<标识符>=<整数>}|char<标识符>=<字符>{,<标识符>=<字符>} 27 | <无符号整数> ::= <数字>{<数字>} 28 | <整数> ::= [+|-]<无符号整数> 29 | <标识符> ::= <字母>{<字母>|<数字>} 30 | <声明头部> ::= int<标识符> |char<标识符> 31 | <常量> ::= <整数>|<字符> 32 | 33 | <变量说明> ::= <变量定义>;{<变量定义>;} 34 | <变量定义> ::= <变量定义无初始化>|<变量定义及初始化> 35 | <变量定义无初始化> ::= <类型标识符>(<标识符>|<标识符>'['<无符号整数>']'|<标识符>'['<无符号整数>']''['<无符号整数>']'){,(<标识符>|<标识符>'['<无符号整数>']'|<标识符>'['<无符号整数>']''['<无符号整数>']' )} 36 | <变量定义及初始化> ::= <类型标识符><标识符>=<常量>|<类型标识符><标识符>'['<无符号整数>']'='{'<常量>{,<常量>}'}'|<类型标识符><标识符>'['<无符号整数>']''['<无符号整数>']'='{''{'<常量>{,<常量>}'}'{, '{'<常量>{,<常量>}'}'}'}' 37 | <类型标识符> ::= int | char 38 | <有返回值函数定义> ::= <声明头部>'('<参数表>')' '{'<复合语句>'}' 39 | <无返回值函数定义> ::= void<标识符>'('<参数表>')''{'<复合语句>'} 40 | <复合语句> ::= [<常量说明>][<变量说明>]<语句列> 41 | <参数表> ::= <类型标识符><标识符>{,<类型标识符><标识符>}| <空> 42 | <主函数> ::= void main‘(’‘)’ ‘{’<复合语句>‘}’ 43 | 44 | <表达式> ::= [+|-]<项>{<加法运算符><项>} 45 | <项> ::= <因子>{<乘法运算符><因子>} 46 | <因子> ::= <标识符>|<标识符>'['<表达式>']'|<标识符>'['<表达式>']''['<表达式>']'|'('<表达式>')'|<整数>|<字符>|<有返回值函数调用语句> 47 | 48 | <语句> ::= <循环语句>|<条件语句>| <有返回值函数调用语句>; |<无返回值函数调用语句>;|<赋值语句>;|<读语句>;|<写语句>;|<情况语句>|<空>;|<返回语句>; | '{'<语句列>'}' 49 | <赋值语句> ::= <标识符>=<表达式>|<标识符>'['<表达式>']'=<表达式>|<标识符>'['<表达式>']''['<表达式>']' =<表达式> 50 | <条件语句> ::= if '('<条件>')'<语句>[else<语句>] 51 | <条件> ::= <表达式><关系运算符><表达式> 52 | <循环语句> ::= while '('<条件>')'<语句>| for'('<标识符>=<表达式>;<条件>;<标识符>=<标识符>(+|-)<步长>')'<语句> 53 | <步长>::= <无符号整数> 54 | <情况语句> ::= switch ‘(’<表达式>‘)’ ‘{’<情况表><缺省>‘}’ 55 | <情况表> ::= <情况子语句>{<情况子语句>} 56 | <情况子语句> ::= case<常量>:<语句> 57 | <缺省> ::= default :<语句> 58 | 59 | <有返回值函数调用语句> ::= <标识符>'('<值参数表>')' 60 | <无返回值函数调用语句> ::= <标识符>'('<值参数表>')' 61 | <值参数表> ::= <表达式>{,<表达式>}|<空> 62 | <语句列> ::= {<语句>} /*测试程序的语句列需出现无语句、有语句2种情况*/ 63 | <读语句> ::= scanf '('<标识符>')' 64 | <写语句>::= printf '(' <字符串>,<表达式> ')'| printf '('<字符串> ')'| printf '('<表达式>')' 65 | <返回语句> ::= return['('<表达式>')'] 66 | ``` 67 | 68 | 69 | 70 | ### 使用方法 71 | 72 | 将源文件放在编译器同目录下的`testfile.txt`,运行编译器。同目录下产生以下输出: 73 | 74 | * `output.txt`:语法分析结果(`Token`类型和内容、语法成分) 75 | * `error.txt`:报错信息 76 | * `pseudo_code_old.txt`:优化前中间代码 77 | * `pseudo_code.txt`:优化后中间代码 78 | * `mips.txt`:Mips汇编代码 79 | 80 | ### 示例 81 | 82 | * testfile.txt 83 | 84 | ```c 85 | void main(){ 86 | printf("Hello world!"); 87 | } 88 | ``` 89 | 90 | * output.txt 91 | 92 | ``` 93 | VOIDTK void 94 | MAINTK main 95 | LPARENT ( 96 | RPARENT ) 97 | LBRACE { 98 | PRINTFTK printf 99 | LPARENT ( 100 | STRCON Hello world! 101 | <字符串> 102 | RPARENT ) 103 | <写语句> 104 | SEMICN ; 105 | <语句> 106 | <语句列> 107 | <复合语句> 108 | RBRACE } 109 | <主函数> 110 | <程序> 111 | ``` 112 | 113 | * pseudo_code.txt 114 | 115 | ```assembly 116 | str0: Hello world!\n 117 | =========FUNC void main========= 118 | PRINT 0 strcon 119 | RETURN 120 | =========END_FUNC void main========= 121 | ``` 122 | 123 | * mips.txt 124 | 125 | ```assembly 126 | .data 127 | str__0: .asciiz "Hello world!\n" 128 | newline__: .asciiz "\n" 129 | .text 130 | 131 | # === =========FUNC void main========= === 132 | addi $sp, $sp, -100 133 | j main 134 | main: 135 | 136 | # === PRINT 0 strcon === 137 | la $a0, str__0 138 | li $v0, 4 139 | syscall 140 | 141 | # === RETURN === 142 | li $v0, 10 143 | syscall 144 | 145 | # === =========END_FUNC void main========= === 146 | ``` 147 | 148 | 149 | 150 | 151 | 152 | 153 | 154 | # 整体架构 155 | 156 | 主要包括词法分析、语法分析、错误处理、代码生成、竞速优化五个部分,进行多遍处理。 157 | 158 | * 第1遍:扫描源程序进行语法分析(词法分析作为语法分析的子程序)、语义分析、生成中间代码和错误处理 159 | * 第2~n遍:扫描中间代码,借助符号表进行多种优化 160 | * 第n+1遍:扫描中间代码,借助符号表生成目标代码 161 | 162 | ```c++ 163 | #include "Grammar.h" 164 | #include "MipsGenerator.h" 165 | 166 | using namespace std; 167 | 168 | int main() { 169 | cout << ":::::::::::::::::::::::::::::::::::::::::::::::::::::" << endl; 170 | cout << ":: ::" << endl; 171 | cout << ":: wzk's compiler V1.0 ::" << endl; 172 | cout << ":: ::" << endl; 173 | cout << ":::::::::::::::::::::::::::::::::::::::::::::::::::::" << endl; 174 | 175 | //语法分析、语义分析、错误处理 176 | Grammar grammar("testfile.txt"); 177 | grammar.analyze(); 178 | grammar.save_to_file("output.txt"); 179 | Errors::save_to_file("error.txt"); 180 | 181 | 182 | //中间代码优化 183 | PseudoCodeList::refactor(); 184 | PseudoCodeList::save_to_file("pseudo_code_old.txt"); 185 | PseudoCodeList::remove_redundant_assign(); 186 | PseudoCodeList::const_broadcast(); 187 | PseudoCodeList::remove_redundant_tmp(); 188 | PseudoCodeList::inline_function(); 189 | PseudoCodeList::const_broadcast(); 190 | PseudoCodeList::save_to_file("pseudo_code.txt"); 191 | 192 | //目标代码生成 193 | MipsGenerator mips; 194 | mips.optimize_muldiv = true; 195 | mips.optimize_assign_reg = true; 196 | mips.translate(); 197 | mips.save_to_file("mips.txt"); 198 | 199 | Errors::terminate(); 200 | 201 | return 0; 202 | } 203 | ``` 204 | 205 | 206 | 207 | 208 | 209 | # 设计实现 210 | 211 | 212 | 213 | ## 词法分析 214 | 215 | ### 最初设计 216 | 217 | 以一个函数`get_token()`为核心,产生token并输出至文件;主函数中读取输入文件所有字符,循环调用此函数读取字符生成token,直至文件末尾。 218 | 219 | `get_token()`内部参考课本上的设计,读取非空白符,连续读取识别标识符并判断是否是保留字,连续读取整数,读取单个字符判断是否为一元分隔符,连续读取判断是否为二元分隔符。若以上一条满足,将当前token的字符串表示及分类分别存入`token`和`symbol`并输出至文件,否则产生异常。 220 | 221 | 222 | 223 | ### 实现与完善 224 | 225 | 为了方便以后扩展,将词法分析写为一个类`TokenAnalyze`,在初始化时指定是否输出分析结果至文件,并调用其`analyze()`方法进行词法分析主过程。将`token`、`symbol`等全局变量作为该类的成员变量。 226 | 227 | ```c++ 228 | class Lexer { 229 | public: 230 | char ch{}; 231 | string token; 232 | string symbol; 233 | string source; 234 | int pos = 0; 235 | int line_num = 1; 236 | int col_num = 1; 237 | bool replace_mode; 238 | 239 | Token analyze(); 240 | Token get_token(); 241 | int read_char(); 242 | void retract(); 243 | static string special(char); 244 | static string reserved(string); 245 | explicit Lexer(const string& in_path, bool replace); 246 | }; 247 | ``` 248 | 249 | `analyze()`最开始将文件内容读入成员变量`source`,并维护整数值`pos`记录正在读取的字符串下标。这样做是为了方便回退,只需`pos--`即可。从而`read_char()`每次读取字符时只需更新`ch=source[pos]`并使`pos++`,更新行号`line_n um`和列号`col_num`(见后)。 250 | 251 | 在`reserver()`和`special()`分别维护两个`map`型变量记录保留字和非歧义分隔符。这里歧义指的是`>`与`>=`等需要多读取一个字符才能区分的分隔符,进行特判。 252 | 253 | 对于`INTCON`,记录其值在成员变量`int_v`中。 254 | 255 | 词法分析异常包含以下几类: 256 | 257 | * 引号不匹配:读取到文件结尾仍未发现右双引号/右单引号 258 | * 字符个数过多:两个单引号内字符多于一个 259 | * 未知字符:所有判断条件均不满足的字符 260 | 261 | 为了更好的异常处理提示信息,记录了正在读取的字符的行数和列数,并在报错时输出行数列数。同时输出的还有提示信息,指出异常属于以上某个特定类别。 262 | 263 | 词法分析的输出是`Token`类的若干实例。每个实例存储了一个词的语法成分、内容、行号、列号等信息。 264 | 265 | ```c++ 266 | class Token { 267 | public: 268 | string type; 269 | string str; 270 | string original_str; 271 | int v_int = -1; 272 | char v_char = 'E'; 273 | int line{}; 274 | int column{}; 275 | int pos{}; 276 | 277 | Token(string t, string s, int l, int c, int p); 278 | }; 279 | ``` 280 | 281 | 282 | 283 | 284 | 285 | ## 语法分析 286 | 287 | ### 最初设计 288 | 289 | 采用递归子程序法进行自顶而下的分析。在调用子程序前先读入一个token,然后每个子程序内通过读入token以及递归调用其他子程序来分析一种非终结符号,并将语法成分输出到文件中。 290 | 291 | 分析文法得知不存在左递归,为了避免回溯,采用了预读的方法,即在多个选择间存在冲突时提前读1~3个token进行判断出唯一选择,并回退到预读前的token,然后调用该选择的子程序。 292 | 293 | 当读入的终结符号与预期不同时,由于避免了回溯,因此直接产生异常,保存至错误表中。 294 | 295 | 其中注意到`<有返回值函数调用>`与`<无返回值函数调用>`的语法完全一致,需要根据语义区分。因此建立简单符号表(等到语义分析时再加以完善),在函数定义时在符号表中添加 “函数标识符 — 有无返回值” 的映射,在需要区分函数调用时通过标识符查表,从而选择有返回值或无返回值函数调用。 296 | 297 | 298 | 299 | ### 实现与完善 300 | 301 | 设计`Grammar`类进行语法分析工作。将词法分析器作为语法分析器的成员变量,从而可以调用其`analyze()`方法产生新的`Token`,以便进行语法分析主过程。 302 | 303 | 将符号表`SymTable`等全局变量作为该类的成员变量。 304 | 305 | 考虑到预读后需要回退至预读前的位置,因此将待输出内容按行保存至`output_str`中,在回退时删去上一个词法分析的输出行。 306 | 307 | 成员变量`tk`,`sym`,`pos`分别保存当前读到的词法分析结果、结果的词法成分、在所有词法分析tokens中的位置。 308 | 309 | 方法`next_sym()`,`retract()`,`error()`,`output()`分别进行读入token、预读结束后回退、存储错误、输出语法成分。 310 | 311 | 各个递归子程序作为方法保存在类中。 312 | 313 | ```c++ 314 | class Grammar { 315 | public: 316 | GrammarMode mode; 317 | Lexer lexer; 318 | vector output_str; 319 | vector cur_lex_results; 320 | int pos = 0; 321 | 322 | Token tk; 323 | string sym = ""; 324 | int local_addr = LOCAL_ADDR_INIT; 325 | 326 | void error(const string &expected); 327 | int next_sym(bool); 328 | void retract(); 329 | int analyze(); 330 | 331 | explicit Grammar(const string &in_path, GrammarMode mode); 332 | 333 | void output(const string &name); 334 | void save_to_file(const string &out_path); 335 | 336 | void Program(); 337 | void ConstDeclare(); 338 | void ConstDef(); 339 | void UnsignedInt(); 340 | string Int(); 341 | string Identifier(); 342 | pair Const(); 343 | void VariableDeclare(); 344 | void VariableDef();; 345 | void TypeIdentifier(); 346 | void SharedFuncDefHead(); 347 | void SharedFuncDefBody(); 348 | void RetFuncDef(); 349 | void NonRetFuncDef(); 350 | void CompoundStmt(); 351 | void ParaList(); 352 | void Main(); 353 | pair Expr(); 354 | pair Item(); 355 | pair Factor(); 356 | void Stmt(); 357 | void AssignStmt(); 358 | void ConditionStmt(); 359 | pair Condition(); 360 | void LoopStmt(); 361 | void CaseStmt(); 362 | void SharedFuncCall(); 363 | void RetFuncCall(); 364 | void NonRetFuncCall(); 365 | void StmtList(); 366 | void ReadStmt(); 367 | void WriteStmt(); 368 | void ReturnStmt(); 369 | 370 | 371 | void add_node(const string &name); 372 | void add_leaf(); 373 | void tree_backward(); 374 | void dfs_show(const TreeNode &, int); 375 | void show_tree(); 376 | }; 377 | ``` 378 | 379 | `analyze()`只需打开关闭输出文件流、读入第一个token、调用`<程序>`子程序即可。 380 | 381 | `<数字>`,`<标识符>`等基础的非终结符号在词法分析时已经进行过判断,因此不必写子程序。 382 | 383 | 每个递归子程序在调用前需要先使用`next_sym()`读入一个token,然后根据右部各选择的首符号进行选择(必要时采用预读),对于非终结符号调用其子程序,终结符号则判断是否与预期一致,不一致则报错。 384 | 385 | 386 | 387 | 388 | 389 | ## 错误处理 390 | 391 | ### 最初设计 392 | 393 | 建立错误类,存储错误类型、行号、列号、额外提示信息;再将所有错误对象整合为一个数组存储,并输出到文件中。 394 | 395 | 为了识别语义错误,需要建立符号表管理各个层次的变量,并存储各种信息(如变量维度、函数返回值类型等等)。在读到标识符时进行增/查操作,在类型不一致时报错。此外还需要求表达式的类型,判断其为char型或者int型。 396 | 397 | 错误处理的主要步骤均在语法分析中完成。语法分析时调用符号表类和错误类的静态方法完成添加符号表项、添加错误等任务。在语法分析结束时将错误格式化输出到文件中。 398 | 399 | 400 | 401 | ### 实现与完善 402 | 403 | #### 错误类 404 | 405 | 将错误类别以宏的形式进行定义,作为错误编码. 406 | 407 | ```c 408 | #define ERR_LEXER 'a' 409 | #define ERR_REDEFINED 'b' 410 | #define ERR_UNDEFINED 'c' 411 | #define ERR_PARA_COUNT 'd' 412 | #define ERR_PARA_TYPE 'e' 413 | #define ERR_CONDITION_TYPE 'f' 414 | #define ERR_NONRET_FUNC 'g' 415 | #define ERR_RET_FUNC 'h' 416 | #define ERR_INDEX_CHAR 'i' 417 | #define ERR_CONST_ASSIGN 'j' 418 | #define ERR_SEMICOL 'k' 419 | #define ERR_RPARENT 'l' 420 | #define ERR_RBRACK 'm' 421 | #define ERR_ARRAY_INIT 'n' 422 | #define ERR_CONST_TYPE 'o' 423 | #define ERR_SWITCH_DEFAULT 'p' 424 | ``` 425 | 426 | 427 | 428 | 设计`Error`类存储单个错误的各种信息。`Errors`类用静态成员变量存储全部错误对象,并提供静态方法将错误输出至文件。 429 | 430 | ```c++ 431 | class Error { 432 | public: 433 | string msg; 434 | int line{}; 435 | int column{}; 436 | int eid; 437 | char err_code{}; 438 | string rich_msg; 439 | }; 440 | class Errors { 441 | public: 442 | static vector errors; 443 | static void add(const string &s, int line, int col, int id); 444 | static void add(const string &s, int id); 445 | static void save_to_file(const string &out_path); 446 | }; 447 | ``` 448 | 449 | 错误信息输出示例如下 450 | 451 | ``` 452 | Error in line 52, column 21: Para count mismatch (EID: d) 453 | Error in line 53, column 22: Para type mismatch (EID: e) 454 | Error in line 55, column 21: Para type mismatch (EID: e) 455 | ``` 456 | 457 | 458 | 459 | #### 符号表 460 | 461 | `SymTableItem`和`SymTable`类分别代表符号表项和整个符号表。设定了增加符号表项、查询符号表、栈式符号表增减层的方法。 462 | 463 | ```c++ 464 | enum STIType { 465 | invalid_sti, 466 | constant, 467 | var, 468 | tmp, 469 | para, 470 | func 471 | }; 472 | 473 | enum DataType { 474 | invalid_dt, 475 | integer, 476 | character, 477 | void_ret 478 | 479 | }; 480 | 481 | class SymTableItem { 482 | public: 483 | string name; 484 | STIType stiType; 485 | DataType dataType; 486 | int dim = 0; 487 | vector> paras; 488 | int addr{}; 489 | int size{}; 490 | int dim1_size{}; 491 | int dim2_size{}; 492 | string const_value; 493 | 494 | SymTableItem(string name, STIType stiType1, DataType dataType1, int addr); 495 | string to_str() const; 496 | }; 497 | 498 | class SymTable { 499 | public: 500 | static vector global; 501 | static map> local; 502 | static unsigned int max_name_length; 503 | 504 | static void add(const string &func, const Token &tk, STIType stiType, DataType dataType, int addr, int dim1, int dim2); 505 | static void add_const(const string &func, const Token &tk, DataType dataType, string const_value); 506 | static int add_func(const Token &tk, DataType dataType, vector> paras); 507 | static SymTableItem search(const string &func, const string &str); 508 | static void show(); 509 | static void reset(); 510 | }; 511 | ``` 512 | 513 | 在最初的设计中符号表为栈式,但后续根据中间代码生成目标代码时,发现将每个函数的符号表均保存起来更加便于查找。最终的设计中符号表包括一个存储全局变量、函数的`global`成员变量,和一个将函数名映射到函数内部变量、参数的`local`变量。在向符号表添加项时需要指明作用域(某个函数内部或是全局)。 514 | 515 | 符号表显示效果如下 516 | 517 | ``` 518 | ============================== 519 | NAME KIND TYPE DIM 520 | ------------------------------ 521 | func_switch_ch func int 0 522 | func_switch_int func int 0 523 | ------------------------------ 524 | c para char 0 525 | tmp var int 0 526 | ============================== 527 | ``` 528 | 529 | 在语法分析程序中增加对于`error()`函数的调用,以在适当地方进行报错,完成跳读,并将错误信息存储到`Errors`类的静态成员变量中。在整个程序分析结束后,将所有错误输出至文件。 530 | 531 | 532 | 533 | 534 | 535 | ## 代码生成 536 | 537 | ### 最初设计 538 | 539 | 代码生成部分涵盖范围最大,包括课本上存储分配、中间代码格式、语法制导翻译、语义分析等四章内容。在设计时将其分为两个阶段: 540 | 541 | * 源代码->中间代码:即语义分析在语法分析子程序里生成四元式形式即可 542 | * 中间代码->目标代码:扫描中间代码,同时查阅符号表得到变量的数据类型、维度等信息,生成目标代码。 543 | 544 | #### 中间代码: 545 | 546 | 设计为四元式的形式,即(运算符,操作数1,操作数2,结果),多个四元式存储在全局的静态变量中。在原先的每个语法分析子程序中增加语义分析的内容生成中间代码。本次作业涉及到的操作符包括加减乘除、读、写、赋值七种。 547 | 548 | 例如对于`E = A op B op C op D`的形式(op为同级运算符,例如乘除或加减),按照翻译文法生成的序列为: 549 | 550 | ``` 551 | op, A, B, #T1 552 | op, #T1, C, #T2 553 | op, #T2, D, #T3 554 | :=, E, #T3 555 | ``` 556 | 557 | 同时,在进入函数时产生`FUNC void main`的四元式用于标识作用域。 558 | 559 | 运算符包括以下种类: 560 | 561 | ``` 562 | #define OP_PRINT "PRINT" 563 | #define OP_SCANF "SCANF" 564 | #define OP_ASSIGN ":=" 565 | #define OP_ADD "+" 566 | #define OP_SUB "-" 567 | #define OP_MUL "*" 568 | #define OP_DIV "/" 569 | #define OP_FUNC "FUNC" 570 | #define OP_END_FUNC "END_FUNC" 571 | 572 | #define OP_ARR_LOAD "ARR_LOAD" 573 | #define OP_ARR_SAVE "ARR_SAVE" 574 | #define OP_LABEL "LABEL" 575 | #define OP_JUMP_IF "JUMP_IF" 576 | #define OP_JUMP_UNCOND "JUMP" 577 | 578 | #define OP_PREPARE_CALL "PREPARE_CALL" 579 | #define OP_CALL "CALL" 580 | #define OP_PUSH_PARA "PUSH_PARA" 581 | #define OP_RETURN "RETURN" 582 | ``` 583 | 584 | 585 | 586 | 587 | 588 | #### MIPS代码: 589 | 590 | 语法分析结束后,根据中间代码借助符号表生成。将字符串以`.asciiz`存在数据区,全局变量存在`$gp`上方,其余变量存在内存中(后续优化为寄存器)。 591 | 592 | 在进行赋值和四则运算操作时,根据四元式的操作数在内存/寄存器/为常量分情况处理。 593 | 594 | ``` 595 | /* 对于a=b: 596 | * a在寄存器,b在寄存器:move a,b 597 | * a在寄存器,b在内存:lw a,b 598 | * a在寄存器,b为常量:li a,b 599 | * 600 | * a在内存,b在寄存器:sw b,a 601 | * a在内存,b在内存:lw reg,b sw reg,a 602 | * a在内存,b为常量 li reg,b sw reg,a 603 | */ 604 | ``` 605 | 606 | ``` 607 | /* 对于a=b+c: 608 | * abc都在寄存器/常量: add a,b,c 609 | * ab在寄存器/常量,c在内存(或反过来): lw reg2,c add a,b,reg2 610 | * a在寄存器,bc在内存: lw reg1,b lw reg2,c add a,reg1,reg2 611 | * a在内存,bc在寄存器/常量: add reg1,b,c sw reg1,a 612 | * ab在内存,c在寄存器/常量(或反过来): lw reg1,b add reg1,reg1,c sw reg1,a 613 | * abc都在内存: lw reg1,b lw reg2,c add reg1,reg1,reg2 sw reg1,a 614 | */ 615 | ``` 616 | 617 | 618 | 619 | 620 | 621 | ### 实现与完善 622 | 623 | 中间代码类实现如下: 624 | 625 | ```c++ 626 | class PseudoCode { 627 | public: 628 | string op; 629 | string num1; 630 | string num2; 631 | string result; 632 | 633 | PseudoCode(string op, string n1, string n2, string r); 634 | }; 635 | 636 | class PseudoCodeList { 637 | public: 638 | static vector codes; 639 | static int code_index; 640 | static vector strcons; 641 | static int strcon_index; 642 | 643 | static string add(const string &op, const string &n1, const string &n2, const string &r); 644 | static void refactor(); 645 | static void show(); 646 | static void save_to_file(const string &out_path); 647 | }; 648 | ``` 649 | 650 | 语义分析时调用静态方法`PseudoCodeList::add`生成四元式。 651 | 652 | MIPS生成类维护变量记录当前函数作用域,方便在符号表中查找。 653 | 654 | 翻译过程基本依照前文设计。例如对于四则运算的处理如下: 655 | 656 | ```c++ 657 | bool a_in_reg = in_reg(code.num1); 658 | bool b_in_reg = in_reg(code.num2); 659 | string a = symbol_to_addr(code.num1); 660 | string b = symbol_to_addr(code.num2); 661 | string reg = "$k0"; 662 | 663 | if (a_in_reg) { 664 | if (b_in_reg) { 665 | generate("move", a, b); 666 | } else if (is_const(code.num2)) { 667 | generate("li", a, b); 668 | } else { 669 | generate("lw", a, b); 670 | } 671 | } else { 672 | if (b_in_reg) { 673 | generate("sw", b, a); 674 | } else if (is_const(code.num2)) { 675 | generate("li", reg, b); 676 | generate("sw", reg, a); 677 | } else { 678 | generate("lw", reg, b); 679 | generate("sw", reg, a); 680 | } 681 | } 682 | ``` 683 | 684 | 同时为了方便debug,在每次翻译一条中间代码时生成一条注释,以表示连续的几条语句的目的。 685 | 686 | ```assembly 687 | # === #T170 = #T169 * num2 === 688 | mul $t2, $t1, $s0 689 | ``` 690 | 691 | 692 | 693 | 694 | 695 | 696 | 697 | ## 竞速优化 698 | 699 | 优化包括以下种类: 700 | 701 | * 函数内联展开 702 | * 局部窥孔优化 703 | * 常量传播 704 | * 寄存器分配 705 | * 循环跳转优化 706 | * 乘除法优化 707 | 708 | 709 | 710 | ### 函数内联展开 711 | 712 | 最开始尝试了很久在源代码上进行内联(即将函数调用替换为复合语句),但由于源代码形式过于复杂多样而难以实现。后来改为在中间代码上进行内联,具体来说有以下步骤: 713 | 714 | * 第一遍扫描中间代码,记录每个函数对应的中间代码以方便替换 715 | * 第二遍扫描中间代码,读取到函数调用时判断是否展开(递归函数和长度超过20条的函数不展开),若展开则记录实参,遍历被调用函数的中间代码进行以下步骤: 716 | * 替换形参为实参,如果实参是表达式(临时变量)则改为局部变量添加至符号表 717 | * 其他变量名进行替换以避免不同作用域的命名冲突,具体命名方法为`_i_name`,其中为内联次数,并添加至调用函数的符号表 718 | * 标签名用类似的方法处理 719 | * 返回语句改为对返回值变量赋值并跳转至函数调用结束 720 | 721 | 节省开销:主要是Memory(函数调用保存现场)和Jump(跳转到函数和返回) 722 | 723 | 724 | 725 | ### 窥孔优化 726 | 727 | 一些较少但是比较有效果的优化,例如: 728 | 729 | * 连续的两个临时变量赋值语句`#T1=A+B`,`#T2=#T1`可以进行合并 730 | * 连续的相同种类常数运算`A=B/5`,`C=A/6`合并 731 | * 加减乘0、乘除1可以优化为赋值语句 732 | 733 | 节省开销:Memory(减少临时变量个数,增加可分配t寄存器数量)和ALU 734 | 735 | 736 | 737 | ### 常量传播 738 | 739 | 结果为临时变量的四则运算可以直接删去,然后存储其值即可(实际实现时存储在符号表中); 740 | 741 | 四则运算中的两个操作数若为临时变量且其常数值可计算,也可以直接进行替换; 742 | 743 | 通过这样的方式,数组元素赋值`a[1][2]=d`(列数为8)的语句从`#T1=2`,`#T2 = 1 * 8`,`#T3 = #T1 + #T2`,`a[#T3]=d`精简为`a[10][2]=d`。 744 | 745 | 节省开销:Memory(减少临时变量个数,增加可分配t寄存器数量)和ALU 746 | 747 | 748 | 749 | ### 寄存器分配 750 | 751 | 局部变量和临时变量(四元式的中间结果)分别存在s寄存器和t寄存器,若寄存器无空闲则放在`$sp$`下方。维护变量记录t寄存器和s寄存器的使用情况以方便分配。s寄存器在进入新的函数时释放,t寄存器在第一次被读取时释放。 752 | 753 | 函数调用时保存当前已使用的寄存器至`$sp`,调用结束时从内存恢复。 754 | 755 | 节省开销:Memory 756 | 757 | 758 | 759 | ### 循环跳转优化 760 | 761 | 对于`while (a>b) a++;` 762 | 763 | 优化前:`label1: jump_if a<=b label2 `,`a=a+1`,`jump label 1` 764 | 765 | 优化后:`jump_if a<=b label2`,`label1: a=a+1`,`jump_if a>b label1` 766 | 767 | 从而循环体内每次循环可以减少一次跳转,大大降低了跳转开销。 768 | 769 | 节省开销:Jump 770 | 771 | 772 | 773 | ### 乘除法优化 774 | 775 | 乘除法时判断操作数是否有2的自然数次幂的常数,若符合条件则可以用移位代替。需注意除法在被除数为负数时`div`和`sra`表现并不一致(前者向下取整,后者向上取整),采用的处理方式是判断被除数的正负,若为负数则先将被除数符号取反,做除法后再将结果取反。 776 | 777 | * `b=a*8`可以翻译为`sll $s1, $s0, 3` 778 | * `b=a/8`可以翻译为: 779 | * `bgez $s0, label1` 780 | * `subu $t0, $zero, $s0` 781 | * `sra $s1, $t0, 3` 782 | * `subu $s1, $zero, $s1` 783 | * `j label2` 784 | * `label1: sra $s1, $s0, 3` 785 | * `label2:` 786 | 787 | 由于乘除法的惩罚是ALU运算和跳转的数倍,上述优化可以大幅减少开销。 788 | 789 | 此外,`div $t1, $t2, $t3`在实际运行时被Mars处理为扩展指令,翻译成四条:`bne $t2, $zero, label1`,`label1: break`,`div $t2, $t3`,`mflo $t1`,其中前两条为检查除数是否为0,保证源程序正确的前提下可以被略去。因此将除法简化为`div $t2,$t3`,`mflo $t1`。 790 | 791 | 节省开销:Div和Mult 792 | 793 | 794 | 795 | 796 | 797 | 798 | 799 | 800 | 801 | 802 | 803 | -------------------------------------------------------------------------------- /Docs/词法分析.md: -------------------------------------------------------------------------------- 1 | 18231047 王肇凯 2 | 3 | 4 | 5 | ## 词法分析 6 | 7 | ### 最初设计 8 | 9 | 以一个函数`get_token()`为核心,产生token并输出至文件;主函数中读取输入文件所有字符,循环调用此函数读取字符生成token,直至文件末尾。 10 | 11 | `get_token()`内部参考课本上的设计,读取非空白符,连续读取识别标识符并判断是否是保留字,连续读取整数,读取单个字符判断是否为一元分隔符,连续读取判断是否为二元分隔符。若以上一条满足,将当前token的字符串表示及分类分别存入`token`和`symbol`并输出至文件,否则产生异常。 12 | 13 | 14 | 15 | ### 实现与完善 16 | 17 | 为了方便以后扩展,将词法分析写为一个类`TokenAnalyze`,在初始化时指定是否输出分析结果至文件,并调用其`analyze()`方法进行词法分析主过程。将`token`、`symbol`等全局变量作为该类的成员变量。 18 | 19 | ```c++ 20 | class TokenAnalyze { 21 | private: 22 | char ch{}; 23 | string token; 24 | string symbol; 25 | string source; 26 | int pos = 0; 27 | int line_num = 1; 28 | int col_num = 1; 29 | bool save_to_file = false; 30 | int int_v; 31 | 32 | public: 33 | int read_char(); 34 | int analyze(const char *, const char *); 35 | int get_token(); 36 | void retract(); 37 | static string special(char); 38 | static string reserver(string); 39 | explicit TokenAnalyze(bool save_to_file): save_to_file(save_to_file) {}; 40 | }; 41 | ``` 42 | 43 | `analyze()`最开始将文件内容读入成员变量`source`,并维护整数值`pos`记录正在读取的字符串下标。这样做是为了方便回退,只需`pos--`即可。从而`read_char()`每次读取字符时只需更新`ch=source[pos]`并使`pos++`,更新行号`line_n um`和列号`col_num`(见后)。 44 | 45 | 在`reserver()`和`special()`分别维护两个`map`型变量记录保留字和非歧义分隔符。这里歧义指的是`>`与`>=`等需要多读取一个字符才能区分的分隔符,进行特判。 46 | 47 | 对于`INTCON`,记录其值在成员变量`int_v`中。 48 | 49 | 词法分析异常包含以下几类: 50 | 51 | * 引号不匹配:读取到文件结尾仍未发现右双引号/右单引号 52 | * 字符个数过多:两个单引号内字符多于一个 53 | * 未知字符:所有判断条件均不满足的字符 54 | 55 | 为了更好的异常处理提示信息,记录了正在读取的字符的行数和列数,并在报错时输出行数列数。同时输出的还有提示信息,指出异常属于以上某个特定类别。 56 | -------------------------------------------------------------------------------- /Docs/语法分析.md: -------------------------------------------------------------------------------- 1 | 18231047 王肇凯 2 | 3 | 4 | 5 | ## 语法分析 6 | 7 | ### 最初设计 8 | 9 | 采用递归子程序法进行自顶而下的分析。在调用子程序前先读入一个token,然后每个子程序内通过读入token以及递归调用其他子程序来分析一种非终结符号,并将语法成分输出到文件中。 10 | 11 | 分析文法得知不存在左递归,为了避免回溯,采用了预读的方法,即在多个选择间存在冲突时提前读1~3个token进行判断出唯一选择,并回退到预读前的token,然后调用该选择的子程序。 12 | 13 | 当读入的终结符号与预期不同时,由于避免了回溯,因此直接产生异常,保存至错误表中。 14 | 15 | 其中注意到`<有返回值函数调用>`与`<无返回值函数调用>`的语法完全一致,需要根据语义区分。因此建立简单符号表(等到语义分析时再加以完善),在函数定义时在符号表中添加 “函数标识符 — 有无返回值” 的映射,在需要区分函数调用时通过标识符查表,从而选择有返回值或无返回值函数调用。 16 | 17 | 18 | 19 | ### 实现与完善 20 | 21 | 设计`Grammar`类进行语法分析工作。在初始化时使用词法分析步骤的结果保存至`tokens`作为输入,并指定是否输出分析结果至文件。使用时,调用其`analyze()`方法进行语法分析主过程。 22 | 23 | 将符号表`stmTable`等全局变量作为该类的成员变量。 24 | 25 | 考虑到预读后需要回退至预读前的位置,因此将待输出内容按行保存至`output_str`中,在回退时删去上一个词法分析的输出行。 26 | 27 | 成员变量`tk`,`sym`,`pos`分别保存当前读到的词法分析结果、结果的词法成分、在所有词法分析tokens中的位置。 28 | 29 | 方法`next_sym()`,`retract()`,`error()`,`output()`分别进行读入token、预读结束后回退、存储错误、输出语法成分。 30 | 31 | 各个递归子程序作为方法保存在类中。 32 | 33 | ```c++ 34 | class Grammar { 35 | public: 36 | vector tokens; 37 | vector errors; 38 | map symTable; 39 | vector output_str; 40 | 41 | bool save_to_file; 42 | Token tk{INVALID, INVALID, -1, -1, -1}; 43 | int pos = 0; 44 | string sym = ""; 45 | 46 | void error(const string &expected); 47 | int next_sym(); 48 | void retract(); 49 | vector analyze(const char *out_path); 50 | Grammar(vector t, bool save) : tokens(std::move(t)), save_to_file(save) {}; 51 | void output(const string &name); 52 | 53 | void Program(); 54 | void ConstDeclare(); 55 | void ConstDef(); 56 | void UnsignedInt(); 57 | void Int(); 58 | void Identifier(); 59 | void DeclareHead(); 60 | void Const(); 61 | void VariableDeclare(); 62 | void VariableDef();; 63 | void TypeIdentifier(); 64 | void SharedFuncDef(); 65 | void RetFuncDef(); 66 | void NonRetFuncDef(); 67 | void CompoundStmt(); 68 | void ParaList(); 69 | void Main(); 70 | void Expr(); 71 | void Item(); 72 | void Factor(); 73 | void Stmt(); 74 | void AssignStmt(); 75 | void ConditionStmt(); 76 | void Condition(); 77 | void LoopStmt(); 78 | void PaceLength(); 79 | void CaseStmt(); 80 | void CaseList(); 81 | void CaseSubStmt(); 82 | void Default(); 83 | void SharedFuncCall(); 84 | void RetFuncCall(); 85 | void NonRetFuncCall(); 86 | void ValueParaList(); 87 | void StmtList(); 88 | void ReadStmt(); 89 | void WriteStmt(); 90 | void ReturnStmt(); 91 | }; 92 | ``` 93 | 94 | `analyze()`只需打开关闭输出文件流、读入第一个token、调用`<程序>`子程序即可。 95 | 96 | `<数字>`,`<标识符>`等基础的非终结符号在词法分析时已经进行过判断,因此不必写子程序。 97 | 98 | 每个递归子程序在调用前需要先使用`next_sym()`读入一个token,然后根据右部各选择的首符号进行选择(必要时采用预读),对于非终结符号调用其子程序,终结符号则判断是否与预期一致,不一致则报错。 -------------------------------------------------------------------------------- /Docs/错误处理.md: -------------------------------------------------------------------------------- 1 | 18231047 王肇凯 2 | 3 | 4 | 5 | ## 错误处理 6 | 7 | ### 最初设计 8 | 9 | 建立错误类,存储错误类型、行号、列号、额外提示信息;再将所有错误对象整合为一个数组存储,并输出到文件中。 10 | 11 | 为了识别语义错误,需要建立符号表管理各个层次的变量,并存储各种信息(如变量维度、函数返回值类型等等)。在读到标识符时进行增/查操作,在类型不一致时报错。此外还需要求表达式的类型,判断其为char型或者int型。 12 | 13 | 14 | 15 | 16 | 17 | ### 实现与完善 18 | 19 | 将错误类别以宏的形式进行定义,作为错误编码 20 | 21 | ```c 22 | #define ERR_LEXER 'a' 23 | #define ERR_REDEFINED 'b' 24 | #define ERR_UNDEFINED 'c' 25 | #define ERR_PARA_COUNT 'd' 26 | #define ERR_PARA_TYPE 'e' 27 | #define ERR_CONDITION_TYPE 'f' 28 | #define ERR_NONRET_FUNC 'g' 29 | #define ERR_RET_FUNC 'h' 30 | #define ERR_INDEX_CHAR 'i' 31 | #define ERR_CONST_ASSIGN 'j' 32 | #define ERR_SEMICOL 'k' 33 | #define ERR_RPARENT 'l' 34 | #define ERR_RBRACK 'm' 35 | #define ERR_ARRAY_INIT 'n' 36 | #define ERR_CONST_TYPE 'o' 37 | #define ERR_SWITCH_DEFAULT 'p' 38 | ``` 39 | 40 | 41 | 42 | 设计`Error`类存储单个错误的各种信息。 43 | 44 | ```c++ 45 | class Error { 46 | public: 47 | string msg; 48 | int line{}; 49 | int column{}; 50 | int eid; 51 | char err_code{}; 52 | string rich_msg; 53 | ``` 54 | `Errors`类用静态成员变量存储全部错误对象,并提供静态方法将错误输出至文件。 55 | 56 | ```c++ 57 | class Errors { 58 | public: 59 | static vector errors; 60 | 61 | static void add(const string &s, int line, int col, int id); 62 | 63 | static void add(const string &s, int id); 64 | 65 | static void save_to_file(const string &out_path); 66 | ``` 67 | 68 | 错误信息输出示例如下 69 | 70 | ``` 71 | Error in line 52, column 21: Para count mismatch (EID: d) 72 | Error in line 53, column 22: Para type mismatch (EID: e) 73 | Error in line 55, column 21: Para type mismatch (EID: e) 74 | ``` 75 | 76 | 77 | 78 | 79 | 80 | `SymTableItem`和`SymTable`类分别代表符号表项和整个符号表。设定了增加符号表项、查询符号表、栈式符号表增减层的方法。 81 | 82 | ```c++ 83 | enum STIType { 84 | constant, 85 | var, 86 | para, 87 | func 88 | }; 89 | 90 | enum DataType { 91 | integer, 92 | character, 93 | void_ret, 94 | invalid 95 | }; 96 | 97 | class SymTableItem { 98 | public: 99 | string name; 100 | STIType stiType{}; 101 | DataType dataType{}; 102 | int num_para = 0; 103 | int dim = 0; 104 | bool valid = true; 105 | vector types; 106 | 107 | SymTableItem(string name, STIType stiType1, DataType dataType1) : 108 | name(std::move(name)), stiType(stiType1), dataType(dataType1) {}; 109 | 110 | explicit SymTableItem(bool valid) : valid(valid) {}; 111 | 112 | SymTableItem() = default; 113 | 114 | string to_str() const; 115 | }; 116 | 117 | class SymTable { 118 | public: 119 | static vector items; 120 | static vector layers; 121 | static unsigned int max_name_length; 122 | 123 | static void add(const Token& tk, STIType stiType, DataType dataType); 124 | 125 | static void add(const Token& tk, STIType stiType, DataType dataType, int dim); 126 | 127 | static void add_func(const Token& tk, DataType dataType, int num_para, vector types); 128 | 129 | static void add_layer(); 130 | 131 | static void pop_layer(); 132 | 133 | static SymTableItem search(const Token &tk); 134 | 135 | static void show(); 136 | 137 | ``` 138 | 139 | 符号表显示效果如下 140 | 141 | ``` 142 | ============================== 143 | NAME KIND TYPE DIM 144 | ------------------------------ 145 | func_switch_ch func int 0 146 | func_switch_int func int 0 147 | ------------------------------ 148 | c para char 0 149 | tmp var int 0 150 | ============================== 151 | ``` 152 | 153 | 154 | 155 | 156 | 157 | 在语法分析程序中增加对于`error()`函数的调用,以在适当地方进行报错,完成跳读,并将错误信息存储到`Errors`类的静态成员变量中。在整个程序分析结束后,将所有错误输出至文件。 158 | 159 | 160 | 161 | -------------------------------------------------------------------------------- /Error.cpp: -------------------------------------------------------------------------------- 1 | // 2 | // Created by wzk on 2020/10/1. 3 | // 4 | 5 | #include "Error.h" 6 | 7 | vector Errors::errors; -------------------------------------------------------------------------------- /Error.h: -------------------------------------------------------------------------------- 1 | 2 | // 3 | // Created by wzk on 2020/9/22. 4 | // 5 | 6 | #ifndef COMPILER_ERROR_H 7 | #define COMPILER_ERROR_H 8 | 9 | #include 10 | #include 11 | #include 12 | #include 13 | #include "utils.h" 14 | 15 | #define NOTFOUND "NOT FOUND" 16 | #define INVALID "NOT VALID" 17 | 18 | #define DEBUG true 19 | 20 | using namespace std; 21 | 22 | class Error { 23 | public: 24 | string msg; 25 | int line{}; 26 | int column{}; 27 | int eid; 28 | char err_code{}; 29 | string rich_msg; 30 | 31 | explicit Error(string s, int id) : msg(move(s)), eid(id) { 32 | rich_msg = "Error: " + msg + " (EID: " + to_string(eid) + ")"; 33 | }; 34 | 35 | Error(string s, int ln, int col, int id) : msg(std::move(s)), line(ln), column(col), eid(id) { 36 | rich_msg = "Error in line " + to_string(ln) + ", column " + to_string(col) 37 | + ": " + msg + " (EID: " + to_string(eid) + ")"; 38 | } 39 | 40 | explicit Error(string s, char ch) : msg(move(s)), eid(1000 + ch), err_code(ch) { 41 | rich_msg = "Error: " + msg + " (EID: " + ch + ")"; 42 | }; 43 | 44 | Error(string s, int ln, int col, char ch) : msg(std::move(s)), line(ln), column(col), 45 | eid(1000 + ch), err_code(ch) { 46 | rich_msg = "Error in line " + to_string(ln) + ", column " + to_string(col) 47 | + ": " + msg + " (EID: " + ch + ")"; 48 | } 49 | }; 50 | 51 | class Errors { 52 | public: 53 | static vector errors; 54 | 55 | static void debug_add() { 56 | if (DEBUG) { 57 | cout << errors[errors.size() - 1].rich_msg << endl; 58 | } 59 | } 60 | 61 | static void add(const string &s, int line, int col, int id) { 62 | errors.emplace_back(s, line, col, id); 63 | debug_add(); 64 | } 65 | 66 | static void add(const string &s, int id) { 67 | errors.emplace_back(s, id); 68 | debug_add(); 69 | } 70 | 71 | static void add(const string &s, int line, int col, char ch) { 72 | errors.emplace_back(s, line, col, ch); 73 | debug_add(); 74 | } 75 | 76 | static void add(const string &s, char ch) { 77 | errors.emplace_back(s, ch); 78 | debug_add(); 79 | } 80 | 81 | static bool terminate() { 82 | if (errors.empty()) { 83 | cout << endl << "All correct." << endl; 84 | return false; 85 | } 86 | 87 | if (errors.size() == 1) { 88 | cerr << endl << "1 error. Listed as below." << endl; 89 | } else { 90 | cerr << errors.size() << " error(s). Listed as below." << endl; 91 | } 92 | 93 | for (auto &err: errors) { 94 | cerr << err.rich_msg << endl; 95 | } 96 | return true; 97 | } 98 | 99 | static void save_to_file(const string &out_path) { 100 | ofstream out(out_path); 101 | int prev_line = 0; 102 | for (auto &err: errors) { 103 | if (err.eid > 1000 && err.line != 0 && err.line != prev_line) { 104 | out << err.line << " " << err.err_code << endl; 105 | prev_line = err.line; 106 | // if (err.err_code == 'j') { 107 | // out << err.msg << endl; 108 | // } 109 | if (DEBUG) { 110 | cout << err.line << " " << err.err_code << endl; 111 | } 112 | } 113 | } 114 | out.close(); 115 | } 116 | }; 117 | 118 | #define E_EMPTY_FILE 1 119 | 120 | //Lexer 121 | #define E_UNEXPECTED_CHAR 2 122 | #define E_UNKNOWN_CHAR 3 123 | #define E_UNEXPECTED_EOF 4 124 | 125 | //Grammar 126 | #define E_GRAMMAR 5 127 | 128 | //Semantic 129 | #define E_UNDEFINED_IDENTF 6 130 | #define E_REDEFINED_IDENTF 7 131 | 132 | 133 | //error process homework 134 | #define ERR_LEXER 'a' 135 | #define ERR_REDEFINED 'b' 136 | #define ERR_UNDEFINED 'c' 137 | #define ERR_PARA_COUNT 'd' 138 | #define ERR_PARA_TYPE 'e' 139 | #define ERR_CONDITION_TYPE 'f' 140 | #define ERR_NONRET_FUNC 'g' 141 | #define ERR_RET_FUNC 'h' 142 | #define ERR_INDEX_CHAR 'i' 143 | #define ERR_CONST_ASSIGN 'j' 144 | #define ERR_SEMICOL 'k' 145 | #define ERR_RPARENT 'l' 146 | #define ERR_RBRACK 'm' 147 | #define ERR_ARRAY_INIT 'n' 148 | #define ERR_CONST_TYPE 'o' 149 | #define ERR_SWITCH_DEFAULT 'p' 150 | 151 | #endif -------------------------------------------------------------------------------- /Grammar.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | 3 | #include 4 | #include 5 | #include 6 | #include 7 | #include "Lexer.h" 8 | #include "Error.h" 9 | #include "SymTable.h" 10 | #include "PseudoCode.h" 11 | 12 | class TreeNode { 13 | public: 14 | string str; 15 | string type; 16 | int parent; 17 | vector child; 18 | 19 | TreeNode(string name, string type, int parent) : str(std::move(name)), type(std::move(type)), parent(parent) {} 20 | }; 21 | 22 | enum GrammarMode { 23 | grammar_check, 24 | gen_inline, 25 | semantic_analyze, 26 | }; 27 | 28 | /* 29 | * 第一遍grammar check: 30 | * 分析语法,检查错误 31 | * 记录函数体及是否能inline(静态变量function_tokens_index) 32 | * 保存至新文件 33 | * 第二遍gen inline: 34 | * 有返回值函数开头增加"&RET&"声明 35 | * 对于能inline的函数,直接替换为函数体(以复合语句形式),其中对参数进行替换(借助符号表); 36 | * 保存至新文件 37 | * 第三遍semantic analyze: 38 | * 根据生成中间代码、建立符号表(实际上为了错误检查,前两遍也有此步骤,但已经通过reset清除) 39 | * 针对inline替换后的代码,此时的文法允许语句列内出现复合语句 40 | * 41 | * 若不开启inline,则只进行第一遍grammar check 42 | */ 43 | 44 | class FunctionIndex { 45 | public: 46 | int begin; 47 | int end; 48 | string name; 49 | 50 | FunctionIndex(int begin, int end, string name) : begin(begin), end(end), name(std::move(name)) {} 51 | }; 52 | 53 | class Grammar { 54 | public: 55 | GrammarMode mode; 56 | Lexer lexer; 57 | vector output_str; 58 | vector cur_lex_results; 59 | vector new_lex_results; 60 | int pos = 0; 61 | 62 | Token tk{INVALID}; 63 | string sym = ""; 64 | int local_addr = LOCAL_ADDR_INIT; 65 | 66 | vector> tmp_paras; 67 | int tmp_dim1{}; 68 | int tmp_dim2{}; 69 | int tmp_para_count = 0; 70 | 71 | DataType funcdef_ret = invalid_dt; 72 | bool has_returned = false; 73 | string cur_func = GLOBAL; 74 | int func_count = 0; 75 | 76 | vector nodes; 77 | unsigned int cur_node = 0; 78 | 79 | static vector function_tokens_index; 80 | int function_call_start_index = -1; 81 | int statement_begin_index = -1; 82 | int ret_index = 0; 83 | bool para_assigned = false; 84 | bool in_factor = false; 85 | 86 | void error(const string &expected); 87 | 88 | int next_sym(bool); 89 | 90 | void retract(); 91 | 92 | int analyze(); 93 | 94 | explicit Grammar(const string &in_path, GrammarMode mode) : 95 | lexer(in_path, mode != grammar_check), mode(mode) {}; 96 | 97 | string add_midcode(const string &op, const string &n1, const string &n2, const string &r) const; 98 | 99 | string const_replace(string symbol) const; 100 | 101 | void output(const string &name); 102 | 103 | void save_to_file(const string &out_path); 104 | 105 | void Program(); 106 | 107 | void ConstDeclare(); 108 | 109 | void ConstDef(); 110 | 111 | void UnsignedInt(); 112 | 113 | string Int(); 114 | 115 | string Identifier(); 116 | 117 | pair Const(); 118 | 119 | void VariableDeclare(); 120 | 121 | void VariableDef();; 122 | 123 | void TypeIdentifier(); 124 | 125 | void SharedFuncDefHead(); 126 | 127 | void SharedFuncDefBody(); 128 | 129 | void RetFuncDef(); 130 | 131 | void NonRetFuncDef(); 132 | 133 | void CompoundStmt(); 134 | 135 | void ParaList(); 136 | 137 | void Main(); 138 | 139 | pair Expr(); 140 | 141 | pair Item(); 142 | 143 | pair Factor(); 144 | 145 | void Stmt(); 146 | 147 | void AssignStmt(); 148 | 149 | void ConditionStmt(); 150 | 151 | pair Condition(); 152 | 153 | void LoopStmt(); 154 | 155 | void CaseStmt(); 156 | 157 | void SharedFuncCall(); 158 | 159 | void RetFuncCall(); 160 | 161 | void NonRetFuncCall(); 162 | 163 | void StmtList(); 164 | 165 | void ReadStmt(); 166 | 167 | void WriteStmt(); 168 | 169 | void ReturnStmt(); 170 | 171 | string add_sub(const string &num1, const string &num2); 172 | 173 | string add_2d_array(const string &op, const string &n1, const string &n2, const string &n3, const string &r); 174 | 175 | string 176 | add_2d_array(const string &op, const string &n1, const string &n2, const string &n3, const string &r, 177 | int dim2_size); 178 | 179 | void add_node(const string &name); 180 | 181 | void add_leaf(); 182 | 183 | void tree_backward(); 184 | 185 | void dfs_show(const TreeNode &, int); 186 | 187 | void show_tree(); 188 | 189 | void save_lexer_results(const string &path); 190 | 191 | 192 | void change_name(); 193 | }; 194 | 195 | 196 | 197 | -------------------------------------------------------------------------------- /Lexer.cpp: -------------------------------------------------------------------------------- 1 | // 2 | // Created by wzk on 2020/9/22. 3 | // 4 | 5 | #include "Lexer.h" 6 | 7 | 8 | Token Lexer::get_token() { 9 | token.clear(); 10 | read_char(); 11 | while (isspace(ch) && pos < source.length()) { 12 | read_char(); 13 | } 14 | 15 | if (isspace(ch)) { 16 | //last character of file 17 | return Token(INVALID); 18 | } 19 | 20 | Token r(INVALID, INVALID, line_num, col_num, pos); 21 | 22 | if (isalpha(ch) || ch == '_' || (replace_mode && ch == '&')) { 23 | while (isalnum(ch) || ch == '_' || (replace_mode && ch == '&')) { 24 | token += ch; 25 | read_char(); 26 | } 27 | retract(); 28 | string reserver_value = reserved(token); 29 | symbol = (reserver_value == NOTFOUND) ? "IDENFR" : reserver_value; 30 | } else if (isdigit(ch)) { 31 | while (isdigit(ch)) { 32 | token += ch; 33 | read_char(); 34 | } 35 | retract(); 36 | symbol = "INTCON"; 37 | } else if (ch == '\'') { 38 | read_char(); 39 | token = ch; 40 | symbol = "CHARCON"; 41 | read_char(); 42 | if (ch != '\'') { 43 | string err = isspace(ch) ? " " : string(&ch); 44 | // Errors::add("expected single quote sign ' , got " + err + " instead", line_num, col_num, 45 | // E_UNEXPECTED_CHAR); 46 | Errors::add("expected single quote sign ' , got " + err + " instead", line_num, col_num, 47 | ERR_LEXER); 48 | } 49 | } else if (ch == '\"') { 50 | while (true) { 51 | read_char(); 52 | if (ch == '\"') { 53 | break; 54 | } 55 | token += ch; 56 | } 57 | symbol = "STRCON"; 58 | } else if (special(ch) != NOTFOUND) { 59 | token = ch; 60 | symbol = special(ch); 61 | } else if (ch == '!') { 62 | read_char(); 63 | if (ch != '=') { 64 | string err = isspace(ch) ? " " : string(&ch); 65 | // Errors::add("expected '!=', got !" + err + " instead", line_num, col_num, E_UNEXPECTED_CHAR); 66 | Errors::add("expected '!=', got !" + err + " instead", line_num, col_num, ERR_LEXER); 67 | } 68 | token = "!="; 69 | symbol = "NEQ"; 70 | } else if (ch == '=') { 71 | read_char(); 72 | if (ch == '=') { 73 | token = "=="; 74 | symbol = "EQL"; 75 | } else { 76 | token = "="; 77 | symbol = "ASSIGN"; 78 | retract(); 79 | } 80 | } else if (ch == '<') { 81 | read_char(); 82 | if (ch == '=') { 83 | token = "<="; 84 | symbol = "LEQ"; 85 | } else { 86 | token = "<"; 87 | symbol = "LSS"; 88 | retract(); 89 | } 90 | } else if (ch == '>') { 91 | read_char(); 92 | if (ch == '=') { 93 | token = ">="; 94 | symbol = "GEQ"; 95 | } else { 96 | token = ">"; 97 | symbol = "GRE"; 98 | retract(); 99 | } 100 | } else { 101 | // Errors::add("unknown character: " + string(&ch), line_num, col_num, E_UNKNOWN_CHAR); 102 | Errors::add("unknown character: " + string(&ch), line_num, col_num, ERR_LEXER); 103 | } 104 | 105 | r.type = symbol; 106 | r.original_str = token; 107 | r.str = symbol == "CHARCON" || symbol == "STRCON" ? token : lower(token); 108 | if (symbol == "INTCON") { 109 | r.v_int = (int) strtol(token.c_str(), nullptr, 10); 110 | } 111 | else if (symbol == "STRCON") { 112 | if (token.empty()) { 113 | // Errors::add("empty string", line_num, col_num, E_UNEXPECTED_CHAR); 114 | Errors::add("empty string", line_num, col_num, ERR_LEXER); 115 | } 116 | for (auto &c: token) { 117 | if (c <= 31 || c == 34 || c >= 127) { 118 | // Errors::add("invalid ascii character in string: " + string(&c), 119 | // line_num, col_num, E_UNKNOWN_CHAR); 120 | Errors::add("invalid ascii character in string: " + string(&c), 121 | line_num, col_num, ERR_LEXER); 122 | } 123 | } 124 | } 125 | else if (symbol == "CHARCON") { 126 | r.v_char = token.c_str()[0]; 127 | if (r.v_char != '+' && r.v_char != '-' && r.v_char != '*' 128 | && r.v_char != '/' && r.v_char != '_' && !isalnum(r.v_char)) { 129 | // Errors::add("invalid character: " + string(&r.v_char), 130 | // line_num, col_num, E_UNKNOWN_CHAR); 131 | Errors::add("invalid character: " + string(&r.v_char), 132 | line_num, col_num, ERR_LEXER); 133 | } 134 | } 135 | if (r.type == "ELSETK") { 136 | 137 | } 138 | return r; 139 | 140 | } 141 | 142 | Token Lexer::analyze() { 143 | try { 144 | return get_token(); 145 | } catch (exception ex) { 146 | // Errors::add("unexpected end of file", E_UNEXPECTED_EOF); 147 | Errors::add("unexpected end of file", ERR_LEXER); 148 | } 149 | return Token(INVALID); 150 | } 151 | 152 | int Lexer::read_char() { 153 | if (pos >= source.length()) { 154 | throw exception(); 155 | } 156 | ch = source[pos++]; 157 | if (ch == '\n') { 158 | line_num++; 159 | if (line_num == 24) { 160 | 161 | } 162 | col_num = 1; 163 | } else { 164 | col_num++; 165 | } 166 | return 0; 167 | } 168 | 169 | void Lexer::retract() { 170 | if (ch == '\n') { 171 | line_num--; 172 | } 173 | pos--; 174 | col_num--; 175 | } 176 | 177 | string Lexer::reserved(string tk) { 178 | map reserves = { 179 | {"const", "CONSTTK"}, 180 | {"int", "INTTK"}, 181 | {"char", "CHARTK"}, 182 | {"void", "VOIDTK"}, 183 | {"main", "MAINTK"}, 184 | {"if", "IFTK"}, 185 | {"else", "ELSETK"}, 186 | {"switch", "SWITCHTK"}, 187 | {"case", "CASETK"}, 188 | {"default", "DEFAULTTK"}, 189 | {"while", "WHILETK"}, 190 | {"for", "FORTK"}, 191 | {"scanf", "SCANFTK"}, 192 | {"printf", "PRINTFTK"}, 193 | {"return", "RETURNTK"}, 194 | }; 195 | auto iter = reserves.find(lower(std::move(tk))); 196 | if (iter != reserves.end()) { 197 | return iter->second; 198 | } 199 | return NOTFOUND; 200 | } 201 | 202 | string Lexer::special(char tk) { 203 | map specials = { 204 | {'+', "PLUS"}, 205 | {'-', "MINU"}, 206 | {'*', "MULT"}, 207 | {'/', "DIV"}, 208 | {':', "COLON"}, 209 | {';', "SEMICN"}, 210 | {',', "COMMA"}, 211 | {'(', "LPARENT"}, 212 | {')', "RPARENT"}, 213 | {'[', "LBRACK"}, 214 | {']', "RBRACK"}, 215 | {'{', "LBRACE"}, 216 | {'}', "RBRACE"} 217 | }; 218 | auto iter = specials.find(tk); 219 | if (iter != specials.end()) { 220 | return iter->second; 221 | } 222 | return NOTFOUND; 223 | } 224 | 225 | 226 | -------------------------------------------------------------------------------- /Lexer.h: -------------------------------------------------------------------------------- 1 | // 2 | // Created by wzk on 2020/9/22. 3 | // 4 | 5 | #ifndef COMPILER_LEXER_H 6 | #define COMPILER_LEXER_H 7 | #include 8 | #include 9 | #include 10 | #include 11 | #include 12 | #include 13 | 14 | #include "Error.h" 15 | 16 | using namespace std; 17 | 18 | class Token { 19 | public: 20 | string type; 21 | string str; 22 | string original_str; 23 | int v_int = -1; 24 | char v_char = 'E'; 25 | int line{}; 26 | int column{}; 27 | int pos{}; 28 | 29 | Token(string t, string s, int l, int c, int p) : 30 | type(std::move(t)), str(std::move(s)), line(l), column(c), pos(p) {}; 31 | explicit Token(string t) : type(std::move(t)) {}; 32 | Token(string t, string s) : type(std::move(t)), str(std::move(s)) {}; 33 | }; 34 | 35 | class Lexer { 36 | public: 37 | char ch{}; 38 | string token; 39 | string symbol; 40 | string source; 41 | int pos = 0; 42 | int line_num = 1; 43 | int col_num = 1; 44 | bool replace_mode; 45 | 46 | 47 | Token analyze(); 48 | Token get_token(); 49 | int read_char(); 50 | void retract(); 51 | static string special(char); 52 | static string reserved(string); 53 | explicit Lexer(const string& in_path, bool replace) { 54 | replace_mode = replace; 55 | ifstream in(in_path); 56 | stringstream buffer; 57 | buffer << in.rdbuf(); 58 | source = buffer.str() + "\n"; 59 | if (source.empty()) { 60 | Errors::add("file not found or empty", E_EMPTY_FILE); 61 | } 62 | in.close(); 63 | }; 64 | }; 65 | 66 | 67 | #endif 68 | 69 | //标识符 IDENFR else ELSETK - MINU = ASSIGN 70 | //整形常量 INTCON switch SWITCHTK * MULT ; SEMICN 71 | //字符常量 CHARCON case CASETK / DIV , COMMA 72 | //字符串 STRCON default DEFAULTTK < LSS ( LPARENT 73 | //const CONSTTK while WHILETK <= LEQ ) RPARENT 74 | //int INTTK for FORTK > GRE [ LBRACK 75 | //char CHARTK scanf SCANFTK >= GEQ ] RBRACK 76 | //void VOIDTK printf PRINTFTK == EQL { LBRACE 77 | //main MAINTK return RETURNTK != NEQ } RBRACE 78 | //if IFTK + PLUS : COLON 79 | -------------------------------------------------------------------------------- /MipsGenerator.cpp: -------------------------------------------------------------------------------- 1 | #pragma clang diagnostic push 2 | #pragma ide diagnostic ignored "performance-inefficient-string-concatenation" 3 | // 4 | // Created by wzk on 2020/11/5. 5 | // 6 | 7 | #include "MipsGenerator.h" 8 | 9 | #include 10 | 11 | //TODO:用a寄存器且不区分ast;sp直接硬编码; 12 | 13 | void MipsGenerator::generate(const string &code) { 14 | mips.push_back(code); 15 | if (DEBUG) { 16 | cout << code << endl; 17 | } 18 | } 19 | 20 | void MipsGenerator::generate(const string &op, const string &num1) { 21 | generate(op + " " + num1); 22 | } 23 | 24 | void MipsGenerator::generate(const string &op, const string &num1, const string &num2) { 25 | generate(op + " " + num1 + ", " + num2); 26 | release(num2); 27 | if (op == "sw" || op == "bltz" || op == "blez" || op == "bgtz" || op == "bgez") { 28 | release(num1); 29 | } 30 | } 31 | 32 | void MipsGenerator::generate(const string &op, const string &num1, const string &num2, const string &num3) { 33 | generate(op + " " + num1 + ", " + num2 + ", " + num3); 34 | if (op == "addu" || op == "subu" || op == "mul" || op == "div" || op == "sll" || op == "sra") { 35 | if (num1 != num2) { 36 | release(num2); 37 | } 38 | if (num1 != num3) { 39 | release(num3); 40 | } 41 | } else if (op == "beq" || op == "bne") { 42 | release(num1); 43 | } 44 | } 45 | 46 | void MipsGenerator::release(string addr) { 47 | if (addr[0] == '$' && addr[1] == 't') { 48 | t_reg_table[addr[2] - '0'] = VACANT; 49 | generate("# RELEASE " + addr); 50 | } 51 | } 52 | 53 | void MipsGenerator::translate() { 54 | generate(".data"); 55 | for (auto &item: SymTable::global) { 56 | if (item.dim >= 1) { 57 | generate("arr__" + item.name + "_: .space " + to_string(item.size)); 58 | } 59 | } 60 | for (auto &it: SymTable::local) { 61 | if (SymTable::search_func(it.first).recur_func) { 62 | // cout << "recursive function: " << it.first << endl; 63 | continue; 64 | } 65 | for (auto &item: it.second) { 66 | if (item.dim >= 1) { 67 | generate("arr_" + it.first + "_" + item.name + "_: .space " + to_string(item.size + 4)); 68 | } 69 | } 70 | } 71 | 72 | for (int i = 0; i < strcons.size(); i++) { 73 | if (strcons[i][0] != '@') { 74 | generate("str__" + to_string(i) + ": .asciiz \"" + strcons[i] + "\""); 75 | } else { 76 | generate("str__" + to_string(i) + ": .ascii \"" + strcons[i].substr(1) + "\""); 77 | } 78 | } 79 | generate(R"(newline__: .asciiz "\n")"); 80 | generate(".text"); 81 | 82 | bool init = true; 83 | 84 | vector s_old; 85 | 86 | int idx = 0; 87 | 88 | for (auto &code:mid) { 89 | generate(""); 90 | generate("# === " + code.to_str() + " ==="); 91 | //generate("# === " + to_string(idx) + " " + code.to_str() + " ==="); 92 | idx++; 93 | string op = code.op, num1 = code.num1, num2 = code.num2, result = code.result; 94 | if (op == OP_FUNC) { 95 | //进入新函数 96 | 97 | if (init) { 98 | //此前为全局变量初始化,在此调用主函数 99 | generate("addi $sp, $sp, -" + to_string(LOCAL_ADDR_INIT + SymTable::func_size("main"))); 100 | generate("j main"); 101 | init = false; 102 | } 103 | 104 | cur_func = num2; 105 | call_func_sp_offset = 0; 106 | generate(num2 + ":"); 107 | //被调用者保护s 108 | for (int i = 0; i < 8; i++) { 109 | s_reg_table[i] = VACANT; 110 | } 111 | for (auto &it: SymTable::search_func(cur_func).paras) { 112 | string sreg = assign_s_reg(it.second); 113 | if (sreg != INVALID) { 114 | generate("lw", sreg, to_string(SymTable::search(cur_func, it.second).addr 115 | + call_func_sp_offset) + "($sp)"); 116 | } else { 117 | break; 118 | } 119 | } 120 | 121 | } else if (op == OP_RETURN) { 122 | //恢复s 123 | if (num1 != VACANT) { 124 | load_value(num1, "$v0"); 125 | } 126 | if (cur_func == "main") { 127 | generate("li $v0, 10"); 128 | generate("syscall"); 129 | } else { 130 | generate("jr $ra"); 131 | } 132 | } else if (op == OP_PREPARE_CALL) { 133 | sp_size.push_back(LOCAL_ADDR_INIT + SymTable::func_size(num1)); 134 | call_func_sp_offset = sum(sp_size); 135 | generate("addi $sp, $sp, -" + to_string(sp_size.back())); 136 | call_func_paras.push_back(SymTable::local[num1]); 137 | } else if (op == OP_PUSH_PARA) { 138 | /* 139 | * a在内存,b在寄存器:sw b,a 140 | * a在内存,b在内存:lw reg,b sw reg,a 141 | * a在内存,b为常量 li reg,b sw reg,a 142 | * 143 | * a在寄存器,b在寄存器:move b,a 144 | * a在寄存器,b在内存:lw a,b 145 | * a在寄存器,b为常量 li a,b 146 | */ 147 | 148 | string para_addr = to_string(call_func_paras.back().begin()->addr) + "($sp)"; 149 | assertion(call_func_paras.back().begin()->stiType == para); 150 | call_func_paras.back().erase(call_func_paras.back().begin()); 151 | string reg = "$a1"; 152 | 153 | 154 | bool b_in_reg = in_reg(num1) || assign_reg(num1, true); 155 | string b = symbol_to_addr(num1); 156 | if (b_in_reg) { 157 | generate("sw", b, para_addr); 158 | } else if (is_const(num1)) { 159 | generate("li", reg, b); 160 | generate("sw", reg, para_addr); 161 | } else { 162 | generate("lw", reg, b); 163 | generate("sw", reg, para_addr); 164 | } 165 | 166 | } else if (op == OP_CALL) { 167 | //调用者保护t 168 | show_reg_status(); 169 | vector saved_s, saved_t; 170 | vector t_old = t_reg_table; 171 | for (int i = 0; i < 10; i++) { 172 | if (t_reg_table[i] != VACANT) { 173 | saved_t.push_back(i); 174 | generate("sw", "$t" + to_string(i), 175 | to_string(STACK_T_BEGIN + 4 * i) + "($sp)"); 176 | t_reg_table[i] = VACANT; 177 | } 178 | } 179 | 180 | s_old = s_reg_table; 181 | for (int i = 0; i < 8; i++) { 182 | if (s_reg_table[i] != VACANT) { 183 | saved_s.push_back(i); 184 | generate("sw", "$s" + to_string(i), 185 | to_string(STACK_S_BEGIN + 4 * i) + "($sp)"); 186 | s_reg_table[i] = VACANT; 187 | } 188 | } 189 | if (cur_func != "main") { 190 | generate("sw", "$ra", STACK_RA); 191 | } 192 | assertion(call_func_paras.back().empty() || call_func_paras.back().begin()->stiType != para); 193 | generate("jal", num1); 194 | 195 | if (cur_func != "main") { 196 | generate("lw", "$ra", STACK_RA); 197 | } 198 | for (int i: saved_s) { 199 | generate("lw", "$s" + to_string(i), 200 | to_string(STACK_S_BEGIN + 4 * i) + "($sp)"); 201 | } 202 | s_reg_table = s_old; 203 | 204 | //调用者恢复t 205 | for (int i: saved_t) { 206 | generate("lw", "$t" + to_string(i), 207 | to_string(STACK_T_BEGIN + 4 * i) + "($sp)"); 208 | } 209 | t_reg_table = t_old; 210 | if (sp_size.back() != 0) { 211 | generate("addi $sp, $sp, " + to_string(sp_size.back())); 212 | sp_size.pop_back(); 213 | } 214 | 215 | call_func_sp_offset = sum(sp_size); 216 | call_func_paras.pop_back(); 217 | } else if (op == OP_END_FUNC) { 218 | // show_reg_status(); 219 | } else if (op == OP_LABEL) { 220 | generate(num1 + ":"); 221 | } else if (op == OP_PRINT) { 222 | if (num2 == "strcon") { 223 | generate("la $a0, str__" + num1); 224 | generate("li $v0, 4"); 225 | generate("syscall"); 226 | } else if (num1 == ENDL) { 227 | generate("la $a0, newline__"); 228 | generate("li $v0, 4"); 229 | generate("syscall"); 230 | } else { 231 | //PRINT 表达式 232 | load_value(num1, "$a0"); 233 | 234 | if (num2 == "int") { 235 | generate("li $v0, 1"); 236 | } else { 237 | generate("li $v0, 11"); 238 | } 239 | generate("syscall"); 240 | } 241 | 242 | } else if (op == OP_SCANF) { 243 | SymTableItem it = SymTable::search(cur_func, num1); 244 | if (it.dataType == integer) { 245 | generate("li $v0, 5"); 246 | } else { 247 | generate("li $v0, 12"); 248 | } 249 | generate("syscall"); 250 | save_value("$v0", num1); 251 | } else if (op == OP_ASSIGN) { 252 | gen_assign(num1, num2); 253 | } else if (is_arith(op)) { 254 | /* 对于a=b+c: 255 | * abc都在寄存器/常量: add a,b,c 256 | * ab在寄存器/常量,c在内存(或反过来): lw reg2,c add a,b,reg2 257 | * a在寄存器,bc在内存: lw reg1,b lw reg2,c add a,reg1,reg2 258 | * a在内存,bc在寄存器/常量: add reg1,b,c sw reg1,a 259 | * ab在内存,c在寄存器/常量(或反过来): lw reg1,b add reg1,reg1,c sw reg1,a 260 | * abc都在内存: lw reg1,b lw reg2,c add reg1,reg1,reg2 sw reg1,a 261 | */ 262 | 263 | string instr = op_to_instr.find(op)->second; 264 | string reg1 = "$a1"; 265 | string reg2 = "$a2"; 266 | bool a_in_reg = in_reg(result) || assign_reg(result); 267 | //only_para:防止先被存到内存后被分配寄存器的情况 268 | bool b_in_reg_or_const = is_const(num1) || in_reg(num1) || assign_reg(num1, true); 269 | bool c_in_reg_or_const = is_const(num2) || in_reg(num2) || assign_reg(num2, true); 270 | 271 | string a = symbol_to_addr(result); 272 | string b = symbol_to_addr(num1); 273 | string c = symbol_to_addr(num2); 274 | 275 | bool is_2_pow_1 = is_const(code.num1) && is_2_power(stoi(code.num1)); 276 | bool is_2_pow_2 = is_const(code.num2) && is_2_power(stoi(code.num2)); 277 | bool sra = false; 278 | if ((op == OP_MUL || op == OP_DIV) && optimize_muldiv) { 279 | if (is_2_pow_1 && op == OP_MUL) { 280 | // mul a, 8, c : sll a, c, 3 281 | instr = "sll"; 282 | b = c; 283 | c = to_string(int(log2(stoi(code.num1)))); 284 | b_in_reg_or_const = c_in_reg_or_const; 285 | c_in_reg_or_const = true; 286 | } else if (is_2_pow_2 && op == OP_MUL) { 287 | // mul a, b, 8 : sll a, b, 3 288 | instr = "sll"; 289 | c = to_string(int(log2(stoi(code.num2)))); 290 | } 291 | 292 | else if (is_2_pow_2 && op == OP_DIV) { 293 | // mul a, b, 8 : sll a, b, 3 294 | instr = "sra"; 295 | c = to_string(int(log2(stoi(code.num2)))); 296 | } 297 | } 298 | 299 | if (a_in_reg) { 300 | if (b_in_reg_or_const && c_in_reg_or_const) { 301 | gen_arithmetic(instr, a, b, c); 302 | } else if (b_in_reg_or_const) { 303 | generate("lw", reg1, c); 304 | gen_arithmetic(instr, a, b, reg1); 305 | } else if (c_in_reg_or_const) { 306 | generate("lw", reg1, b); 307 | gen_arithmetic(instr, a, reg1, c); 308 | } else { 309 | generate("lw", reg1, b); 310 | generate("lw", reg2, c); 311 | gen_arithmetic(instr, a, reg1, reg2); 312 | } 313 | } else { 314 | if (b_in_reg_or_const && c_in_reg_or_const) { 315 | gen_arithmetic(instr, reg1, b, c); 316 | generate("sw", reg1, a); 317 | } else if (b_in_reg_or_const) { 318 | generate("lw", reg1, c); 319 | gen_arithmetic(instr, reg1, b, reg1); 320 | generate("sw", reg1, a); 321 | } else if (c_in_reg_or_const) { 322 | generate("lw", reg1, b); 323 | gen_arithmetic(instr, reg1, reg1, c); 324 | generate("sw", reg1, a); 325 | } else { 326 | generate("lw", reg1, b); 327 | generate("lw", reg2, c); 328 | gen_arithmetic(instr, reg1, reg1, reg2); 329 | generate("sw", reg1, a); 330 | } 331 | } 332 | 333 | } else if (op == OP_ARR_LOAD || op == OP_ARR_SAVE) { 334 | /* 对于a=b[c]: 335 | * load c到reg sll reg,reg,2 336 | * a在寄存器,c为全局数组:lw a,b(reg) 337 | * a在内存,c为全局数组: lw reg2,b(reg) sw reg2, a 338 | * a在寄存器,c为局部数组:add reg,reg,offset add reg,reg,$sp lw a,0(reg) 339 | * a在内存,c为局部数组: add reg,reg,offset add reg,reg,$sp lw reg2,reg($sp) sw reg2, a 340 | */ 341 | 342 | /* 对于b[c]=a: 343 | * load c到reg sll reg,reg,2 344 | * a在寄存器,b为全局数组:sw a,b(reg) 345 | * a在内存,b为全局数组: lw reg2,a sw reg2,b(reg) 346 | * a在寄存器,b为局部数组:add reg,reg,offset sw a,b(reg) 347 | * a在内存,b为局部数组: add reg,reg,offset lw reg2,a sw reg2,reg($sp) 348 | * a为常量,b为全局数组:li reg2,a sw a,b(reg) 349 | * a为常量,b为局部数组:add reg,reg,offset li reg2,a sw reg2,reg($sp) 350 | */ 351 | 352 | string symbol = "^" + num1 + "[" + num2 + "]"; 353 | 354 | bool a_in_reg = in_reg(result) || assign_reg(result, op == OP_ARR_SAVE); 355 | string a = symbol_to_addr(result); 356 | string reg = "$a1"; 357 | string reg2 = "$a2"; 358 | string item_addr; 359 | 360 | bool array_in_global = SymTable::in_global(cur_func, num1); 361 | 362 | if (begins_num(num2)) { //下标是常数,可以简化计算 363 | int offset = 4 * stoi(num2); 364 | if (array_in_global) { 365 | item_addr = "arr__" + num1 + "_+" + to_string(offset) + "($zero)"; 366 | } 367 | 368 | else if (!SymTable::search_func(cur_func).recur_func){ 369 | item_addr = "arr_" + cur_func + "_" + num1 + "_+" + to_string(offset) + "($zero)"; 370 | } 371 | 372 | else { 373 | offset += SymTable::search(cur_func, num1).addr + call_func_sp_offset; 374 | item_addr = to_string(offset) + "($sp)"; 375 | } 376 | } else { //下标是变量,在内存或寄存器,4*num2+sp+call_func_sp_offset 377 | bool index_in_reg = in_reg(num2) || assign_reg(num2, true); 378 | string index = symbol_to_addr(num2); 379 | if (index_in_reg) { 380 | generate("sll", reg, index, "2"); 381 | } else { 382 | generate("lw", reg, index); 383 | generate("sll", reg, reg, "2"); 384 | } 385 | 386 | if (array_in_global) { 387 | item_addr = "arr__" + num1 + "_(" + reg + ")"; 388 | } 389 | 390 | else if (!SymTable::search_func(cur_func).recur_func){ 391 | item_addr = "arr_" + cur_func + "_" + num1 + "_(" + reg + ")"; 392 | } 393 | 394 | else { 395 | generate("addu", reg, reg, to_string(SymTable::search(cur_func, num1).addr + call_func_sp_offset)); 396 | generate("addu", reg, reg, "$sp"); 397 | item_addr = "0(" + reg + ")"; 398 | } 399 | } 400 | 401 | 402 | if (op == OP_ARR_LOAD) { 403 | if (a_in_reg) { 404 | generate("lw", a, item_addr); 405 | } else { 406 | generate("lw", reg2, item_addr); 407 | generate("sw", reg2, a); 408 | } 409 | } else { 410 | if (a_in_reg) { 411 | generate("sw", a, item_addr); 412 | } else if (is_const(result)) { 413 | generate("li", reg2, a); 414 | generate("sw", reg2, item_addr); 415 | } else { 416 | generate("lw", reg2, a); 417 | generate("sw", reg2, item_addr); 418 | } 419 | } 420 | } else if (op == OP_JUMP_UNCOND) { 421 | generate("j", num1); 422 | } else if (op == OP_JUMP_IF) { 423 | if (begins_num(num1)) { 424 | int v = stoi(num1); 425 | bool jump = num2 == "<=0" ? v <= 0 : num2 == ">=0" ? v >= 0 : num2 == "==0" ? v == 0 : 426 | num2 == ">0" ? v > 0: num2 == "<0" ? v < 0: num2 == "!=0" ? v != 0 : true; 427 | if (jump) { 428 | generate("j", result); 429 | } else { 430 | //always not jump; ignore 431 | } 432 | continue; 433 | } 434 | bool a_in_reg = in_reg(num1) || assign_reg(num1, true); 435 | string a = symbol_to_addr(num1); 436 | string reg = "$a1"; 437 | if (a_in_reg) { 438 | reg = a; 439 | } else if (is_const(num1)) { 440 | generate("li", reg, a); 441 | } else { 442 | generate("lw", reg, a); 443 | } 444 | if (num2 == "<0") { 445 | generate("bltz", reg, result); 446 | } else if (num2 == "<=0") { 447 | generate("blez", reg, result); 448 | } else if (num2 == ">0") { 449 | generate("bgtz", reg, result); 450 | } else if (num2 == ">=0") { 451 | generate("bgez", reg, result); 452 | } else if (num2 == "==0") { 453 | generate("beq", reg, "$zero", result); 454 | } else if (num2 == "!=0") { 455 | generate("bne", reg, "$zero", result); 456 | } else { 457 | panic("unknown operator in jump_if: " + num2); 458 | } 459 | } 460 | } 461 | } 462 | 463 | void MipsGenerator::gen_arithmetic(const string &instr, const string &num1, const string &num2, const string &num3) { 464 | string reg3 = "$a3"; 465 | if (is_const(num2)) { 466 | if (instr == "addu") { 467 | generate("addiu", num1, num3, num2); 468 | } else if (instr == "mul") { 469 | generate(instr, num1, num3, num2); 470 | } else if (instr == "div" || instr == "subu") { 471 | //div a,5,b: li reg3,5 divu a,reg3,b 472 | 473 | generate("li", reg3, num2); 474 | if (instr == "div" && optimize_muldiv) { 475 | generate("div", reg3, num3); 476 | generate("mflo", num1); 477 | if (num1 != num3) { 478 | release(num3); 479 | } 480 | } else { 481 | generate(instr, num1, reg3, num3); 482 | } 483 | } else { 484 | generate(instr, num1, num2, num3); 485 | } 486 | } else if (instr == "div" && !is_const(num3) && optimize_muldiv) { 487 | generate("div", num2, num3); 488 | generate("mflo", num1); 489 | if (num1 != num2) { 490 | release(num2); 491 | } 492 | if (num1 != num3) { 493 | release(num3); 494 | } 495 | } else if (instr == "sra") { 496 | string label = assign_label(); 497 | string label2 = assign_label(); 498 | generate("bgez", num2, label); 499 | generate("subu", reg3, "$zero", num2); 500 | generate(instr, num1, reg3, num3); 501 | generate("subu", num1, "$zero", num1); 502 | generate("j", label2); 503 | generate(label + ":"); 504 | generate(instr, num1, num2, num3); 505 | generate(label2 + ":"); 506 | } else if (instr == "addu" && is_const(num3)) { 507 | if (num3 == "1073741824") { 508 | generate("lui", reg3, "0x4000"); 509 | generate("addu", num1, num2, reg3); 510 | } else { 511 | generate("addiu", num1, num2, num3); 512 | } 513 | } else if (instr == "subu" && is_const(num3)) { 514 | if (num3 == "1073741824") { 515 | generate("lui", reg3, "0xc000"); 516 | generate("addu", num1, num2, reg3); 517 | } else { 518 | string neg = num3[0] == '+' ? '-' + num3.substr(1) 519 | : num3[0] == '-' ? num3.substr(1) : '-' + num3; 520 | generate("addiu", num1, num2, neg); 521 | } 522 | } else { 523 | generate(instr, num1, num2, num3); 524 | } 525 | } 526 | 527 | 528 | void MipsGenerator::gen_assign(const string &num1, const string &num2) { 529 | /* 对于a=b: 530 | * a在寄存器,b在寄存器:move a,b 531 | * a在寄存器,b在内存:lw a,b 532 | * a在寄存器,b为常量:li a,b 533 | * 534 | * a在内存,b在寄存器:sw b,a 535 | * a在内存,b在内存:lw reg,b sw reg,a 536 | * a在内存,b为常量 li reg,b sw reg,a 537 | */ 538 | 539 | bool a_in_reg = in_reg(num1) || assign_reg(num1); 540 | 541 | string a = symbol_to_addr(num1); 542 | 543 | string reg = "$a1"; 544 | 545 | if (a_in_reg) { 546 | load_value(num2, a); 547 | } else { 548 | bool b_in_reg = in_reg(num2) || assign_reg(num2, true); 549 | string b = symbol_to_addr(num2); 550 | if (b_in_reg) { 551 | generate("sw", b, a); 552 | } else if (is_const(num2)) { 553 | generate("li", reg, b); 554 | generate("sw", reg, a); 555 | } else { 556 | generate("lw", reg, b); 557 | generate("sw", reg, a); 558 | } 559 | } 560 | } 561 | 562 | 563 | //将symbol的值读到对应寄存器 564 | void MipsGenerator::load_value(const string &symbol, const string ®) { 565 | bool inreg = in_reg(symbol) || assign_reg(symbol, true); 566 | string addr = symbol_to_addr(symbol); 567 | if (inreg) { 568 | generate("move", reg, addr); 569 | } else if (is_const(symbol)) { 570 | generate("li", reg, addr); 571 | } else { 572 | generate("lw", reg, addr); 573 | } 574 | } 575 | 576 | //将reg的值存到symbol的位置 577 | void MipsGenerator::save_value(const string ®, const string &symbol) { 578 | bool inreg = in_reg(symbol) || assign_reg(symbol); 579 | string addr = symbol_to_addr(symbol); 580 | if (inreg) { 581 | generate("move " + addr + ", " + reg); 582 | } else if (!is_const(symbol)) { 583 | generate("sw " + reg + ", " + addr); 584 | } else { 585 | panic(symbol + "not in memory or reg"); 586 | } 587 | } 588 | 589 | bool MipsGenerator::assign_reg(const string &symbol, bool only_para) { 590 | if (!SymTable::in_global(cur_func, symbol) && optimize_assign_reg && !is_const(symbol)) { 591 | SymTableItem item = SymTable::search(cur_func, symbol); 592 | if (item.stiType == var && !only_para) { 593 | string sreg = assign_s_reg(symbol); 594 | if (sreg != INVALID) { 595 | return true; 596 | } 597 | } 598 | 599 | // else if (item.stiType == para) { 600 | // string sreg = assign_s_reg(symbol); 601 | // if (sreg != INVALID) { 602 | // generate("lw", sreg, to_string(item.addr + call_func_sp_offset) + "($sp)"); 603 | // return true; 604 | // } 605 | // } 606 | 607 | else if (item.stiType == tmp && !only_para) { 608 | string treg = assign_t_reg(symbol); 609 | if (treg != INVALID) { 610 | return true; 611 | } 612 | } 613 | } 614 | return false; 615 | } 616 | 617 | bool MipsGenerator::in_reg(const string &symbol) { 618 | if (symbol == "0" || symbol == "%RET") { 619 | return true; 620 | } 621 | if (is_const(symbol)) { 622 | return false; 623 | } 624 | for (int i = 0; i < 10; i++) { 625 | if (t_reg_table[i] == symbol) { 626 | return true; 627 | } 628 | } 629 | for (int i = 0; i < 8; i++) { 630 | if (s_reg_table[i] == symbol) { 631 | return true; 632 | } 633 | } 634 | 635 | return false; 636 | } 637 | 638 | bool MipsGenerator::in_memory(const string &symbol) { 639 | return !is_const(symbol) && !in_reg(symbol); 640 | } 641 | 642 | bool MipsGenerator::is_const(const string &symbol) { 643 | return begins_num(symbol) && symbol != "0"; 644 | } 645 | 646 | //返回symbol对应的寄存器或地址,或常量值 647 | string MipsGenerator::symbol_to_addr(const string &symbol) { 648 | if (begins_num(symbol)) { 649 | if (symbol == "0") { 650 | return "$zero"; //0寄存器 651 | } 652 | return symbol; //int 653 | } 654 | 655 | if (symbol == "%RET") { 656 | return "$v0"; 657 | } 658 | 659 | for (int i = 0; i < 10; i++) { 660 | if (t_reg_table[i] == symbol) { 661 | string t_reg = "$t" + to_string(i); 662 | return t_reg; 663 | } 664 | } 665 | for (int i = 0; i < 8; i++) { 666 | if (s_reg_table[i] == symbol) { 667 | return "$s" + to_string(i); 668 | } 669 | } 670 | 671 | SymTableItem item = SymTable::search(cur_func, symbol); 672 | if (SymTable::in_global(cur_func, symbol)) { //symbol不是局部变量 673 | SymTableItem global = SymTable::try_search(cur_func, symbol, true); 674 | if (global.valid && (global.stiType == var || global.stiType == tmp)) { 675 | return to_string(item.addr - LOCAL_ADDR_INIT) + "($gp)"; 676 | } 677 | } 678 | return to_string(item.addr + call_func_sp_offset) + "($sp)"; 679 | } 680 | 681 | string MipsGenerator::assign_t_reg(const string &name) { 682 | for (int i = 0; i < 10; i++) { 683 | if (t_reg_table[i] == VACANT) { 684 | t_reg_table[i] = name; 685 | return "$t" + to_string(i); 686 | } 687 | } 688 | return INVALID; 689 | } 690 | 691 | string MipsGenerator::assign_s_reg(const string &name) { 692 | for (int i = 0; i < 8; i++) { 693 | if (s_reg_table[i] == VACANT) { 694 | s_reg_table[i] = name; 695 | return "$s" + to_string(i); 696 | } 697 | } 698 | 699 | return INVALID; 700 | } 701 | 702 | void MipsGenerator::show_reg_status() { 703 | cout << "==========REG TABLE==========" << endl; 704 | for (int i = 0; i < 10; i += 2) { 705 | cout << "$t" << i << ": " << t_reg_table[i] << " "; 706 | cout << "$t" << i + 1 << ": " << t_reg_table[i + 1] << endl; 707 | } 708 | for (int i = 0; i < 8; i += 2) { 709 | cout << "$s" << i << ": " << s_reg_table[i] << " "; 710 | cout << "$s" << i + 1 << ": " << s_reg_table[i + 1] << endl; 711 | } 712 | cout << "=============================" << endl; 713 | } 714 | 715 | #pragma clang diagnostic pop -------------------------------------------------------------------------------- /MipsGenerator.h: -------------------------------------------------------------------------------- 1 | // 2 | // Created by wzk on 2020/11/5. 3 | // 4 | 5 | #ifndef COMPILER_MIPSGENERATOR_H 6 | #define COMPILER_MIPSGENERATOR_H 7 | 8 | #include 9 | #include 10 | 11 | #include "PseudoCode.h" 12 | #include "Error.h" 13 | #include "SymTable.h" 14 | 15 | #define STACK_RA "0($sp)" 16 | #define STACK_V0 "4($sp)" //unused 17 | #define STACK_A_BEGIN 8 18 | #define STACK_S_BEGIN 24 19 | #define STACK_T_BEGIN 56 20 | #define STACK_RESERVED "96($sp)" 21 | 22 | 23 | 24 | class MipsGenerator { 25 | public: 26 | vector mid; 27 | vector mips; 28 | vector strcons; 29 | vector s_reg_table = { 30 | VACANT, 31 | VACANT, 32 | VACANT, 33 | VACANT, 34 | VACANT, 35 | VACANT, 36 | VACANT, 37 | VACANT 38 | }; 39 | vector t_reg_table = { 40 | VACANT, 41 | VACANT, 42 | VACANT, 43 | VACANT, 44 | VACANT, 45 | VACANT, 46 | VACANT, 47 | VACANT, 48 | VACANT, 49 | VACANT 50 | }; 51 | string regs[32] = { 52 | "$zero", "$at", "$v0", "$v1", 53 | "$a0", "$a1", "$a2", "$a3", 54 | "$t0", "$t1", "$t2", "$t3", 55 | "$t4", "$t5", "$t6", "$t7", 56 | "$s0", "$s1", "$s2", "$s3", 57 | "$s4", "$s5", "$s6", "$s7", 58 | "$t8", "$t9", "$k0", "$k1", 59 | "$gp", "$sp", "$fp", "$ra" 60 | }; 61 | map op_to_instr = { 62 | {OP_ADD, "addu"}, 63 | {OP_SUB, "subu"}, 64 | {OP_MUL, "mul"}, 65 | {OP_DIV, "div"}, 66 | }; 67 | string cur_func = GLOBAL; 68 | vector> call_func_paras; 69 | vector sp_size = {0}; 70 | int call_func_sp_offset = 0; 71 | int tmp_label_idx = 1; 72 | 73 | bool optimize_assign_reg = false; 74 | bool optimize_muldiv = false; 75 | 76 | MipsGenerator(): mid(PseudoCodeList::codes), strcons(PseudoCodeList::strcons) {}; 77 | 78 | void generate(const string &code); 79 | 80 | void generate(const string &op, const string &num1); 81 | 82 | void generate(const string &op, const string &num1, const string& num2); 83 | 84 | void generate(const string &op, const string &num1, const string& num2, const string& num3); 85 | 86 | void gen_assign(const string &num1, const string &num2); 87 | 88 | void translate(); 89 | 90 | void gen_arithmetic(const string &op, const string &num1, const string& num2, const string& num3); 91 | 92 | void show() { 93 | cout << "=============Mips code=============" << endl; 94 | for (auto &code: mips) { 95 | cout << code << endl; 96 | } 97 | } 98 | 99 | void save_to_file(const string &out_path) { 100 | ofstream out(out_path); 101 | for (auto &c: mips) { 102 | out << c << endl; 103 | } 104 | out.close(); 105 | } 106 | 107 | void load_value(const string &symbol, const string ®); 108 | 109 | void save_value(const string ®, const string &symbol); 110 | 111 | string symbol_to_addr(const string &); 112 | 113 | string assign_t_reg(const string &); 114 | 115 | string assign_s_reg(const string &); 116 | 117 | bool assign_reg(const string& symbol, bool only_para=false); 118 | 119 | bool in_reg(const string& symbol); 120 | 121 | bool in_memory(const string& symbol); 122 | 123 | static bool is_const(const string& symbol) ; 124 | 125 | void show_reg_status(); 126 | 127 | void release(string); 128 | 129 | string assign_label() { 130 | string ret = "tmp_label_" + to_string(tmp_label_idx); 131 | tmp_label_idx++; 132 | return ret; 133 | } 134 | 135 | // string allocate_memory(); 136 | }; 137 | 138 | 139 | 140 | #endif //COMPILER_MIPSGENERATOR_H 141 | -------------------------------------------------------------------------------- /PseudoCode.cpp: -------------------------------------------------------------------------------- 1 | // 2 | // Created by wzk on 2020/11/5. 3 | // 4 | 5 | #include "PseudoCode.h" 6 | 7 | vector PseudoCodeList::codes; 8 | int PseudoCodeList::code_index = 1; 9 | vector PseudoCodeList::strcons; 10 | int PseudoCodeList::label_index = 1; 11 | map> PseudoCodeList::blocks; 12 | vector PseudoCodeList::DAGNodes; 13 | map PseudoCodeList::NodesMap; 14 | int PseudoCodeList::call_times = 0; 15 | 16 | 17 | string PseudoCode::to_str() const { 18 | if (op == OP_FUNC || op == OP_END_FUNC) { 19 | return "=========" + op + " " + num1 + " " + num2 + "========="; 20 | } 21 | if (op == OP_ASSIGN) { 22 | return num1 + " = " + num2; 23 | } 24 | if (op == OP_ARR_LOAD) { 25 | return result + " = " + num1 + "[" + num2 + "]"; 26 | } 27 | if (op == OP_ARR_SAVE) { 28 | return num1 + "[" + num2 + "] = " + result; 29 | } 30 | if (op == OP_JUMP_IF) { 31 | return op + " " + num1 + num2 + " " + result; 32 | } 33 | if (result != VACANT) { 34 | return result + " = " + num1 + " " + op + " " + num2; 35 | } 36 | if (num1 == VACANT && num2 == VACANT) { 37 | return op; 38 | } 39 | if (num2 == VACANT) { 40 | return op + " " + num1; 41 | } 42 | return op + " " + num1 + " " + num2; 43 | } 44 | 45 | string PseudoCode::to_standard_format() const { 46 | if (op == OP_FUNC) { 47 | return num1 + " " + num2 + "()"; 48 | } 49 | return INVALID; 50 | } 51 | 52 | void PseudoCodeList::refactor() { 53 | string cur_func = GLOBAL; 54 | for (auto &code : codes) { 55 | string n1 = code.num1; 56 | string n2 = code.num2; 57 | string op = code.op; 58 | string r = code.result; 59 | if (op == OP_FUNC) { 60 | cur_func = code.num2; 61 | } 62 | if (n1[0] == '\'') { 63 | code.num1 = to_string(n1[1]); //char 64 | n1 = code.num1; 65 | } 66 | if (n2[0] == '\'') { 67 | code.num2 = to_string(n2[1]); //char 68 | n2 = code.num2; 69 | } 70 | if (r[0] == '\'') { 71 | code.result = to_string(r[1]); //char 72 | r = code.result; 73 | } 74 | if (op == OP_ADD || op == OP_MUL) { 75 | if (begins_num(n1) && begins_num(n2)) { 76 | int result = (op == OP_ADD) ? stoi(n1) + stoi(n2) : stoi(n1) * stoi(n2); 77 | code = PseudoCode(OP_ASSIGN, r, to_string(result), VACANT); 78 | } else if (begins_num(n1)) { //因为只允许addu $t1, $t2, 5,所以交换两个数顺序 79 | code.num1 = n2; 80 | code.num2 = n1; 81 | n1 = code.num1; 82 | n2 = code.num2; 83 | } 84 | } 85 | if (op == OP_SUB) { 86 | if (begins_num(n1) && begins_num(n2)) { 87 | code = PseudoCode(OP_ASSIGN, r, to_string(stoi(n1) - stoi(n2)), VACANT); 88 | } else if (begins_num(n1)) { //y=5-x: z=x-5; z=-y 89 | //Grammar里已处理 90 | } 91 | } 92 | if (op == OP_DIV) { 93 | if (begins_num(n1) && begins_num(n2)) { 94 | code = PseudoCode(OP_ASSIGN, r, to_string(stoi(n1) / stoi(n2)), VACANT); 95 | } else if (begins_num(n1)) { //y=5/x: z=5; y=z/x 96 | //Grammar里已处理 97 | } 98 | } 99 | } 100 | } 101 | 102 | void PseudoCodeList::remove_redundant_assign() { 103 | vector new_codes; 104 | for (int i = 0; i < codes.size() - 1; i++) { 105 | PseudoCode c1 = codes[i]; 106 | PseudoCode c2 = codes[i + 1]; 107 | if (c2.op == OP_ASSIGN && c2.num1[0] != '#' && c1.result[0] == '#' && c1.result == c2.num2 && 108 | (is_arith(c1.op) || c1.op == OP_ARR_LOAD)) { 109 | new_codes.emplace_back(c1.op, c1.num1, c1.num2, c2.num1); 110 | i++; 111 | } else if (c1.op == OP_ASSIGN && c1.num1[0] == '#' && c1.num1 == c2.num1 && 112 | (is_arith(c2.op) || c2.op == OP_ARR_SAVE)) { 113 | new_codes.emplace_back(c2.op, c1.num2, c2.num2, c2.result); 114 | i++; 115 | } else if (c1.op == OP_ASSIGN && c1.num1[0] == '#' && c1.num1 == c2.num2 && 116 | (is_arith(c2.op) || c2.op == OP_ARR_SAVE)) { 117 | new_codes.emplace_back(c2.op, c2.num1, c1.num2, c2.result); 118 | i++; 119 | } else if (c1.op == OP_PRINT && c2.op == OP_PRINT && c1.num2 == "strcon" && c2.num1 == ENDL) { 120 | strcons[stoi(c1.num1)] += "\\n"; 121 | new_codes.emplace_back(OP_PRINT, c1.num1, c1.num2, VACANT); 122 | i++; 123 | } else if (i == codes.size() - 2) { 124 | new_codes.push_back(c1); 125 | new_codes.push_back(c2); 126 | } else { 127 | new_codes.push_back(c1); 128 | } 129 | } 130 | codes = new_codes; 131 | } 132 | 133 | void PseudoCodeList::const_broadcast() { 134 | string cur_func = GLOBAL; 135 | vector new_codes; 136 | for (auto &c : codes) { 137 | SymTableItem *it1, *it2, *itr; 138 | if (!begins_num(c.num1)) { 139 | it1 = &SymTable::ref_search(cur_func, c.num1); 140 | } 141 | if (!begins_num(c.num2)) { 142 | it2 = &SymTable::ref_search(cur_func, c.num2); 143 | } 144 | if (!begins_num(c.result) && c.result != VACANT) { 145 | itr = &SymTable::ref_search(cur_func, c.result); 146 | } 147 | bool num1_can_cal = begins_num(c.num1) || !it1->const_value.empty(); 148 | bool num2_can_cal = begins_num(c.num2) || !it2->const_value.empty(); 149 | 150 | if (c.op == OP_ARR_SAVE && c.result == "#T33") { 151 | cout << (c.op == OP_ARR_SAVE) << endl; 152 | cout << begins_num(c.result) << endl; 153 | cout << !itr->const_value.empty() << endl; 154 | } 155 | 156 | if (c.op == OP_ASSIGN && num2_can_cal && it1->stiType == tmp) { 157 | it1->const_value = begins_num(c.num2) ? c.num2 : it2->const_value; 158 | } else if (num1_can_cal && num2_can_cal) { 159 | int v1 = stoi(begins_num(c.num1) ? c.num1 : it1->const_value); 160 | int v2 = stoi(begins_num(c.num2) ? c.num2 : it2->const_value); 161 | if (is_arith(c.op)) { 162 | int r = c.op == OP_ADD ? v1 + v2 : c.op == OP_SUB ? v1 - v2 : 163 | c.op == OP_MUL ? v1 * v2 : c.op == OP_DIV ? v1 / v2 : 23333; 164 | if (itr->stiType == tmp) { 165 | itr->const_value = to_string(r); 166 | } else { 167 | new_codes.emplace_back(OP_ASSIGN, c.result, to_string(r), VACANT); 168 | } 169 | } else { 170 | new_codes.emplace_back(c.op, to_string(v1), to_string(v2), c.result); 171 | } 172 | 173 | } else if (num1_can_cal) { 174 | int v1 = stoi(begins_num(c.num1) ? c.num1 : it1->const_value); 175 | 176 | if (c.op == OP_SUB || c.op == OP_DIV) { 177 | if (c.op == OP_DIV && v1 == 0) { 178 | //y=0/x: y=0 179 | if (itr->stiType == tmp) { 180 | itr->const_value = "0"; 181 | } else { 182 | new_codes.emplace_back(OP_ASSIGN, c.result, "0", VACANT); 183 | } 184 | } else { 185 | // y=5-x y=5/x 186 | if (c.num1 != to_string(v1)) { 187 | new_codes.emplace_back(OP_ASSIGN, c.num1, to_string(v1), VACANT); 188 | } 189 | new_codes.push_back(c); 190 | } 191 | } else if ((c.op == OP_ADD && v1 == 0) || (c.op == OP_MUL && v1 == 1)) { 192 | //y=0+x: y=x 193 | new_codes.emplace_back(OP_ASSIGN, c.result, c.num2, VACANT); 194 | } else if ((c.op == OP_MUL && v1 == 0)) { 195 | //y=0*x: y=0 196 | if (itr->stiType == tmp) { 197 | itr->const_value = "0"; 198 | } else { 199 | new_codes.emplace_back(OP_ASSIGN, c.result, "0", VACANT); 200 | } 201 | } else { 202 | new_codes.emplace_back(c.op, to_string(v1), c.num2, c.result); 203 | } 204 | 205 | } else if (c.op == OP_ARR_SAVE && (begins_num(c.result) || !itr->const_value.empty())) { 206 | int vr = stoi(begins_num(c.result) ? c.result : itr->const_value); 207 | if (num2_can_cal) { 208 | int v2 = stoi(begins_num(c.num2) ? c.num2 : it2->const_value); 209 | new_codes.emplace_back(c.op, c.num1, to_string(v2), to_string(vr)); 210 | } else { 211 | new_codes.emplace_back(c.op, c.num1, c.num2, to_string(vr)); 212 | } 213 | } else if (num2_can_cal) { 214 | int v2 = stoi(begins_num(c.num2) ? c.num2 : it2->const_value); 215 | if ((c.op == OP_ADD && v2 == 0) || (c.op == OP_SUB && v2 == 0) || 216 | (c.op == OP_MUL && v2 == 1) || (c.op == OP_DIV && v2 == 1)) { 217 | //y=a+0, y=a-0, y=a*1, y=a/1 218 | new_codes.emplace_back(OP_ASSIGN, c.result, c.num1, VACANT); 219 | } else if ((c.op == OP_MUL && v2 == 0)) { 220 | //y=a*0 221 | if (itr->stiType == tmp) { 222 | itr->const_value = "0"; 223 | } else { 224 | new_codes.emplace_back(OP_ASSIGN, c.result, "0", VACANT); 225 | } 226 | } else { 227 | new_codes.emplace_back(c.op, c.num1, to_string(v2), c.result); 228 | } 229 | } else if (c.op == OP_FUNC) { 230 | cur_func = c.num2; 231 | new_codes.push_back(c); 232 | } else { 233 | new_codes.push_back(c); 234 | } 235 | } 236 | codes = new_codes; 237 | } 238 | 239 | void PseudoCodeList::remove_redundant_tmp() { 240 | 241 | bool modified; 242 | do { 243 | string cur_func = GLOBAL; 244 | vector new_codes; 245 | modified = false; 246 | for (int i = 0; i < codes.size() - 1; i++) { 247 | PseudoCode c1 = codes[i]; 248 | PseudoCode c2 = codes[i + 1]; 249 | SymTableItem *it11, *it12, *it1r; 250 | SymTableItem *it21, *it22, *it2r; 251 | if (!begins_num(c1.num1)) { 252 | it11 = &SymTable::ref_search(cur_func, c1.num1); 253 | } 254 | if (!begins_num(c1.num2)) { 255 | it12 = &SymTable::ref_search(cur_func, c1.num2); 256 | } 257 | if (!begins_num(c1.result) && c1.result != VACANT) { 258 | it1r = &SymTable::ref_search(cur_func, c1.result); 259 | } 260 | if (!begins_num(c2.num1)) { 261 | it21 = &SymTable::ref_search(cur_func, c2.num1); 262 | } 263 | if (!begins_num(c2.num2)) { 264 | it22 = &SymTable::ref_search(cur_func, c2.num2); 265 | } 266 | if (!begins_num(c2.result) && c2.result != VACANT) { 267 | it2r = &SymTable::ref_search(cur_func, c2.result); 268 | } 269 | 270 | if (c1.op != OP_ASSIGN && c2.op != OP_ASSIGN && c2.result != VACANT && 271 | !begins_num(c2.result) && !begins_num(c1.result) && !begins_num(c2.num1) && 272 | it2r->stiType == tmp && it21->stiType == tmp && it1r->name == it21->name 273 | && begins_num(c1.num2) && begins_num(c2.num2)) { 274 | if (c1.op == c2.op) { 275 | if (c1.op == OP_ADD || c1.op == OP_SUB) { 276 | new_codes.emplace_back(c1.op, c1.num1, to_string(stoi(c1.num2) + stoi(c2.num2)), c2.result); 277 | modified = true; 278 | i++; 279 | } else if (c1.op == OP_MUL || c1.op == OP_DIV) { 280 | new_codes.emplace_back(c1.op, c1.num1, to_string(stoi(c1.num2) * stoi(c2.num2)), c2.result); 281 | modified = true; 282 | i++; 283 | } else { 284 | new_codes.push_back(c1); 285 | } 286 | } else if ((c1.op == OP_ADD && c2.op == OP_SUB) || (c1.op == OP_SUB && c2.op == OP_ADD)) { 287 | new_codes.emplace_back(c1.op, c1.num1, to_string(stoi(c1.num2) - stoi(c2.num2)), c2.result); 288 | modified = true; 289 | i++; 290 | } 291 | 292 | // else if (((c1.op == OP_MUL && c2.op == OP_DIV) || (c1.op == OP_DIV && c2.op == OP_MUL)) && 293 | // (c1.num2 == c2.num2) && stoi(c1.num2) <= 2) { 294 | // new_codes.emplace_back(OP_ASSIGN, c2.result, c1.num1, VACANT); 295 | // modified = true; 296 | // i++; 297 | // } 298 | 299 | else { 300 | new_codes.push_back(c1); 301 | } 302 | 303 | } else if (c1.op == OP_FUNC) { 304 | cur_func = c1.num2; 305 | new_codes.push_back(c1); 306 | } else if (i == codes.size() - 2) { 307 | new_codes.push_back(c1); 308 | new_codes.push_back(c2); 309 | } else { 310 | new_codes.push_back(c1); 311 | } 312 | } 313 | codes = new_codes; 314 | } while (modified); 315 | } 316 | 317 | void PseudoCodeList::remove_tripple() { 318 | string cur_func = GLOBAL; 319 | vector new_codes; 320 | for (int i = 0; i < codes.size() - 2; i++) { 321 | PseudoCode c1 = codes[i]; 322 | PseudoCode c2 = codes[i + 1]; 323 | PseudoCode c3 = codes[i + 2]; 324 | SymTableItem *it11, *it12, *it1r; 325 | SymTableItem *it21, *it22, *it2r; 326 | if (!begins_num(c1.num1)) { 327 | it11 = &SymTable::ref_search(cur_func, c1.num1); 328 | } 329 | if (!begins_num(c1.num2)) { 330 | it12 = &SymTable::ref_search(cur_func, c1.num2); 331 | } 332 | if (!begins_num(c1.result) && c1.result != VACANT) { 333 | it1r = &SymTable::ref_search(cur_func, c1.result); 334 | } 335 | if (!begins_num(c2.num1)) { 336 | it21 = &SymTable::ref_search(cur_func, c2.num1); 337 | } 338 | if (!begins_num(c2.num2)) { 339 | it22 = &SymTable::ref_search(cur_func, c2.num2); 340 | } 341 | if (!begins_num(c2.result) && c2.result != VACANT) { 342 | it2r = &SymTable::ref_search(cur_func, c2.result); 343 | } 344 | if (is_arith(c1.op) && c1.op == c2.op && c1.num1 == c2.num1 && c1.num2 == c2.num2 && 345 | it1r->stiType == tmp && it2r->stiType == tmp && c3.num1 == c1.result && c3.num2 == c2.result) { 346 | if (c3.op == OP_SUB) { 347 | new_codes.emplace_back(OP_ASSIGN, c3.result, "0", VACANT); 348 | i += 2; 349 | } else if (c3.op == OP_DIV) { 350 | new_codes.emplace_back(OP_ASSIGN, c3.result, "1", VACANT); 351 | i += 2; 352 | } 353 | } else if (i == codes.size() - 3) { 354 | new_codes.push_back(c1); 355 | new_codes.push_back(c2); 356 | new_codes.push_back(c3); 357 | } else if (c1.op == OP_FUNC) { 358 | cur_func = c1.num2; 359 | new_codes.push_back(c1); 360 | } else { 361 | new_codes.push_back(c1); 362 | } 363 | } 364 | codes = new_codes; 365 | } 366 | 367 | void PseudoCodeList::interpret() { 368 | //TODO: array 369 | vector new_codes; 370 | int i = 0; 371 | while (codes[i].op != OP_FUNC) { 372 | if (codes[i].op != OP_ASSIGN) { 373 | panic("global code not assign statement: " + codes[i].to_str()); 374 | } 375 | if (!begins_num(codes[i].num2)) { 376 | panic("not begins num: " + codes[i].num2); 377 | } 378 | SymTableItem *it1 = &SymTable::ref_search(GLOBAL, codes[i].num1); 379 | it1->const_value = codes[i].num2; 380 | new_codes.push_back(codes[i]); 381 | i++; 382 | } 383 | string cur_func = GLOBAL; 384 | for (; i < codes.size(); i++) { 385 | PseudoCode c = codes[i]; 386 | SymTableItem *it1, *it2, *itr; 387 | if (!begins_num(c.num1)) { 388 | it1 = &SymTable::ref_search(cur_func, c.num1); 389 | } 390 | if (!begins_num(c.num2)) { 391 | it2 = &SymTable::ref_search(cur_func, c.num2); 392 | } 393 | if (!begins_num(c.result) && c.result != VACANT) { 394 | itr = &SymTable::ref_search(cur_func, c.result); 395 | } 396 | 397 | 398 | if (c.op == OP_SCANF) { 399 | //将所有值保存 400 | for (auto &item: SymTable::global) { 401 | if (item.modified) { 402 | new_codes.emplace_back(OP_ASSIGN, item.name, item.const_value, VACANT); 403 | item.modified = false; 404 | } 405 | } 406 | for (auto &item: SymTable::local[cur_func]) { 407 | if (item.modified && item.stiType != tmp) { 408 | new_codes.emplace_back(OP_ASSIGN, item.name, item.const_value, VACANT); 409 | item.modified = false; 410 | } 411 | } 412 | while (codes[i].op != OP_END_FUNC) { 413 | new_codes.push_back(codes[i]); 414 | i++; 415 | } 416 | new_codes.push_back(codes[i]); 417 | } else if (c.op == OP_PRINT) { 418 | if (c.num2 == "int" || c.num2 == "char") { 419 | if (it1->valid && it1->dataType == character) { 420 | char ch = (char) stoi(it1->const_value); 421 | PseudoCodeList::strcons.push_back("@" + string(&ch)); 422 | } else if (c.num2 == "char") { 423 | char ch = (char) stoi(c.num1); 424 | PseudoCodeList::strcons.push_back("@" + string(&ch)); 425 | } else if (it1->valid && it1->dataType == integer) { 426 | PseudoCodeList::strcons.push_back(it1->const_value); 427 | } else { 428 | PseudoCodeList::strcons.push_back(c.num1); 429 | } 430 | new_codes.emplace_back(OP_PRINT, to_string(PseudoCodeList::strcons.size() - 1), "strcon", VACANT); 431 | } else { 432 | new_codes.push_back(codes[i]); 433 | } 434 | } else if (c.op == OP_END_FUNC) { 435 | for (auto &item: SymTable::global) { 436 | if (item.modified) { 437 | new_codes.emplace_back(OP_ASSIGN, item.name, item.const_value, VACANT); 438 | item.modified = false; 439 | } 440 | } 441 | new_codes.push_back(codes[i]); 442 | } else if (c.op == OP_FUNC) { 443 | cur_func = c.num2; 444 | new_codes.push_back(codes[i]); 445 | } else if (c.op == OP_ASSIGN) { 446 | it1->const_value = begins_num(c.num2) ? c.num2 : it2->const_value; 447 | it1->modified = true; 448 | } else if (is_arith(c.op)) { 449 | int v1 = stoi(begins_num(c.num1) ? c.num1 : it1->const_value); 450 | int v2 = stoi(begins_num(c.num2) ? c.num2 : it2->const_value); 451 | int r = c.op == OP_ADD ? v1 + v2 : c.op == OP_SUB ? v1 - v2 : 452 | c.op == OP_MUL ? v1 * v2 : c.op == OP_DIV ? v1 / v2 : 23333; 453 | itr->const_value = to_string(r); 454 | itr->modified = true; 455 | } 456 | } 457 | codes = new_codes; 458 | } 459 | 460 | void PseudoCodeList::divide_basic_blocks() { 461 | map> basic_block_idx; 462 | string cur_func = GLOBAL; 463 | for (int i = 0; i < codes.size(); i++) { 464 | string op = codes[i].op; 465 | PseudoCode c = codes[i]; 466 | if (op == OP_FUNC) { 467 | cur_func = c.num2; 468 | basic_block_idx[cur_func] = set(); 469 | //函数第一条语句为入口语句 470 | for (int j = i + 1; codes[j].op != OP_END_FUNC; j++) { 471 | if (codes[j + 1].op == OP_END_FUNC) { 472 | basic_block_idx[cur_func].insert(j + 1); 473 | } 474 | if (codes[j].op != OP_LABEL) { 475 | basic_block_idx[cur_func].insert(j); 476 | break; 477 | } 478 | } 479 | } else if (cur_func == GLOBAL) { 480 | //不给全局变量赋值语句划分基本块 481 | continue; 482 | } 483 | 484 | if (op == OP_LABEL || op == OP_JUMP_IF || op == OP_JUMP_UNCOND 485 | || op == OP_RETURN || op == OP_CALL) { 486 | for (int j = i + 1; codes[j].op != OP_END_FUNC; j++) { 487 | if (codes[j + 1].op == OP_END_FUNC) { 488 | basic_block_idx[cur_func].insert(j + 1); 489 | } 490 | if (codes[j].op != OP_LABEL) { 491 | basic_block_idx[cur_func].insert(j); 492 | break; 493 | } 494 | } 495 | } 496 | if (op == OP_END_FUNC) { 497 | basic_block_idx[cur_func].insert(i); 498 | } 499 | } 500 | 501 | 502 | int index = 0; 503 | for (auto &it: basic_block_idx) { 504 | blocks[it.first] = vector(); 505 | vector vec(basic_block_idx[it.first].begin(), basic_block_idx[it.first].end()); 506 | for (int i = 0; i < vec.size() - 1; i++) { 507 | blocks[it.first].emplace_back(index++, vec[i], vec[i + 1] - 1); 508 | } 509 | } 510 | 511 | cout << "=====basic blocks:=====" << endl; 512 | for (auto &it: blocks) { 513 | cout << "===function " << it.first << "===" << endl; 514 | for (auto &b: blocks[it.first]) { 515 | cout << "#" << b.index << " " << b.start << "~" << b.end << endl; 516 | } 517 | } 518 | 519 | //TODO:生成流图 520 | } 521 | 522 | void PseudoCodeList::gen_DAG_graph(int begin, int end) { 523 | for (int idx = begin; idx <= end; idx++) { 524 | PseudoCode c = codes[idx]; 525 | string num1 = c.op == OP_ARR_SAVE ? c.num2 : c.num1; 526 | string num2 = c.op == OP_ARR_SAVE ? c.result : c.num2; 527 | string result = c.op == OP_ARR_SAVE ? c.num1 : c.result; 528 | if (c.op == OP_ASSIGN) { 529 | unsigned int k = -1; 530 | for (auto &n: NodesMap) { 531 | if (n.first == num2) { 532 | k = n.second; 533 | break; 534 | } 535 | } 536 | if (k == -1) { 537 | k = DAGNodes.size(); 538 | string name = num2[0] == '#' ? num2 : num2 + ""; 539 | DAGNodes.emplace_back(k, name, true); 540 | NodesMap[num2] = k; 541 | } 542 | DAGNodes[k].symbols.push_back(num1); 543 | 544 | bool find = false; 545 | for (auto &n: NodesMap) { 546 | if (n.first == num1) { 547 | n.second = k; 548 | find = true; 549 | break; 550 | } 551 | } 552 | if (!find) { 553 | NodesMap[num1] = k; 554 | } 555 | 556 | continue; 557 | } 558 | 559 | unsigned int i = -1; 560 | for (auto &n: NodesMap) { 561 | if (n.first == num1) { 562 | i = n.second; 563 | break; 564 | } 565 | } 566 | if (i == -1) { 567 | i = DAGNodes.size(); 568 | string name = num2[0] == '#' ? num1 : num1 + ""; 569 | DAGNodes.emplace_back(i, name, true); 570 | NodesMap[num1] = i; 571 | } 572 | 573 | unsigned int j = -1; 574 | for (auto &n: NodesMap) { 575 | if (n.first == num2) { 576 | j = n.second; 577 | break; 578 | } 579 | } 580 | if (j == -1) { 581 | j = DAGNodes.size(); 582 | string name = num2[0] == '#' ? num2 : num2 + ""; 583 | DAGNodes.emplace_back(j, name, true); 584 | NodesMap[num2] = j; 585 | } 586 | 587 | unsigned int k = -1; 588 | for (auto &n: DAGNodes) { 589 | if (n.name == c.op && n.children[0] == 0 && n.children[1] == j) { 590 | k = n.index; 591 | break; 592 | } 593 | } 594 | if (k == -1) { 595 | k = DAGNodes.size(); 596 | DAGNode new_node(DAGNodes.size(), c.op, false); 597 | new_node.children.push_back(i); 598 | new_node.children.push_back(j); 599 | DAGNodes[i].parents.push_back(k); 600 | DAGNodes[j].parents.push_back(k); 601 | DAGNodes.push_back(new_node); 602 | } 603 | DAGNodes[k].symbols.push_back(result); 604 | 605 | bool find = false; 606 | for (auto &n1: NodesMap) { 607 | if (n1.first == result) { 608 | n1.second = k; 609 | find = true; 610 | break; 611 | } 612 | } 613 | if (!find) { 614 | NodesMap[result] = k; 615 | } 616 | } 617 | 618 | for (auto &n: DAGNodes) { 619 | bool has_find = false; 620 | for (auto &name: n.symbols) { 621 | if (name[0] != '#') { 622 | n.primary_symbol = name; 623 | has_find = true; 624 | break; 625 | } 626 | } 627 | if (!has_find) { 628 | n.primary_symbol = n.symbols[0]; 629 | } 630 | } 631 | } 632 | 633 | vector PseudoCodeList::DAG_output() { 634 | vector ret; 635 | vector queue; 636 | cout << "total number of DAGNodes: " << DAGNodes.size() << endl; 637 | while (true) { 638 | bool break_flag = true; 639 | for (auto &n: DAGNodes) { 640 | if (!n.is_leaf && !n.in_queue) { 641 | break_flag = false; 642 | //cout << "dag output loop1" << endl; 643 | } 644 | } 645 | if (break_flag) { 646 | break; 647 | } 648 | 649 | int i; 650 | for (i = 0; i < DAGNodes.size(); i++) { 651 | if (DAGNodes[i].parents.empty() && !DAGNodes[i].is_leaf && !DAGNodes[i].in_queue) { 652 | queue.push_back(DAGNodes[i]); 653 | DAGNodes[i].in_queue = true; 654 | cout << "Node" << i << "enqueue" << endl; 655 | break; 656 | } 657 | } 658 | for (auto &node: DAGNodes) { 659 | vector new_parents; 660 | for (auto &parent: node.parents) { 661 | if (parent != i) { 662 | new_parents.push_back(parent); 663 | } 664 | } 665 | node.parents = new_parents; 666 | } 667 | if (!DAGNodes[i].children.empty()) { 668 | int child_id = DAGNodes[i].children[0]; 669 | DAGNode cur = DAGNodes[child_id]; 670 | while (cur.parents.empty() && !cur.is_leaf && !DAGNodes[i].in_queue) { 671 | queue.push_back(cur); 672 | cur.in_queue = true; 673 | for (auto &node: DAGNodes) { 674 | vector new_parents; 675 | for (auto &parent: node.parents) { 676 | if (parent != child_id) { 677 | new_parents.push_back(parent); 678 | } 679 | } 680 | node.parents = new_parents; 681 | } 682 | if (!cur.children.empty()) { 683 | child_id = cur.children[0]; 684 | cur = DAGNodes[child_id]; 685 | } else { 686 | break; 687 | } 688 | } 689 | } 690 | } 691 | 692 | for (auto &n : DAGNodes) { 693 | if (n.is_leaf) { 694 | for (auto &name: n.symbols) { 695 | if (name[0] != '#' && name != n.primary_symbol && NodesMap[name] == n.index) { 696 | ret.emplace_back(OP_ASSIGN, name, n.primary_symbol, VACANT); 697 | } 698 | } 699 | } 700 | } 701 | 702 | for (int i = (int) queue.size() - 1; i >= 0; i--) { 703 | // cout << "queue[" << i << "]: " << queue[i].primary_symbol << endl; 704 | assertion(can_dag(queue[i].name)); 705 | bool has_print = false; 706 | for (auto &name: queue[i].symbols) { 707 | if (name[0] != '#') { 708 | if (queue[i].name == OP_ARR_SAVE) { 709 | ret.emplace_back(queue[i].name, name, DAGNodes[queue[i].children[0]].primary_symbol, 710 | DAGNodes[queue[i].children[1]].primary_symbol); 711 | } else { 712 | ret.emplace_back(queue[i].name, DAGNodes[queue[i].children[0]].primary_symbol, 713 | DAGNodes[queue[i].children[1]].primary_symbol, name); 714 | } 715 | has_print = true; 716 | } 717 | } 718 | if (!has_print) { 719 | if (queue[i].name == OP_ARR_SAVE) { 720 | ret.emplace_back(queue[i].name, queue[i].primary_symbol, DAGNodes[queue[i].children[0]].primary_symbol, 721 | DAGNodes[queue[i].children[1]].primary_symbol); 722 | } else { 723 | ret.emplace_back(queue[i].name, DAGNodes[queue[i].children[0]].primary_symbol, 724 | DAGNodes[queue[i].children[1]].primary_symbol, queue[i].primary_symbol); 725 | } 726 | } 727 | } 728 | 729 | cout << "AFTER DAG" << endl; 730 | for (auto &code: ret) { 731 | cout << code.to_str() << endl; 732 | } 733 | return ret; 734 | } 735 | 736 | void PseudoCodeList::DAG_optimize() { 737 | for (auto &it: blocks) { 738 | for (auto &b: blocks[it.first]) { 739 | for (int i = b.start; i < b.end; i++) { 740 | if (!can_dag(codes[i].op)) continue; 741 | cout << "BEFORE DAG" << endl; 742 | cout << codes[i].to_str() << endl; 743 | int j = i + 1; 744 | while (can_dag(codes[j].op)) { 745 | cout << codes[j].to_str() << endl; 746 | j++; 747 | } 748 | if (j - i <= 1) continue; 749 | 750 | cout << "DAG for " << i << "~" << j - 1 << endl; 751 | gen_DAG_graph(i, j - 1); 752 | cout << "gen DAG graph done" << endl; 753 | show_DAG_tree(); 754 | if (!DAGNodes.empty()) { 755 | vector dag_outputs = DAG_output(); 756 | assertion(dag_outputs.size() <= j - i); 757 | for (int m = i; m <= j - 1; m++) { 758 | if (m - i < dag_outputs.size()) { 759 | codes[m] = dag_outputs[m - i]; 760 | } else { 761 | codes[m] = PseudoCode(OP_PLACEHOLDER, VACANT, VACANT, VACANT); 762 | } 763 | } 764 | //TODO: replace old with new 765 | } 766 | i = j; 767 | 768 | DAGNodes.clear(); 769 | NodesMap.clear(); 770 | } 771 | } 772 | } 773 | 774 | vector new_codes; 775 | for (auto &c: codes) { 776 | if (c.op != OP_PLACEHOLDER) { 777 | new_codes.push_back(c); 778 | } 779 | } 780 | codes = new_codes; 781 | } 782 | 783 | void PseudoCodeList::dfs_show(const DAGNode &node, int depth) { 784 | for (int i = 0; i < depth - 1; i++) { 785 | cout << "| "; 786 | } 787 | if (depth != 0) { 788 | cout << "|-----"; 789 | } 790 | cout << "'" << node.name << "' " << node.primary_symbol << "(" << node.index << ") ["; 791 | for (auto &s: node.symbols) { 792 | if (s != node.primary_symbol) { 793 | cout << s << ","; 794 | } 795 | } 796 | cout << "]" << (node.is_leaf ? " LEAF" : "") << endl; 797 | for (auto &child: node.children) { 798 | dfs_show(DAGNodes[child], depth + 1); 799 | } 800 | } 801 | 802 | void PseudoCodeList::show_DAG_tree() { 803 | DAGNode root(-1, "ROOT", false); 804 | for (auto &node: DAGNodes) { 805 | if (node.parents.empty()) { 806 | root.children.push_back(node.index); 807 | } 808 | } 809 | dfs_show(root, 0); 810 | // for (auto &it: NodesMap) { 811 | // cout << it.first << ": " << it.second << endl; 812 | // } 813 | } 814 | 815 | void PseudoCodeList::inline_function() { 816 | vector new_codes; 817 | map> func_codes; 818 | string cur_func = INVALID; 819 | for (auto &code: codes) { 820 | if (code.op == OP_FUNC) { 821 | cur_func = code.num2; 822 | func_codes[cur_func] = vector(); 823 | } else if (code.op == OP_END_FUNC) { 824 | cur_func = INVALID; 825 | } else if (cur_func != INVALID) { 826 | func_codes[cur_func].push_back(code); 827 | } 828 | } 829 | 830 | vector> paras; 831 | vector call_func; 832 | cur_func = GLOBAL; 833 | for (auto &code: codes) { 834 | if (code.op == OP_FUNC) { 835 | cur_func = code.num2; 836 | } 837 | 838 | if (code.op == OP_PREPARE_CALL) { 839 | if (SymTable::search_func(code.num1).recur_func 840 | || func_codes[code.num1].size() > 20) { 841 | cout << code.num1 << " size: " << func_codes[code.num1].size() << endl; 842 | new_codes.push_back(code); 843 | continue; 844 | } 845 | paras.emplace_back(); 846 | call_func.push_back(code.num1); 847 | call_times++; 848 | cout << "inline func: " << call_func.back() << endl; 849 | } else if (code.op == OP_PUSH_PARA && !call_func.empty()) { 850 | if (code.num1[0] == '#') { 851 | string var_name = "@V" + to_string(code_index); 852 | code_index++; 853 | new_codes.emplace_back(OP_ASSIGN, var_name, code.num1, VACANT); 854 | int addr = SymTable::func_size(cur_func) + 4; 855 | SymTable::add(cur_func, var_name, var, SymTable::try_search(cur_func, code.num1, true).dataType, addr); 856 | paras.back().push_back(var_name); 857 | } else { 858 | paras.back().push_back(code.num1); 859 | } 860 | } else if (code.op == OP_CALL && !call_func.empty()) { 861 | assertion(!call_func.empty()); 862 | vector> call_paras = SymTable::search_func(call_func.back()).paras; 863 | string label_end_func = assign_label(); 864 | vector call_codes = func_codes[call_func.back()]; 865 | for (int i = 0; i < call_codes.size(); i++) { 866 | PseudoCode c = call_codes[i]; 867 | string result = rename_inline_var(c.result, call_paras, paras.back(), call_times, call_func.back(), 868 | cur_func); 869 | string num1 = rename_inline_var(c.num1, call_paras, paras.back(), call_times, call_func.back(), 870 | cur_func); 871 | string num2 = rename_inline_var(c.num2, call_paras, paras.back(), call_times, call_func.back(), 872 | cur_func); 873 | if (c.op == OP_RETURN) { 874 | if (c.num1 != VACANT) { 875 | new_codes.emplace_back(OP_ASSIGN, "%RET", num1, VACANT); 876 | } 877 | if (i == call_codes.size() - 1 878 | || (c.op == OP_RETURN && call_codes[i + 1].to_str() == c.to_str())) { 879 | 880 | } else { 881 | new_codes.emplace_back(OP_JUMP_UNCOND, label_end_func, VACANT, VACANT); 882 | } 883 | } else { 884 | new_codes.emplace_back(c.op, num1, num2, result); 885 | } 886 | } 887 | 888 | new_codes.emplace_back(OP_LABEL, label_end_func, VACANT, VACANT); 889 | call_func.pop_back(); 890 | paras.pop_back(); 891 | } else { // if (call_func.empty()) 892 | new_codes.push_back(code); 893 | } 894 | } 895 | 896 | codes = new_codes; 897 | } 898 | 899 | string rename_inline_var(string name, vector> call_paras, 900 | vector real_paras, int call_times, const string &call_func, const string &cur_func) { 901 | for (int i = 0; i < call_paras.size(); i++) { 902 | if (call_paras[i].second == name) { 903 | cout << "replace " << name << " with " << real_paras[i] << endl; 904 | return real_paras[i]; 905 | } 906 | } 907 | if (begins_num(name) || SymTable::in_global(call_func, name)) { 908 | return name; 909 | } 910 | SymTableItem item = SymTable::try_search(call_func, name, false); 911 | if (item.valid) { 912 | string ret = "_" + to_string(call_times) + "_" + name; 913 | if (!SymTable::try_search(cur_func, ret, false).valid) { 914 | int addr = SymTable::func_size(cur_func) + 4; 915 | SymTable::add(cur_func, ret, item.stiType, item.dataType, addr, item.dim1_size, item.dim2_size); 916 | } 917 | return ret; 918 | } else if (name.substr(0, 5) == "label") { 919 | return name + "_" + to_string(call_times) + "_"; 920 | } 921 | 922 | return name; 923 | } 924 | 925 | bool is_arith(const string &op) { 926 | return (op == OP_ADD || op == OP_SUB || op == OP_MUL || op == OP_DIV); 927 | } 928 | 929 | bool can_dag(const string &op) { 930 | return is_arith(op) || op == OP_ASSIGN; //|| op == OP_ARR_LOAD || op == OP_ARR_SAVE; 931 | } -------------------------------------------------------------------------------- /PseudoCode.h: -------------------------------------------------------------------------------- 1 | // 2 | // Created by wzk on 2020/11/5. 3 | // 4 | 5 | #ifndef COMPILER_Pseudo_H 6 | #define COMPILER_Pseudo_H 7 | 8 | #include 9 | #include 10 | #include 11 | #include 12 | #include 13 | #include 14 | 15 | #include "utils.h" 16 | #include "SymTable.h" 17 | 18 | #define VACANT "#VACANT" 19 | #define AUTO "#AUTO" 20 | #define AUTO_VAR "#AUTO_VAR" 21 | #define ENDL "#ENDL" 22 | #define AUTO_LABEL "#AUTO_LABEL" 23 | 24 | #define OP_PRINT "PRINT" 25 | #define OP_SCANF "SCANF" 26 | #define OP_ASSIGN ":=" 27 | #define OP_ADD "+" 28 | #define OP_SUB "-" 29 | #define OP_MUL "*" 30 | #define OP_DIV "/" 31 | #define OP_FUNC "FUNC" 32 | #define OP_END_FUNC "END_FUNC" 33 | 34 | #define OP_ARR_LOAD "ARR_LOAD" 35 | #define OP_ARR_SAVE "ARR_SAVE" 36 | #define OP_LABEL "LABEL" 37 | #define OP_JUMP_IF "JUMP_IF" 38 | #define OP_JUMP_UNCOND "JUMP" 39 | 40 | #define OP_PREPARE_CALL "PREPARE_CALL" 41 | #define OP_CALL "CALL" 42 | #define OP_PUSH_PARA "PUSH_PARA" 43 | #define OP_RETURN "RETURN" 44 | 45 | #define OP_EMPTY "EMPTY" 46 | #define OP_PLACEHOLDER "PLACEHOLDER" 47 | 48 | 49 | #define OP_VAR "VAR" 50 | #define OP_CONST "CONST" 51 | #define OP_EQL "EQL" 52 | #define OP_NEQ "NEQ" 53 | #define OP_GEQ "GEQ" 54 | #define OP_GRE "GRE" 55 | #define OP_LEQ "LEQ" 56 | #define OP_LSS "LSS" 57 | #define OP_GOTO "GOTO" 58 | #define OP_BNZ "BNZ" 59 | #define OP_BZ "BZ" 60 | 61 | using namespace std; 62 | 63 | class PseudoCode { 64 | public: 65 | string op; 66 | string num1; 67 | string num2; 68 | string result; 69 | 70 | PseudoCode(string op, string n1, string n2, string r) : 71 | op(std::move(op)), num1(std::move(n1)), num2(std::move(n2)), result(std::move(r)) {}; 72 | 73 | PseudoCode() = default; 74 | 75 | string to_str() const; 76 | 77 | string to_standard_format() const; 78 | }; 79 | 80 | class BasicBlock { 81 | public: 82 | int index; 83 | int start; 84 | int end; 85 | vector codes; 86 | vector prev; 87 | vector next; 88 | 89 | BasicBlock(int index, int start, int end) : index(index), start(start), end(end) {} 90 | }; 91 | 92 | class DAGNode { 93 | public: 94 | int index; 95 | string name; 96 | bool is_leaf; 97 | bool in_queue = false; 98 | string primary_symbol; 99 | vector symbols; 100 | vector children; 101 | vector parents; 102 | 103 | DAGNode(int index, const string& name, bool is_leaf) : index(index), name(name), is_leaf(is_leaf) { 104 | if (is_leaf) { 105 | symbols.push_back(name); 106 | } 107 | } 108 | 109 | // DAGNode(int index, string name, bool is_leaf, const string &name2) : 110 | // index(index), name(std::move(name)), is_leaf(is_leaf) { 111 | // symbols.push_back(name2); 112 | // } 113 | }; 114 | 115 | class PseudoCodeList { 116 | public: 117 | static vector codes; 118 | static int code_index; 119 | static int label_index; 120 | static vector strcons; 121 | static map> blocks; 122 | static vector DAGNodes; 123 | static map NodesMap; 124 | static int call_times; 125 | 126 | static void reset() { 127 | codes.clear(); 128 | strcons.clear(); 129 | blocks.clear(); 130 | code_index = 1; 131 | label_index = 1; 132 | DAGNodes.clear(); 133 | NodesMap.clear(); 134 | } 135 | 136 | static string add(const string &op, const string &n1, const string &n2, const string &r) { 137 | string result = r; 138 | if (result == AUTO) { 139 | result = "#T" + to_string(code_index); 140 | code_index++; 141 | } else if (result == AUTO_LABEL) { 142 | result = assign_label(); 143 | } else if (result == AUTO_VAR) { 144 | result = "@V" + to_string(code_index); 145 | code_index++; 146 | } 147 | codes.emplace_back(op, n1, n2, result); 148 | return result; 149 | } 150 | 151 | static string assign_label() { 152 | label_index++; 153 | return "label_" + to_string(label_index - 1); 154 | } 155 | 156 | static void refactor(); 157 | 158 | static void remove_redundant_tmp(); 159 | 160 | static void remove_redundant_assign(); 161 | 162 | static void remove_tripple(); 163 | 164 | static void const_broadcast(); 165 | 166 | static void interpret(); 167 | 168 | static void divide_basic_blocks(); 169 | 170 | static void dfs_show(const DAGNode &node, int depth); 171 | 172 | static void show_DAG_tree(); 173 | 174 | static void DAG_optimize(); 175 | 176 | static vector DAG_output(); 177 | 178 | static void gen_DAG_graph(int, int); 179 | 180 | static void inline_function(); 181 | 182 | static void show() { 183 | cout << "========MID CODES========" << endl; 184 | for (auto &c: codes) { 185 | cout << c.to_str() << endl; 186 | 187 | } 188 | cout << "=========================" << endl; 189 | } 190 | 191 | static void save_to_file(const string &out_path) { 192 | ofstream out(out_path); 193 | for (int i = 0; i < strcons.size(); i++) { 194 | out << "str" << i << ": " << strcons[i] << endl; 195 | } 196 | out << "===============" << endl; 197 | for (int i = 0; i < codes.size(); i++) { 198 | //out << i << ": " << codes[i].to_str() << endl; 199 | out << codes[i].to_str() << endl; 200 | } 201 | out.close(); 202 | } 203 | 204 | static void show_standard_format() { 205 | cout << "========MID CODES========" << endl; 206 | for (auto &c: codes) { 207 | string format = c.to_standard_format(); 208 | if (format != INVALID) { 209 | cout << format << endl; 210 | } 211 | } 212 | cout << "=========================" << endl; 213 | } 214 | 215 | static void save_to_file_standard_format(const string &out_path) { 216 | ofstream out(out_path); 217 | for (auto &c: codes) { 218 | string format = c.to_standard_format(); 219 | if (format != INVALID) { 220 | cout << format << endl; 221 | } 222 | } 223 | out.close(); 224 | } 225 | }; 226 | 227 | bool is_arith(const string &op); 228 | 229 | bool can_dag(const string &op); 230 | 231 | string rename_inline_var(string name, vector> call_paras, 232 | vector real_paras, int call_times, const string& call_func, const string& cur_func); 233 | 234 | #endif //COMPILER_Pseudo_H 235 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # BUAA compiler 2020 2 | 3 | 北航编译实验课作业 4 | 5 | 6 | 7 | 请参考[设计文档](https://github.com/wzk1015/compiler/blob/correct/Docs/%E7%BC%96%E8%AF%91%E5%99%A8%E8%AE%BE%E8%AE%A1%E6%96%87%E6%A1%A3%2018231047%E7%8E%8B%E8%82%87%E5%87%AF.md)查看使用方法和设计实现 8 | 9 | 10 | 11 | *此分支为竞速最终提交版本并参加申优,master分支下有更多的优化但存在bug,~~感兴趣者可以去帮忙debug~~* 12 | 13 | 最终竞速排名大约为25/240 14 | -------------------------------------------------------------------------------- /SymTable.cpp: -------------------------------------------------------------------------------- 1 | // 2 | // Created by wzk on 2020/10/1. 3 | // 4 | 5 | #include "SymTable.h" 6 | 7 | #include 8 | 9 | vector SymTable::global; 10 | map> SymTable::local; 11 | unsigned int SymTable::max_name_length = 5; 12 | SymTableItem SymTable::invalid = SymTableItem(false); 13 | 14 | 15 | void SymTable::add(const string &func, const string &name, STIType stiType, DataType dataType, int addr) { 16 | if (try_search(func, name, false).valid) { 17 | Errors::add("redefined identifier '" + name + "'", ERR_REDEFINED); 18 | return; 19 | } 20 | SymTableItem a(name, stiType, dataType, addr); 21 | a.size = size_of(dataType); 22 | if (func == GLOBAL) { 23 | global.push_back(a); 24 | } else { 25 | local[func].push_back(a); 26 | } 27 | max_name_length = max_name_length > name.length() ? max_name_length : name.length(); 28 | } 29 | 30 | void SymTable::add(const string &func, const Token &tk, STIType stiType, DataType dataType, int addr) { 31 | add(func, tk, stiType, dataType, addr, 0, 0); 32 | } 33 | 34 | void 35 | SymTable::add(const string &func, const Token &tk, STIType stiType, DataType dataType, int addr, int dim1, int dim2) { 36 | if (try_search(func, tk.str, false).valid) { 37 | Errors::add("redefined identifier '" + tk.str + "'", tk.line, tk.column, ERR_REDEFINED); 38 | return; 39 | } 40 | SymTableItem a(tk.str, stiType, dataType, addr); 41 | if (dim1 == 0) { 42 | a.dim = 0; 43 | } else if (dim2 == 0) { 44 | a.dim = 1; 45 | a.dim1_size = dim1; 46 | a.size = size_of(dataType) * dim1; 47 | // a.arr_value = vector(dim1); 48 | } else { 49 | a.dim = 2; 50 | a.dim1_size = dim1; 51 | a.dim2_size = dim2; 52 | a.size = size_of(dataType) * dim1 * dim2; 53 | // a.arr_value = vector(dim1 * dim2); 54 | } 55 | if (func == GLOBAL) { 56 | global.push_back(a); 57 | } else { 58 | local[func].push_back(a); 59 | } 60 | max_name_length = max_name_length > tk.str.length() ? max_name_length : tk.str.length(); 61 | } 62 | 63 | void 64 | SymTable::add(const string &func, const string &name, STIType stiType, DataType dataType, int addr, int dim1, int dim2) { 65 | if (try_search(func, name, false).valid) { 66 | Errors::add("redefined identifier '" + name + "'", ERR_REDEFINED); 67 | return; 68 | } 69 | SymTableItem a(name, stiType, dataType, addr); 70 | if (dim1 == 0) { 71 | a.dim = 0; 72 | } else if (dim2 == 0) { 73 | a.dim = 1; 74 | a.dim1_size = dim1; 75 | a.size = size_of(dataType) * dim1; 76 | // a.arr_value = vector(dim1); 77 | } else { 78 | a.dim = 2; 79 | a.dim1_size = dim1; 80 | a.dim2_size = dim2; 81 | a.size = size_of(dataType) * dim1 * dim2; 82 | // a.arr_value = vector(dim1 * dim2); 83 | } 84 | if (func == GLOBAL) { 85 | global.push_back(a); 86 | } else { 87 | local[func].push_back(a); 88 | } 89 | max_name_length = max_name_length > name.length() ? max_name_length : name.length(); 90 | } 91 | 92 | void SymTable::add_const(const string &func, const Token &tk, DataType dataType, string const_value) { 93 | if (try_search(func, tk.str, false).valid) { 94 | Errors::add("redefined const '" + tk.str + "'", tk.line, tk.column, ERR_REDEFINED); 95 | return; 96 | } 97 | SymTableItem a(tk.str, constant, dataType, 0); 98 | a.const_value = std::move(const_value); 99 | if (func == GLOBAL) { 100 | global.push_back(a); 101 | } else { 102 | local[func].push_back(a); 103 | } 104 | max_name_length = max_name_length > tk.str.length() ? max_name_length : tk.str.length(); 105 | } 106 | 107 | int SymTable::add_func(const Token &tk, DataType dataType, vector> paras) { 108 | if (try_search(GLOBAL, tk.str, true).valid) { 109 | Errors::add("redefined function '" + tk.str + "'", tk.line, tk.column, ERR_REDEFINED); 110 | return -1; 111 | } 112 | 113 | local[tk.str] = vector(); 114 | //local.insert(make_pair(tk.str, vector())); 115 | 116 | SymTableItem a(tk.str, func, dataType, 0); 117 | a.paras = std::move(paras); 118 | global.push_back(a); 119 | max_name_length = max_name_length > tk.str.length() ? max_name_length : tk.str.length(); 120 | return int(global.size() - 1); 121 | } 122 | 123 | SymTableItem SymTable::search(const string &func, const Token &tk) { 124 | if (func != GLOBAL) { 125 | if (!search_func(func).valid) { 126 | return SymTableItem(false); 127 | } 128 | vector loc = local.find(func)->second; 129 | for (auto &item: loc) { 130 | if (item.name == tk.str) { 131 | return item; 132 | } 133 | } 134 | } 135 | for (auto &item: global) { 136 | if (item.name == tk.str) { 137 | return item; 138 | } 139 | } 140 | // Errors::add("undefined identifier '" + tk.str + "'", tk.line, tk.column, E_UNDEFINED_IDENTF); 141 | Errors::add("undefined identifier '" + tk.str + "'", tk.line, tk.column, ERR_UNDEFINED); 142 | return SymTableItem(false); 143 | } 144 | 145 | SymTableItem SymTable::search(const string &func, const string &str) { 146 | if (func != GLOBAL) { 147 | if (!search_func(func).valid) { 148 | return SymTableItem(false); 149 | } 150 | vector loc = local.find(func)->second; 151 | for (auto &item: loc) { 152 | if (item.name == str) { 153 | return item; 154 | } 155 | } 156 | } 157 | for (auto &item: global) { 158 | if (item.name == str) { 159 | return item; 160 | } 161 | } 162 | Errors::add("undefined identifier '" + str + "'", ERR_UNDEFINED); 163 | return SymTableItem(false); 164 | } 165 | 166 | SymTableItem &SymTable::ref_search(const string &func, const string &str) { 167 | if (func != GLOBAL) { 168 | if (!search_func(func).valid) { 169 | return invalid; 170 | } 171 | for (auto &item: local.find(func)->second) { 172 | if (item.name == str) { 173 | return item; 174 | } 175 | } 176 | } 177 | for (auto &item: global) { 178 | if (item.name == str) { 179 | return item; 180 | } 181 | } 182 | //Errors::add("undefined identifier '" + str + "'", ERR_UNDEFINED); 183 | return invalid; 184 | } 185 | 186 | SymTableItem SymTable::try_search(const string &func, const string &str, bool include_global) { 187 | if (func != GLOBAL) { 188 | if (!search_func(func).valid) { 189 | return SymTableItem(false); 190 | } 191 | vector loc = local.find(func)->second; 192 | for (auto &item: loc) { 193 | if (item.name == str) { 194 | return item; 195 | } 196 | } 197 | } 198 | if (include_global) { 199 | for (auto &item: global) { 200 | if (item.name == str) { 201 | return item; 202 | } 203 | } 204 | } 205 | return SymTableItem(false); 206 | } 207 | 208 | SymTableItem SymTable::search_func(const string &func_name) { 209 | for (auto &item: global) { 210 | if (item.name == func_name && item.stiType == func) { 211 | return item; 212 | } 213 | } 214 | Errors::add("undefined function '" + func_name + "'", ERR_UNDEFINED); 215 | return invalid; 216 | } 217 | 218 | void SymTable::show() { 219 | string sep1, sep2; 220 | for (int i = 0; i < max_name_length + 26; i++) { 221 | sep1 += "="; 222 | sep2 += "-"; 223 | } 224 | cout << sep1 << endl << "NAME"; 225 | for (int i = 0; i < max_name_length - 3; i++) { 226 | cout << " "; 227 | } 228 | cout << "KIND TYPE DIM ADDR VALUE" << endl << sep2 << endl; 229 | cout << "---" << GLOBAL << endl; 230 | for (auto &item: global) { 231 | cout << item.to_str() << endl; 232 | } 233 | for (auto &f: local) { 234 | cout << sep2 << endl; 235 | cout << "---" << f.first << endl; 236 | for (auto &item: f.second) { 237 | cout << item.to_str() << endl; 238 | } 239 | } 240 | cout << sep1 << endl; 241 | } 242 | 243 | bool SymTable::in_global(const string &func, const string &str) { 244 | return !try_search(func, str, false).valid && try_search(GLOBAL, str, true).valid; 245 | } 246 | 247 | int SymTable::func_size(const string &func) { 248 | vector vec = func == GLOBAL ? global : local[func]; 249 | if (!vec.empty()) { 250 | for (auto it = vec.end() - 1; it >= vec.begin(); it--) { 251 | if (it->stiType != constant) { 252 | return it->addr + it->size; 253 | } 254 | } 255 | } 256 | return 0; 257 | } 258 | 259 | string SymTableItem::to_str() const { 260 | map stitype_str = { 261 | {constant, "const"}, 262 | {var, "var "}, 263 | {tmp, "tmp "}, 264 | {para, "para "}, 265 | {func, "func "} 266 | }; 267 | 268 | map datatype_str = { 269 | {integer, "int "}, 270 | {character, "char"}, 271 | {void_ret, "void"} 272 | }; 273 | string ans = name; 274 | for (int i = 0; i < SymTable::max_name_length - name.length(); i++) { 275 | ans += " "; 276 | } 277 | string sep = addr >= 10000 ? " " : addr >= 1000 ? " " : addr >= 100 ? " " : addr >= 10 ? " " : " "; 278 | return ans + " " + stitype_str[stiType] + " " + datatype_str[dataType] + " " + to_string(dim) + " " + 279 | to_string(addr) + sep + const_value; 280 | } 281 | 282 | void SymTable::reset() { 283 | global.clear(); 284 | local.clear(); 285 | max_name_length = 5; 286 | } -------------------------------------------------------------------------------- /SymTable.h: -------------------------------------------------------------------------------- 1 | // 2 | // Created by wzk on 2020/10/1. 3 | // 4 | 5 | #ifndef COMPILER_SYMTABLE_H 6 | #define COMPILER_SYMTABLE_H 7 | 8 | #include 9 | #include 10 | #include 11 | #include "Error.h" 12 | #include "Lexer.h" 13 | 14 | #define GLOBAL "#GLOBAL" 15 | #define size_of(dt) (4) 16 | //(dt == integer ? 4 : 1) 17 | #define type_to_str(dt) (dt == integer? "int" : "char") 18 | 19 | #define LOCAL_ADDR_INIT 100 20 | 21 | using namespace std; 22 | 23 | enum STIType { 24 | invalid_sti, 25 | constant, 26 | var, 27 | tmp, 28 | para, 29 | func 30 | }; 31 | 32 | enum DataType { 33 | invalid_dt, 34 | integer, 35 | character, 36 | void_ret 37 | 38 | }; 39 | 40 | class SymTableItem { 41 | public: 42 | string name; 43 | STIType stiType{}; 44 | DataType dataType{}; 45 | int dim = 0; 46 | bool valid = true; 47 | vector> paras; 48 | int addr{}; 49 | int size{}; 50 | int dim1_size{}; 51 | int dim2_size{}; 52 | string const_value; 53 | bool modified = false; 54 | bool recur_func = false; 55 | 56 | 57 | SymTableItem(string name, STIType stiType1, DataType dataType1, int addr) : 58 | name(std::move(name)), stiType(stiType1), dataType(dataType1), addr(addr) {}; 59 | 60 | explicit SymTableItem(bool valid) : valid(valid) {}; 61 | 62 | SymTableItem() = default; 63 | 64 | string to_str() const; 65 | }; 66 | 67 | class SymTable { 68 | public: 69 | static vector global; 70 | static map> local; 71 | static unsigned int max_name_length; 72 | static SymTableItem invalid; 73 | 74 | static void add(const string &func, const string &name, STIType stiType, DataType dataType, int addr); 75 | 76 | static void add(const string &func, const string &name, STIType stiType, DataType dataType, int addr, int dim1, int dim2); 77 | 78 | static void add(const string &func, const Token &tk, STIType stiType, DataType dataType, int addr); 79 | 80 | static void add(const string &func, const Token &tk, STIType stiType, DataType dataType, int addr, int dim1, int dim2); 81 | 82 | static void add_const(const string &func, const Token &tk, DataType dataType, string const_value); 83 | 84 | static int add_func(const Token &tk, DataType dataType, vector> paras); 85 | 86 | static SymTableItem search(const string &func, const Token &tk); 87 | 88 | static SymTableItem search(const string &func, const string &str); 89 | 90 | static SymTableItem try_search(const string &func, const string &str, bool include_global); 91 | 92 | static bool in_global(const string &func, const string &str); 93 | 94 | static SymTableItem search_func(const string &func_name); 95 | 96 | static void show(); 97 | 98 | static SymTableItem &ref_search(const string &func, const string &str); 99 | 100 | static int func_size(const string &func); 101 | 102 | static void reset(); 103 | }; 104 | 105 | 106 | #endif //COMPILER_SYMTABLE_H 107 | -------------------------------------------------------------------------------- /error.txt: -------------------------------------------------------------------------------- 1 | 28 l 2 | 35 f 3 | 40 f 4 | -------------------------------------------------------------------------------- /main.cpp: -------------------------------------------------------------------------------- 1 | #include "Grammar.h" 2 | #include "MipsGenerator.h" 3 | 4 | using namespace std; 5 | 6 | int main() { 7 | cout << ":::::::::::::::::::::::::::::::::::::::::::::::::::::" << endl; 8 | cout << ":: ::" << endl; 9 | cout << ":: wzk's compiler V1.0 ::" << endl; 10 | cout << ":: ::" << endl; 11 | cout << ":::::::::::::::::::::::::::::::::::::::::::::::::::::" << endl; 12 | 13 | //语法分析、错误处理 14 | Grammar grammar("testfile.txt", grammar_check); 15 | grammar.analyze(); 16 | grammar.save_to_file("output.txt"); 17 | Errors::save_to_file("error.txt"); 18 | 19 | 20 | 21 | //中间代码优化 22 | 23 | PseudoCodeList::refactor(); 24 | 25 | 26 | 27 | PseudoCodeList::remove_redundant_assign(); 28 | PseudoCodeList::const_broadcast(); 29 | PseudoCodeList::remove_redundant_tmp(); 30 | 31 | PseudoCodeList::save_to_file("pseudo_code_old.txt"); 32 | 33 | PseudoCodeList::inline_function(); 34 | // 35 | PseudoCodeList::const_broadcast(); 36 | 37 | PseudoCodeList::save_to_file("pseudo_code.txt"); 38 | 39 | //目标代码生成 40 | MipsGenerator mips; 41 | mips.optimize_muldiv = true; 42 | mips.optimize_assign_reg = true; 43 | mips.translate(); 44 | mips.save_to_file("mips.txt"); 45 | // 46 | //SymTable::show(); 47 | 48 | //grammar.show_tree(); 49 | 50 | if (Errors::terminate()) { 51 | return 0; 52 | } 53 | 54 | return 0; 55 | } 56 | -------------------------------------------------------------------------------- /mips.txt: -------------------------------------------------------------------------------- 1 | .data 2 | str__0: .asciiz "-----\n" 3 | str__1: .asciiz "func1_print_99 done!\n" 4 | str__2: .asciiz "The result is: " 5 | str__3: .asciiz "-----\n" 6 | str__4: .asciiz "-----\n" 7 | str__5: .asciiz "fun2_print_return_999 done!\n" 8 | str__6: .asciiz "The result is below: \n" 9 | str__7: .asciiz "-----\n" 10 | str__8: .asciiz "nice\n" 11 | str__9: .asciiz "18231052\n" 12 | str__10: .asciiz "nice\n" 13 | newline__: .asciiz "\n" 14 | .text 15 | 16 | # === =========FUNC void func1_print_99========= === 17 | addi $sp, $sp, -256 18 | j main 19 | func1_print_99: 20 | 21 | # === PRINT 0 strcon === 22 | la $a0, str__0 23 | li $v0, 4 24 | syscall 25 | 26 | # === PRINT 1 strcon === 27 | la $a0, str__1 28 | li $v0, 4 29 | syscall 30 | 31 | # === PRINT 2 strcon === 32 | la $a0, str__2 33 | li $v0, 4 34 | syscall 35 | 36 | # === PRINT 99 int === 37 | li $a0, 99 38 | li $v0, 1 39 | syscall 40 | 41 | # === PRINT #ENDL === 42 | la $a0, newline__ 43 | li $v0, 4 44 | syscall 45 | 46 | # === PRINT 3 strcon === 47 | la $a0, str__3 48 | li $v0, 4 49 | syscall 50 | 51 | # === RETURN === 52 | jr $ra 53 | 54 | # === RETURN === 55 | jr $ra 56 | 57 | # === =========END_FUNC void func1_print_99========= === 58 | 59 | # === =========FUNC int fun2_print_return_999========= === 60 | fun2_print_return_999: 61 | 62 | # === a = 111 === 63 | li $s0, 111 64 | 65 | # === b = 9 === 66 | li $s1, 9 67 | 68 | # === PRINT 4 strcon === 69 | la $a0, str__4 70 | li $v0, 4 71 | syscall 72 | 73 | # === PRINT 5 strcon === 74 | la $a0, str__5 75 | li $v0, 4 76 | syscall 77 | 78 | # === PRINT 6 strcon === 79 | la $a0, str__6 80 | li $v0, 4 81 | syscall 82 | 83 | # === #T2 = a * b === 84 | mul $t0, $s0, $s1 85 | 86 | # === PRINT #T2 int === 87 | move $a0, $t0 88 | # RELEASE $t0 89 | li $v0, 1 90 | syscall 91 | 92 | # === PRINT #ENDL === 93 | la $a0, newline__ 94 | li $v0, 4 95 | syscall 96 | 97 | # === PRINT 7 strcon === 98 | la $a0, str__7 99 | li $v0, 4 100 | syscall 101 | 102 | # === #T3 = a * b === 103 | mul $t0, $s0, $s1 104 | 105 | # === RETURN #T3 === 106 | move $v0, $t0 107 | # RELEASE $t0 108 | jr $ra 109 | 110 | # === =========END_FUNC int fun2_print_return_999========= === 111 | 112 | # === =========FUNC void errfun========= === 113 | errfun: 114 | 115 | # === RETURN === 116 | jr $ra 117 | 118 | # === RETURN === 119 | jr $ra 120 | 121 | # === =========END_FUNC void errfun========= === 122 | 123 | # === =========FUNC void main========= === 124 | main: 125 | 126 | # === b = 100 === 127 | li $s0, 100 128 | 129 | # === c = -100 === 130 | li $s1, -100 131 | 132 | # === ff = 102 === 133 | li $s2, 102 134 | 135 | # === #T4 = ff - 10 === 136 | addiu $t0, $s2, -10 137 | 138 | # === JUMP_IF #T4<=0 label_1 === 139 | blez $t0, label_1 140 | # RELEASE $t0 141 | 142 | # === PRINT 8 strcon === 143 | la $a0, str__8 144 | li $v0, 4 145 | syscall 146 | 147 | # === LABEL label_1 === 148 | label_1: 149 | 150 | # === #T5 = b * c === 151 | mul $t0, $s0, $s1 152 | 153 | # === #T6 = b * b === 154 | mul $t1, $s0, $s0 155 | 156 | # === c = #T5 - #T6 === 157 | subu $s1, $t0, $t1 158 | # RELEASE $t0 159 | # RELEASE $t1 160 | 161 | # === PRINT 9 strcon === 162 | la $a0, str__9 163 | li $v0, 4 164 | syscall 165 | 166 | # === JUMP_IF -95<=0 label_2 === 167 | j label_2 168 | 169 | # === PRINT 10 strcon === 170 | la $a0, str__10 171 | li $v0, 4 172 | syscall 173 | 174 | # === LABEL label_2 === 175 | label_2: 176 | 177 | # === PRINT 0 strcon === 178 | la $a0, str__0 179 | li $v0, 4 180 | syscall 181 | 182 | # === PRINT 1 strcon === 183 | la $a0, str__1 184 | li $v0, 4 185 | syscall 186 | 187 | # === PRINT 2 strcon === 188 | la $a0, str__2 189 | li $v0, 4 190 | syscall 191 | 192 | # === PRINT 99 int === 193 | li $a0, 99 194 | li $v0, 1 195 | syscall 196 | 197 | # === PRINT #ENDL === 198 | la $a0, newline__ 199 | li $v0, 4 200 | syscall 201 | 202 | # === PRINT 3 strcon === 203 | la $a0, str__3 204 | li $v0, 4 205 | syscall 206 | 207 | # === LABEL label_3 === 208 | label_3: 209 | 210 | # === _2_a = 111 === 211 | li $s3, 111 212 | 213 | # === _2_b = 9 === 214 | li $s4, 9 215 | 216 | # === PRINT 4 strcon === 217 | la $a0, str__4 218 | li $v0, 4 219 | syscall 220 | 221 | # === PRINT 5 strcon === 222 | la $a0, str__5 223 | li $v0, 4 224 | syscall 225 | 226 | # === PRINT 6 strcon === 227 | la $a0, str__6 228 | li $v0, 4 229 | syscall 230 | 231 | # === _2_#T2 = _2_a * _2_b === 232 | mul $t0, $s3, $s4 233 | 234 | # === PRINT _2_#T2 int === 235 | move $a0, $t0 236 | # RELEASE $t0 237 | li $v0, 1 238 | syscall 239 | 240 | # === PRINT #ENDL === 241 | la $a0, newline__ 242 | li $v0, 4 243 | syscall 244 | 245 | # === PRINT 7 strcon === 246 | la $a0, str__7 247 | li $v0, 4 248 | syscall 249 | 250 | # === _2_#T3 = _2_a * _2_b === 251 | mul $t0, $s3, $s4 252 | 253 | # === %RET = _2_#T3 === 254 | move $v0, $t0 255 | # RELEASE $t0 256 | 257 | # === LABEL label_4 === 258 | label_4: 259 | 260 | # === a = %RET === 261 | move $s5, $v0 262 | 263 | # === RETURN === 264 | li $v0, 10 265 | syscall 266 | 267 | # === =========END_FUNC void main========= === 268 | -------------------------------------------------------------------------------- /output.txt: -------------------------------------------------------------------------------- 1 | CONSTTK const 2 | INTTK int 3 | IDENFR CONST_INT_2 4 | ASSIGN = 5 | INTCON 2 6 | <无符号整数> 7 | <整数> 8 | <常量定义> 9 | SEMICN ; 10 | <常量说明> 11 | VOIDTK void 12 | IDENFR func1_print_99 13 | LPARENT ( 14 | <参数表> 15 | RPARENT ) 16 | LBRACE { 17 | CONSTTK const 18 | INTTK int 19 | IDENFR a 20 | ASSIGN = 21 | INTCON 9 22 | <无符号整数> 23 | <整数> 24 | <常量定义> 25 | SEMICN ; 26 | CONSTTK const 27 | INTTK int 28 | IDENFR b 29 | ASSIGN = 30 | INTCON 11 31 | <无符号整数> 32 | <整数> 33 | <常量定义> 34 | SEMICN ; 35 | <常量说明> 36 | PRINTFTK printf 37 | LPARENT ( 38 | STRCON ----- 39 | <字符串> 40 | RPARENT ) 41 | <写语句> 42 | SEMICN ; 43 | <语句> 44 | PRINTFTK printf 45 | LPARENT ( 46 | STRCON func1_print_99 done! 47 | <字符串> 48 | RPARENT ) 49 | <写语句> 50 | SEMICN ; 51 | <语句> 52 | PRINTFTK printf 53 | LPARENT ( 54 | STRCON The result is: 55 | <字符串> 56 | COMMA , 57 | IDENFR a 58 | <因子> 59 | MULT * 60 | IDENFR b 61 | <因子> 62 | <项> 63 | <表达式> 64 | RPARENT ) 65 | <写语句> 66 | SEMICN ; 67 | <语句> 68 | PRINTFTK printf 69 | LPARENT ( 70 | STRCON ----- 71 | <字符串> 72 | RPARENT ) 73 | <写语句> 74 | SEMICN ; 75 | <语句> 76 | RETURNTK return 77 | <返回语句> 78 | SEMICN ; 79 | <语句> 80 | <语句列> 81 | <复合语句> 82 | RBRACE } 83 | <无返回值函数定义> 84 | INTTK int 85 | IDENFR fun2_print_return_999 86 | <声明头部> 87 | LPARENT ( 88 | <参数表> 89 | RPARENT ) 90 | LBRACE { 91 | INTTK int 92 | IDENFR a 93 | ASSIGN = 94 | INTCON 111 95 | <无符号整数> 96 | <整数> 97 | <常量> 98 | <变量定义及初始化> 99 | <变量定义> 100 | SEMICN ; 101 | INTTK int 102 | IDENFR b 103 | ASSIGN = 104 | INTCON 9 105 | <无符号整数> 106 | <整数> 107 | <常量> 108 | <变量定义及初始化> 109 | <变量定义> 110 | SEMICN ; 111 | <变量说明> 112 | PRINTFTK printf 113 | LPARENT ( 114 | STRCON ----- 115 | <字符串> 116 | RPARENT ) 117 | <写语句> 118 | SEMICN ; 119 | <语句> 120 | PRINTFTK printf 121 | LPARENT ( 122 | STRCON fun2_print_return_999 done! 123 | <字符串> 124 | RPARENT ) 125 | <写语句> 126 | SEMICN ; 127 | <语句> 128 | PRINTFTK printf 129 | LPARENT ( 130 | STRCON The result is below: 131 | <字符串> 132 | RPARENT ) 133 | <写语句> 134 | SEMICN ; 135 | <语句> 136 | PRINTFTK printf 137 | LPARENT ( 138 | IDENFR a 139 | <因子> 140 | MULT * 141 | IDENFR b 142 | <因子> 143 | <项> 144 | <表达式> 145 | RPARENT ) 146 | <写语句> 147 | SEMICN ; 148 | <语句> 149 | PRINTFTK printf 150 | LPARENT ( 151 | STRCON ----- 152 | <字符串> 153 | RPARENT ) 154 | <写语句> 155 | SEMICN ; 156 | <语句> 157 | RETURNTK return 158 | LPARENT ( 159 | IDENFR a 160 | <因子> 161 | MULT * 162 | IDENFR b 163 | <因子> 164 | <项> 165 | <表达式> 166 | RPARENT ) 167 | <返回语句> 168 | SEMICN ; 169 | <语句> 170 | <语句列> 171 | <复合语句> 172 | RBRACE } 173 | <有返回值函数定义> 174 | VOIDTK void 175 | IDENFR errfun 176 | LPARENT ( 177 | LBRACE { 178 | RETURNTK return 179 | <返回语句> 180 | SEMICN ; 181 | <语句> 182 | <语句列> 183 | <复合语句> 184 | RBRACE } 185 | <无返回值函数定义> 186 | VOIDTK void 187 | MAINTK main 188 | LPARENT ( 189 | RPARENT ) 190 | LBRACE { 191 | CONSTTK const 192 | CHARTK char 193 | IDENFR fki 194 | ASSIGN = 195 | CHARCON k 196 | <常量定义> 197 | SEMICN ; 198 | <常量说明> 199 | INTTK int 200 | IDENFR a 201 | <变量定义无初始化> 202 | <变量定义> 203 | SEMICN ; 204 | INTTK int 205 | IDENFR b 206 | ASSIGN = 207 | PLUS + 208 | INTCON 100 209 | <无符号整数> 210 | <整数> 211 | <常量> 212 | <变量定义及初始化> 213 | <变量定义> 214 | SEMICN ; 215 | INTTK int 216 | IDENFR c 217 | ASSIGN = 218 | MINU - 219 | INTCON 100 220 | <无符号整数> 221 | <整数> 222 | <常量> 223 | <变量定义及初始化> 224 | <变量定义> 225 | SEMICN ; 226 | CHARTK char 227 | IDENFR ff 228 | ASSIGN = 229 | CHARCON f 230 | <常量> 231 | <变量定义及初始化> 232 | <变量定义> 233 | SEMICN ; 234 | <变量说明> 235 | IFTK if 236 | LPARENT ( 237 | IDENFR ff 238 | <因子> 239 | <项> 240 | <表达式> 241 | GRE > 242 | INTCON 10 243 | <无符号整数> 244 | <整数> 245 | <因子> 246 | <项> 247 | <表达式> 248 | <条件> 249 | RPARENT ) 250 | PRINTFTK printf 251 | LPARENT ( 252 | STRCON nice 253 | <字符串> 254 | RPARENT ) 255 | <写语句> 256 | SEMICN ; 257 | <语句> 258 | <条件语句> 259 | <语句> 260 | IDENFR c 261 | ASSIGN = 262 | IDENFR b 263 | <因子> 264 | MULT * 265 | IDENFR c 266 | <因子> 267 | <项> 268 | MINU - 269 | LPARENT ( 270 | IDENFR b 271 | <因子> 272 | MULT * 273 | IDENFR b 274 | <因子> 275 | <项> 276 | <表达式> 277 | RPARENT ) 278 | <因子> 279 | <项> 280 | <表达式> 281 | <赋值语句> 282 | SEMICN ; 283 | <语句> 284 | PRINTFTK printf 285 | LPARENT ( 286 | STRCON 18231052 287 | <字符串> 288 | RPARENT ) 289 | <写语句> 290 | SEMICN ; 291 | <语句> 292 | IFTK if 293 | LPARENT ( 294 | INTCON 12 295 | <无符号整数> 296 | <整数> 297 | <因子> 298 | <项> 299 | <表达式> 300 | GRE > 301 | IDENFR fki 302 | <因子> 303 | <项> 304 | <表达式> 305 | <条件> 306 | RPARENT ) 307 | PRINTFTK printf 308 | LPARENT ( 309 | STRCON nice 310 | <字符串> 311 | RPARENT ) 312 | <写语句> 313 | SEMICN ; 314 | <语句> 315 | <条件语句> 316 | <语句> 317 | IDENFR func1_print_99 318 | LPARENT ( 319 | <值参数表> 320 | RPARENT ) 321 | <无返回值函数调用语句> 322 | SEMICN ; 323 | <语句> 324 | IDENFR a 325 | ASSIGN = 326 | IDENFR fun2_print_return_999 327 | LPARENT ( 328 | <值参数表> 329 | RPARENT ) 330 | <有返回值函数调用语句> 331 | <因子> 332 | <项> 333 | <表达式> 334 | <赋值语句> 335 | SEMICN ; 336 | <语句> 337 | <语句列> 338 | <复合语句> 339 | RBRACE } 340 | <主函数> 341 | <程序> 342 | -------------------------------------------------------------------------------- /testfile.txt: -------------------------------------------------------------------------------- 1 | 2 | const int CONST_INT_2 = 2; 3 | 4 | void func1_print_99() { 5 | const int a = 9; 6 | const int b = 11; 7 | 8 | printf("-----"); 9 | printf("func1_print_99 done!"); 10 | printf("The result is: ", a*b); 11 | printf("-----"); 12 | 13 | return ; 14 | } 15 | 16 | int fun2_print_return_999() { 17 | int a = 111; 18 | int b = 9; 19 | 20 | printf("-----"); 21 | printf("fun2_print_return_999 done!"); 22 | printf("The result is below: "); 23 | printf(a*b); 24 | printf("-----"); 25 | 26 | return (a*b); 27 | } 28 | void errfun( {return;} 29 | void main() { 30 | const char fki = 'k'; 31 | int a; 32 | int b = +100; 33 | int c = -100; 34 | char ff= 'f'; 35 | if(ff > 10) printf("nice"); 36 | 37 | c = b*c - (b*b); 38 | 39 | printf("18231052"); 40 | if(12 > fki) printf("nice"); 41 | 42 | func1_print_99(); 43 | 44 | a = fun2_print_return_999(); 45 | 46 | } -------------------------------------------------------------------------------- /utils.cpp: -------------------------------------------------------------------------------- 1 | // 2 | // Created by wzk on 2020/10/24. 3 | // 4 | 5 | #include 6 | #include "utils.h" 7 | 8 | string lower(string wd) { 9 | string s; 10 | int len = wd.size(); 11 | for (int i = 0; i < len; i++) { 12 | if (wd[i] >= 'A' && wd[i] <= 'Z') { 13 | s += (char) (wd[i] + 'a' - 'A'); 14 | } else { 15 | s += wd[i]; 16 | } 17 | } 18 | return s; 19 | } 20 | 21 | int max(int a, int b) { 22 | return a > b ? a : b; 23 | } 24 | 25 | int min(int a, int b) { 26 | return a < b ? a : b; 27 | } 28 | 29 | bool is_2_power(int x) { 30 | return (x & (x - 1)) == 0; 31 | } 32 | 33 | bool begins_num(string symbol) { 34 | return isdigit(symbol[0]) || symbol[0] == '+' || symbol[0] == '-'; 35 | } 36 | 37 | bool num_or_char(string symbol) { 38 | return begins_num(symbol) || symbol[0] == '\''; 39 | } 40 | 41 | void panic(const string &msg) { 42 | cerr << msg << endl; 43 | exit(1); 44 | } 45 | 46 | void assertion(bool flag) { 47 | if (!flag) { 48 | // mips_debug(); 49 | panic("assertion failed"); 50 | } 51 | } 52 | 53 | string str_replace(string str, const string &from, const string &to) { 54 | string s; 55 | int len = str.size(); 56 | for (unsigned int i = 0; i < len; i++) { 57 | if (str.substr(i, from.size()) == from) { 58 | s += to; 59 | i += from.size() - 1; 60 | } else { 61 | s += str[i]; 62 | } 63 | } 64 | return s; 65 | } 66 | 67 | int sum(const vector& arr) { 68 | int v = 0; 69 | for (int i : arr) { 70 | v += i; 71 | } 72 | return v; 73 | } 74 | 75 | void mips_debug() { 76 | ofstream out("mips.txt"); 77 | out << "addu 0, 0, 0" << endl; 78 | out.close(); 79 | exit(0); 80 | } -------------------------------------------------------------------------------- /utils.h: -------------------------------------------------------------------------------- 1 | // 2 | // Created by wzk on 2020/10/24. 3 | // 4 | 5 | #ifndef COMPILER_UTILS_H 6 | #define COMPILER_UTILS_H 7 | 8 | #include 9 | #include 10 | 11 | using namespace std; 12 | 13 | string lower(string); 14 | 15 | int max(int, int); 16 | 17 | int min(int, int); 18 | 19 | bool is_2_power(int); 20 | 21 | bool begins_num(string); 22 | 23 | bool num_or_char(string); 24 | 25 | void panic(const string&); 26 | 27 | string str_replace(string str, const string& from, const string& to); 28 | 29 | void assertion(bool); 30 | 31 | int sum(const vector& arr); 32 | 33 | void mips_debug(); 34 | 35 | #endif //COMPILER_UTILS_H 36 | -------------------------------------------------------------------------------- /理论课复习.md: -------------------------------------------------------------------------------- 1 | # 第一章 绪论 2 | 3 | ## 基本概念 4 | 5 | 源程序:汇编或高级语言编写的程序 6 | 7 | 目标程序:目标语言表示的程序 8 | 9 | 目标语言:(介于源语言和机器语言之间的)中间语言/机器语言/目标语言 10 | 11 | 翻译程序:将源程序转换为目标程序的程序(汇编程序、编译程序、各种变换程序) 12 | 13 | 汇编程序:汇编语言程序->机器语言程序的翻译程序 14 | 15 | 编译程序:高级语言程序->目标程序的翻译程序 16 | 17 | 编译-解释执行:先编译为中间形式,再和输入数据一起输入到解释程序得到输出 18 | 19 | 20 | 21 | ## 编译过程 22 | 23 | 前端:词法分析、语法分析、语义分析生成中间代码、代码优化(与源语言有关) 24 | 25 | 后端:生成目标程序(与目标机有关) 26 | 27 | 其他:符号表管理、错误处理 28 | 29 | 前处理器:源程序到可重定位机器码 30 | 31 | 后处理器:可重定位机器码经过链接得到可执行程序,经过加载器得到可运行机器码 32 | 33 | 遍:源程序扫描一次 34 | 35 | 36 | 37 | # 第二章 文法和语言 38 | 39 | ## 预备知识 40 | 41 | image-20201218171246280 42 | 43 | 44 | 45 | image-20201218171400954 46 | 47 | 48 | 49 | image-20201218171457550 50 | 51 | 若把字符看作符号,则单词就是符号串,单词集合就是符号串的集合。 52 | 53 | 若把单词看作符号,则句子就是符号串,而所有句子的集合(即语言)就是符号串的集合。 54 | 55 | 56 | 57 | ## 文法形式定义 58 | 59 | image-20201218171836337 60 | 61 | 文法:非终结符、终结符、产生式/规则集合、识别符号 62 | 63 | image-20201218172026311 64 | 65 | image-20201218172244030 66 | 67 | **规范推导=最右推导** 68 | 69 | 70 | 71 | image-20201218172325093 72 | 73 | 74 | 75 | image-20201218172522000 76 | 77 | 78 | 79 | ### 短语、简单短语、句柄 80 | 81 | image-20201218172739904 82 | 83 | 句柄:最左简单短语 84 | 85 | 86 | 87 | image-20201218190028035 88 | 89 | 90 | 91 | **规范归约:每次归约句柄(最左归约)** 92 | 93 | 94 | 95 | image-20201218191319672 96 | 97 | 98 | 99 | image-20201218191757062 100 | 101 | 102 | 103 | 104 | 105 | 二义性意味着句柄不唯一 106 | 107 | image-20201218192949362 108 | 109 | image-20201218193000041 110 | 111 | 112 | 113 | ## 语言分类 114 | 115 | image-20201218193407568 116 | 117 | 118 | 119 | image-20201218193423032 120 | 121 | image-20201218193431134 122 | 123 | 124 | 125 | image-20201218193442511 126 | 127 | ## 消除多余规则 128 | 129 | image-20201221141410119 130 | 131 | 132 | 133 | 134 | 135 | # 第三章 词法分析 136 | 137 | 单词种类:保留字、标识符、常数、分界符(运算符) 138 | 139 | image-20201218194615594 140 | 141 | 142 | 143 | image-20201218194802577 144 | 145 | 146 | 147 | (自底而上分析) 148 | 149 | 句柄是第一个到达的状态 150 | 151 | 152 | 153 | # 第四章 语法分析 154 | 155 | ### 左递归 156 | 157 | 自顶向下分析不能处理左递归 158 | 159 | * **消除方法一:使用BNF** 160 | 161 | image-20201218213638153 162 | 163 | image-20201218213653036 164 | 165 | * **方法二:改为右递归** 166 | 167 | image-20201218213842335 168 | 169 | * 消除一般左递归: 170 | 171 | image-20201218214617169 172 | 173 | 174 | 175 | ### 回溯 176 | 177 | image-20201218215236369 178 | 179 | * 消除方法一:改写文法 180 | 181 | image-20201218215744752 182 | 183 | 184 | 185 | * 超前扫描 186 | 187 | 188 | 189 | **不带回溯的充要条件** 190 | 191 | image-20201218221426852 192 | 193 | 194 | 195 | **构造First集合的算法:按照产生式从下到上** 196 | 197 | image-20201218223508003 198 | 199 | image-20201218223518758 200 | 201 | 202 | 203 | ### 构造Follow集合的算法 204 | 205 | image-20201218223755686 206 | 207 | 208 | 209 | 递归子程序法对应最左推导 210 | 211 | 212 | 213 | 214 | 215 | # 第五章 符号表管理 216 | 217 | image-20201218231224384 218 | 219 | 220 | 221 | image-20201218231428059 222 | 223 | 非分程序结构语言符号表:全局符号表、局部符号表 224 | 225 | 226 | 227 | 分程序结构语言符号表(**栈式符号表**): 228 | 229 | 示例 230 | 231 | image-20201218232652191 232 | 233 | 234 | 235 | # 第六章 存储管理 236 | 237 | 静态存储分配:编译阶段由编译程序分配给源程序中变量 238 | 239 | 要求能够在编译时确定空间大小 240 | 241 | 动态存储分配:运行阶段由目标程序分配给源程序中变量 242 | 243 | image-20201218233350071 244 | 245 | 246 | 247 | ### 活动记录 248 | 249 | **按照调用顺序(非编译顺序)** 250 | 251 | image-20201218235603775 252 | 253 | image-20201219000237993 254 | 255 | 256 | 257 | image-20201219000218823 258 | 259 | image-20201219000544158 260 | 261 | image-20201219000551450 262 | 263 | 264 | 265 | **prevabp指向前一个AR的display区,display区指向能使用的外层模块基地址,ret addr是返回的函数名** 266 | 267 | 268 | 269 | # 第七章 源程序的中间形式 270 | 271 | ## 波兰表示 272 | 273 | image-20201219024920861 274 | 275 | image-20201219025057751 276 | 277 | image-20201219025119781 278 | 279 | BMZ:≤0则跳转 280 | 281 | ## N元式 282 | 283 | 简洁三元式:另一张表表示执行顺序 284 | 285 | image-20201219025400821 286 | 287 | ## 抽象机代码 288 | 289 | BP:活动记录基地址,SP:栈,NP:堆 290 | 291 | image-20201219025629886 292 | 293 | Pcode是波兰表示形式的中间代码 294 | 295 | 296 | 297 | # 第八章 错误处理 298 | 299 | * 语法错误 300 | 301 | * 超越系统限制 302 | * 不符合语义规则 303 | * 数据/存储分配溢出 304 | 305 | 目标程序运行时错误检测:编译时生成检测的代码(数据越界、结果溢出、动态存储分配数据区溢出) 306 | 307 | ## 错误局部化处理 308 | 309 | 语法分析:跳过所在的语法成分(短语或语句),跳到语句右界符(语法成分的后继符号/停止符号),然后从新语句继续往下分析。 310 | 311 | 312 | 313 | 314 | 315 | # 第九章 语法制导翻译 316 | 317 | image-20201219132342904 318 | 319 | 活动序列:终结符(输入序列)+动作符号(动作序列) 320 | 321 | 翻译文法:终结符号包括输入符号、动作符号,并改写产生式 322 | 323 | 语法制导翻译:按照翻译文法进行翻译 324 | 325 | 326 | 327 | ## 属性翻译文法 - 综合文法 328 | 329 | image-20201219132645535 330 | 331 | 综合属性自底而上、自右向左计算 332 | 333 | 334 | 335 | ## 继承属性 336 | 337 | image-20201219133304008 338 | 339 | 继承属性自左向右、自顶向下计算 340 | 341 | 342 | 343 | ## L型属性翻译文法 344 | 345 | image-20201219145150654 346 | 347 | 348 | 349 | image-20201219145432922 350 | 351 | 352 | 353 | image-20201219150410470 354 | 355 | image-20201219150633664 356 | 357 | 左部综合属性传地址,返回时有值 358 | 359 | 右部综合属性:声明变量并赋值 360 | 361 | 362 | 363 | 364 | 365 | # 第十一章 词法分析 366 | 367 | ## 正则表达式 368 | 369 | 正则表达式和三型文法等价 370 | 371 | image-20201220205917778 372 | 373 | 374 | 375 | ## NFA确定化 376 | 377 | image-20201220211842025 378 | 379 | 380 | 381 | image-20201220212048559 382 | 383 | 384 | 385 | image-20201220212201248 386 | 387 | 388 | 389 | image-20201220212214729 390 | 391 | 392 | 393 | image-20201220212223103 394 | 395 | image-20201220213018574 396 | 397 | 398 | 399 | ## 最小化 400 | 401 | image-20201220213359773 402 | 403 | image-20201220213408759 404 | 405 | image-20201220213422417 406 | 407 | image-20201220213451847 408 | 409 | image-20201220213500740 410 | 411 | 412 | 413 | # 第十二章 语法分析 414 | 415 | ## LL分析 416 | 417 | image-20201220225611291 418 | 419 | image-20201220225654409 420 | 421 | image-20201220225814036 422 | 423 | 424 | 425 | **是最左推导** 426 | 427 | image-20201220234813574 428 | 429 | 430 | 431 | image-20201221000655194 432 | 433 | 434 | 435 | ## 算符优先分析 436 | 437 | image-20201221004641803 438 | 不一定是最左归约 439 | 440 | image-20201221005619335 441 | 442 | image-20201221005841526 443 | 444 | image-20201221010257570 445 | 446 | image-20201221010521124 447 | 448 | image-20201221010616923 449 | 450 | image-20201221011537195 451 | 452 | 453 | 454 | ## LR分析 455 | 456 | 是规范归约,每次归约的都是句柄 457 | 458 | 栈内符号串是规范句型的活前缀,和输入串剩余部分构成规范句型 459 | 460 | image-20201221012140350 461 | 462 | image-20201221012409669 463 | 464 | image-20201221012420398 465 | 466 | 467 | 468 | image-20201221020740988 469 | 470 | 471 | 472 | image-20201221021325389 473 | 474 | image-20201221021601571 475 | 476 | 477 | 478 | image-20201221022024457 479 | 480 | 求有效项目方法:查看状态转移图 481 | 482 | image-20201221022403232 483 | 484 | 485 | 486 | ![image-20201221142509301](编译复习.assets/image-20201221142509301.png) 487 | 488 | 489 | 490 | # 第十四章 代码优化 491 | 492 | 分类: 493 | 494 | * 局部优化:基本块内,如局部公共子表达式 495 | * 全局优化:函数/过程内,跨基本块,如数据流分析 496 | * 跨函数优化 497 | 498 | ## 基本块 499 | 500 | image-20201219183747445 501 | 502 | 划分算法:**确定入口语句(第一条、跳转语句后第一条、跳转语句跳转到的第一条)** 503 | 504 | ## DAG图 505 | 506 | image-20201219185021620 507 | 508 | image-20201219185340471 509 | 510 | image-20201219185447743 511 | 512 | image-20201219190127485 513 | 514 | 515 | 516 | ## 到达定义分析 517 | 518 | image-20201219191139115 519 | 520 | image-20201219191323265 521 | 522 | 523 | 524 | image-20201219191605715 525 | 526 | image-20201219194516616 527 | 528 | **循环执行直至不变** 529 | 530 | 531 | 532 | image-20201219192209848 533 | 534 | 535 | 536 | ## 活跃变量分析 537 | 538 | image-20201219193800340 539 | 540 | **沿流图反向计算** 541 | 542 | image-20201219194548072 543 | 544 | **循环执行直至不变** 545 | 546 | image-20201219194749266 547 | 548 | ## 冲突图 549 | 550 | image-20201219195348919 551 | 552 | image-20201219195531303 553 | 554 | ## 定义-使用链 555 | 556 | image-20201219195924384 557 | 558 | image-20201219195941673 559 | 560 | image-20201219195953750 561 | 562 | 563 | 564 | # 第十五章 目标代码优化 565 | 566 | ## 引用计数 567 | 568 | image-20201219172904468 569 | 570 | image-20201219172934173 571 | 572 | ## 着色算法 573 | 574 | image-20201219173208674 575 | 576 | 577 | 578 | image-20201219173228439 579 | 580 | image-20201219173252271 581 | 582 | 583 | 584 | 585 | 586 | 587 | 588 | # 第十六章 编译程序生成方法 589 | 590 | ## 自编译 591 | 592 | image-20201219163528891 593 | 594 | image-20201219163703622 595 | 596 | ## 自展 597 | 598 | image-20201219164223017 599 | 600 | 601 | 602 | ## 移植 603 | 604 | image-20201219164554022 605 | 606 | image-20201219164637499 607 | 608 | 609 | 610 | 611 | 612 | 613 | 614 | 615 | 616 | 617 | 618 | 619 | 620 | 621 | 622 | 623 | 624 | 625 | 626 | 627 | 628 | 629 | 630 | -------------------------------------------------------------------------------- /编译复习.assets/image-20201218171246280.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201218171246280.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201218171400954.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201218171400954.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201218171457550.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201218171457550.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201218171836337.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201218171836337.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201218172026311.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201218172026311.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201218172244030.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201218172244030.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201218172325093.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201218172325093.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201218172522000.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201218172522000.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201218172739904.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201218172739904.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201218190028035.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201218190028035.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201218191319672.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201218191319672.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201218191757062.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201218191757062.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201218192949362.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201218192949362.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201218193000041.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201218193000041.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201218193407568.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201218193407568.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201218193423032.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201218193423032.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201218193431134.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201218193431134.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201218193442511.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201218193442511.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201218194615594.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201218194615594.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201218194802577.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201218194802577.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201218213638153.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201218213638153.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201218213653036.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201218213653036.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201218213842335.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201218213842335.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201218214617169.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201218214617169.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201218215236369.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201218215236369.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201218215744752.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201218215744752.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201218221426852.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201218221426852.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201218223508003.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201218223508003.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201218223518758.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201218223518758.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201218223755686.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201218223755686.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201218231224384.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201218231224384.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201218231428059.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201218231428059.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201218232652191.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201218232652191.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201218232936168.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201218232936168.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201218233350071.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201218233350071.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201218235603775.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201218235603775.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219000218823.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219000218823.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219000237993.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219000237993.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219000544158.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219000544158.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219000551450.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219000551450.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219024920861.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219024920861.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219025057751.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219025057751.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219025119781.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219025119781.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219025400821.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219025400821.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219025629886.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219025629886.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219030553694.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219030553694.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219132342904.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219132342904.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219132645535.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219132645535.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219133304008.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219133304008.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219145150654.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219145150654.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219145432922.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219145432922.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219150410470.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219150410470.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219150633664.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219150633664.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219163528891.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219163528891.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219163703622.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219163703622.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219164223017.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219164223017.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219164554022.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219164554022.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219164637499.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219164637499.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219172904468.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219172904468.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219172934173.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219172934173.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219173036638.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219173036638.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219173208674.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219173208674.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219173228439.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219173228439.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219173252271.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219173252271.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219174124518.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219174124518.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219183747445.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219183747445.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219185021620.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219185021620.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219185340471.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219185340471.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219185447743.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219185447743.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219190127485.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219190127485.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219191139115.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219191139115.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219191323265.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219191323265.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219191605715.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219191605715.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219191730110.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219191730110.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219192209848.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219192209848.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219193800340.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219193800340.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219194516616.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219194516616.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219194548072.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219194548072.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219194749266.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219194749266.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219195348919.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219195348919.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219195531303.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219195531303.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219195924384.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219195924384.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219195941673.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219195941673.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201219195953750.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201219195953750.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201220205917778.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201220205917778.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201220211842025.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201220211842025.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201220212048559.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201220212048559.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201220212201248.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201220212201248.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201220212214729.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201220212214729.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201220212223103.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201220212223103.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201220213018574.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201220213018574.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201220213359773.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201220213359773.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201220213408759.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201220213408759.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201220213422417.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201220213422417.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201220213451847.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201220213451847.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201220213500740.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201220213500740.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201220225611291.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201220225611291.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201220225654409.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201220225654409.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201220225814036.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201220225814036.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201220234813574.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201220234813574.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201221000655194.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201221000655194.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201221004641803.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201221004641803.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201221005619335.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201221005619335.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201221005841526.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201221005841526.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201221010257570.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201221010257570.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201221010521124.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201221010521124.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201221010616923.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201221010616923.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201221011537195.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201221011537195.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201221012129929.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201221012129929.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201221012140350.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201221012140350.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201221012409669.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201221012409669.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201221012410679.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201221012410679.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201221012420398.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201221012420398.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201221020740988.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201221020740988.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201221021325389.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201221021325389.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201221021601571.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201221021601571.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201221022024457.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201221022024457.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201221022403232.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201221022403232.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201221141410119.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201221141410119.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201221142503522.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201221142503522.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201221142504459.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201221142504459.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201221142506694.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201221142506694.png -------------------------------------------------------------------------------- /编译复习.assets/image-20201221142509301.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wzk1015/compiler/fd71bbd8f8c4b460eb8a23c5e094136696803635/编译复习.assets/image-20201221142509301.png --------------------------------------------------------------------------------