├── LICENSE
├── Makefile
├── README.md
├── as.c
├── c5.c
└── lk.c
/LICENSE:
--------------------------------------------------------------------------------
1 | GNU GENERAL PUBLIC LICENSE
2 | Version 2, June 1991
3 |
4 | Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
5 | 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
6 | Everyone is permitted to copy and distribute verbatim copies
7 | of this license document, but changing it is not allowed.
8 |
9 | Preamble
10 |
11 | The licenses for most software are designed to take away your
12 | freedom to share and change it. By contrast, the GNU General Public
13 | License is intended to guarantee your freedom to share and change free
14 | software--to make sure the software is free for all its users. This
15 | General Public License applies to most of the Free Software
16 | Foundation's software and to any other program whose authors commit to
17 | using it. (Some other Free Software Foundation software is covered by
18 | the GNU Lesser General Public License instead.) You can apply it to
19 | your programs, too.
20 |
21 | When we speak of free software, we are referring to freedom, not
22 | price. Our General Public Licenses are designed to make sure that you
23 | have the freedom to distribute copies of free software (and charge for
24 | this service if you wish), that you receive source code or can get it
25 | if you want it, that you can change the software or use pieces of it
26 | in new free programs; and that you know you can do these things.
27 |
28 | To protect your rights, we need to make restrictions that forbid
29 | anyone to deny you these rights or to ask you to surrender the rights.
30 | These restrictions translate to certain responsibilities for you if you
31 | distribute copies of the software, or if you modify it.
32 |
33 | For example, if you distribute copies of such a program, whether
34 | gratis or for a fee, you must give the recipients all the rights that
35 | you have. You must make sure that they, too, receive or can get the
36 | source code. And you must show them these terms so they know their
37 | rights.
38 |
39 | We protect your rights with two steps: (1) copyright the software, and
40 | (2) offer you this license which gives you legal permission to copy,
41 | distribute and/or modify the software.
42 |
43 | Also, for each author's protection and ours, we want to make certain
44 | that everyone understands that there is no warranty for this free
45 | software. If the software is modified by someone else and passed on, we
46 | want its recipients to know that what they have is not the original, so
47 | that any problems introduced by others will not reflect on the original
48 | authors' reputations.
49 |
50 | Finally, any free program is threatened constantly by software
51 | patents. We wish to avoid the danger that redistributors of a free
52 | program will individually obtain patent licenses, in effect making the
53 | program proprietary. To prevent this, we have made it clear that any
54 | patent must be licensed for everyone's free use or not licensed at all.
55 |
56 | The precise terms and conditions for copying, distribution and
57 | modification follow.
58 |
59 | GNU GENERAL PUBLIC LICENSE
60 | TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
61 |
62 | 0. This License applies to any program or other work which contains
63 | a notice placed by the copyright holder saying it may be distributed
64 | under the terms of this General Public License. The "Program", below,
65 | refers to any such program or work, and a "work based on the Program"
66 | means either the Program or any derivative work under copyright law:
67 | that is to say, a work containing the Program or a portion of it,
68 | either verbatim or with modifications and/or translated into another
69 | language. (Hereinafter, translation is included without limitation in
70 | the term "modification".) Each licensee is addressed as "you".
71 |
72 | Activities other than copying, distribution and modification are not
73 | covered by this License; they are outside its scope. The act of
74 | running the Program is not restricted, and the output from the Program
75 | is covered only if its contents constitute a work based on the
76 | Program (independent of having been made by running the Program).
77 | Whether that is true depends on what the Program does.
78 |
79 | 1. You may copy and distribute verbatim copies of the Program's
80 | source code as you receive it, in any medium, provided that you
81 | conspicuously and appropriately publish on each copy an appropriate
82 | copyright notice and disclaimer of warranty; keep intact all the
83 | notices that refer to this License and to the absence of any warranty;
84 | and give any other recipients of the Program a copy of this License
85 | along with the Program.
86 |
87 | You may charge a fee for the physical act of transferring a copy, and
88 | you may at your option offer warranty protection in exchange for a fee.
89 |
90 | 2. You may modify your copy or copies of the Program or any portion
91 | of it, thus forming a work based on the Program, and copy and
92 | distribute such modifications or work under the terms of Section 1
93 | above, provided that you also meet all of these conditions:
94 |
95 | a) You must cause the modified files to carry prominent notices
96 | stating that you changed the files and the date of any change.
97 |
98 | b) You must cause any work that you distribute or publish, that in
99 | whole or in part contains or is derived from the Program or any
100 | part thereof, to be licensed as a whole at no charge to all third
101 | parties under the terms of this License.
102 |
103 | c) If the modified program normally reads commands interactively
104 | when run, you must cause it, when started running for such
105 | interactive use in the most ordinary way, to print or display an
106 | announcement including an appropriate copyright notice and a
107 | notice that there is no warranty (or else, saying that you provide
108 | a warranty) and that users may redistribute the program under
109 | these conditions, and telling the user how to view a copy of this
110 | License. (Exception: if the Program itself is interactive but
111 | does not normally print such an announcement, your work based on
112 | the Program is not required to print an announcement.)
113 |
114 | These requirements apply to the modified work as a whole. If
115 | identifiable sections of that work are not derived from the Program,
116 | and can be reasonably considered independent and separate works in
117 | themselves, then this License, and its terms, do not apply to those
118 | sections when you distribute them as separate works. But when you
119 | distribute the same sections as part of a whole which is a work based
120 | on the Program, the distribution of the whole must be on the terms of
121 | this License, whose permissions for other licensees extend to the
122 | entire whole, and thus to each and every part regardless of who wrote it.
123 |
124 | Thus, it is not the intent of this section to claim rights or contest
125 | your rights to work written entirely by you; rather, the intent is to
126 | exercise the right to control the distribution of derivative or
127 | collective works based on the Program.
128 |
129 | In addition, mere aggregation of another work not based on the Program
130 | with the Program (or with a work based on the Program) on a volume of
131 | a storage or distribution medium does not bring the other work under
132 | the scope of this License.
133 |
134 | 3. You may copy and distribute the Program (or a work based on it,
135 | under Section 2) in object code or executable form under the terms of
136 | Sections 1 and 2 above provided that you also do one of the following:
137 |
138 | a) Accompany it with the complete corresponding machine-readable
139 | source code, which must be distributed under the terms of Sections
140 | 1 and 2 above on a medium customarily used for software interchange; or,
141 |
142 | b) Accompany it with a written offer, valid for at least three
143 | years, to give any third party, for a charge no more than your
144 | cost of physically performing source distribution, a complete
145 | machine-readable copy of the corresponding source code, to be
146 | distributed under the terms of Sections 1 and 2 above on a medium
147 | customarily used for software interchange; or,
148 |
149 | c) Accompany it with the information you received as to the offer
150 | to distribute corresponding source code. (This alternative is
151 | allowed only for noncommercial distribution and only if you
152 | received the program in object code or executable form with such
153 | an offer, in accord with Subsection b above.)
154 |
155 | The source code for a work means the preferred form of the work for
156 | making modifications to it. For an executable work, complete source
157 | code means all the source code for all modules it contains, plus any
158 | associated interface definition files, plus the scripts used to
159 | control compilation and installation of the executable. However, as a
160 | special exception, the source code distributed need not include
161 | anything that is normally distributed (in either source or binary
162 | form) with the major components (compiler, kernel, and so on) of the
163 | operating system on which the executable runs, unless that component
164 | itself accompanies the executable.
165 |
166 | If distribution of executable or object code is made by offering
167 | access to copy from a designated place, then offering equivalent
168 | access to copy the source code from the same place counts as
169 | distribution of the source code, even though third parties are not
170 | compelled to copy the source along with the object code.
171 |
172 | 4. You may not copy, modify, sublicense, or distribute the Program
173 | except as expressly provided under this License. Any attempt
174 | otherwise to copy, modify, sublicense or distribute the Program is
175 | void, and will automatically terminate your rights under this License.
176 | However, parties who have received copies, or rights, from you under
177 | this License will not have their licenses terminated so long as such
178 | parties remain in full compliance.
179 |
180 | 5. You are not required to accept this License, since you have not
181 | signed it. However, nothing else grants you permission to modify or
182 | distribute the Program or its derivative works. These actions are
183 | prohibited by law if you do not accept this License. Therefore, by
184 | modifying or distributing the Program (or any work based on the
185 | Program), you indicate your acceptance of this License to do so, and
186 | all its terms and conditions for copying, distributing or modifying
187 | the Program or works based on it.
188 |
189 | 6. Each time you redistribute the Program (or any work based on the
190 | Program), the recipient automatically receives a license from the
191 | original licensor to copy, distribute or modify the Program subject to
192 | these terms and conditions. You may not impose any further
193 | restrictions on the recipients' exercise of the rights granted herein.
194 | You are not responsible for enforcing compliance by third parties to
195 | this License.
196 |
197 | 7. If, as a consequence of a court judgment or allegation of patent
198 | infringement or for any other reason (not limited to patent issues),
199 | conditions are imposed on you (whether by court order, agreement or
200 | otherwise) that contradict the conditions of this License, they do not
201 | excuse you from the conditions of this License. If you cannot
202 | distribute so as to satisfy simultaneously your obligations under this
203 | License and any other pertinent obligations, then as a consequence you
204 | may not distribute the Program at all. For example, if a patent
205 | license would not permit royalty-free redistribution of the Program by
206 | all those who receive copies directly or indirectly through you, then
207 | the only way you could satisfy both it and this License would be to
208 | refrain entirely from distribution of the Program.
209 |
210 | If any portion of this section is held invalid or unenforceable under
211 | any particular circumstance, the balance of the section is intended to
212 | apply and the section as a whole is intended to apply in other
213 | circumstances.
214 |
215 | It is not the purpose of this section to induce you to infringe any
216 | patents or other property right claims or to contest validity of any
217 | such claims; this section has the sole purpose of protecting the
218 | integrity of the free software distribution system, which is
219 | implemented by public license practices. Many people have made
220 | generous contributions to the wide range of software distributed
221 | through that system in reliance on consistent application of that
222 | system; it is up to the author/donor to decide if he or she is willing
223 | to distribute software through any other system and a licensee cannot
224 | impose that choice.
225 |
226 | This section is intended to make thoroughly clear what is believed to
227 | be a consequence of the rest of this License.
228 |
229 | 8. If the distribution and/or use of the Program is restricted in
230 | certain countries either by patents or by copyrighted interfaces, the
231 | original copyright holder who places the Program under this License
232 | may add an explicit geographical distribution limitation excluding
233 | those countries, so that distribution is permitted only in or among
234 | countries not thus excluded. In such case, this License incorporates
235 | the limitation as if written in the body of this License.
236 |
237 | 9. The Free Software Foundation may publish revised and/or new versions
238 | of the General Public License from time to time. Such new versions will
239 | be similar in spirit to the present version, but may differ in detail to
240 | address new problems or concerns.
241 |
242 | Each version is given a distinguishing version number. If the Program
243 | specifies a version number of this License which applies to it and "any
244 | later version", you have the option of following the terms and conditions
245 | either of that version or of any later version published by the Free
246 | Software Foundation. If the Program does not specify a version number of
247 | this License, you may choose any version ever published by the Free Software
248 | Foundation.
249 |
250 | 10. If you wish to incorporate parts of the Program into other free
251 | programs whose distribution conditions are different, write to the author
252 | to ask for permission. For software which is copyrighted by the Free
253 | Software Foundation, write to the Free Software Foundation; we sometimes
254 | make exceptions for this. Our decision will be guided by the two goals
255 | of preserving the free status of all derivatives of our free software and
256 | of promoting the sharing and reuse of software generally.
257 |
258 | NO WARRANTY
259 |
260 | 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
261 | FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
262 | OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
263 | PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
264 | OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
265 | MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
266 | TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
267 | PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
268 | REPAIR OR CORRECTION.
269 |
270 | 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
271 | WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
272 | REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
273 | INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
274 | OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
275 | TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
276 | YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
277 | PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
278 | POSSIBILITY OF SUCH DAMAGES.
279 |
280 | END OF TERMS AND CONDITIONS
281 |
282 | How to Apply These Terms to Your New Programs
283 |
284 | If you develop a new program, and you want it to be of the greatest
285 | possible use to the public, the best way to achieve this is to make it
286 | free software which everyone can redistribute and change under these terms.
287 |
288 | To do so, attach the following notices to the program. It is safest
289 | to attach them to the start of each source file to most effectively
290 | convey the exclusion of warranty; and each file should have at least
291 | the "copyright" line and a pointer to where the full notice is found.
292 |
293 | {description}
294 | Copyright (C) {year} {fullname}
295 |
296 | This program is free software; you can redistribute it and/or modify
297 | it under the terms of the GNU General Public License as published by
298 | the Free Software Foundation; either version 2 of the License, or
299 | (at your option) any later version.
300 |
301 | This program is distributed in the hope that it will be useful,
302 | but WITHOUT ANY WARRANTY; without even the implied warranty of
303 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
304 | GNU General Public License for more details.
305 |
306 | You should have received a copy of the GNU General Public License along
307 | with this program; if not, write to the Free Software Foundation, Inc.,
308 | 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
309 |
310 | Also add information on how to contact you by electronic and paper mail.
311 |
312 | If the program is interactive, make it output a short notice like this
313 | when it starts in an interactive mode:
314 |
315 | Gnomovision version 69, Copyright (C) year name of author
316 | Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
317 | This is free software, and you are welcome to redistribute it
318 | under certain conditions; type `show c' for details.
319 |
320 | The hypothetical commands `show w' and `show c' should show the appropriate
321 | parts of the General Public License. Of course, the commands you use may
322 | be called something other than `show w' and `show c'; they could even be
323 | mouse-clicks or menu items--whatever suits your program.
324 |
325 | You should also get your employer (if you work as a programmer) or your
326 | school, if any, to sign a "copyright disclaimer" for the program, if
327 | necessary. Here is a sample; alter the names:
328 |
329 | Yoyodyne, Inc., hereby disclaims all copyright interest in the program
330 | `Gnomovision' (which makes passes at compilers) written by James Hacker.
331 |
332 | {signature of Ty Coon}, 1 April 1989
333 | Ty Coon, President of Vice
334 |
335 | This General Public License does not permit incorporating your program into
336 | proprietary programs. If your program is a subroutine library, you may
337 | consider it more useful to permit linking proprietary applications with the
338 | library. If this is what you want to do, use the GNU Lesser General
339 | Public License instead of this License.
340 |
341 |
--------------------------------------------------------------------------------
/Makefile:
--------------------------------------------------------------------------------
1 | ALL : c5 as lk
2 |
3 | .PHONY: ALL
4 |
5 | c5 : c5.c
6 | cc -m32 -g -o c5 c5.c
7 |
8 | as : as.c
9 | cc -m32 -g -o as as.c
10 |
11 | lk : lk.c
12 | cc -m32 -g -o lk lk.c
13 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | c5 - Compiler C to MIPS in five functions
2 | ====
3 |
4 | Forked from [rswier/c4](https://github.com/rswier/c4). Modified to generate MIPS asm code.
5 |
6 | And then convert asm code in MERL binary.
7 |
8 | Finally link multiple binaries into one.
9 |
10 | Try the following:
11 |
12 | ```
13 | make
14 |
15 | ./c5 -o c5.asm c5.c
16 | ./as -o c5.o -m c5.asm
17 |
18 | ./c5 -o other.asm .c
19 | ./as -o other.o -m other.asm
20 |
21 | ./lk -o combine -m c5.o other.o
22 | ```
23 |
24 | NOTE: `-m` in parameter means to output in MERL form, otherwise some error like `printf not found` would occurred when linking. All symbols must be resolved before running.
25 |
--------------------------------------------------------------------------------
/as.c:
--------------------------------------------------------------------------------
1 | #include
2 | #include
3 | #include
4 | #include
5 | #include
6 |
7 | enum {
8 | Tk,
9 | Hash,
10 | Name,
11 | Value,
12 | IdSize
13 | };
14 |
15 | enum {
16 | _ = 255,
17 | ADD ,ADDU ,SUB ,SUBU ,AND ,OR ,XOR ,NOR ,SLT ,SLTU ,SLL ,SRL ,SRA ,SLLV ,SRLV ,SRAV ,JR ,JALR ,SCALL,ERET ,MFCO ,MTCO ,
18 | ADDI ,ADDIU,ANDI ,ORI ,XORI ,LUI ,LW ,LH ,LB ,SW ,SH ,SB ,SLTI ,SLTIU,
19 | BEQ ,BNE ,J ,JAL ,
20 | DB ,DW ,DD ,STR ,
21 | _ZERO,_AT ,_V0 ,_V1 ,_A0 ,_A1 ,_A2 ,_A3 ,_T0 ,_T1 ,_T2 ,_T3 ,_T4 ,_T5 ,_T6 ,_T7 ,
22 | _S0 ,_S1 ,_S2 ,_S3 ,_S4 ,_S5 ,_S6 ,_S7 ,_T8 ,_T9 ,_K0 ,_K1 ,_GP ,_SP ,_FP ,_RA ,
23 | GLOB ,EXTN
24 | };
25 |
26 | enum { Id = 128, Reg, Imm, Labl, Directive };
27 |
28 | char *p,
29 | *buf,
30 | *file,
31 | *output;
32 |
33 | int *sym,
34 | *id,
35 | *e,
36 | tk,
37 | ival,
38 | line,
39 | merl
40 | ;
41 |
42 | void
43 | next()
44 | {
45 | char *pp;
46 | int sign;
47 |
48 | while ((tk = *p)) {
49 | ++p;
50 | if (tk == '\n') ++line;
51 | else if (tk == '#') {
52 | while (*p != 0 && *p != '\n') ++p;
53 | }
54 | else if (tk == '$') {
55 | if (*p >= '0' && *p <= '9') {
56 | ival = 0;
57 | while (*p >= '0' && *p <= '9') ival = ival * 10 + *p++ - '0';
58 | tk = Reg;
59 | return;
60 | }
61 | else {
62 | pp = p; tk = 0;
63 | while ((*p >= 'a' && *p <= 'z') || (*p >= 'A' && *p <= 'Z') || (*p >= '0' && *p <= '9') || *p == '_')
64 | tk = tk * 147 + *p++;
65 | tk = (tk << 6) + (p - pp);
66 | id = sym;
67 | while (id[Tk]) {
68 | if (tk == id[Hash] && !memcmp((char*)id[Name], pp, p - pp) && id[Tk] >= _ZERO && id[Tk] <= _RA) {
69 | tk = Reg;
70 | ival = id[Tk] - _ZERO;
71 | return;
72 | }
73 | id = id + IdSize;
74 | }
75 | printf("%s:%d: bad register `%.*s'\n", file, line, p - pp, pp);
76 | exit(-1);
77 | }
78 | }
79 | else if (tk == '.') {
80 | pp = p; tk = 0;
81 | while ((*p >= 'a' && *p <= 'z') || (*p >= 'A' && *p <= 'Z') || (*p >= '0' && *p <= '9') || *p == '_')
82 | tk = tk * 147 + *p++;
83 | tk = (tk << 6) + (p - pp);
84 | id = sym;
85 | while (id[Tk]) {
86 | if (tk == id[Hash] && !memcmp((char*)id[Name], pp, p - pp) && id[Tk] >= GLOB && id[Tk] <= EXTN) {
87 | tk = Directive;
88 | ival = id[Tk];
89 | return;
90 | }
91 | id = id + IdSize;
92 | }
93 | printf("%s:%d: bad directive `%.*s'\n", file, line, p - pp, pp);
94 | exit(-1);
95 | }
96 | else if ((tk >= 'a' && tk <= 'z') || (tk >= 'A' && tk <= 'Z') || tk == '_') {
97 | pp = p - 1;
98 | while ((*p >= 'a' && *p <= 'z') || (*p >= 'A' && *p <= 'Z') || (*p >= '0' && *p <= '9')|| *p == '_')
99 | tk = tk * 147 + *p++;
100 | tk = (tk << 6) + (p - pp);
101 | id = sym;
102 | while (id[Tk]) {
103 | if (tk == id[Hash] && !memcmp((char*)id[Name], pp, p - pp)) {
104 | tk = Id;
105 | ival = id[Value] == ~0 ? ~0 : (id[Value] < 0 ? -id[Value] : id[Value]); // calculate abs(id[Value])
106 | return;
107 | }
108 | id = id + IdSize;
109 | }
110 | id[Name] = (int)pp;
111 | id[Hash] = tk;
112 | tk = id[Tk] = Id;
113 | ival = id[Value] = ~0;
114 | return;
115 | }
116 | else if (tk == '-' || (tk >= '0' && tk <= '9')) {
117 | if ((sign = (tk == '-'))) tk = *p++;
118 | if ((ival = tk - '0')) { while (*p >= '0' && *p <= '9') ival = ival * 10 + *p++ - '0'; }
119 | else if (*p == 'x' || *p == 'X') {
120 | while ((tk = *++p) && ((tk >= '0' && tk <= '9') || (tk >= 'a' && tk <= 'f') || (tk >= 'A' && tk <= 'F')))
121 | ival = ival * 16 + (tk & 15) + (tk >= 'A' ? 9 : 0);
122 | }
123 | else { while (*p >= '0' && *p <= '7') ival = ival * 8 + *p++ - '0'; }
124 | if (sign) ival = -ival;
125 | tk = Imm;
126 | return;
127 | }
128 | else if (tk == '\'' || tk == '"') {
129 | pp = buf;
130 | while (*p != 0 && *p != tk) {
131 | if ((ival = *p++) == '\\') {
132 | if ((ival = *p++) == 'n') ival = '\n';
133 | }
134 | if (tk == '"') *pp++ = ival;
135 | }
136 | ++p;
137 | if (tk == '"') ival = pp - buf; else tk = Imm;
138 | return;
139 | }
140 | else if ( tk == ',' || tk == '(' || tk == ')' || tk == ':' || tk == '[' || tk == ']') return;
141 | else if ( tk != ' ' && tk != '\n' && tk != '\t' && tk != '\r' && tk != '\b') {
142 | printf("%s:%d: bad token '%c'(%d)\n", file, line, tk, tk);
143 | exit(-1);
144 | }
145 | }
146 | }
147 |
148 | int
149 | main(int argc, char **argv)
150 | {
151 | int fd, poolsz, i, *lsym, *le, *rel, t, *lid, ltk;
152 | char *tp;
153 |
154 | int offset;
155 |
156 | merl = 0;
157 |
158 | --argc; ++argv;
159 |
160 | while (argc && **argv == '-') {
161 | if ((*argv)[1] == 'o') {
162 | if (! --argc) { printf("no output file\n"); exit(-1); }
163 | output = *++argv;
164 | }
165 | else if ((*argv)[1] == 'm') {
166 | merl = 1;
167 | }
168 | else { printf("unknown argument `%s'\n", *argv); exit(-1); }
169 | --argc; ++argv;
170 | }
171 |
172 | if (!output) { printf("no output file\n"); exit(-1); }
173 | if (!argc) { printf("usage: as -o output file ...\n"); exit(-1); }
174 |
175 | poolsz = 25600 * 1024; // arbitrary size
176 |
177 | if (!(sym = malloc(poolsz))) { printf("could not malloc(%d) symbol area\n", poolsz); exit(-1); }
178 | if (!(buf = malloc(poolsz))) { printf("could not malloc(%d) buffer area\n", poolsz); exit(-1); }
179 | if (!(le = e = malloc(poolsz))) { printf("could not malloc(%d) exec area\n", poolsz); exit(-1); }
180 |
181 | memset(sym, 0, poolsz);
182 | memset(buf, 0, poolsz);
183 | memset(e, 0, poolsz);
184 |
185 | p = "add addu sub subu and or xor nor slt sltu sll srl sra sllv srlv srav jr jalr syscall eret mfco mtco "
186 | "addi addiu andi ori xori lui lw lh lb sw sh sb slti sltiu "
187 | "beq bne j jal "
188 | "db dw dd string "
189 | "zero at v0 v1 a0 a1 a2 a3 t0 t1 t2 t3 t4 t5 t6 t7 "
190 | "s0 s1 s2 s3 s4 s5 s6 s7 t8 t9 k0 k1 gp sp fp ra "
191 | "global extern ";
192 | i = ADD;
193 | while (i <= EXTN) { next(); id[Tk] = i++; }
194 |
195 | if (!(tp = p = malloc(poolsz))) { printf("could not malloc(%d) source area\n", poolsz); exit(-1); }
196 |
197 | offset = merl ? 12 : 0;
198 | lsym = sym;
199 | while (argc--) {
200 | if ((fd = open(file = *argv, 0)) < 0) { printf("could not open(%s)\n", *argv); exit(-1); }
201 | if ((i = read(fd, p, poolsz - (p - tp))) <= 0) { printf("read() returned %d\n", i); exit(-1); }
202 | close(fd);
203 |
204 | line = 1;
205 | next();
206 | while (tk) {
207 | if (tk == Id) {
208 | if ((id[Tk] >= ADD && id[Tk] <= SLTU) || (id[Tk] >= SLLV && id[Tk] <= SRAV)) {
209 | next(); if (tk != Reg) { printf("%s:%d expect register\n", file, line); exit(-1); }
210 | next(); if (tk != ',') { printf("%s:%d expect `,'", file, line); exit(-1); }
211 | next(); if (tk != Reg) { printf("%s:%d expect register\n", file, line); exit(-1); }
212 | next(); if (tk != ',') { printf("%s:%d expect `,'", file, line); exit(-1); }
213 | next(); if (tk != Reg) { printf("%s:%d expect register\n", file, line); exit(-1); }
214 | next();
215 | offset = offset + 4;
216 | }
217 | else if (
218 | (id[Tk] >= SLL && id[Tk] <= SRA) ||
219 | (id[Tk] >= ADDI && id[Tk] <= XORI) ||
220 | (id[Tk] >= SLTI && id[Tk] <= SLTIU) ||
221 | (id[Tk] >= BEQ && id[Tk] <= BNE)
222 | ) {
223 | next(); if (tk != Reg) { printf("%s:%d expect register\n", file, line); exit(-1); }
224 | next(); if (tk != ',') { printf("%s:%d expect `,'", file, line); exit(-1); }
225 | next(); if (tk != Reg) { printf("%s:%d expect register\n", file, line); exit(-1); }
226 | next(); if (tk != ',') { printf("%s:%d expect `,'", file, line); exit(-1); }
227 | next(); if (tk != Id && tk != Imm) { printf("%s:%d expect imm\n", file, line); exit(-1); }
228 | next(); if (tk == '[') {
229 | next(); if (tk != Id) { printf("%s:%d expect label\n", file, line); exit(-1); }
230 | next(); if (tk != ']') { printf("%s:%d expect `]'\n", file, line); exit(-1); }
231 | next();
232 | }
233 | offset = offset + 4;
234 | }
235 | else if (id[Tk] == JR) {
236 | next(); if (tk != Reg) { printf("%s:%d expect register\n", file, line); exit(-1); }
237 | next();
238 | offset = offset + 4;
239 | }
240 | else if (id[Tk] == JALR) {
241 | next(); if (tk != Reg) { printf("%s:%d expect register\n", file, line); exit(-1); }
242 | next(); if (tk != ',') { printf("%s:%d expect `,'", file, line); exit(-1); }
243 | next(); if (tk != Reg) { printf("%s:%d expect register\n", file, line); exit(-1); }
244 | next();
245 | offset = offset + 4;
246 | }
247 | else if (id[Tk] == SCALL) {
248 | next(); if (tk != Reg) { printf("%s:%d expect register\n", file, line); exit(-1); }
249 | next();
250 | offset = offset + 4;
251 | }
252 | else if (id[Tk] == ERET) {
253 | next();
254 | offset = offset + 4;
255 | }
256 | else if ((id[Tk] >= MFCO && id[Tk] <= MTCO) || id[Tk] == LUI) {
257 | next(); if (tk != Reg) { printf("%s:%d expect register\n", file, line); exit(-1); }
258 | next(); if (tk != ',') { printf("%s:%d expect `,'", file, line); exit(-1); }
259 | next(); if (tk != Imm) { printf("%s:%d expect imm\n", file, line); exit(-1); }
260 | next();
261 | offset = offset + 4;
262 | }
263 | else if (id[Tk] >= LW && id[Tk] <= SB) {
264 | next(); if (tk != Reg) { printf("%s:%d expect register\n", file, line); exit(-1); }
265 | next(); if (tk != ',') { printf("%s:%d expect `,'", file, line); exit(-1); }
266 | next(); if (tk != Id && tk != Imm) { printf("%s:%d expect offset\n", file, line); exit(-1); }
267 | next();
268 | if (tk == '[') {
269 | next(); if (tk != Id) { printf("%s:%d expect label\n", file, line); exit(-1); }
270 | next(); if (tk != ']') { printf("%s:%d expect `]'\n", file, line); exit(-1); }
271 | next();
272 | }
273 | if (tk != '(') { printf("%s:%d expect `('\n", file, line); exit(-1); }
274 | next(); if (tk != Reg) { printf("%s:%d expect register\n", file, line); exit(-1); }
275 | next(); if (tk != ')') { printf("%s:%d expect `)'\n", file, line); exit(-1); }
276 | next();
277 | offset = offset + 4;
278 | }
279 | else if (id[Tk] >= J && id[Tk] <= JAL) {
280 | next(); if (tk != Id && tk != Imm) { printf("%s:%d expect address\n", file, line); exit(-1); }
281 | next();
282 | offset = offset + 4;
283 | }
284 | else if (id[Tk] == DB || id[Tk] == DW || id[Tk] == DD) {
285 | next(); if (tk != Id && tk != Imm) { printf("%s:%d expect imm\n", file, line); exit(-1); }
286 | next(); if (tk == '[') {
287 | next(); if (tk != Id) { printf("%s:%d expect label\n", file, line); exit(-1); }
288 | next(); if (tk != ']') { printf("%s:%d expect `]'\n", file, line); exit(-1); }
289 | next();
290 | }
291 | offset = offset + 4;
292 | }
293 | else if (id[Tk] == STR) {
294 | next(); if (tk != '"') { printf("%s:%d expect string\n", file, line); exit(-1); }
295 | offset = ((offset + ival) & -sizeof(int)) + 4;
296 | next();
297 | }
298 | else {
299 | if (id[Tk] == Labl && id[Value] != ~0) { printf("%s:%id iduplicate label `%.*s'\n", file, line, id[Hash] & 0x3F, (char*)id[Name]); }
300 | id[Tk] = Labl;
301 | id[Value] = offset;
302 | next();
303 | if (tk != ':') { printf("%s:%d bad label\n", file, line); exit(-1); }
304 | next();
305 | }
306 | }
307 | else if (tk == Directive) {
308 | if (ival == GLOB) {
309 | next();
310 | if (tk != Id) { printf("%s:%d bad global directive `%.*s'\n", file, line, id[Hash] & 0x3F, (char*)id[Name]); exit(-1); }
311 | next();
312 | }
313 | else if (ival == EXTN) {
314 | next();
315 | if (tk != Id) { printf("%s:%d bad extern directive `%.*s'\n", file, line, id[Hash] & 0x3F, (char*)id[Name]); exit(-1); }
316 | if (tk == Id || id[Tk] != Labl) {
317 | // not exists in other files
318 | id[Tk] = Labl;
319 | id[Value] = ~0;
320 | }
321 | next();
322 | }
323 | else {
324 | printf("%s:%d unsupported directive `%.*s'\n", file, line, id[Hash] & 0x3F, (char*)id[Name]);
325 | exit(-1);
326 | }
327 | }
328 | else { printf("%s:%d bad inst\n", file, line); exit(-1); }
329 | }
330 |
331 | ++argv;
332 | }
333 |
334 | rel = (int*)((int)e + offset);
335 |
336 | if (merl) {
337 | *e++ = (0x04 << 26) | (0x02);
338 | *e++ = offset;
339 | e++; // padding for file length
340 | }
341 |
342 | /**
343 | * *--------*--------*--------*
344 | * | | Reloc | Extern |
345 | * *--------*--------*--------*
346 | * | B-Type | / | 0x02 |
347 | * *--------*--------*--------*
348 | * | I-Type | 0x11 | 0x12 |
349 | * *--------*--------*--------*
350 | * | J-Type | 0x21 | 0x22 |
351 | * *--------*--------*--------*
352 | * | DD | 0x31 | 0x32 |
353 | * *--------*--------*--------*
354 | *
355 | * Global : 0x00
356 | */
357 |
358 | p = tp;
359 | line = 1;
360 | next();
361 | while (tk) {
362 | if (tk == Id && id[Tk] != STR && id[Tk] != Labl) {
363 | i = 0;
364 | if (
365 | id[Tk] >= ADD && id[Tk] <= SLTU
366 | ) {
367 | if (id[Tk] == ADD) i = i | 0x20;
368 | else if (id[Tk] == ADDU) i = i | 0x21;
369 | else if (id[Tk] == SUB) i = i | 0x22;
370 | else if (id[Tk] == SUBU) i = i | 0x23;
371 | else if (id[Tk] == AND) i = i | 0x24;
372 | else if (id[Tk] == OR) i = i | 0x25;
373 | else if (id[Tk] == XOR) i = i | 0x26;
374 | else if (id[Tk] == NOR) i = i | 0x27;
375 | else if (id[Tk] == SLT) i = i | 0x2A;
376 | else if (id[Tk] == SLTU) i = i | 0x2B;
377 | next(); i = i | ((ival & 0x1F) << 11); next();
378 | next(); i = i | ((ival & 0x1F) << 21); next();
379 | next(); i = i | ((ival & 0x1F) << 16); next();
380 | }
381 | else if (
382 | id[Tk] >= SLLV && id[Tk] <= SRAV
383 | ) {
384 | if (id[Tk] == SLLV) i = i | 0x04;
385 | else if (id[Tk] == SRLV) i = i | 0x06;
386 | else if (id[Tk] == SRAV) i = i | 0x07;
387 | next(); i = i | ((ival & 0x1F) << 11); next();
388 | next(); i = i | ((ival & 0x1F) << 16); next();
389 | next(); i = i | ((ival & 0x1F) << 21); next();
390 | }
391 | else if (
392 | id[Tk] >= SLL && id[Tk] <= SRA
393 | ) {
394 | if (id[Tk] == SLL) i = i | 0x00;
395 | else if (id[Tk] == SRL) i = i | 0x02;
396 | else if (id[Tk] == SRA) i = i | 0x03;
397 | next(); i = i | ((ival & 0x1F) << 11); next();
398 | next(); i = i | ((ival & 0x1F) << 16); next();
399 | next(); i = i | ((ival & 0x1F) << 6); next();
400 | }
401 | else if (
402 | id[Tk] == JR
403 | ) {
404 | i = i | 0x08;
405 | next(); i = i | ((ival & 0x1F) << 21); next();
406 | }
407 | else if (
408 | id[Tk] == JALR
409 | ) {
410 | i = i | 0x09;
411 | next(); i = i | ((ival & 0x1F) << 21); next();
412 | next(); i = i | ((ival & 0x1F) << 11); next();
413 | }
414 | else if (
415 | id[Tk] == SCALL
416 | ) {
417 | next(); i = i | ((ival & 0x1F) << 16); next();
418 | i = i | 0x0C;
419 | }
420 | else if (
421 | id[Tk] == ERET
422 | ) {
423 | i = (0x10 << 26) | (0x18);
424 | next();
425 | }
426 | else if (
427 | id[Tk] == MFCO || id[Tk] == MTCO
428 | ) {
429 | if (id[Tk] == MTCO) i = i | 0x04;
430 | next(); i = i | ((ival & 0x1F) << 16); next();
431 | next(); i = i | ((ival & 0x1F) << 11); next();
432 | i = i | (0x10 << 26);
433 | }
434 | else if (
435 | (id[Tk] >= ADDI && id[Tk] <= XORI) ||
436 | (id[Tk] >= SLTI && id[Tk] <= SLTIU)
437 | ) {
438 | if (id[Tk] == ADDI) i = i | (0x08 << 26);
439 | else if (id[Tk] == ADDIU) i = i | (0x09 << 26);
440 | else if (id[Tk] == ANDI) i = i | (0x0C << 26);
441 | else if (id[Tk] == ORI) i = i | (0x0D << 26);
442 | else if (id[Tk] == XORI) i = i | (0x0E << 26);
443 | else if (id[Tk] == SLTI) i = i | (0x0A << 26);
444 | else if (id[Tk] == SLTIU) i = i | (0x0B << 26);
445 | next(); i = i | ((ival & 0x1F) << 16); next();
446 | next(); i = i | ((ival & 0x1F) << 21); next();
447 | next(); t = ival; lid = id; ltk = tk; next();
448 |
449 | if (tk == '[') {
450 | next();
451 | if (ival == ~0) { printf("%s:%d refer to unresolved symbol\n", file, line); exit(-1); }
452 | t = t - ival;
453 | next(); next();
454 | }
455 | else {
456 | if (merl && ltk == Id && (lid[Tk] == Labl || lid[Tk] == Id)) {
457 | if (t == ~0) {
458 | *rel++ = 0x12;
459 | *rel++ = (int)e - (int)le;
460 | *rel++ = lid[Hash] & 0x3F;
461 | memcpy((char*)rel, (char*)lid[Name], lid[Hash] & 0x3F);
462 | rel = (int*)((int)rel + (lid[Hash] & 0x3F) + sizeof(int) & -sizeof(int));
463 | }
464 | else {
465 | *rel++ = 0x11;
466 | *rel++ = (int)e - (int)le;
467 | }
468 | }
469 | else if (ltk == Id && ival == ~0) {
470 | printf("unresolved label: `%.*s'\n", lid[Hash] & 0x3F, (char*)lid[Name]);
471 | exit(-1);
472 | }
473 | }
474 | if (t > 32767 || t < -32768) { printf("%s:%d imm/label too large (%d)\n", file, line, t); }
475 | i = i | (t & ((1 << 16) - 1));
476 | }
477 | else if (
478 | id[Tk] == LUI
479 | ) {
480 | next(); i = i | ((ival & 0x1F) << 16); next();
481 | next(); i = i | (ival & ((i << 16) - 1)); next();
482 | i = i | (0x0F << 26);
483 | }
484 | else if (
485 | id[Tk] >= LW && id[Tk] <= SB
486 | ) {
487 | if (id[Tk] == LW) i = i | (0x23 << 26);
488 | else if (id[Tk] == LH) i = i | (0x21 << 26);
489 | else if (id[Tk] == LB) i = i | (0x20 << 26);
490 | else if (id[Tk] == SW) i = i | (0x2B << 26);
491 | else if (id[Tk] == SH) i = i | (0x29 << 26);
492 | else if (id[Tk] == SB) i = i | (0x28 << 26);
493 | next(); i = i | ((ival & 0x1F) << 16); next();
494 | next(); t = ival; lid = id; ltk = tk; next();
495 | if (tk == '[') {
496 | next();
497 | if (ival == ~0) { printf("%s:%d refer to unresolved symbol\n", file, line); exit(-1); }
498 | t = t - ival;
499 | next(); next();
500 | }
501 | else {
502 | if (merl && ltk == Id && (lid[Tk] == Labl || lid[Tk] == Id)) {
503 | if (t == ~0) {
504 | // external labels
505 | *rel++ = 0x12;
506 | *rel++ = (int)e - (int)le;
507 | *rel++ = lid[Hash] & 0x3F;
508 | memcpy((char*)rel, (char*)lid[Name], lid[Hash] & 0x3F);
509 | rel = (int*)((int)rel + (lid[Hash] & 0x3F) + sizeof(int) & -sizeof(int));
510 | }
511 | else {
512 | *rel++ = 0x11;
513 | *rel++ = (int)e - (int)le;
514 | }
515 | }
516 | else if (ltk == Id && ival == ~0) {
517 | printf("unresolved label: `%.*s'\n", lid[Hash] & 0x3F, (char*)lid[Name]);
518 | exit(-1);
519 | }
520 | }
521 | if (t > 32767 || t < -32768) { printf("%s:%d imm/label too large (%d)\n", file, line, t); }
522 | i = i | (t & ((1 << 16) - 1));
523 | next(); i = i | ((ival & 0x1F) << 21); next(); next();
524 | }
525 | else if (
526 | id[Tk] >= BEQ && id[Tk] <= BNE
527 | ) {
528 | if (id[Tk] == BEQ) i = i | (0x04 << 26);
529 | else if (id[Tk] == BNE) i = i | (0x05 << 26);
530 | next(); i = i | ((ival & 0x1F) << 16); next();
531 | next(); i = i | ((ival & 0x1F) << 21); next();
532 | next();
533 | if (merl && tk == Id && (id[Tk] == Labl || id[Tk] == Id) && ival == ~0) {
534 | // external labels
535 | *rel++ = 0x02;
536 | *rel++ = (int)e - (int)le;
537 | *rel++ = id[Hash] & 0x3F;
538 | memcpy((char*)rel, (char*)id[Name], id[Hash] & 0x3F);
539 | rel = (int*)((int)rel + (id[Hash] & 0x3F) + sizeof(int) & -sizeof(int));
540 | }
541 | else if (tk == Id && ival == ~0) {
542 | printf("unresolved label: `%.*s'\n", id[Hash] & 0x3F, (char*)id[Name]);
543 | exit(-1);
544 | }
545 | if (id[Tk] == Labl) ival = (ival - ((int)e - (int)le) - 4) >> 2;
546 | if (ival > 32767 || ival < -32768) { printf("%s:%d imm/label too large (%d)\n", file, line, t); }
547 | i = i | (ival & ((1 << 16) - 1));
548 | next();
549 | }
550 | else if (
551 | id[Tk] == J || id[Tk] == JAL
552 | ) {
553 | i = ((id[Tk] == J ? 0x02 : 3) << 26);
554 | next();
555 | if (id[Tk] == Labl || tk == Imm) ival = (ival) >> 2;
556 | if (merl && tk == Id && (id[Tk] == Labl || id[Tk] == Id)) {
557 | if (ival == ~0) {
558 | // external labels
559 | *rel++ = 0x22;
560 | *rel++ = (int)e - (int)le;
561 | *rel++ = id[Hash] & 0x3F;
562 | memcpy((char*)rel, (char*)id[Name], id[Hash] & 0x3F);
563 | rel = (int*)((int)rel + (id[Hash] & 0x3F) + sizeof(int) & -sizeof(int));
564 | }
565 | else {
566 | *rel++ = 0x21;
567 | *rel++ = (int)e - (int)le;
568 | }
569 | }
570 | else if (tk == Id && ival == ~0) {
571 | printf("unresolved label: `%.*s'\n", id[Hash] & 0x3F, (char*)id[Name]);
572 | exit(-1);
573 | }
574 | i = i | (ival & ((1 << 26) - 1));
575 | next();
576 | }
577 | else if (
578 | id[Tk] >= DB && id[Tk] <= DD
579 | ) {
580 | next();
581 | i = ival;
582 | if (merl && tk == Id && (id[Tk] == Labl || id[Tk] == Id)) {
583 | if (ival == ~0) {
584 | // external labels
585 | *rel++ = 0x32;
586 | *rel++ = (int)e - (int)le;
587 | *rel++ = id[Hash] & 0x3F;
588 | memcpy((char*)rel, (char*)id[Name], id[Hash] & 0x3F);
589 | rel = (int*)((int)rel + (id[Hash] & 0x3F) + sizeof(int) & -sizeof(int));
590 | }
591 | else {
592 | *rel++ = 0x31;
593 | *rel++ = (int)e - (int)le;
594 | }
595 | }
596 | else if (tk == Id && ival == ~0) {
597 | printf("unresolved label: `%.*s'\n", id[Hash] & 0x3F, (char*)id[Name]);
598 | exit(-1);
599 | }
600 | next();
601 | }
602 | *e++ = i;
603 | }
604 | else if (tk == Id && id[Tk] == STR) {
605 | next();
606 | memcpy((char*)e, buf, ival);
607 | e = (int*)((int)e + ival + sizeof(int) & -sizeof(int));
608 | next();
609 | }
610 | else if (tk == Id && id[Tk] == Labl) {
611 | next();
612 | next();
613 | }
614 | else if (tk == Directive) {
615 | if (ival == GLOB) {
616 | next();
617 | if (tk != Id || id[Tk] != Labl) { printf("%s:%d bad global directive `%.*s'\n", file, line, id[Hash] & 0x3F, (char*)id[Name]); exit(-1); }
618 | id[Value] = -id[Value];
619 | }
620 | else {
621 | next();
622 | }
623 | next();
624 | }
625 | }
626 |
627 | if (merl) {
628 | id = sym;
629 | // globals
630 | while (id[Tk]) {
631 | if (id[Tk] == Labl && id[Value] != ~0 && id[Value] < 0) {
632 | *rel++ = 0x00;
633 | *rel++ = -id[Value];
634 | *rel++ = id[Hash] & 0x3F;
635 | memcpy((char*)rel, (char*)id[Name], id[Hash] & 0x3F);
636 | rel = (int*)((int)rel + (id[Hash] & 0x3F) + sizeof(int) & -sizeof(int));
637 | }
638 | id = id + IdSize;
639 | }
640 | le[2] = (int)rel - (int)le;
641 | }
642 |
643 | if ((fd = open(output,
644 | O_CREAT | O_WRONLY | O_TRUNC,
645 | S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP)) < 0) { printf("open returned %d\n", fd); exit(-1); }
646 |
647 | write(fd, le, (int)rel - (int)le);
648 | close(fd);
649 |
650 | free(tp);
651 | free(sym);
652 | free(buf);
653 | free(le);
654 |
655 | return 0;
656 | }
657 |
--------------------------------------------------------------------------------
/c5.c:
--------------------------------------------------------------------------------
1 | // c4.c - C in four functions
2 |
3 | // char, int, and pointer types
4 | // if, while, return, and expression statements
5 | // just enough features to allow self-compilation and a bit more
6 |
7 | // Written by Robert Swierczek
8 |
9 | #include
10 | #include
11 | #include
12 | #include
13 | #include
14 |
15 | char *p, *lp, // current position in source code
16 | *data, // data/bss pointer
17 | *buff;
18 |
19 | int *e, *le, // current position in emitted code
20 | *id, // currently parsed identifier
21 | *sym, // symbol table (simple list of identifiers)
22 | *current_func,
23 | tk, // current token
24 | ival, // current token value
25 | ty, // current expression type
26 | loc, // local variable offset
27 | line, // current line number
28 | st,
29 | tl;
30 |
31 | // tokens and classes (operators last and in precedence order)
32 | enum {
33 | Num = 128, Fun, Sys, Glo, Arg, Loc, Id,
34 | Char, Else, Enum, If, Int, Return, Sizeof, While,
35 | Assign, Cond, Lor, Lan, Or, Xor, And, Eq, Ne, Lt, Gt, Le, Ge, Shl, Shr, Add, Sub, Mul, Div, Mod, Inc, Dec, Brak
36 | };
37 |
38 | // opcodes
39 | enum {
40 | STR ,GLO ,LOC ,ARG ,IMM ,JMP ,CALL,BZ ,BNZ ,LEV ,LI ,LC ,SI ,SC ,PSH ,
41 | OR ,XOR ,AND ,EQ ,NE ,LT ,GT ,LE ,GE ,SHL ,SHR ,ADD ,SUB ,MUL ,DIV ,MOD ,LABL,CMMT
42 | };
43 |
44 | // types
45 | enum { CHAR, INT, PTR };
46 |
47 | // identifier offsets (since we can't create an ident struct)
48 | enum { Tk, Hash, Name, Class, Type, Val, HClass, HType, HVal, Idsz };
49 |
50 | void next()
51 | {
52 | char *pp;
53 |
54 | while ((tk = *p)) {
55 | ++p;
56 | if (tk == '\n') ++line;
57 | else if (tk == '#') {
58 | while (*p != 0 && *p != '\n') ++p;
59 | }
60 | else if ((tk >= 'a' && tk <= 'z') || (tk >= 'A' && tk <= 'Z') || tk == '_') {
61 | pp = p - 1;
62 | while ((*p >= 'a' && *p <= 'z') || (*p >= 'A' && *p <= 'Z') || (*p >= '0' && *p <= '9') || *p == '_')
63 | tk = tk * 147 + *p++;
64 | tk = (tk << 6) + (p - pp);
65 | id = sym;
66 | while (id[Tk]) {
67 | if (tk == id[Hash] && !memcmp((char *)id[Name], pp, p - pp)) { tk = id[Tk]; return; }
68 | id = id + Idsz;
69 | }
70 | id[Name] = (int)pp;
71 | id[Hash] = tk;
72 | id[Class] = 0;
73 | tk = id[Tk] = Id;
74 | return;
75 | }
76 | else if (tk >= '0' && tk <= '9') {
77 | if ((ival = tk - '0')) { while (*p >= '0' && *p <= '9') ival = ival * 10 + *p++ - '0'; }
78 | else if (*p == 'x' || *p == 'X') {
79 | while ((tk = *++p) && ((tk >= '0' && tk <= '9') || (tk >= 'a' && tk <= 'f') || (tk >= 'A' && tk <= 'F')))
80 | ival = ival * 16 + (tk & 15) + (tk >= 'A' ? 9 : 0);
81 | }
82 | else { while (*p >= '0' && *p <= '7') ival = ival * 8 + *p++ - '0'; }
83 | tk = Num;
84 | return;
85 | }
86 | else if (tk == '/') {
87 | if (*p == '/') {
88 | ++p;
89 | while (*p != 0 && *p != '\n') ++p;
90 | }
91 | else {
92 | tk = Div;
93 | return;
94 | }
95 | }
96 | else if (tk == '\'' || tk == '"') {
97 | pp = data;
98 | while (*p != 0 && *p != tk) {
99 | if ((ival = *p++) == '\\') {
100 | if ((ival = *p) == 'n') { ival = '\n'; ++p; }
101 | else if ((ival = *p) == 't') { ival = '\t'; ++p; }
102 | }
103 | if (tk == '"') *data++ = ival;
104 | }
105 | ++p;
106 | if (tk == '"') ival = (int)pp; else tk = Num;
107 | return;
108 | }
109 | else if (tk == '=') { if (*p == '=') { ++p; tk = Eq; } else tk = Assign; return; }
110 | else if (tk == '+') { if (*p == '+') { ++p; tk = Inc; } else tk = Add; return; }
111 | else if (tk == '-') { if (*p == '-') { ++p; tk = Dec; } else tk = Sub; return; }
112 | else if (tk == '!') { if (*p == '=') { ++p; tk = Ne; } return; }
113 | else if (tk == '<') { if (*p == '=') { ++p; tk = Le; } else if (*p == '<') { ++p; tk = Shl; } else tk = Lt; return; }
114 | else if (tk == '>') { if (*p == '=') { ++p; tk = Ge; } else if (*p == '>') { ++p; tk = Shr; } else tk = Gt; return; }
115 | else if (tk == '|') { if (*p == '|') { ++p; tk = Lor; } else tk = Or; return; }
116 | else if (tk == '&') { if (*p == '&') { ++p; tk = Lan; } else tk = And; return; }
117 | else if (tk == '^') { tk = Xor; return; }
118 | else if (tk == '%') { tk = Mod; return; }
119 | else if (tk == '*') { tk = Mul; return; }
120 | else if (tk == '[') { tk = Brak; return; }
121 | else if (tk == '?') { tk = Cond; return; }
122 | else if (tk == '~' || tk == ';' || tk == '{' || tk == '}' || tk == '(' || tk == ')' || tk == ']' || tk == ',' || tk == ':') return;
123 | }
124 | }
125 |
126 | void expr(int lev)
127 | {
128 | int t, *d;
129 |
130 | if (!tk) { printf("%d: unexpected eof in expression\n", line); exit(-1); }
131 | else if (tk == Num) { *++e = IMM; *++e = ival; next(); ty = INT; }
132 | else if (tk == '"') {
133 | *++e = STR; *++e = ival; next();
134 | while (tk == '"') next();
135 | data = (char *)((int)data + sizeof(int) & -sizeof(int)); ty = PTR;
136 | }
137 | else if (tk == Sizeof) {
138 | next(); if (tk == '(') next(); else { printf("%d: open paren expected in sizeof\n", line); exit(-1); }
139 | ty = INT; if (tk == Int) next(); else if (tk == Char) { next(); ty = CHAR; }
140 | while (tk == Mul) { next(); ty = ty + PTR; }
141 | if (tk == ')') next(); else { printf("%d: close paren expected in sizeof\n", line); exit(-1); }
142 | *++e = IMM; *++e = (ty == CHAR) ? sizeof(char) : sizeof(int);
143 | ty = INT;
144 | }
145 | else if (tk == Id) {
146 | d = id; next();
147 | if (tk == '(') {
148 | next();
149 | t = 0;
150 | while (tk != ')') { expr(Assign); *++e = PSH; ++t; if (tk == ',') next(); }
151 | next();
152 | if ((d[Class] == Fun || d[Class] == 0)) { *++e = CALL; *++e = t; *++e = (int)d; }
153 | else { printf("%d: bad function call\n", line); exit(-1); }
154 | ty = d[Type];
155 | }
156 | else if (d[Class] == Num) { *++e = IMM; *++e = d[Val]; ty = INT; }
157 | else {
158 | if (d[Class] == Loc) { *++e = LOC; *++e = d[Val]; }
159 | else if (d[Class] == Arg) { *++e = ARG; *++e = d[Val]; }
160 | else if (d[Class] == Glo) { *++e = GLO; *++e = (int)d; }
161 | else { *++e = GLO; *++e = (int)d; d[Type] = INT; } // undefined symbol treated as extern variable
162 | *++e = ((ty = d[Type]) == CHAR) ? LC : LI;
163 | }
164 | }
165 | else if (tk == '(') {
166 | next();
167 | if (tk == Int || tk == Char) {
168 | t = (tk == Int) ? INT : CHAR; next();
169 | while (tk == Mul) { next(); t = t + PTR; }
170 | if (tk == ')') next(); else { printf("%d: bad cast\n", line); exit(-1); }
171 | expr(Inc);
172 | ty = t;
173 | }
174 | else {
175 | expr(Assign);
176 | if (tk == ')') next(); else { printf("%d: close paren expected\n", line); exit(-1); }
177 | }
178 | }
179 | else if (tk == Mul) {
180 | next(); expr(Inc);
181 | if (ty > INT) ty = ty - PTR; else { printf("%d: bad dereference\n", line); exit(-1); }
182 | *++e = (ty == CHAR) ? LC : LI;
183 | }
184 | else if (tk == And) {
185 | next(); expr(Inc);
186 | if (*e == LC || *e == LI) --e; else { printf("%d: bad address-of\n", line); exit(-1); }
187 | ty = ty + PTR;
188 | }
189 | else if (tk == '!') { next(); expr(Inc); *++e = PSH; *++e = IMM; *++e = 0; *++e = EQ; ty = INT; }
190 | else if (tk == '~') { next(); expr(Inc); *++e = PSH; *++e = IMM; *++e = -1; *++e = XOR; ty = INT; }
191 | else if (tk == Add) { next(); expr(Inc); ty = INT; }
192 | else if (tk == Sub) {
193 | next(); *++e = IMM;
194 | if (tk == Num) { *++e = -ival; next(); } else { *++e = 0; *++e = PSH; expr(Inc); *++e = SUB; }
195 | ty = INT;
196 | }
197 | else if (tk == Inc || tk == Dec) {
198 | t = tk; next(); expr(Inc);
199 | if (*e == LC) { *e = PSH; *++e = LC; }
200 | else if (*e == LI) { *e = PSH; *++e = LI; }
201 | else { printf("%d: bad lvalue in pre-increment\n", line); exit(-1); }
202 | *++e = PSH;
203 | *++e = IMM; *++e = (ty > PTR) ? sizeof(int) : sizeof(char);
204 | *++e = (t == Inc) ? ADD : SUB;
205 | *++e = (ty == CHAR) ? SC : SI;
206 | }
207 | else { printf("%d: bad expression\n", line); exit(-1); }
208 |
209 | while (tk >= lev) { // "precedence climbing" or "Top Down Operator Precedence" method
210 | t = ty;
211 | if (tk == Assign) {
212 | next();
213 | if (*e == LC || *e == LI) *e = PSH; else { printf("%d: bad lvalue in assignment\n", line); exit(-1); }
214 | expr(Assign); *++e = ((ty = t) == CHAR) ? SC : SI;
215 | }
216 | else if (tk == Cond) {
217 | next();
218 | *++e = BZ; d = ++e;
219 | expr(Assign);
220 | if (tk == ':') next(); else { printf("%d: conditional missing colon\n", line); exit(-1); }
221 | *d = (int)(e + 3); *++e = JMP; d = ++e; *++e = LABL;
222 | expr(Cond);
223 | *(int*)(*d = (int)++e) = LABL;
224 | }
225 | else if (tk == Lor) {
226 | next();
227 | *++e = PSH;
228 | *++e = BNZ;
229 | d = ++e;
230 | *++e = OR;
231 | expr(Lan);
232 | *++e = PSH;
233 | *(int*)(*d = (int)++e) = LABL;
234 | *++e = OR;
235 | ty = INT;
236 | }
237 | else if (tk == Lan) {
238 | next();
239 | *++e = PSH;
240 | *++e = BZ;
241 | d = ++e;
242 | *++e = AND;
243 | expr(Or);
244 | *++e = PSH;
245 | *(int*)(*d = (int)++e) = LABL;
246 | *++e = AND;
247 | ty = INT;
248 | }
249 | else if (tk == Or) { next(); *++e = PSH; expr(Xor); *++e = OR; ty = INT; }
250 | else if (tk == Xor) { next(); *++e = PSH; expr(And); *++e = XOR; ty = INT; }
251 | else if (tk == And) { next(); *++e = PSH; expr(Eq); *++e = AND; ty = INT; }
252 | else if (tk == Eq) { next(); *++e = PSH; expr(Lt); *++e = EQ; ty = INT; }
253 | else if (tk == Ne) { next(); *++e = PSH; expr(Lt); *++e = NE; ty = INT; }
254 | else if (tk == Lt) { next(); *++e = PSH; expr(Shl); *++e = LT; ty = INT; }
255 | else if (tk == Gt) { next(); *++e = PSH; expr(Shl); *++e = GT; ty = INT; }
256 | else if (tk == Le) { next(); *++e = PSH; expr(Shl); *++e = LE; ty = INT; }
257 | else if (tk == Ge) { next(); *++e = PSH; expr(Shl); *++e = GE; ty = INT; }
258 | else if (tk == Shl) { next(); *++e = PSH; expr(Add); *++e = SHL; ty = INT; }
259 | else if (tk == Shr) { next(); *++e = PSH; expr(Add); *++e = SHR; ty = INT; }
260 | else if (tk == Add) {
261 | next(); *++e = PSH; expr(Mul);
262 | if ((ty = t) > PTR) { *++e = PSH; *++e = IMM; *++e = 2; *++e = SHL; }
263 | *++e = ADD;
264 | }
265 | else if (tk == Sub) {
266 | next(); *++e = PSH; expr(Mul);
267 | if (t > PTR && t == ty) { *++e = SUB; *++e = PSH; *++e = IMM; *++e = sizeof(int); *++e = DIV; ty = INT; }
268 | else if ((ty = t) > PTR) { *++e = PSH; *++e = IMM; *++e = 2; *++e = SHL; *++e = SUB; }
269 | else *++e = SUB;
270 | }
271 | else if (tk == Mul) { next(); *++e = PSH; expr(Inc); *++e = MUL; ty = INT; }
272 | else if (tk == Div) { next(); *++e = PSH; expr(Inc); *++e = DIV; ty = INT; }
273 | else if (tk == Mod) { next(); *++e = PSH; expr(Inc); *++e = MOD; ty = INT; }
274 | else if (tk == Inc || tk == Dec) {
275 | if (*e == LC) { *e = PSH; *++e = LC; }
276 | else if (*e == LI) { *e = PSH; *++e = LI; }
277 | else { printf("%d: bad lvalue in post-increment\n", line); exit(-1); }
278 | *++e = PSH; *++e = IMM; *++e = (ty > PTR) ? sizeof(int) : sizeof(char);
279 | *++e = (tk == Inc) ? ADD : SUB;
280 | *++e = (ty == CHAR) ? SC : SI;
281 | *++e = PSH; *++e = IMM; *++e = (ty > PTR) ? sizeof(int) : sizeof(char);
282 | *++e = (tk == Inc) ? SUB : ADD;
283 | next();
284 | }
285 | else if (tk == Brak) {
286 | next(); *++e = PSH; expr(Assign);
287 | if (tk == ']') next(); else { printf("%d: close bracket expected\n", line); exit(-1); }
288 | if (t > PTR) { *++e = PSH; *++e = IMM; *++e = 2; *++e = SHL; }
289 | else if (t < PTR) { printf("%d: pointer type expected\n", line); exit(-1); }
290 | *++e = ADD;
291 | *++e = ((ty = t - PTR) == CHAR) ? LC : LI;
292 | }
293 | else { printf("%d: compiler error tk=%d\n", line, tk); exit(-1); }
294 | }
295 | }
296 |
297 | void
298 | codegen(int *e, int *le)
299 | {
300 | int i, lst;
301 |
302 | st = -1;
303 | lst = 0; // last in stack
304 | while (e != le) {
305 | if (*e == STR) {
306 | tl = sprintf(buff, " addi $v0, $gp, s%u\n", *++e);
307 | buff = buff + tl; lst = 0;
308 | }
309 | else if (*e == GLO) {
310 | ++e;
311 | lst = 0;
312 | if (*(e + 1) == LI) {
313 | tl = sprintf(buff, " lw $v0, %.*s($gp)\n", ((int*)(*e))[Hash] & 0x3F, (char*)((int*)(*e))[Name]);
314 | buff = buff + tl; ++e;
315 | }
316 | else if (*(e + 1) == LC) {
317 | tl = sprintf(buff, " lb $v0, %.*s($gp)\n", ((int*)(*e))[Hash] & 0x3F, (char*)((int*)(*e))[Name]);
318 | buff = buff + tl; ++e;
319 | }
320 | else if (*(e + 1) == PSH) {
321 | tl = sprintf(buff, " addi $t%d, $gp, %.*s\n", ++st, ((int*)(*e))[Hash] & 0x3F, (char*)((int*)(*e))[Name]);
322 | buff = buff + tl; ++e; lst = 1;
323 | }
324 | else {
325 | tl = sprintf(buff, " addi $v0, $gp, %.*s\n", ((int*)(*e))[Hash] & 0x3F, (char*)((int*)(*e))[Name]);
326 | buff = buff + tl;
327 | }
328 | }
329 | else if (*e == LOC) {
330 | lst = 0;
331 | if (*(e + 2) == LI) {
332 | tl = sprintf(buff, " lw $v0, -%d($fp)\n", (*++e + 1) << 2);
333 | buff = buff + tl; ++e;
334 | }
335 | else if (*(e + 2) == LC) {
336 | tl = sprintf(buff, " lb $v0, -%d($fp)\n", (*++e + 1) << 2);
337 | buff = buff + tl; ++e;
338 | }
339 | else if (*(e + 2) == PSH) {
340 | tl = sprintf(buff, " addi $t%d, $fp, -%d\n", ++st, (*++e + 1) << 2);
341 | buff = buff + tl; ++e; lst = 1;
342 | }
343 | else {
344 | tl = printf(buff, " addi $v0, $fp, -%d\n", (*++e + 1) << 2);
345 | buff = buff + tl;
346 | }
347 | }
348 | else if (*e == ARG) {
349 | lst = 0;
350 | if (*(e + 2) == LI) { tl = sprintf(buff, " lw $v0, %d($fp)\n", (*++e + 2) << 2); buff = buff + tl; ++e; }
351 | else if (*(e + 2) == LC) { tl = sprintf(buff, " lb $v0, %d($fp)\n", (*++e + 2) << 2); buff = buff + tl; ++e; }
352 | else if (*(e + 2) == PSH) {
353 | tl = sprintf(buff, " addi $t%d, $fp, %d\n", ++st, (*++e + 2) << 2);
354 | buff = buff + tl; ++e; lst = 1;
355 | }
356 | else { tl = sprintf(buff, " addi $v0, $fp, %d\n", (*++e + 2) << 2); buff = buff + tl; }
357 | }
358 | else if (*e == IMM) {
359 | ++e;
360 | if (*e <= 32767 && *e >= -65535) {
361 | lst = 0;
362 | if (*(e + 1) == LI || *(e + 1) == LC) {
363 | tl = sprintf(buff, " %s $v0, %d($gp)\n", *(e + 1) == LI ? "lw" : "lb", *e);
364 | buff = buff + tl; ++e;
365 | }
366 | else if (*(e + 1) == PSH) {
367 | tl = sprintf(buff, " addi $t%d, $zero, %d\n", ++st, *e);
368 | buff = buff + tl; ++e; lst = 1;
369 | }
370 | else if (*(e + 1) == OR) { tl = sprintf(buff, " ori $v0, $t%d, %d\n", st--, *e); buff = buff + tl; ++e; }
371 | else if (*(e + 1) == XOR) { tl = sprintf(buff, " xori $v0, $t%d, %d\n", st--, *e); buff = buff + tl; ++e; }
372 | else if (*(e + 1) == AND) { tl = sprintf(buff, " andi $v0, $t%d, %d\n", st--, *e); buff = buff + tl; ++e; }
373 | else if (*(e + 1) == NE) { tl = sprintf(buff, " addi $v0, $t%d, %d\n", st--, -*e); buff = buff + tl; ++e; }
374 | else if (*(e + 1) == SHL) { tl = sprintf(buff, " sll $v0, $t%d, %d\n", st--, *e); buff = buff + tl; ++e; }
375 | else if (*(e + 1) == SHR) { tl = sprintf(buff, " srl $v0, $t%d, %d\n", st--, *e); buff = buff + tl; ++e; }
376 | else if (*(e + 1) == ADD) { tl = sprintf(buff, " addi $v0, $t%d, %d\n", st--, *e); buff = buff + tl; ++e; }
377 | else if (*(e + 1) == SUB) { tl = sprintf(buff, " addi $v0, $t%d, %d\n", st--, -*e); buff = buff + tl; ++e; }
378 | else { tl = sprintf(buff, " addi $v0, $zero, %d\n", *e); buff = buff + tl; }
379 | }
380 | else { printf("Imm too large: %d\n", *e); exit(-1); }
381 | }
382 | else if (*e == JMP) {
383 | // address indepent code
384 | tl = sprintf(buff, " beq $zero, $zero, _%.*s_%u\n", current_func[Hash] & 0x3F, (char*)current_func[Name], *++e);
385 | buff = buff + tl;
386 | }
387 | else if (*e == CALL) {
388 | ++e; i = 0; lst = 0;
389 | tl = sprintf(buff, " addi $sp, $sp, -%d\n", (st + 1) << 2); buff = buff + tl;
390 | while (i <= st - *e) {
391 | tl = sprintf(buff, " sw $t%d, %d($sp)\n", i, (st - i) << 2);
392 | buff = buff + tl; ++i;
393 | } // save temp stack
394 | while (i <= st) {
395 | tl = sprintf(buff, " sw $t%d, %d($sp)\n", (st - *e) + (st - i) + 1, (st - i) << 2); buff = buff + tl; ++i;
396 | }
397 |
398 | tl = sprintf(buff, " jal %.*s\n", ((int*)(*(e + 1)))[Hash] & 0x3F, (char*)((int*)(*(e + 1)))[Name]);
399 | buff = buff + tl;
400 |
401 | i = 0;
402 | while (i <= st - *e) { tl = sprintf(buff, " lw $t%d, %d($sp)\n", i, (st - i) << 2); buff = buff + tl; ++i; }
403 | tl = sprintf(buff, " addi $sp, $sp, %d\n", (st + 1) << 2); buff = buff + tl;
404 | st = st - *e;
405 |
406 | ++e;
407 | }
408 | else if (*e == BZ) {
409 | if (lst) {
410 | tl = sprintf(buff, " beq $t%d, $zero, _%.*s_%u\n",
411 | st, current_func[Hash] & 0x3F, (char*)current_func[Name], *++e);
412 | buff = buff + tl;
413 | }
414 | else {
415 | tl = sprintf(buff, " beq $v0, $zero, _%.*s_%u\n", current_func[Hash] & 0x3F, (char*)current_func[Name], *++e);
416 | buff = buff + tl;
417 | }
418 | }
419 | else if (*e == BNZ) {
420 | if (lst) {
421 | tl = sprintf(buff, " bne $t%d, $zero, _%.*s_%u\n",
422 | st, current_func[Hash] & 0x3F, (char*)current_func[Name], *++e);
423 | buff = buff + tl;
424 | }
425 | else {
426 | tl = sprintf(buff, " bne $v0, $zero, _%.*s_%u\n",
427 | current_func[Hash] & 0x3F, (char*)current_func[Name], *++e);
428 | buff = buff + tl;
429 | }
430 | }
431 | else if (*e == LEV) {
432 | if (e + 1 != le) { // at the end of function
433 | tl = sprintf(buff, " j _%.*s_end\n", current_func[Hash] & 0x3F, (char*)current_func[Name]); buff = buff + tl;
434 | }
435 | }
436 | else if (*e == LI) {
437 | if (lst) { tl = sprintf(buff, " lw $v0, 0($t%d)\n", st); buff = buff + tl; }
438 | else { tl = sprintf(buff, " lw $v0, 0($v0)\n"); buff = buff + tl; }
439 | lst = 0;
440 | }
441 | else if (*e == LC) {
442 | if (lst) { tl = sprintf(buff, " lb $v0, 0($t%d)\n", st); buff = buff + tl; }
443 | else { tl = sprintf(buff, " lb $v0, 0($v0)\n"); buff = buff + tl;}
444 | lst = 0;
445 | }
446 | else if (*e == SI) { lst = 0; tl = sprintf(buff, " sw $v0, 0($t%d)\n", st--); buff = buff + tl; }
447 | else if (*e == SC) { lst = 0; tl = sprintf(buff, " sb $v0, 0($t%d)\n", st--); buff = buff + tl; }
448 | else if (*e == PSH) { tl = sprintf(buff, " addi $t%d, $v0, 0\n", ++st); buff = buff + tl; }
449 | else if (*e == OR) { lst = 0; tl = sprintf(buff, " or $v0, $t%d, $v0\n", st--); buff = buff + tl; }
450 | else if (*e == XOR) { lst = 0; tl = sprintf(buff, " xor $v0, $t%d, $v0\n", st--); buff = buff + tl; }
451 | else if (*e == AND) { lst = 0; tl = sprintf(buff, " and $v0, $t%d, $v0\n", st--); buff = buff + tl; }
452 | else if (*e == EQ) {
453 | lst = 0;
454 | if (*(e + 1) == BNZ) {
455 | e = e + 2;
456 | tl = sprintf(buff, " beq $v0, $t%d, _%.*s_%u\n",
457 | st--, current_func[Hash] & 0x3F, (char*)current_func[Name], *e);
458 | buff = buff + tl;
459 | }
460 | else if (*(e + 1) == BZ) {
461 | e = e + 2;
462 | tl = sprintf(buff, " bne $v0, $t%d, _%.*s_%u\n",
463 | st--, current_func[Hash] & 0x3F, (char*)current_func[Name], *e);
464 | buff = buff + tl;
465 | }
466 | else {
467 | tl = sprintf(buff,
468 | " beq $v0, $t%d, 2\n"
469 | " addi $v0, $zero, 0\n"
470 | " beq $zero, $zero, 1\n"
471 | " addi $v0, $zero, 1\n", st--
472 | );
473 | buff = buff + tl;
474 | }
475 | }
476 | else if (*e == NE) { tl = sprintf(buff, " sub $v0, $t%d, $v0\n", st--); buff = buff + tl; }
477 | else if (*e == LT) { tl = sprintf(buff, " slt $v0, $t%d, $v0\n", st--); buff = buff + tl; }
478 | else if (*e == LE) { tl = sprintf(buff, " slt $v0, $v0, $t%d\n addi $v0, $v0, -1\n", st--); buff = buff + tl; }
479 | else if (*e == GT) { tl = sprintf(buff, " slt $v0, $v0, $t%d\n", st--); buff = buff + tl; }
480 | else if (*e == GE) { tl = sprintf(buff, " slt $v0, $t%d, $v0\n addi $v0, $v0, -1\n", st--); buff = buff + tl; }
481 | else if (*e == SHL) { tl = sprintf(buff, " sllv $v0, $t%d, $v0\n", st--); buff = buff + tl; }
482 | else if (*e == SHR) { tl = sprintf(buff, " srlv $v0, $t%d, $v0\n", st--); buff = buff + tl; }
483 | else if (*e == ADD) { tl = sprintf(buff, " add $v0, $t%d, $v0\n", st--); buff = buff + tl; }
484 | else if (*e == SUB) { tl = sprintf(buff, " sub $v0, $t%d, $v0\n", st--); buff = buff + tl; }
485 | else if (*e == MUL || *e == DIV || *e == MOD) {
486 | if (!lst) { tl = sprintf(buff, " addi $t%d, $v0, 0\n", ++st); buff = buff + tl; }
487 | i = 0;
488 | tl = sprintf(buff, " addi $sp, $sp, -%d\n", (st + 1) << 2); buff = buff + tl;
489 | while (i <= st - 2) {
490 | tl = sprintf(buff, " sw $t%d, %d($sp)\n", i, (st - i) << 2);
491 | buff = buff + tl; ++i;
492 | } // save temp stack
493 | while (i <= st) {
494 | tl = sprintf(buff, " sw $t%d, %d($sp)\n", st + st - i - 1, (st - i) << 2);
495 | buff = buff + tl; ++i;
496 | }
497 | if (*e == MUL) { tl = sprintf(buff, " jal mul\n"); buff = buff + tl; }
498 | else if (*e == DIV) { tl = sprintf(buff, " jal div\n"); buff = buff + tl; }
499 | else if (*e == MOD) { tl = sprintf(buff, " jal mod\n"); buff = buff + tl; }
500 | i = 0;
501 | while (i <= st - 2) { tl = sprintf(buff, " lw $t%d, %d($sp)\n", i, (st - i) << 2); buff = buff + tl; ++i; }
502 | tl = sprintf(buff, " addi $sp, $sp, %d\n", (st + 1) << 2); buff = buff + tl;
503 | st = st - 2;
504 | }
505 | else if (*e == LABL) {
506 | tl = sprintf(buff, "_%.*s_%u:\n", current_func[Hash] & 0x3F, (char*)current_func[Name], (int)e);
507 | buff = buff + tl;
508 | }
509 | else if (*e == CMMT) { tl = sprintf(buff, "## line %d\n", *++e); buff = buff + tl; }
510 | else { printf("Unknown inst: %d\n", *e); exit(-1); }
511 | ++e;
512 | }
513 | }
514 |
515 | void stmt()
516 | {
517 | int *a, *b;
518 |
519 | if (tk == If) {
520 | *++e = CMMT; *++e = line;
521 | next();
522 | if (tk == '(') next(); else { printf("%d: open paren expected\n", line); exit(-1); }
523 | expr(Assign);
524 | if (tk == ')') next(); else { printf("%d: close paren expected\n", line); exit(-1); }
525 | *++e = BZ; b = ++e;
526 | stmt();
527 | if (tk == Else) {
528 | *b = (int)(e + 3); *++e = JMP; b = ++e; *++e = LABL;
529 | next();
530 | stmt();
531 | }
532 | *(int*)(*b = (int)++e) = LABL;
533 | }
534 | else if (tk == While) {
535 | *++e = CMMT; *++e = line;
536 | next();
537 | *(int*)(a = ++e) = LABL;
538 | if (tk == '(') next(); else { printf("%d: open paren expected\n", line); exit(-1); }
539 | expr(Assign);
540 | if (tk == ')') next(); else { printf("%d: close paren expected\n", line); exit(-1); }
541 | *++e = BZ; b = ++e;
542 | stmt();
543 | *++e = JMP; *++e = (int)a;
544 | *(int*)(*b = (int)++e) = LABL;
545 | }
546 | else if (tk == Return) {
547 | *++e = CMMT; *++e = line;
548 | next();
549 | if (tk != ';') expr(Assign);
550 | *++e = LEV;
551 | if (tk == ';') next(); else { printf("%d: semicolon expected\n", line); exit(-1); }
552 | }
553 | else if (tk == '{') {
554 | next();
555 | while (tk != '}') stmt();
556 | next();
557 | }
558 | else if (tk == ';') {
559 | next();
560 | }
561 | else {
562 | *++e = CMMT; *++e = line;
563 | expr(Assign);
564 | if (tk == ';') next(); else { printf("%d: semicolon expected\n", line); exit(-1); }
565 | }
566 | }
567 |
568 | int main(int argc, char **argv)
569 | {
570 | int fd, bt, ty, poolsz;
571 | int i; // temps
572 | int *le;
573 | int *te;
574 | char *td, *t;
575 | char *tb;
576 | char *output;
577 |
578 | --argc; ++argv;
579 | if (argc < 1) { printf("usage: c5 -o output file ...\n"); return -1; }
580 |
581 | while (argc && **argv == '-') {
582 | if ((*argv)[1] == 'o') {
583 | if (! --argc) { printf("no output file\n"); exit(-1); }
584 | output = *++argv;
585 | }
586 | --argc; ++argv;
587 | }
588 |
589 | poolsz = 256*1024; // arbitrary size
590 | if (!(sym = malloc(poolsz))) { printf("could not malloc(%d) symbol area\n", poolsz); return -1; }
591 | if (!(le = e = malloc(poolsz))) { printf("could not malloc(%d) text area\n", poolsz); return -1; }
592 | if (!(data = malloc(poolsz))) { printf("could not malloc(%d) data area\n", poolsz); return -1; }
593 | if (!(tb = buff = malloc(poolsz))) { printf("could not malloc(%d) buffer area\n", poolsz); return -1; }
594 |
595 | memset(sym, 0, poolsz);
596 | memset(e, 0, poolsz);
597 | memset(data, 0, poolsz);
598 | memset(buff, 0, poolsz);
599 |
600 | te = e;
601 | td = data;
602 |
603 | p = "char else enum if int return sizeof while void ";
604 | i = Char; while (i <= While) { next(); id[Tk] = i++; } // add keywords to symbol table
605 | next(); id[Tk] = Char; // handle void type
606 |
607 | if (!(lp = p = malloc(poolsz))) { printf("could not malloc(%d) source area\n", poolsz); return -1; }
608 |
609 | while (argc--) {
610 | p = lp;
611 | if ((fd = open(*argv++, 0)) < 0) { printf("could not open(%s)\n", *argv); return -1; }
612 | if ((i = read(fd, p, poolsz-1)) <= 0) { printf("read() returned %d\n", i); return -1; }
613 | p[i] = 0;
614 | close(fd);
615 |
616 | // parse declarations
617 | line = 1;
618 | next();
619 | while (tk) {
620 | bt = INT; // basetype
621 | if (tk == Int) next();
622 | else if (tk == Char) { next(); bt = CHAR; }
623 | else if (tk == Enum) {
624 | next();
625 | if (tk != '{') next();
626 | if (tk == '{') {
627 | next();
628 | i = 0;
629 | while (tk != '}') {
630 | if (tk != Id) { printf("%d: bad enum identifier %d\n", line, tk); return -1; }
631 | next();
632 | if (tk == Assign) {
633 | next();
634 | if (tk != Num) { printf("%d: bad enum initializer\n", line); return -1; }
635 | i = ival;
636 | next();
637 | }
638 | id[Class] = Num; id[Type] = INT; id[Val] = i++;
639 | if (tk == ',') next();
640 | }
641 | next();
642 | }
643 | }
644 | while (tk != ';' && tk != '}') {
645 | ty = bt;
646 | while (tk == Mul) { next(); ty = ty + PTR; }
647 | if (tk != Id) { printf("%d: bad global declaration\n", line); return -1; }
648 | if (id[Class]) {
649 | printf("%d: duplicate global definition: %.*s\n", line, id[Hash] & 0x3F, (char*)id[Name]);
650 | return -1;
651 | }
652 | next();
653 | id[Type] = ty;
654 | if (tk == '(') { // function
655 | id[Class] = Fun;
656 | id[Val] = (int)(e + 1);
657 | current_func = id;
658 | next(); i = 0;
659 | while (tk != ')') {
660 | ty = INT;
661 | if (tk == Int) next();
662 | else if (tk == Char) { next(); ty = CHAR; }
663 | while (tk == Mul) { next(); ty = ty + PTR; }
664 | if (tk != Id) { printf("%d: bad parameter declaration\n", line); return -1; }
665 | if (id[Class] == Loc || id[Class] == Arg) { printf("%d: duplicate parameter definition\n", line); return -1; }
666 | id[HClass] = id[Class]; id[Class] = Arg;
667 | id[HType] = id[Type]; id[Type] = ty;
668 | id[HVal] = id[Val]; id[Val] = i++;
669 | next();
670 | if (tk == ',') next();
671 | }
672 | next();
673 | if (tk == '{') {
674 | i = 0;
675 | next();
676 | while (tk == Int || tk == Char) {
677 | bt = (tk == Int) ? INT : CHAR;
678 | next();
679 | while (tk != ';') {
680 | ty = bt;
681 | while (tk == Mul) { next(); ty = ty + PTR; }
682 | if (tk != Id) { printf("%d: bad local declaration\n", line); return -1; }
683 | if (id[Class] == Loc || id[Class] == Arg) { printf("%d: duplicate local definition\n", line); return -1; }
684 | id[HClass] = id[Class]; id[Class] = Loc;
685 | id[HType] = id[Type]; id[Type] = ty;
686 | id[HVal] = id[Val]; id[Val] = i++;
687 | next();
688 | if (tk == ',') next();
689 | }
690 | next();
691 | }
692 | le = e;
693 | while (tk != '}') stmt();
694 | tl = sprintf(
695 | buff,
696 | "\n%.*s:\n"
697 | " addi $sp, $sp, -4\n"
698 | " sw $fp, 0($sp)\n"
699 | " addi $sp, $sp, -4\n"
700 | " sw $ra, 0($sp)\n"
701 | " addi $fp, $sp, 0\n"
702 | "\n",
703 | current_func[Hash] & 0x3F, (char*)current_func[Name]
704 | );
705 | buff = buff + tl;
706 | if (i) {
707 | tl = sprintf(buff, " addi $sp, $sp, -%d\n", i << 2);
708 | buff = buff + tl;
709 | }
710 | codegen(le + 1, e + 1);
711 | tl = sprintf(
712 | buff,
713 | "\n"
714 | "_%.*s_end:\n",
715 | current_func[Hash] & 0x3F, (char*)current_func[Name]
716 | );
717 | buff = buff + tl;
718 | if (i) {
719 | tl = sprintf(buff, " addi $sp, $sp, %d\n", i << 2);
720 | buff = buff + tl;
721 | }
722 | tl = sprintf(
723 | buff,
724 | " lw $ra, 0($sp)\n"
725 | " addi $sp, $sp, 4\n"
726 | " lw $fp, 0($sp)\n"
727 | " addi $sp, $sp, 4\n"
728 | " jr $ra\n"
729 | "\n.global %.*s\n",
730 | current_func[Hash] & 0x3F, (char*)current_func[Name]
731 | );
732 | buff = buff + tl;
733 | }
734 | else if (tk != ';') { printf("%d: bad function decl\n", line); exit(-1); }
735 | else {
736 | tl = sprintf(buff, "\n.extern %.*s\n", current_func[Hash] & 0x3F, (char*)current_func[Name]);
737 | buff = buff + tl;
738 | }
739 | id = sym; // unwind symbol table locals
740 | while (id[Tk]) {
741 | if (id[Class] == Loc || id[Class] == Arg) {
742 | id[Class] = id[HClass];
743 | id[Type] = id[HType];
744 | id[Val] = id[HVal];
745 | }
746 | id = id + Idsz;
747 | }
748 | }
749 | else {
750 | id[Class] = Glo;
751 | id[Val] = (int)data;
752 | tl = sprintf(buff, "\n%.*s:\n dd 0\n.global %.*s\n", id[Hash] & 0x3F, (char*)id[Name], id[Hash] & 0x3F, (char*)id[Name]);
753 | buff = buff + tl;
754 | }
755 | if (tk == ',') next();
756 | }
757 | next();
758 | }
759 | }
760 |
761 | t = td;
762 | while (t < data) {
763 | tl = sprintf(buff, "s%u:\n", (int)t); buff = buff + tl;
764 | tl = sprintf(buff, " string \""); buff = buff + tl;
765 | while (*t) {
766 | if (*t == '\n') { tl = sprintf(buff, "\\n"); buff = buff + tl; }
767 | else if (*t == '"') { tl = sprintf(buff, "\\\""); buff = buff + tl; }
768 | else if (*t == '\'') { tl = sprintf(buff, "\\\'"); buff = buff + tl; }
769 | else if (*t == '\\') { tl = sprintf(buff, "\\\\"); buff = buff + tl; }
770 | else { tl = sprintf(buff, "%c", *t); buff = buff + tl; }
771 | ++t;
772 | }
773 | tl = sprintf(buff, "\"\n\n"); buff = buff + tl;
774 | t = (char*)((int)t + sizeof(int) & -sizeof(int));
775 | }
776 |
777 | if ((fd = open(output,
778 | O_CREAT | O_WRONLY,
779 | S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP)) < 0) { printf("open returned %d\n", fd); exit(-1); }
780 | write(fd, tb, (int)buff - (int)tb);
781 | close(fd);
782 |
783 | free(tb);
784 | free(sym);
785 | free(te);
786 | free(td);
787 |
788 | return 0;
789 | }
790 |
--------------------------------------------------------------------------------
/lk.c:
--------------------------------------------------------------------------------
1 | #include
2 | #include
3 | #include
4 | #include
5 | #include
6 |
7 | int
8 | main(int argc, char **argv)
9 | {
10 | int poolsz, merl, fd, offset, codelen, start;
11 | char *output, *file, *c, *label, *nbuf, *nn;
12 | int *orig,
13 | *dest,
14 | *o, *d, *nd, *rel,
15 | *t, *lo,
16 | i, tlen,
17 | *glob, **tg, *g,
18 | lfd;
19 |
20 | --argc; ++argv;
21 |
22 | merl = 0;
23 | label = 0; // NULL
24 | start = 0;
25 |
26 | while (argc && **argv == '-') {
27 | if ((*argv)[1] == 'm') { merl = 1; }
28 | else if ((*argv)[1] == 'o') {
29 | if (! --argc) { printf("no output file\n"); exit(-1); }
30 | output = *++argv;
31 | }
32 | else if ((*argv)[1] == 'l') {
33 | if (! --argc) { printf("no label file\n"); exit(-1); }
34 | label = *++argv;
35 | }
36 | else if ((*argv)[1] == 's') {
37 | if (! --argc) { printf("no start\n"); exit(-1); }
38 | c = *++argv;
39 | while (*c && *c >= '0' && *c <= '9') {
40 | start = start * 10 + (*c++ - '0');
41 | }
42 | }
43 | --argc; ++argv;
44 | }
45 |
46 | if (!argc) { printf("Usage: lk -o output [-m] file ...\n"); exit(-1); }
47 |
48 | poolsz = 256 * 1024;
49 | if (!(orig = malloc(poolsz))) { printf("could not malloc(%d) original space\n", poolsz); exit(-1); }
50 | if (!(dest = malloc(poolsz))) { printf("could not malloc(%d) destination space\n", poolsz); exit(-1); }
51 |
52 | if (label && (lfd = open(label, O_CREAT | O_WRONLY,
53 | S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP)) < 0) { printf("open label returned %d\n", lfd); exit(-1); }
54 | if (label && (!(nbuf = malloc(12)))) { printf("could not malloc(12) number buffer\n"); exit(-1); }
55 |
56 | o = orig;
57 | glob = 0;
58 | tg = &glob;
59 | codelen = 0;
60 | offset = merl ? 12 : 0;
61 | while (argc) {
62 | if ((fd = open(file = *argv, 0)) < 0) { printf("could not open(%s)\n", file); exit(-1); }
63 | if ((i = read(fd, (char*)o, poolsz - (o - orig))) <= 0) { printf("read() returned %d\n", i); exit(-1); }
64 | close(fd);
65 | if (*o != ((0x04 << 26) | (0x02))) { printf("file %s is not MERL\n", file); exit(-1); }
66 | if (o[2] != i) { printf("out of pool size, when processing %s (filesize: %d, but %d read)\n", file, o[2], i); exit(-1); }
67 |
68 | tlen = o[1] - 12;
69 | codelen = codelen + tlen;
70 |
71 | t = (int*)((int)o + o[1]);
72 | o = (int*)((int)o + o[2]);
73 |
74 | while (t < o) {
75 | if (*t == 0x00) {
76 | // global symbol
77 | *tg = t;
78 | tg = (int**)(t);
79 | i = 0;
80 | c = (char*)(t + 3);
81 | while (*c) { i = i * 147 + *c++; }
82 | t[1] = t[1] + offset - 12 + (merl ? 0 : start);
83 | t[2] = (i << 6) + t[2];
84 | if (label) {
85 | i = t[1]; nn = nbuf + 12;
86 | if (i) {
87 | while (i) { *--nn = (i % 10) + '0'; i = i / 10; }
88 | write(lfd, nn, nbuf + 12 - nn);
89 | }
90 | else { write(lfd, "0", 1); }
91 | write(lfd, " ", 1);
92 | write(lfd, (char*)(t + 3), t[2] & 0x3F);
93 | write(lfd, "\n", 1);
94 | }
95 | t = (int*)((int)t + 12 + (t[2] & 0x3F) + sizeof(int) & -sizeof(int));
96 | }
97 | else if ((*t & 0xF) == 2) {
98 | // extern symbol
99 | t = (int*)((int)t + 12 + t[2] + sizeof(int) & -sizeof(int));
100 | }
101 | else if ((*t & 0xF) == 1) {
102 | // relocation address
103 | t = t + 2;
104 | }
105 | }
106 |
107 | --argc; ++argv;
108 | offset = offset + tlen;
109 | }
110 |
111 | *tg = 0;
112 | offset = merl ? 12 : 0;
113 | rel = (int*)((int)dest + (merl ? codelen + 12 : codelen));
114 |
115 | o = orig;
116 | d = dest;
117 | lo = t;
118 |
119 | if (merl) {
120 | *d++ = ((0x04 << 26) | (0x02));
121 | *d++ = codelen + 12;
122 | d++;
123 | }
124 |
125 | if (label) {
126 | close(lfd);
127 | free(nbuf);
128 | }
129 |
130 | while (o < lo) {
131 | memcpy((char*)d, (char*)(o + 3), (o[1] - 12));
132 | tlen = o[1] - 12;
133 | nd = (int*)((int)d + tlen);
134 | t = (int*)((int)o + o[1]);
135 | o = (int*)((int)o + o[2]);
136 |
137 | while (t < o) {
138 | if ((*t & 3) == 0) { // modified when calculating hash
139 | if (merl) {
140 | *rel++ = 0x00;
141 | *rel++ = t[1];
142 | *rel++ = t[2] & 0x3F;
143 | memcpy((char*)rel, (char*)(t + 3), t[2] & 0x3F);
144 | rel = (int*)((int)rel + (t[2] & 0x3F) + sizeof(int) & -sizeof(int));
145 | }
146 | t = (int*)((int)t + 12 + (t[2] & 0x3F) + sizeof(int) & -sizeof(int));
147 | }
148 | else if ((*t & 0xF) == 2) { // external symbols
149 | i = 0;
150 | c = (char*)(t + 3);
151 | while (*c) { i = i * 147 + *c++; }
152 | i = (i << 6) + t[2];
153 | g = glob;
154 | while (g && i != g[2]) { g = (int*)*g; }
155 | if (g) {
156 | if (*t == 0x02) { // B-type
157 | i = *(int*)((int)d + t[1] - 12);
158 | *(int*)((int)d + t[1] - 12) = (((g[1] - (t[1] + offset - 12) - 4) >> 2) & ((1 << 16) - 1))
159 | | (i & (((1 << 16) - 1) << 16));
160 | }
161 | else if (*t == 0x12) { // I-type
162 | i = *(int*)((int)d + t[1] - 12);
163 | *(int*)((int)d + t[1] - 12) = (g[1] & ((1 << 16) - 1)) | (i & (((1 << 16) - 1) << 16));
164 | if (merl) {
165 | *rel++ = 0x11;
166 | *rel++ = (t[1] + offset - 12);
167 | }
168 | }
169 | else if (*t == 0x22) { // J-type
170 | i = *(int*)((int)d + t[1] - 12);
171 | *(int*)((int)d + t[1] - 12) = ((g[1] >> 2) & ((1 << 26) - 1)) | (i & (((1 << 6) - 1) << 26));
172 | if (merl) {
173 | *rel++ = 0x21;
174 | *rel++ = (t[1] + offset - 12);
175 | }
176 | }
177 | else if (*t == 0x32) { // DD
178 | *(int*)((int)d + t[1] - 12) = g[1];
179 | if (merl) {
180 | *rel++ = 0x31;
181 | *rel++ = (t[1] + offset - 12);
182 | }
183 | }
184 | }
185 | else {
186 | if (merl) {
187 | *rel++ = *t;
188 | *rel++ = t[1] + offset - 12;
189 | *rel++ = t[2];
190 | memcpy((char*)rel, (char*)(t + 3), t[2]);
191 | rel = (int*)((int)rel + t[2] + sizeof(int) & -sizeof(int));
192 | }
193 | else { printf("unresolved symbol: %s\n", (char*)(t + 3)); exit(-1); }
194 | }
195 | t = (int*)((int)t + 12 + t[2] + sizeof(int) & -sizeof(int));
196 | }
197 | else if (*t == 0x11) { // I-type
198 | i = *(int*)((int)d + t[1] - 12);
199 | *(int*)((int)d + t[1] - 12) = ((i + offset - 12) & ((1 << 16) - 1)) | (i & (((1 << 16) - 1) << 16));
200 | if (merl) {
201 | *rel++ = 0x11;
202 | *rel++ = (t[1] + offset - 12);
203 | }
204 | t = t + 2;
205 | }
206 | else if (*t == 0x21) { // J-type
207 | i = *(int*)((int)d + t[1] - 12);
208 | *(int*)((int)d + t[1] - 12) = ((i + ((offset - 12 + (merl ? 0 : start)) >> 2)) & ((1 << 26) - 1)) | (i & (((1 << 6) - 1) << 26));
209 | if (merl) {
210 | *rel++ = 0x21;
211 | *rel++ = (t[1] + offset - 12);
212 | }
213 | t = t + 2;
214 | }
215 | else if (*t == 0x31) { // DD
216 | *(int*)((int)d + t[1] - 12) = *(int*)((int)d + t[1] - 12) + offset - 12 + (merl ? 0 : start);
217 | if (merl) {
218 | *rel++ = 0x31;
219 | *rel++ = (t[1] + offset - 12);
220 | }
221 | t = t + 2;
222 | }
223 | }
224 |
225 | d = nd;
226 | offset = offset + tlen;
227 | }
228 |
229 | if ((fd = open(output,
230 | O_CREAT | O_WRONLY,
231 | S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP)) < 0) { printf("open returned %d\n", fd); exit(-1); }
232 |
233 | if (merl) {
234 | dest[2] = (int)rel - (int)dest;
235 | write(fd, dest, (int)rel - (int)dest);
236 | }
237 | else {
238 | write(fd, dest, (int)d - (int)dest);
239 | }
240 | close(fd);
241 |
242 | free(orig);
243 | free(dest);
244 |
245 | return 0;
246 | }
247 |
--------------------------------------------------------------------------------