├── Cristina-Cifuentes ├── dcc.html ├── decompilation_thesis.dvi.gz ├── decompilation_thesis.pdf └── decompilation_thesis.ps.gz ├── arm └── QRC0001_UAL-quick-reference.pdf ├── calling-conventions ├── calling_conventions.pdf └── gcc │ └── x86 ├── cis570 ├── lecture01.pdf ├── lecture02.pdf ├── lecture03.pdf ├── lecture04.pdf ├── lecture05.pdf ├── lecture06.pdf ├── lecture07.pdf ├── lecture08.pdf ├── lecture09.pdf ├── lecture10.pdf ├── lecture11.pdf ├── lecture12.pdf ├── lecture13.pdf ├── lecture14.pdf ├── lecture15.pdf ├── lecture16.pdf ├── lecture17.pdf ├── lecture18.pdf └── lecture19.pdf ├── control-flow ├── CS640-2-02.ppt ├── CSE-2009-27.pdf ├── Oakland2013-CCFIR-CR.pdf └── zadeck_structure_analysis.diff ├── data-flow └── data-flow.pdf ├── decompilers ├── ACMMA-27.pdf ├── moll_bsc.pdf ├── paper.pdf └── revgen.pdf ├── file-formats ├── ELF_Format.pdf └── dotguide.pdf ├── gccsummit └── gccsummit-2003-proceedings.pdf ├── graph └── 10.1.1.56.9735.pdf ├── intel ├── 32bit-register-expansion-to-64bit.txt ├── Optimizing-248966.pdf ├── Volume1-253665.pdf ├── Volume2A-253666.pdf ├── Volume2B-253667.pdf ├── Volume3A-253668.pdf ├── Volume3B-253669.pdf └── asm64-handout.pdf ├── llvm ├── 2003-10-01-LLVA.ps ├── [LLVMdev] LLVM IR is a compiler IR.html ├── lnq.pdf └── x86-llvm-translator-chipounov_2.pdf ├── optimizing ├── optimizing_cpp.pdf └── test61.c ├── rtl ├── rep.txt └── test.txt ├── second-write ├── Anand.pdf ├── readme.txt └── secondwrite.sec11.pdf ├── software-maintenance └── 10.1.1.89.2073-graphs.pdf ├── source-source-translator └── ROSE-Tutorial.pdf └── type-detection ├── Harris05WBIA.ps ├── TIE - Principled Reverse Engineering of Types in Binary Programs.pdf └── data-type-determination.txt /Cristina-Cifuentes/dcc.html: -------------------------------------------------------------------------------- 1 | 2 | Decompilation of Binary Programs - dcc 3 | 4 | 5 | 6 | School of ITEE, The University of Queensland

School of ITEE, The University of Queensland

8 |

9 | 10 | 11 |

The dcc Decompiler

12 |

The dcc decompiler was developed by Cristina Cifuentes while a PhD student at 13 | the Queensland University of Technology 14 | (QUT), Australia, 1991-4, under the supervision of Professor 15 | John Gough. Mike Van Emmerik developed the library signature 16 | recognition algorithms while employed by QUT. The dcc distribution is made 17 | available under the GPL license. The readme 18 | file provides information about the distribution. We do not provide 19 | support for this decompiler, if you email, you'll get the standard 20 | reply. However, we participate in the Boomerang 21 | open source project, which aims at creating a retargetable decompiler based on 22 | some of the dcc and UQBT ideas, design, and/or implementations.

23 |

24 |

The dcc 26 | Decompiler 27 |
Example of 28 | Decompilation 29 |
PhD Thesis 30 | 31 |
Related 33 | Publications 34 |
Future Work - 35 | A Retargetable Decompiler 36 |
dcc 37 | Distribution, available under a GPL license

Notice 38 |
Decompilation is a technique that allows you to recover lost source code. It 39 | is also needed in some cases for computer security, interoperability and error 40 | correction. dcc, and any decompiler in general, should not be used for 41 | "cracking" other programs, as programs are protected by copyright. Cracking of 42 | programs is not only illegal but it rides on other's creative effort. See the ethics of 44 | decompilation for more information. 45 |

46 |

47 | 48 |

dcc

The dcc decompiler decompiles .exe files from the 49 | (i386, DOS) platform to C programs. The final C program contains assembler code 50 | for any subroutines that are not possible to be decompiled at a higher level 51 | than assembler. 52 |

The analysis performed by dcc is based on traditional compiler 53 | optimization techniques and graph theory. The former is capable of eliminating 54 | registers and intermediate instructions to reconstruct high-level statements; 55 | the later is capable of determining the control structures in each subroutine. 56 |

Please note that at present, only C source is produced; dcc cannot (as 57 | yet) produce C++ source. 58 |

The structure of a decompiler resembles that of a compiler: a front-, 59 | middle-, and back-end which perform separate tasks. The front-end is a 60 | machine-language dependent module that reads in machine code for a particular 61 | machine and transforms it into an intermediate, machine-independent 62 | representation of the program. The middle-end (aka the Universal Decompiling 63 | Machine or UDM) is a machine and language independent module that performs the 64 | core of the decompiling analysis: data flow and control flow analysis. Finally, 65 | the back-end is high-level language dependent and generates code for the program 66 | (C in the case of dcc). 67 |

In practice, several programs 68 | are used with the decompiler to create the high-level program. These programs 69 | aid in the detection of compiler and library signatures, hence augmenting the 70 | readability of programs and eliminating compiler start-up and library routines 71 | from the decompilation analysis. 72 |

73 |

74 | 75 |

76 | 77 |

Example of Decompilation

We illustrate the 78 | decompilation of a fibonacci program (see Figure 4). Figure 1 illustrates the 79 | relevant machine code of this binary. No library or compiler start up code is 80 | included. Figure 2 presents the disassembly of the binary program. All calls to 81 | library routines were detected by dccSign (the signature matcher), and thus not 82 | included in the analysis. Figure 3 is the final output from dcc. This C program 83 | can be compared with the original C program in Figure 4. 84 |

85 |

86 |

         55 8B EC 83 EC 04 56 57 1E B8 94 00 50 9A 
 87 |    0E 00 3C 17 59 59 16 8D 46 FC 50 1E B8 B1 00 50 
 88 |    9A 07 00 F0 17 83 C4 08 BE 01 00 EB 3B 1E B8 B4
 89 |    00 50 9A 0E 00 3C 17 59 59 16 8D 46 FE 50 1E B8
 90 |    C3 00 50 9A 07 00 F0 17 83 C4 08 FF 76 FE 9A 7C
 91 |    00 3B 16 59 8B F8 57 FF 76 FE 1E B8 C6 00 50 9A
 92 |    0E 00 3C 17 83 C4 08 46 3B 76 FC 7E C0 33 C0 50
 93 |    9A 0A 00 49 16 59 5F 5E 8B E5 5D CB 55 8B EC 56
 94 |    8B 76 06 83 FE 02 7E 1E 8B C6 48 50 0E E8 EC FF
 95 |    59 50 8B C6 05 FE FF 50 0E E8 E0 FF 59 8B D0 58
 96 |    03 C2 EB 07 EB 05 B8 01 00 EB 00 5E 5D CB

Figure 1 - 97 | Machine Code for Fibonacci.exe 98 |

99 |

100 |

                proc_1  PROC  FAR                        
101 | 000 00053C 55                  PUSH           bp         
102 | 001 00053D 8BEC                MOV            bp, sp      
103 | 002 00053F 56                  PUSH           si          
104 | 003 000540 8B7606              MOV            si, [bp+6]  
105 | 004 000543 83FE02              CMP            si, 2       
106 | 005 000546 7E1E                JLE            L1          
107 | 006 000548 8BC6                MOV            ax, si      
108 | 007 00054A 48                  DEC            ax          
109 | 008 00054B 50                  PUSH           ax          
110 | 009 00054C 0E                  PUSH           cs          
111 | 010 00054D E8ECFF              CALL  near ptr proc_1      
112 | 011 000550 59                  POP            cx          
113 | 012 000551 50                  PUSH           ax          
114 | 013 000552 8BC6                MOV            ax, si      
115 | 014 000554 05FEFF              ADD            ax, 0FFFEh  
116 | 015 000557 50                  PUSH           ax          
117 | 016 000558 0E                  PUSH           cs          
118 | 017 000559 E8E0FF              CALL  near ptr proc_1      
119 | 018 00055C 59                  POP            cx          
120 | 019 00055D 8BD0                MOV            dx, ax      
121 | 020 00055F 58                  POP            ax          
122 | 021 000560 03C2                ADD            ax, dx      
123 | 023 00056B 5E             L2:  POP            si          
124 | 024 00056C 5D                  POP            bp          
125 | 025 00056D CB                  RETF                       
126 | 026 000566 B80100         L1:  MOV            ax, 1       
127 | 027 000569 EB00                JMP            L2          
128 |                 proc_1  ENDP                              
129 |                                                           
130 |                 main  PROC  FAR                           
131 | 000 0004C2 55                  PUSH           bp          
132 | 001 0004C3 8BEC                MOV            bp, sp      
133 | 002 0004C5 83EC04              SUB            sp, 4       
134 | 003 0004C8 56                  PUSH           si          
135 | 004 0004C9 57                  PUSH           di          
136 | 005 0004CA 1E                  PUSH           ds          
137 | 006 0004CB B89400              MOV            ax, 94h     
138 | 007 0004CE 50                  PUSH           ax          
139 | 008 0004CF 9A0E004D01          CALL   far ptr printf      
140 | 009 0004D4 59                  POP            cx          
141 | 010 0004D5 59                  POP            cx          
142 | 011 0004D6 16                  PUSH           ss          
143 | 012 0004D7 8D46FC              LEA            ax, [bp-4]  
144 | 013 0004DA 50                  PUSH           ax          
145 | 014 0004DB 1E                  PUSH           ds          
146 | 015 0004DC B8B100              MOV            ax, 0B1h    
147 | 016 0004DF 50                  PUSH           ax          
148 | 017 0004E0 9A07000102          CALL   far ptr scanf       
149 | 018 0004E5 83C408              ADD            sp, 8       
150 | 019 0004E8 BE0100              MOV            si, 1       
151 | 021 000528 3B76FC         L3:  CMP            si, [bp-4]  
152 | 022 00052B 7EC0                JLE            L4          
153 | 023 00052D 33C0                XOR            ax, ax      
154 | 024 00052F 50                  PUSH           ax          
155 | 025 000530 9A0A005A00          CALL   far ptr exit        
156 | 026 000535 59                  POP            cx          
157 | 027 000536 5F                  POP            di          
158 | 028 000537 5E                  POP            si          
159 | 029 000538 8BE5                MOV            sp, bp      
160 | 030 00053A 5D                  POP            bp          
161 | 031 00053B CB                  RETF                       
162 | 032 0004ED 1E             L4:  PUSH           ds          
163 | 033 0004EE B8B400              MOV            ax, 0B4h    
164 | 034 0004F1 50                  PUSH           ax          
165 | 035 0004F2 9A0E004D01          CALL   far ptr printf      
166 | 036 0004F7 59                  POP            cx          
167 | 037 0004F8 59                  POP            cx          
168 | 038 0004F9 16                  PUSH           ss          
169 | 039 0004FA 8D46FE              LEA            ax, [bp-2]  
170 | 040 0004FD 50                  PUSH           ax          
171 | 041 0004FE 1E                  PUSH           ds          
172 | 042 0004FF B8C300              MOV            ax, 0C3h    
173 | 043 000502 50                  PUSH           ax          
174 | 044 000503 9A07000102          CALL   far ptr scanf       
175 | 045 000508 83C408              ADD            sp, 8       
176 | 046 00050B FF76FE              PUSH  word ptr [bp-2]      
177 | 047 00050E 9A7C004C00          CALL   far ptr proc_1      
178 | 048 000513 59                  POP            cx          
179 | 049 000514 8BF8                MOV            di, ax      
180 | 050 000516 57                  PUSH           di          
181 | 051 000517 FF76FE              PUSH  word ptr [bp-2]      
182 | 052 00051A 1E                  PUSH           ds          
183 | 053 00051B B8C600              MOV            ax, 0C6h    
184 | 054 00051E 50                  PUSH           ax          
185 | 055 00051F 9A0E004D01          CALL   far ptr printf      
186 | 056 000524 83C408              ADD            sp, 8       
187 | 057 000527 46                  INC            si          
188 | 058                            JMP            L3         ;Synthetic inst 
189 |                 main  ENDP

Figure 190 | 2 - Code produced by the Disassembler 191 |

192 |

193 |

/*                                                            
194 |  * Input file   : fibo.exe                                    
195 |  * File type    : EXE                                         
196 |  */                                                           
197 |                                                               
198 | int proc_1 (int arg0)                                         
199 | /* Takes 2 bytes of parameters.                               
200 |  * High-level language prologue code.                         
201 |  * C calling convention.                                      
202 |  */                                                           
203 | {                                                             
204 | int loc1;                                                     
205 | int loc2; /* ax */                                            
206 |                                                               
207 |     loc1 = arg0;                                              
208 |     if (loc1 > 2) {                                           
209 |         loc2 = (proc_1 ((loc1 - 1)) + proc_1 ((loc1 + 0xFFFE)));  
210 |     }                                                         
211 |     else {                                                    
212 |         loc2 = 1;                                             
213 |     }                                                         
214 |     return (loc2);                                            
215 | }                                                             
216 |                                                               
217 |                                                               
218 | void main ()                                                  
219 | /* Takes no parameters.                                       
220 |  * High-level language prologue code.                         
221 |  */                                                           
222 | {                                                             
223 | int loc1;                                                     
224 | int loc2;                                                    
225 | int loc3;                                                     
226 | int loc4;                                                     
227 |                                                               
228 |     printf ("Input number of iterations: ");                  
229 |     scanf ("%d", &loc1);                                      
230 |     loc3 = 1;                                                 
231 |     while ((loc3 <= loc1)) {                                  
232 |         printf ("Input number: ");                            
233 |         scanf ("%d", &loc2);                                  
234 |         loc4 = proc_1 (loc2);                                 
235 |         printf ("fibonacci(%d) = %u\n", loc2, loc4);          
236 |         loc3 = (loc3 + 1);                                    
237 |     } /* end of while */                                      
238 |     exit (0);                                                 
239 | }

Figure 3 - Code produced by dcc in C 240 |

241 |

242 |

#include <stdio.h>                                            
243 |                                                               
244 | int main()                                                    
245 | { int i, numtimes, number;                                    
246 |   unsigned value, fib();                                      
247 |                                                               
248 |    printf("Input number of iterations: ");                    
249 |    scanf ("%d", &numtimes);                                   
250 |    for (i = 1; i <= numtimes; i++)                            
251 |    {                                                          
252 |       printf ("Input number: ");                              
253 |       scanf ("%d", &number);                                  
254 |       value = fib(number);                                    
255 |       printf("fibonacci(%d) = %u\n", number, value);          
256 |    }                                                          
257 |    exit(0);                                                   
258 | }                                                             
259 |                                                               
260 | unsigned fib(x)                 /* compute fibonacci number recursively */
261 | int x;                                                        
262 | {                                                             
263 |    if (x > 2)                                                 
264 |       return (fib(x - 1) + fib(x - 2));                       
265 |    else                                                       
266 |       return (1);                                             
267 | }

Figure 4 - Initial C Program 268 |

269 |

270 | 271 |

272 | 273 |

PhD Thesis

C Cifuentes. Reverse 274 | Compilation Techniques, Queensland University of Technology, Department of 275 | Computer Science, PhD thesis. 276 | July 1994. (474 Kb compressed postscript file). Also available in compressed dvi 277 | format (365 Kb). 278 |

ABSTRACT 279 |

Techniques for writing reverse compilers or decompilers are presented in this 280 | thesis. These techniques are based on compiler and optimization theory, and are 281 | applied to decompilation in a unique way; these techniques have never before 282 | been published. 283 |

A decompiler is composed of several phases which are grouped into modules 284 | dependent on language or machine features. The front-end is a machine dependent 285 | module that parses the binary program, analyzes the semantics of the 286 | instructions in the program, and generates an intermediate low-level 287 | representation of the program, as well as a control flow graph of each 288 | subroutine. The universal decompiling machine is a language and machine 289 | independent module that analyzes the low-level intermediate code and transforms 290 | it into a high-level representation available in any high-level language, and 291 | analyzes the structure of the control flow graph(s) and transform them into 292 | graphs that make use of high-level control structures. Finally, the back-end is 293 | a target language dependent module that generates code for the target language. 294 |

Decompilation is a process that involves the use of tools to load the binary 295 | program into memory, parse or disassemble such a program, and decompile or 296 | analyze the program to generate a high-level language program. This process 297 | benefits from compiler and library signatures to recognize particular compilers 298 | and library subroutines. Whenever a compiler signature is recognized in the 299 | binary program, all compiler start-up and library subroutines are not 300 | decompiled; in the former case, the routines are eliminated from the final 301 | target program and the entry point to the main program is used for the 302 | decompiler analysis, in the latter case the subroutines are replaced by their 303 | library name. 304 |

The presented techniques were implemented in a prototype decompiler for the 305 | Intel i80286 architecture running under the DOS operating system, dcc, which 306 | produces target C programs for source .exe or .com files. Sample decompiled 307 | programs, comparisons against the initial high-level language program, and an 308 | analysis of results is presented in Chapter 9. 309 |

Chapter 1 gives an introduction to decompilation from a compiler point of 310 | view, Chapter 2 gives an overview of the history of decompilation since its 311 | appearance in the early 1960s, Chapter 3 presents the relations between the 312 | static binary code of the source binary program and the actions performed at 313 | run-time to implement the program, Chapter 4 describes the phases of the 314 | front-end module, Chapter 5 defines data optimization techniques to analyze the 315 | intermediate code and transform it into a higher-representation, Chapter 6 316 | defines control structure transformation techniques to analyze the structure of 317 | the control flow graph and transform it into a graph of high-level control 318 | structures, Chapter 7 describes the back-end module, Chapter 8 presents the 319 | decompilation tool programs, Chapter 9 gives an overview of the implementation 320 | of dcc and the results obtained, and Chapter 10 gives the conclusions and future 321 | work of this research. 322 |

The techniques presented in this thesis expand on earlier work described in 323 | the literature. Previous work in decompilation did not document on the 324 | interprocedural register analysis required to determine register arguments and 325 | register return values, the analysis required to eliminate stack-related 326 | instructions (i.e. push and pop), or the structuring of a generic set of control 327 | structures. Innovative work done for this research is described in Chapters 5, 328 | 6, and 8. Chapter 5, Sections 5.2 and 5.4 illustrate and describe nine different 329 | types of optimizations that transform the low-level intermediate code into a 330 | high-level representation. These optimizations take into account condition 331 | codes, subroutine calls (i.e. interprocedural analysis) and register spilling, 332 | eliminating all low-level features of the intermediate instructions (such as 333 | condition codes and registers) and introducing the high-level concept of 334 | expressions into the intermediate representation. Chapter 6, Sections 6.2 and 335 | 6.6 illustrate and describe algorithms to structure different types of loops and 336 | conditional, including multi-way branch conditionals (e.g. case statements). 337 | Previous work in this area has concentrated in the structuring of loops, few 338 | papers attempt to structure 2-way conditional branches, no work on multi-way 339 | conditional branches is described in the literature. This thesis presents a 340 | complete method for structuring all types of structures based on a 341 | predetermined, generic set of high-level control structures. A criterion for 342 | determining the generic set of control structures is given in Chapter 6, Section 343 | 6.4. Chapter 8 describes all tools used to decompile programs, the most 344 | important tool is the signature generator (Section 8.2) which is used to 345 | determine compiler and library signatures in architectures that have an 346 | operating system that do not share libraries, such as the DOS operating system. 347 |

348 |

349 | 350 |

351 | 352 |

Future Work - A Retargetable Decompiler

A 353 | retargetable decompiler engine can be built based on ideas and code from the UQBT project, by reusing the 354 | frontend of that framework and writing a new backend that supports the RTL and 355 | HRTL intermediate representation of the UQBT system. Please refer to the 356 | open source project Boomerang. 357 |

358 |

359 | 360 |

361 | 362 |

dcc Distribution

The dcc source code distribution is 363 | made available under the GNU GPL General Public 365 | License.

The dcc distribution is available in gzip tar format for Unix users, dcc.tar.gz 366 | and dcc_oo.tar.gz, and in its individual .zip 367 | files for PC users, dcc files pages. Read the readme file for a 368 | description of what is included in the distribution and installation 369 | instructions. If you do not have the tar and/or pkunzip programs, contact your 370 | system's administrator. 371 |

There is a now a second version of the decompiler; mainly to distinguish it 372 | from the first, we'll call it the "OO" version (it has the beginnings of Object 373 | Orientation, but there is still much to be done). This version has a bug fixed 374 | which caused the output to be wrong some of the time (randomly; successive runs 375 | would result in different output). It is also converted to C++, (the source for 376 | dcc; dcc does not produce C++ source), so those users wishing to use a C 377 | compiler without C++ facilities will have to stick to the original version. The 378 | file dccsrcoo.zip has the source for the later version, and dcc_oo.tar.gz 379 | has the whole distribution, with dccsrcoo.zip instead of 380 | dccsrc.zip and dcc32.zip instead of dcc.zip. This 381 | version has a better chance of working on PC compilers such as Microsoft Visual 382 | C++ and Borland C++. There is no longer any use of the curses library; it 383 | was found to be too much of a distribution hassle. 384 |

The OO version of dcc is the most recent, and has bug fixes that the 385 | original does not. For most purposes, the OO version is the one to start working 386 | with. 387 |

Support
Please note that the authors are not currently working on 388 | this project and therefore cannot support any changes required on dcc. Source 389 | code is provided "as is". Read the documentation first. 390 |

Likewise, please don't email the authors with requests for modifications to 391 | dcc, or specific questions about its inner workings. If you do, you will just 392 | get a reply with this 393 | formletter. 394 |

Note
Dcc has a fundamental 395 | implementation flaw that limits it to about 30KB of input binary program, i.e. 396 | it currently handles toy programs only! The problem is that pointers are kept in 397 | many places; many of these pointers point to elements of arrays. The arrays are 398 | all of variable size; the realloc system call can and will change the 399 | virtual addresses of these arrays, thus invalidating the pointers. Because of 400 | this, results are unpredictable as soon as one array is resized. (However, a 401 | segmentation fault is likely when this happens). The arrays are sized such that 402 | they don't get reallocated for input binaries less than about 30KB. 403 |

Before any serious work can be done with dcc, this implementation flaw has to 404 | be corrected. As noted above, the authors do not have the time to correct this 405 | error, or to offer any suggestions as to how to do this. 406 |

407 |

408 | 409 | 410 |

Last updated: 4th May 2002

411 | 412 |

This page: http://www.itee.uq.edu.au/~cristina/dcc.html

413 | -------------------------------------------------------------------------------- /Cristina-Cifuentes/decompilation_thesis.dvi.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/Cristina-Cifuentes/decompilation_thesis.dvi.gz -------------------------------------------------------------------------------- /Cristina-Cifuentes/decompilation_thesis.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/Cristina-Cifuentes/decompilation_thesis.pdf -------------------------------------------------------------------------------- /Cristina-Cifuentes/decompilation_thesis.ps.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/Cristina-Cifuentes/decompilation_thesis.ps.gz -------------------------------------------------------------------------------- /arm/QRC0001_UAL-quick-reference.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/arm/QRC0001_UAL-quick-reference.pdf -------------------------------------------------------------------------------- /calling-conventions/calling_conventions.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/calling-conventions/calling_conventions.pdf -------------------------------------------------------------------------------- /calling-conventions/gcc/x86: -------------------------------------------------------------------------------- 1 | 32bit 2 | fastcall with gcc: ecx, edx 3 | 4 | 64bit 5 | fastcall with gcc: rdi, rsi, rdx, rcx, r8, r9, xmm0-7 6 | 7 | -------------------------------------------------------------------------------- /cis570/lecture01.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/cis570/lecture01.pdf -------------------------------------------------------------------------------- /cis570/lecture02.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/cis570/lecture02.pdf -------------------------------------------------------------------------------- /cis570/lecture03.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/cis570/lecture03.pdf -------------------------------------------------------------------------------- /cis570/lecture04.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/cis570/lecture04.pdf -------------------------------------------------------------------------------- /cis570/lecture05.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/cis570/lecture05.pdf -------------------------------------------------------------------------------- /cis570/lecture06.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/cis570/lecture06.pdf -------------------------------------------------------------------------------- /cis570/lecture07.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/cis570/lecture07.pdf -------------------------------------------------------------------------------- /cis570/lecture08.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/cis570/lecture08.pdf -------------------------------------------------------------------------------- /cis570/lecture09.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/cis570/lecture09.pdf -------------------------------------------------------------------------------- /cis570/lecture10.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/cis570/lecture10.pdf -------------------------------------------------------------------------------- /cis570/lecture11.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/cis570/lecture11.pdf -------------------------------------------------------------------------------- /cis570/lecture12.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/cis570/lecture12.pdf -------------------------------------------------------------------------------- /cis570/lecture13.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/cis570/lecture13.pdf -------------------------------------------------------------------------------- /cis570/lecture14.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/cis570/lecture14.pdf -------------------------------------------------------------------------------- /cis570/lecture15.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/cis570/lecture15.pdf -------------------------------------------------------------------------------- /cis570/lecture16.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/cis570/lecture16.pdf -------------------------------------------------------------------------------- /cis570/lecture17.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/cis570/lecture17.pdf -------------------------------------------------------------------------------- /cis570/lecture18.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/cis570/lecture18.pdf -------------------------------------------------------------------------------- /cis570/lecture19.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/cis570/lecture19.pdf -------------------------------------------------------------------------------- /control-flow/CS640-2-02.ppt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/control-flow/CS640-2-02.ppt -------------------------------------------------------------------------------- /control-flow/CSE-2009-27.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/control-flow/CSE-2009-27.pdf -------------------------------------------------------------------------------- /control-flow/Oakland2013-CCFIR-CR.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/control-flow/Oakland2013-CCFIR-CR.pdf -------------------------------------------------------------------------------- /data-flow/data-flow.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/data-flow/data-flow.pdf -------------------------------------------------------------------------------- /decompilers/ACMMA-27.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/decompilers/ACMMA-27.pdf -------------------------------------------------------------------------------- /decompilers/moll_bsc.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/decompilers/moll_bsc.pdf -------------------------------------------------------------------------------- /decompilers/paper.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/decompilers/paper.pdf -------------------------------------------------------------------------------- /decompilers/revgen.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/decompilers/revgen.pdf -------------------------------------------------------------------------------- /file-formats/ELF_Format.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/file-formats/ELF_Format.pdf -------------------------------------------------------------------------------- /file-formats/dotguide.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/file-formats/dotguide.pdf -------------------------------------------------------------------------------- /gccsummit/gccsummit-2003-proceedings.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/gccsummit/gccsummit-2003-proceedings.pdf -------------------------------------------------------------------------------- /graph/10.1.1.56.9735.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/graph/10.1.1.56.9735.pdf -------------------------------------------------------------------------------- /intel/32bit-register-expansion-to-64bit.txt: -------------------------------------------------------------------------------- 1 | In Intel X86_64 bit mode: 2 | 32-bit operands generate a 32-bit result, zero-extended to a 64-bit result in the destination general-purpose register. 3 | So, MOV EAX, EBX causes the upper 32-bits of EBX to be zeroed. 4 | 5 | For ADDL, the result is always 32bits, with the higher 64 bits all zero. 6 | ADDL: 0xffffffff + 0x22 = 0x21 32bit 7 | ADDQ: 0xffffffff + 0x22 = 0x100000021 64bit 8 | 9 | -------------------------------------------------------------------------------- /intel/Optimizing-248966.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/intel/Optimizing-248966.pdf -------------------------------------------------------------------------------- /intel/Volume1-253665.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/intel/Volume1-253665.pdf -------------------------------------------------------------------------------- /intel/Volume2A-253666.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/intel/Volume2A-253666.pdf -------------------------------------------------------------------------------- /intel/Volume2B-253667.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/intel/Volume2B-253667.pdf -------------------------------------------------------------------------------- /intel/Volume3A-253668.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/intel/Volume3A-253668.pdf -------------------------------------------------------------------------------- /intel/Volume3B-253669.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/intel/Volume3B-253669.pdf -------------------------------------------------------------------------------- /intel/asm64-handout.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/intel/asm64-handout.pdf -------------------------------------------------------------------------------- /llvm/[LLVMdev] LLVM IR is a compiler IR.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | [LLVMdev] LLVM IR is a compiler IR 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 |

[LLVMdev] LLVM IR is a compiler IR

Previous message: [LLVMdev] LLVM IR Code Annotation 20 |
Next message: [LLVMdev] LLVM IR is a compiler IR 22 |
Messages sorted by: 24 | [ date ] 25 | [ thread ] 26 | [ subject ] 27 | [ author ] 28 |

30 |

31 | 32 |

In this email, I argue that LLVM IR is a poor system for building a
 33 | Platform, by which I mean any system where LLVM IR would be a
 34 | format in which programs are stored or transmitted for subsequent
 35 | use on multiple underlying architectures.
 36 | 
 37 | LLVM IR initially seems like it would work well here. I myself was
 38 | once attracted to this idea. I was even motivated to put a bunch of
 39 | my own personal time into making some of LLVM's optimization passes
 40 | more robust in the absence of TargetData a while ago, even with no
 41 | specific project in mind. There are several things still missing,
 42 | but one could easily imagine that this is just a matter of people
 43 | writing some more code.
 44 | 
 45 | However, there are several ways in which LLVM IR differs from actual
 46 | platforms, both high-level VMs like Java or .NET and actual low-level
 47 | ISAs like x86 or ARM.
 48 | 
 49 | First, the boundaries of what capabilities LLVM provides are nebulous.
 50 | LLVM IR contains:
 51 | 
 52 |  * Explicitly Target-specific features. These aren't secret;
 53 |    x86_fp80's reason for being is pretty clear.
 54 | 
 55 |  * Target-specific ABI code. In order to interoperate with native
 56 |    C ABIs, LLVM requires front-ends to emit target-specific IR.
 57 |    Pretty much everyone around here has run into this.
 58 | 
 59 |  * Implicitly Target-specific features. The most obvious examples of
 60 |    these are all the different Linkage kinds. These are all basically
 61 |    just gateways to features in real linkers, and real linkers vary
 62 |    quite a lot. LLVM has its own IR-level Linker, but it doesn't
 63 |    do all the stuff that native linkers do.
 64 | 
 65 |  * Target-specific limitations in seemingly portable features.
 66 |    How big can the alignment be on an alloca? Or a GlobalVariable?
 67 |    What's the widest supported integer type? LLVM's various backends
 68 |    all have different answers to questions like these.
 69 | 
 70 | Even ignoring the fact that the quality of the backends in the
 71 | LLVM source tree varies widely, the question of "What can LLVM IR do?"
 72 | has numerous backend-specific facets. This can be problematic for
 73 | producers as well as consumers.
 74 | 
 75 | Second, and more fundamentally, LLVM IR is a fundamentally
 76 | vague language. It has:
 77 | 
 78 |  * Undefined Behavior. LLVM is, at its heart, a C compiler, and
 79 |    Undefined Behavior is one of its cornerstones.
 80 | 
 81 |    High-level VMs typically raise predictable exceptions when they
 82 |    encounter program errors. Physical machines typically document
 83 |    their behavior very extensively. LLVM is fundamentally different
 84 |    from both: it presents a bunch of rules to follow and then offers
 85 |    no description of what happens if you break them.
 86 | 
 87 |    LLVM's optimizers are built on the assumption that the rules
 88 |    are never broken, so when rules do get broken, the code just
 89 |    goes off the rails and runs into whatever happens to be in
 90 |    the way. Sometimes it crashes loudly. Sometimes it silently
 91 |    corrupts data and keeps running.
 92 | 
 93 |    There are some tools that can help locate violations of the
 94 |    rules. Valgrind is a very useful tool. But they can't find
 95 |    everything. There are even some kinds of undefined behavior that
 96 |    I've never heard anyone even propose a method of detection for.
 97 | 
 98 |  * Intentional vagueness. There is a strong preference for defining
 99 |    LLVM IR semantics intuitively rather than formally. This is quite
100 |    practical; formalizing a language is a lot of work, it reduces
101 |    future flexibility, and it tends to draw attention to troublesome
102 |    edge cases which could otherwise be largely ignored.
103 | 
104 |    I've done work to try to formalize parts of LLVM IR, and the
105 |    results have been largely fruitless. I got bogged down in
106 |    edge cases that no one is interested in fixing.
107 | 
108 |  * Floating-point arithmetic is not always consistent. Some backends
109 |    don't fully implement IEEE-754 arithmetic rules even without
110 |    -ffast-math and friends, to get better performance.
111 | 
112 | If you're familiar with "write once, debug everywhere" in Java,
113 | consider the situation in LLVM IR, which is fundamentally opposed
114 | to even trying to provide that level of consistency. And if you allow
115 | the optimizer to do subtarget-specific optimizations, you increase
116 | the chances that some bit of undefined behavior or vagueness will be
117 | exposed.
118 | 
119 | Third, LLVM is a low level system that doesn't represent high-level
120 | abstractions natively. It forces them to be chopped up into lots of
121 | small low-level instructions.
122 | 
123 |  * It makes LLVM's Interpreter really slow. The amount of work
124 |    performed by each instruction is relatively small, so the interpreter
125 |    has to execute a relatively large number of instructions to do simple
126 |    tasks, such as virtual method calls. Languages built for interpretation
127 |    do more with fewer instructions, and have lower per-instruction
128 |    overhead.
129 | 
130 |  * Similarly, it makes really-fast JITing hard. LLVM is fast compared
131 |    to some other static C compilers, but it's not fast compared to
132 |    real JIT compilers. Compiling one LLVM IR level instruction at a
133 |    time can be relatively simple, ignoring the weird stuff, but this
134 |    approach generates comically bad code. Fixing this requires
135 |    recognizing patterns in groups of instructions, and then emitting
136 |    code for the patterns. This works, but it's more involved.
137 | 
138 |  * Lowering high-level language features into low-level code locks
139 |    in implementation details. This is less severe in native code,
140 |    because a compiled blob is limited to a single hardware platform
141 |    as well. But a platform which advertizes architecture independence
142 |    which still has all the ABI lock-in of HLL implementation details
143 |    presents a much more frightening backwards compatibility specter.
144 | 
145 |  * Apple has some LLVM IR transformations for Objective-C, however
146 |    the transformations have to reverse-engineer the high-level semantics
147 |    out of the lowered code, which is awkward. Further, they're
148 |    reasoning about high-level semantics in a way that isn't guaranteed
149 |    to be safe by LLVM IR rules alone. It works for the kinds of code
150 |    clang generates for Objective C, but it wouldn't necessarily be
151 |    correct if run on code produced by other front-ends. LLVM IR
152 |    isn't capable of representing the necessary semantics for this
153 |    unless we start embedding Objective C into it.
154 | 
155 | 
156 | In conclusion, consider the task of writing an independent implementation
157 | of an LLVM IR Platform. The set of capabilities it provides depends on who
158 | you talk to. Semantic details are left to chance. There are features
159 | which require a bunch of complicated infrastructure to implement which
160 | are rarely used. And if you want light-weight execution, you'll
161 | probably need to translate it into something else better suited for it
162 | first. This all doesn't sound very appealing.
163 | 
164 | LLVM isn't actually a virtual machine. It's widely acknoledged that the
165 | name "LLVM" is a historical artifact which doesn't reliably connote what
166 | LLVM actually grew to be. LLVM IR is a compiler IR.
167 | 
168 | Dan
169 | 
170 |

171 | 172 | 173 | 174 | 175 | 176 | 177 | 178 | 179 | 180 | 181 | 182 | 183 | 184 | 185 | 186 | 187 | 188 | 189 | 190 | 191 | 192 | 193 | 194 | 195 | 196 |

197 |

Previous message: [LLVMdev] LLVM IR Code Annotation 200 |
Next message: [LLVMdev] LLVM IR is a compiler IR 202 |
Messages sorted by: 204 | [ date ] 205 | [ thread ] 206 | [ subject ] 207 | [ author ] 208 |

210 | 211 |

212 | More information about the LLVMdev 213 | mailing list
214 | 215 | -------------------------------------------------------------------------------- /llvm/lnq.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/llvm/lnq.pdf -------------------------------------------------------------------------------- /llvm/x86-llvm-translator-chipounov_2.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/llvm/x86-llvm-translator-chipounov_2.pdf -------------------------------------------------------------------------------- /optimizing/optimizing_cpp.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/optimizing/optimizing_cpp.pdf -------------------------------------------------------------------------------- /optimizing/test61.c: -------------------------------------------------------------------------------- 1 | /* Test a condition that results in no branches when compiled with -O2 */ 2 | /* 0000000000000000 : 3 | 0: 83 ff 01 cmp $0x1,%edi 4 | 3: 19 c0 sbb %eax,%eax 5 | 5: 83 e0 df and $0xffffffdf,%eax 6 | 8: 83 c0 61 add $0x61,%eax 7 | b: c3 retq 8 | */ 9 | /* Suggest re-write it to: 10 | 0000000000000000 : 11 | 0: cmp $0x1,%edi Node A 12 | 1: jb 4: Node A 13 | 2: mov 0x61,%eax Node B 14 | 3: jmp 5: Node B 15 | 4: mov 0x40,%eax Node C 16 | 5: retq Node D (Due to join point) 17 | */ 18 | 19 | int test61 ( unsigned value ); 20 | 21 | int test61 ( unsigned value ) { 22 | int ret; 23 | if (value < 1) 24 | ret = 0x40; 25 | else 26 | ret = 0x61; 27 | return ret; 28 | } 29 | 30 | -------------------------------------------------------------------------------- /rtl/rep.txt: -------------------------------------------------------------------------------- 1 | The REP prefix. 2 | 3 | A variable repz and repnz will be either 1 or 0. 4 | Any decoded instruction can use that variable to decide how to craft the RTL. 5 | 6 | E.g. in amd64 att format: 7 | rep movs rsi, rdi; 8 | The RTL for this will be: 9 | begin: 10 | cmp ecx, 0; 11 | jz next_instruction; 12 | mov rsi, rdi; 13 | add 4, rsi; 14 | add 4, rdi; 15 | jmp begin; 16 | next_instruction: 17 | 18 | rep cmps rsi, rdi; 19 | The RTL for this will be: 20 | begin: 21 | repz cmp ecx, 0; 22 | jz next_instruction; 23 | cmp rsi, rdi; 24 | jz next_instruction; 25 | add 4, rsi; 26 | add 4, rdi; 27 | jmp begin; 28 | next_instruction: 29 | 30 | The above code arrangement has the advantage that all jz and jmp instructions are to amd64 intruction boundarys and none are to specific rtl instructions. I.e. No Jumps to the middle of a set of rtl instructions decoded from a single amd64 instruction. 31 | 32 | 33 | 34 | -------------------------------------------------------------------------------- /rtl/test.txt: -------------------------------------------------------------------------------- 1 | Pseudo code for the TEST instruction. 2 | 3 | TEMP = SRC1 AND SRC2; 4 | SF = MSB(TEMP); 5 | IF TEMP = 0 6 | ZF = 1; 7 | ELSE 8 | ZF = 0; 9 | FI; 10 | PF = BitwiseXNOR(TEMP[0:7]); 11 | CF = 0; 12 | OF = 0; 13 | (* AF is undefined *) 14 | -------------------------------------------------------------------------------- /second-write/Anand.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/second-write/Anand.pdf -------------------------------------------------------------------------------- /second-write/readme.txt: -------------------------------------------------------------------------------- 1 | There does not appear to be any open source code available. 2 | -------------------------------------------------------------------------------- /second-write/secondwrite.sec11.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/second-write/secondwrite.sec11.pdf -------------------------------------------------------------------------------- /software-maintenance/10.1.1.89.2073-graphs.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/software-maintenance/10.1.1.89.2073-graphs.pdf -------------------------------------------------------------------------------- /source-source-translator/ROSE-Tutorial.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/source-source-translator/ROSE-Tutorial.pdf -------------------------------------------------------------------------------- /type-detection/TIE - Principled Reverse Engineering of Types in Binary Programs.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jcdutton/reference/3c950860cc90a22e24a8669a0912fe65e1403395/type-detection/TIE - Principled Reverse Engineering of Types in Binary Programs.pdf -------------------------------------------------------------------------------- /type-detection/data-type-determination.txt: -------------------------------------------------------------------------------- 1 | Copyright (C) 2004-2017 James Courtier-Dutton 2 | 3 | IDEAS regarding data type determination. 4 | 5 | P1,P2 are pointers 6 | I1,I2 are integers 7 | 8 | OP1 OP2 9 | P1 + P2 = Invalid 10 | P1 - P2 = Integer 11 | P1 * P2 = Invalid 12 | P1 / P2 = Invalid 13 | 14 | P1 + I1 = Pointer 15 | P1 - I1 = Pointer 16 | P1 * I1 = Invalid 17 | P1 / I1 = Invalid 18 | 19 | I1 + I2 = Integer 20 | I1 - I2 = Integer 21 | I1 * I2 = Integer 22 | I1 / I2 = Integer 23 | 24 | I1 + P1 = Pointer 25 | I1 - P1 = Invalid 26 | I1 * P1 = Invalid 27 | I1 / P1 = Invalid 28 | 29 | + can be P1 + I1, I1 + P1, I1 + I2 30 | - can be P1 - P2, P1 - I1, I1 - I2 31 | * can be I1 * I2 32 | / can be I1 / I2 33 | 34 | 1) A variable is a pointer if it is used to load/store another variable in memory. 35 | 2) Signed/Unsigned integer can be determined from branch statements and other hints. 36 | 3) Integer or String can be determined depending on how it is used. 37 | E.g. If it is used as the format variable in a printf() it is a string. 38 | If it is used in the variable args of a printf() statement and the format string identifies it as %s is it a string. 39 | 40 | 41 | IDEAS and work in progress: 42 | Create a graph for each value or label. 43 | At the top of the graph have the instruction that directly determines the type of a value. 44 | E.g. A STORE or LOAD determines that it is a pointer and what size the value is being stored. 45 | The nodes of the graph are a combined key of value and instruction. 46 | 47 | Each value has a list of instructions that use that value. 48 | One of the instructions caused the creation of the SSA value. 49 | The rest of the instructions used the value. 50 | 51 | The problem is how to link these value-instruction nodes ? 52 | Investigate a rule that type inference inherits only from above. 53 | Can you arrange the nodes so this works. 54 | 55 | Rules: 56 | 1) If one valueA-instruction is a LOAD/STORE so the type is clear, all other valueA-instructions have the same type, so you can raise them to the top of the graph. 57 | 2) A sanity check needs to be made so that all valueA-instructions have the same type. 58 | 3) Links between nodes are based on instructions. So, an instruction with 3 params will have 1 node at the top, and 2 nodes below it, with links. 59 | But then the next nodes below those will be the previous nodes above them. So, do we then have triangle graph links? 60 | 4) If one of the nodes below is a top node, A horizontal or upwards link is used. 61 | 5) Also need another link type for different instructions of the same value. 62 | 6) Need to work out if order of instructions affects the type? 63 | 7) Is this a problem that can be solved with linear programming? 64 | Each value has two properties: 65 | 1) Pointer or Int 66 | 2) size 67 | Note: (1) Pointer of Int is wrong. We instead need the type of pointer. 68 | 69 | 70 | Maybe Use a scatter diagram. 71 | Each node is a value-instruction. 72 | Each link between the nodes infers inheritance in the link direction. 73 | Sometimes, an inheritance is inferred only from a combination of more than one node. 74 | So, a new node type is needed, a decision node, which combines more than one link into a single link. 75 | Also, a node that is just the "value" with the links determining the type of the value. 76 | 77 | IDEA 3: Type Inference Propagation 78 | Alternatively, create a long list of "if (...) then ..." statements, and then try to solve them. 79 | Need a way to resolve circular statements. 80 | 81 | Maybe, If the type is clear, the inheritance should be clear, and thus the value can be removed from the model. 82 | The if (...) then .... can be added and remove from the context depending on whether the params in the (...) are defined type is clear or not. 83 | First pass, selects the ones in context, sorts them to move them to the top of the list. Some of them might have (...) being 1, i.e. always true 84 | Next, the ones selected are executed. 85 | Based on the result, see if any others can be added to the context. 86 | Next, these new ones are executed. 87 | repeat until no new ones can be add to the context. 88 | 89 | If when executing the rules, a type is changed, e.g. 32bit goes to 64bit. Then need to modify the instruction stream with bitcasts etc. and re-run the analysis. 90 | If a bitcast is done, retain the bitcast relationship so if a bitcast is needed later on, the same one can be utilised. 91 | If a bitcast is added, the types on all the affected instructions could be reset to unknown, and then the rules can be run again on those instructions 92 | in order to gain type information. This area is difficult because we don't know exactly how to unwind the previous induction steps. 93 | So, we don't know which instructions have been used to make the type inductions, so we will not know if the changed instructions previously contributed to 94 | the type induction. 95 | Adding a bitcast instruction might modify the SSA labeling. A label would be split in 2. The label up until the new bitcast would be the same. 96 | The label after the bitcast would be new. 97 | The reason to add the bitcast instruction is due to the immediately following instruction needing it. So, with this pairing in mind, we can limit the amount 98 | of instructions that might have contributed to the induction. 99 | 100 | 101 | The final check is to then run all the rules, and verify that they are all valid. 102 | Each rule has several expressions. 103 | One expression to decide to include the rule or not in the context. (e.g. If value1 and value2 defined, and value3 undefined.) 104 | One expression to validate the input data. (e.g. if value1 == ptr and value2 == int) 105 | One expression to update the type data, or use to check the type data. (e.g. value3 == ptr.) 106 | Maybe add negative rules, I.e. for a MUL expression, if either of the values are ptr, abort. or do ptr_to_int casts. 107 | 108 | Implementation: 109 | 1) Add structure to the labal_s struct. 110 | add to the label_s structure a list/array containing all the instructions that reference this label. 111 | The list would be an array of structs. This can probably be filled in 112 | by adding functionality to the "register_label()" function. 113 | For each instruction in the list, we would have: 114 | struct tip_s { 115 | int valid; /* Is this entry valid? More use for when we need to 116 | delete individual entries */ 117 | int inst_number; /* Number of the inst_log entry */ 118 | int operand; /* Which operand of the instruction? 1 = srcA/value1, 2 = 119 | srcB/value2, 3 = dstA/value3 */ 120 | int lab_pointer_first; /* Is this a pointer */ 121 | int lab_pointer_inferred; 122 | int size_bits_first; 123 | int size_bits_inferred; 124 | int lab_pointer_size_first; 125 | int lab_pointer_size_inferred; 126 | } 127 | 128 | So, for each label we now have a list of each instruction it appears in. 129 | The contents of the structure are then filled with what type 130 | information can be gained from this instruction for this label. 131 | 132 | Each variable in the structure have values: 133 | 0 = no known, 134 | 1 or more = valid value 135 | 136 | The "first" and "inferred" depends on how the type was discovered. 137 | "first" is used when we are 100% sure about the type. 138 | "inferred" is used when the type has been inferred based on the type 139 | of some other label. 140 | 141 | Once that array is set up for each label, the next stage is to fill 142 | the entries in. 143 | One would do a first pass of all the instructions, and fill in the 144 | "first" values where possible. 145 | One would then start doing passes to fill in the inferred ones. 146 | 147 | 2) 148 | Then build a dependency tree. 149 | So, instead of one label being links to each of the instructions it is used in, 150 | create a dependency tree of just the dependency between the labels. 151 | I.e. label A's type depends in some way on label B. 152 | Each dependency link will have a rule. 153 | Rules can be: 154 | label A's type IS_THE_SAME_AS label B's type. IS_THE_SAME_AS could be bi-directional, but we need to get to a DAG, so make in uni-directional. 155 | but the direction to choose is difficult. The node furthest from the "first values" node depends on the closer one. 156 | i.e. the deepest one depends on the closer one. 157 | If they are equal depth, then which direction to choose ? 158 | labal A is type X IF label B is type X and label C is type X. This is tri-directional because it is the same as: 159 | labal B is type X IF label A is type X and label C is type X. 160 | labal C is type X IF label A is type X and label B is type X. 161 | We need some way to determine which dependencies to suppress in order to get to a DAG. 162 | Use deepest one depends on the closer one. 163 | If they are equal depth, then which ones of the 3 to choose ? 164 | 165 | We can have a first set of rules to determine if a label is a pointer or not. 166 | The second set would deal with the bit width of each label. 167 | 168 | 3) 169 | There exists a problem in that there might be conflicts in the types, with 170 | e.g. one dependency telling us it is 32bits and another telling us it is 64bits. 171 | These are resolved by adding bitcasts, truncates instructions etc. 172 | The problem is when an instruction is added, the labels change, and thus the dependencies change. 173 | 174 | 4a) 175 | There exists a problem in that there might be circular dependencies. 176 | We need some way to ensure we end up with a DAG. i.e. Acyclic. 177 | Can we tell if it is acyclic even before we have inferred the types? 178 | 179 | You can check for cycles in a connected component of a graph as follows. 180 | Find a node which has only outgoing edges. If there is no such node, then there is a cycle. 181 | Start a DFS at that node. When traversing each edge, check whether the edge points back to a node already on your stack. This indicates the existence of a cycle. If you find no such edge, there are no cycles in that connected component. 182 | Repeat for each node which has only outgoing edges. 183 | 184 | 4b) 185 | If cycles are found, what can be done to break them? 186 | a) If the dependency from both directions gives us the same type, then we can temporarily block one of the dependencies. 187 | b) If the dependency causes the type to wish to be different depending on which dependency we use, 188 | then new bitcast, truncate instructions need to be added, creating a new label and the dependency graph adjusted to include them. 189 | 190 | 191 | NEW INFO: 192 | We need to store types: 193 | Type. e.g. Int 194 | Pointer to Type 195 | Pointer to Pointer 196 | Pointer to Pointer to int etc. 197 | 198 | Simplistically this can be, for now, taken as: 199 | Int 200 | Pointer to Int 201 | Pointer to Pointer. 202 | 203 | 204 | Then score direct, then score indirect (the Int bit of "pointer to Int"). 205 | 206 | struct tip_s { 207 | int valid; /* Is this entry valid? More use for when we need to delete individual entries */ 208 | int node; /* The node that the inst or phi is contained in */ 209 | int inst_number; /* Number of the inst_log entry */ 210 | int phi_number; /* Number of the phi */ 211 | int operand; /* Which operand of the instruction? 1 = srcA/value1, 2 = srcB/value2, 3 = dstA/value3 */ 212 | int lab_pointer_first; /* Is this a pointer. Determined from the LOAD or STORE command */ 213 | int lab_pointed_to_size; /* Is the size of the pointed to value */ 214 | int lab_pointer_inferred; /* This has been inferred from another label. */ 215 | struct tip_s *tip; /* Add a pointer to the pointed to type. */ 216 | struct tip_s *tip_up; /* Add a pointer to the tip that pointed to this. */ 217 | int lab_size_first; /* Bit width of the label */ 218 | int lab_size_inferred; 219 | int lab_integer_first; 220 | int lab_unsigned_integer_first; 221 | int lab_signed_integer_first; 222 | }; 223 | 224 | Type inferance can get complicated. 225 | E.g. See the following code: 226 | 227 | ADD A + B = C 228 | STORE 2 in [C] 229 | 230 | We therefore know C is a Pointer. 231 | From the rules: 232 | P1 + P2 = Invalid 233 | P1 + I1 = Pointer 234 | This means, either: 235 | A = Pointer 236 | B = Integer 237 | or 238 | A = Integer 239 | B = Pointer 240 | 241 | How to represent this in the types? 242 | 243 | What identifies a pointer? 244 | A STORE or LOAD determines that it is a pointer and what size the value is being stored. 245 | What idenfifies an intenger? 246 | Signed/Unsigned integer can be determined from branch statements and other hints. 247 | What are the other hints? 248 | I am not sure a branch can determine that. One could compare two pointer, so see if we had reached a malloc limit. 249 | A signed compare would probably force the type to be a signed int, and not a pointer. 250 | 251 | Integer or String can be determined depending on how it is used. 252 | E.g. If it is used as the format variable in a printf() it is a string. 253 | If it is used in the variable args of a printf() statement and the format string identifies it as %s is it a string. 254 | 255 | What about code that has: 256 | value1 + value2 + value3 257 | 258 | E.g. Instructions could look like: 259 | result = value1 260 | result = result + value2 261 | result = result + value3 262 | return result 263 | 264 | is result an integer or a pointer? 265 | Assume value1 is a pointer, what happens? 266 | value2 and value3 must be integers. 267 | Assume value1 is an integer, what happens? 268 | If value2 is a pointer, value3 must be an int. 269 | If value3 is a pointer, value2 must be an int. 270 | If value2 is a int, value3 may also be an int. 271 | Note: This is starting to become difficult!!! 272 | 273 | What can tell us that something is an int and not a pointer? 274 | 1) A signed calculation or compare means it is an integer. 275 | 2) The integer part of a addition, with the other part known to be a pointer. 276 | 3) 277 | P1 - P2 = Integer 278 | I1 + I2 = Integer 279 | I1 - I2 = Integer 280 | I1 * I2 = Integer 281 | I1 / I2 = Integer 282 | 283 | 284 | --------------------------------------------------------------------------------