├── README.md ├── lua50-lopcodes.h ├── lua51-lopcodes.h ├── lua52-lopcodes.h ├── lua53-lopcodes.h ├── luacexplain └── tests ├── test1.lua └── test2.lua /README.md: -------------------------------------------------------------------------------- 1 | 2 | LuacExplain 3 | =========== 4 | 5 | 'luacexplain' is a small program that enhances the output of "luac -l", 6 | and especially of "luac -l -l". 7 | 8 | It's intended for whoever wants to understand Lua VM's instructions (or 9 | "opcodes") better. 10 | 11 | After every VM instruction it prints two lines: 12 | 13 | - Pseudo code explaining what the instruction do. 14 | 15 | - If possible, the instruction's operands in a more human-readable 16 | format (You have to use "luac -l -l" for this!). 17 | 18 | The program operates like a unix pipe: you pipe the output of luac into 19 | it. 20 | 21 | An example 22 | ========== 23 | 24 | Let's suppose we have the following code in test.lua: 25 | 26 | local c 27 | 28 | local function bobo(x, y) 29 | do 30 | local c = x + y 31 | end 32 | do 33 | local zzz = c 34 | end 35 | return "bye" 36 | end 37 | 38 | Let's first run plain vanilla 'luac' on it: 39 | 40 | $ luac -l -l -p test.lua 41 | ... 42 | function (5 instructions at 0x9c521f8) 43 | 2 params, 3 slots, 1 upvalue, 4 locals, 1 constant, 0 functions 44 | 1 [6] ADD 2 0 1 45 | 2 [9] GETUPVAL 2 0 ; c 46 | 3 [11] LOADK 2 -1 ; "bye" 47 | 4 [11] RETURN 2 2 48 | 5 [12] RETURN 0 1 49 | ... 50 | 51 | And now let's pipe it through 'luacexplain': 52 | 53 | $ luac -l -l -p test.lua | luacexplain 54 | ... 55 | function (5 instructions at 0x9c521f8) 56 | 2 params, 3 slots, 1 upvalue, 4 locals, 1 constant, 0 functions 57 | 1 [6] ADD 2 0 1 58 | ;; R(A) := RK(B) + RK(C) 59 | ;; 2 0 1 60 | 2 [9] GETUPVAL 2 0 ; c 61 | ;; R(A) := UpValue[B] 62 | ;; 2 0<^c> 63 | 3 [11] LOADK 2 -1 ; "bye" 64 | ;; R(A) := Kst(Bx) 65 | ;; 2 -1<"bye"> 66 | 4 [11] RETURN 2 2 67 | ;; return R(A), ... ,R(A+B-2) 68 | ;; 2 2 69 | 5 [12] RETURN 0 1 70 | ;; return R(A), ... ,R(A+B-2) 71 | ;; --RETURN nothing-- 72 | ... 73 | 74 | Note the two new lines, prefixed with ";;", after each instruction. 75 | 76 | The first line is the pseudo code. The second line is a repeat of the 77 | operands with, when possible, additional information between angle 78 | brackets: this information is either a variable name or a constant value. 79 | 80 | Cool, heh? 81 | 82 | Note how it's smart enough to recognize that register #2 first refers to 83 | variable "c" and later to variable "zzz" (and later still to no 84 | variable). Incidentally, note how "c" and "zzz" are printed with "?" in 85 | front: that's because it's their place of definition (the "local" line) 86 | where they're not "officially" recognized yet (a more exact explanation 87 | is in the source code, if you're interested). 88 | 89 | Upvalues are printed with "^" in front (as in "^c"). 90 | 91 | To understadd the pseduo-code --the meaning of "R()", "RK()" etc.-- you 92 | need to know a tiny bit about the VM. You can find all the explanatory 93 | text you need in either of: 94 | 95 | - ["A No-Frills Introduction to Lua 5.1 VM Instructions"](https://docs.google.com/viewer?a=v&pid=sites&srcid=ZGVmYXVsdGRvbWFpbnxydWJibGVwaWxlc3xneDo2MTdkZDIxZTZjMWFmZmJi), by Kein-Hong Man. 96 | You need to read **just the first two or three pages**. It's good for any Lua, 97 | not just 5.1. 98 | 99 | - The comments in [lopcodes.h](http://www.lua.org/source/5.3/lopcodes.h.html). 100 | 101 | - ["Lua 5.2 Bytecode and Virtual Machine"](https://github.com/dlaurie/lua52vm-tools), by Dirk Laurie (skip 102 | to "Instruction anatomy"). 103 | 104 | The pseudo code is taken from the 'lopcodes.h' header, which is shipped 105 | with the program. There are actually four headers shipped: for Lua 5.0, 106 | 5.1, 5.2, 5.3, and all of them get parsed and merged so 'luacexplain' 107 | supports all Lua versions: you don't need tell it what version of 'luac' 108 | you're using! (Though, if you're super pedantic, you can direct 109 | 'luacexplain', via a command line option, to turn on just a specific 110 | version of Lua opcodes; see the source code.) 111 | 112 | Installation 113 | ============ 114 | 115 | Place all the files (the one executable and lua*-lopcodes.h) in some 116 | directory appearing in your $PATH. 117 | 118 | Or create a launcher script calling the executable and place it in some 119 | 'bin' folder of yours appearing in $PATH. 120 | 121 | Operation 122 | ========= 123 | 124 | Simply pipe luac's output to this program, as demonstrated above: 125 | 126 | $ luac -l -l some_script.lua | luacexplain 127 | 128 | Or you can can supply the input as an argument: 129 | 130 | $ luacexplain previously_saved_luac_output.txt 131 | -------------------------------------------------------------------------------- /lua50-lopcodes.h: -------------------------------------------------------------------------------- 1 | /* 2 | ** $Id: lopcodes.h,v 1.102 2002/08/21 18:56:09 roberto Exp $ 3 | ** Opcodes for Lua virtual machine 4 | ** See Copyright Notice in lua.h 5 | */ 6 | 7 | #ifndef lopcodes_h 8 | #define lopcodes_h 9 | 10 | #include "llimits.h" 11 | 12 | 13 | /*=========================================================================== 14 | We assume that instructions are unsigned numbers. 15 | All instructions have an opcode in the first 6 bits. 16 | Instructions can have the following fields: 17 | `A' : 8 bits 18 | `B' : 9 bits 19 | `C' : 9 bits 20 | `Bx' : 18 bits (`B' and `C' together) 21 | `sBx' : signed Bx 22 | 23 | A signed argument is represented in excess K; that is, the number 24 | value is the unsigned value minus K. K is exactly the maximum value 25 | for that argument (so that -max is represented by 0, and +max is 26 | represented by 2*max), which is half the maximum for the corresponding 27 | unsigned argument. 28 | ===========================================================================*/ 29 | 30 | 31 | enum OpMode {iABC, iABx, iAsBx}; /* basic instruction format */ 32 | 33 | 34 | /* 35 | ** size and position of opcode arguments. 36 | */ 37 | #define SIZE_C 9 38 | #define SIZE_B 9 39 | #define SIZE_Bx (SIZE_C + SIZE_B) 40 | #define SIZE_A 8 41 | 42 | #define SIZE_OP 6 43 | 44 | #define POS_C SIZE_OP 45 | #define POS_B (POS_C + SIZE_C) 46 | #define POS_Bx POS_C 47 | #define POS_A (POS_B + SIZE_B) 48 | 49 | 50 | /* 51 | ** limits for opcode arguments. 52 | ** we use (signed) int to manipulate most arguments, 53 | ** so they must fit in BITS_INT-1 bits (-1 for sign) 54 | */ 55 | #if SIZE_Bx < BITS_INT-1 56 | #define MAXARG_Bx ((1<>1) /* `sBx' is signed */ 58 | #else 59 | #define MAXARG_Bx MAX_INT 60 | #define MAXARG_sBx MAX_INT 61 | #endif 62 | 63 | 64 | #define MAXARG_A ((1<>POS_A)) 83 | #define SETARG_A(i,u) ((i) = (((i)&MASK0(SIZE_A,POS_A)) | \ 84 | ((cast(Instruction, u)<>POS_B) & MASK1(SIZE_B,0))) 87 | #define SETARG_B(i,b) ((i) = (((i)&MASK0(SIZE_B,POS_B)) | \ 88 | ((cast(Instruction, b)<>POS_C) & MASK1(SIZE_C,0))) 91 | #define SETARG_C(i,b) ((i) = (((i)&MASK0(SIZE_C,POS_C)) | \ 92 | ((cast(Instruction, b)<>POS_Bx) & MASK1(SIZE_Bx,0))) 95 | #define SETARG_Bx(i,b) ((i) = (((i)&MASK0(SIZE_Bx,POS_Bx)) | \ 96 | ((cast(Instruction, b)< C) then R(A) := R(B) else pc++ */ 169 | 170 | OP_CALL,/* A B C R(A), ... ,R(A+C-2) := R(A)(R(A+1), ... ,R(A+B-1)) */ 171 | OP_TAILCALL,/* A B C return R(A)(R(A+1), ... ,R(A+B-1)) */ 172 | OP_RETURN,/* A B return R(A), ... ,R(A+B-2) (see note) */ 173 | 174 | OP_FORLOOP,/* A sBx R(A)+=R(A+2); if R(A) =) R(A)*/ 185 | OP_CLOSURE/* A Bx R(A) := closure(KPROTO[Bx], R(A), ... ,R(A+n)) */ 186 | } OpCode; 187 | 188 | 189 | #define NUM_OPCODES (cast(int, OP_CLOSURE+1)) 190 | 191 | 192 | 193 | /*=========================================================================== 194 | Notes: 195 | (1) In OP_CALL, if (B == 0) then B = top. C is the number of returns - 1, 196 | and can be 0: OP_CALL then sets `top' to last_result+1, so 197 | next open instruction (OP_CALL, OP_RETURN, OP_SETLIST) may use `top'. 198 | 199 | (2) In OP_RETURN, if (B == 0) then return up to `top' 200 | 201 | (3) For comparisons, B specifies what conditions the test should accept. 202 | 203 | (4) All `skips' (pc++) assume that next instruction is a jump 204 | ===========================================================================*/ 205 | 206 | 207 | /* 208 | ** masks for instruction properties 209 | */ 210 | enum OpModeMask { 211 | OpModeBreg = 2, /* B is a register */ 212 | OpModeBrk, /* B is a register/constant */ 213 | OpModeCrk, /* C is a register/constant */ 214 | OpModesetA, /* instruction set register A */ 215 | OpModeK, /* Bx is a constant */ 216 | OpModeT /* operator is a test */ 217 | 218 | }; 219 | 220 | 221 | extern const lu_byte luaP_opmodes[NUM_OPCODES]; 222 | 223 | #define getOpMode(m) (cast(enum OpMode, luaP_opmodes[m] & 3)) 224 | #define testOpMode(m, b) (luaP_opmodes[m] & (1 << (b))) 225 | 226 | 227 | #ifdef LUA_OPNAMES 228 | extern const char *const luaP_opnames[]; /* opcode names */ 229 | #endif 230 | 231 | 232 | 233 | /* number of list items to accumulate before a SETLIST instruction */ 234 | /* (must be a power of 2) */ 235 | #define LFIELDS_PER_FLUSH 32 236 | 237 | 238 | #endif 239 | -------------------------------------------------------------------------------- /lua51-lopcodes.h: -------------------------------------------------------------------------------- 1 | /* 2 | ** $Id: lopcodes.h,v 1.124 2005/12/02 18:42:08 roberto Exp $ 3 | ** Opcodes for Lua virtual machine 4 | ** See Copyright Notice in lua.h 5 | */ 6 | 7 | #ifndef lopcodes_h 8 | #define lopcodes_h 9 | 10 | #include "llimits.h" 11 | 12 | 13 | /*=========================================================================== 14 | We assume that instructions are unsigned numbers. 15 | All instructions have an opcode in the first 6 bits. 16 | Instructions can have the following fields: 17 | `A' : 8 bits 18 | `B' : 9 bits 19 | `C' : 9 bits 20 | `Bx' : 18 bits (`B' and `C' together) 21 | `sBx' : signed Bx 22 | 23 | A signed argument is represented in excess K; that is, the number 24 | value is the unsigned value minus K. K is exactly the maximum value 25 | for that argument (so that -max is represented by 0, and +max is 26 | represented by 2*max), which is half the maximum for the corresponding 27 | unsigned argument. 28 | ===========================================================================*/ 29 | 30 | 31 | enum OpMode {iABC, iABx, iAsBx}; /* basic instruction format */ 32 | 33 | 34 | /* 35 | ** size and position of opcode arguments. 36 | */ 37 | #define SIZE_C 9 38 | #define SIZE_B 9 39 | #define SIZE_Bx (SIZE_C + SIZE_B) 40 | #define SIZE_A 8 41 | 42 | #define SIZE_OP 6 43 | 44 | #define POS_OP 0 45 | #define POS_A (POS_OP + SIZE_OP) 46 | #define POS_C (POS_A + SIZE_A) 47 | #define POS_B (POS_C + SIZE_C) 48 | #define POS_Bx POS_C 49 | 50 | 51 | /* 52 | ** limits for opcode arguments. 53 | ** we use (signed) int to manipulate most arguments, 54 | ** so they must fit in LUAI_BITSINT-1 bits (-1 for sign) 55 | */ 56 | #if SIZE_Bx < LUAI_BITSINT-1 57 | #define MAXARG_Bx ((1<>1) /* `sBx' is signed */ 59 | #else 60 | #define MAXARG_Bx MAX_INT 61 | #define MAXARG_sBx MAX_INT 62 | #endif 63 | 64 | 65 | #define MAXARG_A ((1<>POS_OP) & MASK1(SIZE_OP,0))) 81 | #define SET_OPCODE(i,o) ((i) = (((i)&MASK0(SIZE_OP,POS_OP)) | \ 82 | ((cast(Instruction, o)<>POS_A) & MASK1(SIZE_A,0))) 85 | #define SETARG_A(i,u) ((i) = (((i)&MASK0(SIZE_A,POS_A)) | \ 86 | ((cast(Instruction, u)<>POS_B) & MASK1(SIZE_B,0))) 89 | #define SETARG_B(i,b) ((i) = (((i)&MASK0(SIZE_B,POS_B)) | \ 90 | ((cast(Instruction, b)<>POS_C) & MASK1(SIZE_C,0))) 93 | #define SETARG_C(i,b) ((i) = (((i)&MASK0(SIZE_C,POS_C)) | \ 94 | ((cast(Instruction, b)<>POS_Bx) & MASK1(SIZE_Bx,0))) 97 | #define SETARG_Bx(i,b) ((i) = (((i)&MASK0(SIZE_Bx,POS_Bx)) | \ 98 | ((cast(Instruction, b)< C) then pc++ */ 190 | OP_TESTSET,/* A B C if (R(B) <=> C) then R(A) := R(B) else pc++ */ 191 | 192 | OP_CALL,/* A B C R(A), ... ,R(A+C-2) := R(A)(R(A+1), ... ,R(A+B-1)) */ 193 | OP_TAILCALL,/* A B C return R(A)(R(A+1), ... ,R(A+B-1)) */ 194 | OP_RETURN,/* A B return R(A), ... ,R(A+B-2) (see note) */ 195 | 196 | OP_FORLOOP,/* A sBx R(A)+=R(A+2); 197 | if R(A) =) R(A)*/ 205 | OP_CLOSURE,/* A Bx R(A) := closure(KPROTO[Bx], R(A), ... ,R(A+n)) */ 206 | 207 | OP_VARARG/* A B R(A), R(A+1), ..., R(A+B-1) = vararg */ 208 | } OpCode; 209 | 210 | 211 | #define NUM_OPCODES (cast(int, OP_VARARG) + 1) 212 | 213 | 214 | 215 | /*=========================================================================== 216 | Notes: 217 | (*) In OP_CALL, if (B == 0) then B = top. C is the number of returns - 1, 218 | and can be 0: OP_CALL then sets `top' to last_result+1, so 219 | next open instruction (OP_CALL, OP_RETURN, OP_SETLIST) may use `top'. 220 | 221 | (*) In OP_VARARG, if (B == 0) then use actual number of varargs and 222 | set top (like in OP_CALL with C == 0). 223 | 224 | (*) In OP_RETURN, if (B == 0) then return up to `top' 225 | 226 | (*) In OP_SETLIST, if (B == 0) then B = `top'; 227 | if (C == 0) then next `instruction' is real C 228 | 229 | (*) For comparisons, A specifies what condition the test should accept 230 | (true or false). 231 | 232 | (*) All `skips' (pc++) assume that next instruction is a jump 233 | ===========================================================================*/ 234 | 235 | 236 | /* 237 | ** masks for instruction properties. The format is: 238 | ** bits 0-1: op mode 239 | ** bits 2-3: C arg mode 240 | ** bits 4-5: B arg mode 241 | ** bit 6: instruction set register A 242 | ** bit 7: operator is a test 243 | */ 244 | 245 | enum OpArgMask { 246 | OpArgN, /* argument is not used */ 247 | OpArgU, /* argument is used */ 248 | OpArgR, /* argument is a register or a jump offset */ 249 | OpArgK /* argument is a constant or register/constant */ 250 | }; 251 | 252 | LUAI_DATA const lu_byte luaP_opmodes[NUM_OPCODES]; 253 | 254 | #define getOpMode(m) (cast(enum OpMode, luaP_opmodes[m] & 3)) 255 | #define getBMode(m) (cast(enum OpArgMask, (luaP_opmodes[m] >> 4) & 3)) 256 | #define getCMode(m) (cast(enum OpArgMask, (luaP_opmodes[m] >> 2) & 3)) 257 | #define testAMode(m) (luaP_opmodes[m] & (1 << 6)) 258 | #define testTMode(m) (luaP_opmodes[m] & (1 << 7)) 259 | 260 | 261 | LUAI_DATA const char *const luaP_opnames[NUM_OPCODES+1]; /* opcode names */ 262 | 263 | 264 | /* number of list items to accumulate before a SETLIST instruction */ 265 | #define LFIELDS_PER_FLUSH 50 266 | 267 | 268 | #endif 269 | -------------------------------------------------------------------------------- /lua52-lopcodes.h: -------------------------------------------------------------------------------- 1 | /* 2 | ** $Id: lopcodes.h,v 1.142 2011/07/15 12:50:29 roberto Exp $ 3 | ** Opcodes for Lua virtual machine 4 | ** See Copyright Notice in lua.h 5 | */ 6 | 7 | #ifndef lopcodes_h 8 | #define lopcodes_h 9 | 10 | #include "llimits.h" 11 | 12 | 13 | /*=========================================================================== 14 | We assume that instructions are unsigned numbers. 15 | All instructions have an opcode in the first 6 bits. 16 | Instructions can have the following fields: 17 | `A' : 8 bits 18 | `B' : 9 bits 19 | `C' : 9 bits 20 | 'Ax' : 26 bits ('A', 'B', and 'C' together) 21 | `Bx' : 18 bits (`B' and `C' together) 22 | `sBx' : signed Bx 23 | 24 | A signed argument is represented in excess K; that is, the number 25 | value is the unsigned value minus K. K is exactly the maximum value 26 | for that argument (so that -max is represented by 0, and +max is 27 | represented by 2*max), which is half the maximum for the corresponding 28 | unsigned argument. 29 | ===========================================================================*/ 30 | 31 | 32 | enum OpMode {iABC, iABx, iAsBx, iAx}; /* basic instruction format */ 33 | 34 | 35 | /* 36 | ** size and position of opcode arguments. 37 | */ 38 | #define SIZE_C 9 39 | #define SIZE_B 9 40 | #define SIZE_Bx (SIZE_C + SIZE_B) 41 | #define SIZE_A 8 42 | #define SIZE_Ax (SIZE_C + SIZE_B + SIZE_A) 43 | 44 | #define SIZE_OP 6 45 | 46 | #define POS_OP 0 47 | #define POS_A (POS_OP + SIZE_OP) 48 | #define POS_C (POS_A + SIZE_A) 49 | #define POS_B (POS_C + SIZE_C) 50 | #define POS_Bx POS_C 51 | #define POS_Ax POS_A 52 | 53 | 54 | /* 55 | ** limits for opcode arguments. 56 | ** we use (signed) int to manipulate most arguments, 57 | ** so they must fit in LUAI_BITSINT-1 bits (-1 for sign) 58 | */ 59 | #if SIZE_Bx < LUAI_BITSINT-1 60 | #define MAXARG_Bx ((1<>1) /* `sBx' is signed */ 62 | #else 63 | #define MAXARG_Bx MAX_INT 64 | #define MAXARG_sBx MAX_INT 65 | #endif 66 | 67 | #if SIZE_Ax < LUAI_BITSINT-1 68 | #define MAXARG_Ax ((1<>POS_OP) & MASK1(SIZE_OP,0))) 90 | #define SET_OPCODE(i,o) ((i) = (((i)&MASK0(SIZE_OP,POS_OP)) | \ 91 | ((cast(Instruction, o)<>pos) & MASK1(size,0))) 94 | #define setarg(i,v,pos,size) ((i) = (((i)&MASK0(size,pos)) | \ 95 | ((cast(Instruction, v)<= R(A) + 1 */ 200 | OP_EQ,/* A B C if ((RK(B) == RK(C)) ~= A) then pc++ */ 201 | OP_LT,/* A B C if ((RK(B) < RK(C)) ~= A) then pc++ */ 202 | OP_LE,/* A B C if ((RK(B) <= RK(C)) ~= A) then pc++ */ 203 | 204 | OP_TEST,/* A C if not (R(A) <=> C) then pc++ */ 205 | OP_TESTSET,/* A B C if (R(B) <=> C) then R(A) := R(B) else pc++ */ 206 | 207 | OP_CALL,/* A B C R(A), ... ,R(A+C-2) := R(A)(R(A+1), ... ,R(A+B-1)) */ 208 | OP_TAILCALL,/* A B C return R(A)(R(A+1), ... ,R(A+B-1)) */ 209 | OP_RETURN,/* A B return R(A), ... ,R(A+B-2) (see note) */ 210 | 211 | OP_FORLOOP,/* A sBx R(A)+=R(A+2); 212 | if R(A) > 4) & 3)) 276 | #define getCMode(m) (cast(enum OpArgMask, (luaP_opmodes[m] >> 2) & 3)) 277 | #define testAMode(m) (luaP_opmodes[m] & (1 << 6)) 278 | #define testTMode(m) (luaP_opmodes[m] & (1 << 7)) 279 | 280 | 281 | LUAI_DDEC const char *const luaP_opnames[NUM_OPCODES+1]; /* opcode names */ 282 | 283 | 284 | /* number of list items to accumulate before a SETLIST instruction */ 285 | #define LFIELDS_PER_FLUSH 50 286 | 287 | 288 | #endif 289 | -------------------------------------------------------------------------------- /lua53-lopcodes.h: -------------------------------------------------------------------------------- 1 | /* 2 | ** $Id: lopcodes.h,v 1.148 2014/10/25 11:50:46 roberto Exp $ 3 | ** Opcodes for Lua virtual machine 4 | ** See Copyright Notice in lua.h 5 | */ 6 | 7 | #ifndef lopcodes_h 8 | #define lopcodes_h 9 | 10 | #include "llimits.h" 11 | 12 | 13 | /*=========================================================================== 14 | We assume that instructions are unsigned numbers. 15 | All instructions have an opcode in the first 6 bits. 16 | Instructions can have the following fields: 17 | 'A' : 8 bits 18 | 'B' : 9 bits 19 | 'C' : 9 bits 20 | 'Ax' : 26 bits ('A', 'B', and 'C' together) 21 | 'Bx' : 18 bits ('B' and 'C' together) 22 | 'sBx' : signed Bx 23 | 24 | A signed argument is represented in excess K; that is, the number 25 | value is the unsigned value minus K. K is exactly the maximum value 26 | for that argument (so that -max is represented by 0, and +max is 27 | represented by 2*max), which is half the maximum for the corresponding 28 | unsigned argument. 29 | ===========================================================================*/ 30 | 31 | 32 | enum OpMode {iABC, iABx, iAsBx, iAx}; /* basic instruction format */ 33 | 34 | 35 | /* 36 | ** size and position of opcode arguments. 37 | */ 38 | #define SIZE_C 9 39 | #define SIZE_B 9 40 | #define SIZE_Bx (SIZE_C + SIZE_B) 41 | #define SIZE_A 8 42 | #define SIZE_Ax (SIZE_C + SIZE_B + SIZE_A) 43 | 44 | #define SIZE_OP 6 45 | 46 | #define POS_OP 0 47 | #define POS_A (POS_OP + SIZE_OP) 48 | #define POS_C (POS_A + SIZE_A) 49 | #define POS_B (POS_C + SIZE_C) 50 | #define POS_Bx POS_C 51 | #define POS_Ax POS_A 52 | 53 | 54 | /* 55 | ** limits for opcode arguments. 56 | ** we use (signed) int to manipulate most arguments, 57 | ** so they must fit in LUAI_BITSINT-1 bits (-1 for sign) 58 | */ 59 | #if SIZE_Bx < LUAI_BITSINT-1 60 | #define MAXARG_Bx ((1<>1) /* 'sBx' is signed */ 62 | #else 63 | #define MAXARG_Bx MAX_INT 64 | #define MAXARG_sBx MAX_INT 65 | #endif 66 | 67 | #if SIZE_Ax < LUAI_BITSINT-1 68 | #define MAXARG_Ax ((1<>POS_OP) & MASK1(SIZE_OP,0))) 90 | #define SET_OPCODE(i,o) ((i) = (((i)&MASK0(SIZE_OP,POS_OP)) | \ 91 | ((cast(Instruction, o)<>pos) & MASK1(size,0))) 94 | #define setarg(i,v,pos,size) ((i) = (((i)&MASK0(size,pos)) | \ 95 | ((cast(Instruction, v)<> RK(C) */ 199 | OP_UNM,/* A B R(A) := -R(B) */ 200 | OP_BNOT,/* A B R(A) := ~R(B) */ 201 | OP_NOT,/* A B R(A) := not R(B) */ 202 | OP_LEN,/* A B R(A) := length of R(B) */ 203 | 204 | OP_CONCAT,/* A B C R(A) := R(B).. ... ..R(C) */ 205 | 206 | OP_JMP,/* A sBx pc+=sBx; if (A) close all upvalues >= R(A - 1) */ 207 | OP_EQ,/* A B C if ((RK(B) == RK(C)) ~= A) then pc++ */ 208 | OP_LT,/* A B C if ((RK(B) < RK(C)) ~= A) then pc++ */ 209 | OP_LE,/* A B C if ((RK(B) <= RK(C)) ~= A) then pc++ */ 210 | 211 | OP_TEST,/* A C if not (R(A) <=> C) then pc++ */ 212 | OP_TESTSET,/* A B C if (R(B) <=> C) then R(A) := R(B) else pc++ */ 213 | 214 | OP_CALL,/* A B C R(A), ... ,R(A+C-2) := R(A)(R(A+1), ... ,R(A+B-1)) */ 215 | OP_TAILCALL,/* A B C return R(A)(R(A+1), ... ,R(A+B-1)) */ 216 | OP_RETURN,/* A B return R(A), ... ,R(A+B-2) (see note) */ 217 | 218 | OP_FORLOOP,/* A sBx R(A)+=R(A+2); 219 | if R(A) > 4) & 3)) 283 | #define getCMode(m) (cast(enum OpArgMask, (luaP_opmodes[m] >> 2) & 3)) 284 | #define testAMode(m) (luaP_opmodes[m] & (1 << 6)) 285 | #define testTMode(m) (luaP_opmodes[m] & (1 << 7)) 286 | 287 | 288 | LUAI_DDEC const char *const luaP_opnames[NUM_OPCODES+1]; /* opcode names */ 289 | 290 | 291 | /* number of list items to accumulate before a SETLIST instruction */ 292 | #define LFIELDS_PER_FLUSH 50 293 | 294 | 295 | #endif 296 | -------------------------------------------------------------------------------- /luacexplain: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env lua 2 | 3 | ----------------------------------- Utils ------------------------------------ 4 | 5 | local append = table.insert 6 | 7 | --- 8 | -- Reads a file's contents. 9 | local function slurp(path) 10 | local f = assert(io.open(path)) 11 | local s = f:read("*a") 12 | f:close() 13 | return s 14 | end 15 | 16 | --- 17 | -- Trims a string. 18 | local function trim(s) 19 | return s:match('^%s*(.-)%s*$') 20 | end 21 | 22 | --- 23 | -- Iterates over lines. 24 | local function nlines(s) 25 | return s:gmatch('([^\n]*)\n') 26 | end 27 | 28 | --- 29 | -- Parses the numbers in a string into a table. 30 | -- Returns *nothing* (not even an empty table) if it finds none. 31 | local function parse_nums(s) 32 | local nums = {} 33 | for num in s:gmatch('%S+') do 34 | append(nums, tonumber(num)) 35 | end 36 | return nums[1] and nums 37 | end 38 | 39 | --- 40 | -- Returns the dirname() of a path. Or an empty string if it gets baffled. 41 | local function dirname(path) 42 | return path:match('.*[/\\]') or '' 43 | end 44 | 45 | --- 46 | -- If a string has a newline, cleans the whitespaces surrounding it, and optionally indent. 47 | local function canonical_nl(s, indentation) 48 | return (s:gsub('%s*\n%s*', '\n' .. (indentation or ''))) 49 | end 50 | 51 | --- 52 | -- Indents a string. 53 | -- If the string contains several lines (newline-delimited), they all will be indented. 54 | local function indent(s, indentation) 55 | return indentation .. s:gsub('\n', '\n' .. indentation) 56 | end 57 | 58 | --- 59 | -- Merges associative arrays. 60 | local function merge(base, ext, ...) 61 | if ext then 62 | for k, v in pairs(ext) do 63 | base[k] = v 64 | end 65 | return merge(base, ...) 66 | else 67 | return base 68 | end 69 | end 70 | 71 | --- 72 | -- Parses command-line options. 73 | -- 74 | -- For example, when the program is executed as "program -a -v --help whatever", 75 | -- returns { a=true, v=true, whatever=true } and additionally removes these from 'args'. 76 | -- 77 | local function parse_options(args) 78 | local opts = {} 79 | while args[1] and args[1]:match('^-') do 80 | opts[ args[1]:match('^-+(.*)') ] = true 81 | table.remove(args, 1) 82 | end 83 | return opts 84 | end 85 | 86 | ---------------------------------- Parsers ----------------------------------- 87 | 88 | --[[ 89 | 90 | Parses the operators information out of Lua's header file (lopcodes.h). 91 | 92 | In other words, it turns the string: 93 | 94 | OP_MOVE,/* A B R(A) := R(B) */ 95 | OP_LOADK,/* A Bx R(A) := Kst(Bx) */ 96 | 97 | Into: 98 | 99 | { 100 | MOVE = { 101 | signature = "A B", 102 | doc = "R(A) := R(B)", 103 | args = { "R", "R" } 104 | }, 105 | LOADK = { 106 | signature = "A Bx", 107 | doc = "R(A) := Kst(Bx)", 108 | args = { "R", "Kst" } 109 | }, 110 | } 111 | 112 | ]] 113 | 114 | local function parse_ops_info(s) 115 | 116 | -- Extracts the operand types from the doc-string. 117 | local function parse_operand_types(doc) 118 | local args = {} 119 | local arg_indices = { A = 1, Ax = 1, B = 2, Bx = 2, sBx = 2, C = 3 } 120 | for typ, v in doc:gmatch('(%w+)[%(%[]([ABCxs]+)[%)%]]') do -- This matches "UpValue[A]", "RK[Bx]", etc. 121 | args[arg_indices[v]] = typ 122 | end 123 | return args 124 | end 125 | 126 | local data = {} 127 | for op, args, doc in s:gmatch('OP_(%w+),?/%*\t+(.-)\t+(.-)%*/') do 128 | data[op] = { 129 | signature = args, 130 | doc = canonical_nl(trim(doc), ' '), 131 | operand_types = parse_operand_types(doc), 132 | } 133 | end 134 | return data 135 | end 136 | 137 | --[[ 138 | 139 | Parses information about functions out of luac's output. It parses just 140 | their data: the list of constants, locals, and upvalues. 141 | 142 | For example, it turns the string: 143 | 144 | function (14 instructions, 56 bytes at 0x846c980) 145 | 1 param, 6 slots, 1 upvalue, 5 locals, 2 constants, 0 functions 146 | 1 [92] GETUPVAL 1 0 147 | 2 [92] LEN 1 1 148 | 3 [92] LOADK 2 -1 149 | ... 150 | 14 [97] RETURN 0 1 151 | constants (2) for 0x846c980: 152 | 1 1 153 | 2 -1 154 | locals (5) for 0x846c980: 155 | 0 v 1 14 156 | 1 (for index) 5 14 157 | 2 (for limit) 5 14 158 | 3 (for step) 5 14 159 | 4 i 6 13 160 | upvalues (1) for 0x846c980: 161 | 0 rgb_val 162 | 163 | Into: 164 | 165 | { 166 | ["0x8f46980"] = { 167 | constants = { 168 | { name = "1" }, 169 | { name = "-1" } 170 | }, 171 | locals = { 172 | [0] = { 173 | name = "v", 174 | range = { 1, 14 } 175 | }, 176 | { 177 | name = "(for index)", 178 | range = { 5, 14 } 179 | }, 180 | { 181 | name = "(for limit)", 182 | range = { 5, 14 } 183 | }, 184 | { 185 | name = "(for step)", 186 | range = { 5, 14 } 187 | }, 188 | { 189 | name = "i", 190 | range = { 6, 13 } 191 | } 192 | }, 193 | upvalues = { 194 | [0] = { name = "rgb_val" } 195 | } 196 | }, 197 | } 198 | 199 | ]] 200 | 201 | local function parse_funcs_info(src) 202 | local data = {} 203 | local fn_addr, slot 204 | for ln in nlines(src) do 205 | if not ln:match('^\t') then 206 | slot = nil 207 | end 208 | if slot then 209 | -- A variable/constant line. 210 | local num_str, name, trailing = ln:match('^\t([%d-]+)\t([^\t]+)(.*)') 211 | local record = { 212 | name = name, 213 | range = parse_nums(trailing), 214 | } 215 | data[fn_addr][slot][tonumber(num_str)] = record 216 | end 217 | if not slot then 218 | -- A header line. 219 | slot, fn_addr = ln:match('^(%w+) .* for (0x%x+)') 220 | if slot then 221 | data[fn_addr] = data[fn_addr] or {} 222 | data[fn_addr][slot] = {} 223 | end 224 | end 225 | end 226 | return data 227 | end 228 | 229 | ------------------------------- The decorator -------------------------------- 230 | 231 | --[[ 232 | 233 | Enhances luac's output. 234 | 235 | This is the gist of this program. It turn lines of the form: 236 | 237 | 14 [224] LOADK 12 -3 238 | 239 | into: 240 | 241 | 14 [224] LOADK 12 -3 242 | ;; R(A) := Kst(Bx) 243 | ;; 12 -3<"some const"> 244 | ]] 245 | 246 | --local DBG = true -- Enable this to add some debugging info. You'll need the p/pp global functions (pretty printers). 247 | local interpret_args -- forward declaration. 248 | 249 | local function decorate(src, ops_info, funcs_info) 250 | 251 | local indentation = string.rep(' ', 47) .. ';; ' 252 | local indent = function(s) return indent(s, indentation) end 253 | 254 | local func = nil 255 | 256 | for ln in nlines(src) do 257 | print(ln) 258 | -- The following matches function headers, such as "function ... 56 bytes at 0x846c980)". 259 | if ln:match('^%w') then 260 | local fn_addr = ln:match('at (0x%x+)') 261 | if fn_addr then -- Yes, we're starting a new function. 262 | func = funcs_info[fn_addr] 263 | if DBG then p(func) end 264 | end 265 | end 266 | -- The following matches instruction lines, such as " 14 [224] LOADK 12 -3 ; whatever". 267 | local program_counter_str, op, nums_str = ln:match('(%d+)%s+%[[%d-]+%]%s+(%w+)%s+([-%d%s]*%d)') 268 | if op then 269 | local nums = parse_nums(nums_str) 270 | if ops_info[op] then 271 | print(indent(ops_info[op].doc)) 272 | if func then -- When not doing "luac -l -l", we have no function data. 273 | if op == 'RETURN' and nums[1] == 0 and nums[2] == 1 then 274 | -- This is a special case. "RETURN 0 1" returns nothing and we 275 | -- shouldn't confuse the user with its operands. 276 | print(indent('--RETURN nothing--')) 277 | else 278 | print(indent(interpret_args(nums, ops_info[op].operand_types, func, tonumber(program_counter_str)))) 279 | end 280 | end 281 | else 282 | print(indent('Warning: opcode ' .. op .. ' unknown.')) 283 | end 284 | end 285 | end 286 | end 287 | 288 | --- 289 | -- Find the variable, among 'variables', which is at position 'register'. 290 | -- 291 | -- We can't just do variables[register]: we must look only at variables 292 | -- that are in scope. 293 | -- 294 | local function find_var(register, variables, program_counter) 295 | 296 | if (register > #variables) or (not variables[0] --[[empty]]) then 297 | return -- speed things up. 298 | end 299 | 300 | local pos = 0 301 | for i = 0, #variables do 302 | local var = variables[i] 303 | if program_counter >= var.range[1] - 1 and program_counter < var.range[2] then 304 | if register == pos then 305 | -- We also return, as a second value, a "maybe" flag. This 306 | -- flag makes the variable name printed with "?" in front. 307 | -- 308 | -- Local variables are recognized one line later of their 309 | -- place of declaration, but we want to recognize them one 310 | -- line earlier so they appear on initialization lines: 311 | -- 312 | -- local x = 99 313 | -- 314 | -- Which gives us "LOADK 0 -1<99>" 315 | -- 316 | -- That's why we do 'var.range[1] - 1' in our range check. 317 | -- 318 | -- However, when the initializing expression is "complex", as in: 319 | -- 320 | -- local x = a + 10 321 | -- 322 | -- the same register may happen to appear for the right side as well: 323 | -- 324 | -- ADD 0 0 -2<99> 325 | -- 326 | -- Obviously here, only the first operand above is "x". The second 327 | -- isn't. The "?" signals to the user that the labeling may be 328 | -- incorrect. 329 | -- 330 | return var, (program_counter == var.range[1] - 1) 331 | end 332 | pos = pos + 1 333 | end 334 | end 335 | 336 | end 337 | 338 | --- 339 | -- Returns, if possible, a humane representation of an opcode's operands 340 | -- given in 'nums'. 341 | -- 342 | -- E.g., converts { 3, 4, 8 } to '3 4 8<"some constant">' 343 | -- 344 | function interpret_args(nums, operand_types, func, program_counter) 345 | -- Note: "rep"/"reps" stands for "nice human [rep]resentation". 346 | local reps = {} 347 | for i, num in ipairs(nums) do 348 | local typ = operand_types[i] 349 | local rep = num 350 | 351 | if typ == 'RK' then 352 | typ = (num < 0) and 'Kst' or 'R' 353 | end 354 | 355 | if typ == 'R' then 356 | local var, maybe = find_var(num, func['locals'], program_counter) 357 | if var then 358 | rep = rep .. '<' .. (maybe and '?' or '') .. var.name .. '>' 359 | if DBG then rep = rep .. pp(var.range) end 360 | end 361 | end 362 | if typ == 'Kst' and func['constants'][-num] then 363 | rep = rep .. '<' .. func['constants'][-num].name .. '>' 364 | end 365 | if typ == 'UpValue' and func['upvalues'][num] then 366 | rep = rep .. '<^' .. func['upvalues'][num].name .. '>' 367 | end 368 | 369 | append(reps, rep) 370 | end 371 | return table.concat(reps, ' ') 372 | end 373 | 374 | ------------------------------------ Main ------------------------------------ 375 | 376 | -- The program's entry point. 377 | 378 | local function print_help() 379 | print "Please see README.md for how to use this program." 380 | end 381 | 382 | local function main() 383 | 384 | local opts = parse_options(arg) 385 | 386 | local DATA_DIR = dirname(arg[0]) 387 | 388 | local ops_info_50 = parse_ops_info(slurp(DATA_DIR .. 'lua50-lopcodes.h')) 389 | local ops_info_51 = parse_ops_info(slurp(DATA_DIR .. 'lua51-lopcodes.h')) 390 | local ops_info_52 = parse_ops_info(slurp(DATA_DIR .. 'lua52-lopcodes.h')) 391 | local ops_info_53 = parse_ops_info(slurp(DATA_DIR .. 'lua53-lopcodes.h')) 392 | 393 | local ops_info = merge({}, ops_info_50, ops_info_51, ops_info_52, ops_info_53) 394 | 395 | for opt, _ in pairs(opts) do 396 | if opt == 'lua50' then 397 | ops_info = ops_info_50 398 | elseif opt == 'lua51' then 399 | ops_info = ops_info_51 400 | elseif opt == 'lua52' then 401 | ops_info = ops_info_52 402 | elseif opt == 'lua53' then 403 | ops_info = ops_info_53 404 | elseif opt == 'help' or opt == 'h' then 405 | print_help() 406 | os.exit() 407 | else 408 | error("Unrecognized command-line option --" .. opt) 409 | end 410 | end 411 | 412 | if DBG then 413 | p(ops_info) 414 | end 415 | 416 | local src 417 | if arg[1] then 418 | src = slurp(arg[1]) 419 | else 420 | src = io.read('*a') 421 | end 422 | 423 | local funcs_info = parse_funcs_info(src) 424 | 425 | if DBG then 426 | p(funcs_info) 427 | end 428 | 429 | decorate(src, ops_info, funcs_info) 430 | 431 | end 432 | 433 | main() 434 | -------------------------------------------------------------------------------- /tests/test1.lua: -------------------------------------------------------------------------------- 1 | -- This code tests whether our variable lookup (find_var()) is ok. 2 | 3 | do 4 | local a = "one" 5 | local z = "one half" 6 | end 7 | 8 | do 9 | local b = "two" -- verify that the register is labeled "?b", not "a". 10 | end 11 | 12 | local function func() 13 | for k, v in pairs(ext) do 14 | base[k] = v 15 | end 16 | local x = 2 -- verify that the register is labeled "?x", not "(for generator)". 17 | end 18 | -------------------------------------------------------------------------------- /tests/test2.lua: -------------------------------------------------------------------------------- 1 | -- This code tests whether we report "return" with no arguments in a friendly manner. 2 | 3 | function f1() 4 | return 5 | end 6 | --------------------------------------------------------------------------------