├── .gitignore ├── LICENSE ├── README.md ├── SYNTAX ├── __init__.py ├── codes.py ├── examples ├── euler1.vm ├── euler2.vm ├── fib.vm └── primes.vm ├── exception.py ├── main.py ├── memory.py ├── program.py ├── utils.py └── vm.py /.gitignore: -------------------------------------------------------------------------------- 1 | .idea 2 | *.pyc 3 | __pycache__ 4 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright (c) 2015, Cholerae Hu 2 | All rights reserved. 3 | 4 | Redistribution and use in source and binary forms, with or without 5 | modification, are permitted provided that the following conditions are met: 6 | 7 | * Redistributions of source code must retain the above copyright notice, this 8 | list of conditions and the following disclaimer. 9 | 10 | * Redistributions in binary form must reproduce the above copyright notice, 11 | this list of conditions and the following disclaimer in the documentation 12 | and/or other materials provided with the distribution. 13 | 14 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 15 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 16 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 17 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 18 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 19 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 20 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 21 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 22 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 23 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 24 | 25 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # TinyVM 2 | 3 | This tinyVM is an implementation of [GenTiradentes' TinyVM][tvm] in Python (the original is written in C). 4 | 5 | This is my first attempt at a VM, so huge thanks to GenTiradentes for making a minimal one, easy to grasp. Also, [Specter][specter](the Go version) inspired me much, thanks to its author. 6 | 7 | ## Run 8 | 9 | This tinyVM depend on module enum. Module enum has already been a part of standard library in python 3.4+, if you are using other versions of python please run 10 | 11 | `pip install enum34` 12 | 13 | to install it. 14 | 15 | `cd` to this directory 16 | 17 | `python ./main.py --file /path/to/source` 18 | 19 | ## License 20 | 21 | The [BSD 2-Clause license][bsd]. 22 | 23 | [bsd]: http://opensource.org/licenses/BSD-2-Clause 24 | [tvm]: https://github.com/GenTiradentes/tinyvm 25 | [specter]: https://github.com/PuerkitoBio/specter 26 | -------------------------------------------------------------------------------- /SYNTAX: -------------------------------------------------------------------------------- 1 | This virtual machine loosely follows traditional Intel x86 assembly syntax. 2 | 3 | ////////////////////////////////////////////////// 4 | // Table of Contents ///////////////////////////// 5 | ////////////////////////////////////////////////// 6 | 0. VALUES 7 | 1. REGISTERS 8 | 2. MEMORY 9 | 3. LABELS 10 | 4. INSTRUCTION LISTING 11 | I. Memory 12 | II. Stack 13 | III. Calling Conventions 14 | IV. Arithmetic Operators 15 | V. Binary Operators 16 | VI. Comparison 17 | VII. Control Flow Manipulation 18 | VIII. Input / Output 19 | 20 | ////////////////////////////////////////////////// 21 | // 0. VALUES ///////////////////////////////////// 22 | ////////////////////////////////////////////////// 23 | 24 | Values can be specified in decimal, octal, hexadecimal, or binary. The only difference between Intel syntax assembly and this 25 | syntax is the delimiter between the value and the base specifier. By default, values without a base specifier are assumed 26 | to be in decimal. Any value prepended with "0x" is assumed to be in hexadecimal. 27 | 28 | Values can also be specified using base identifiers. To specify the value "32," I can use 0x20 for hexadecimal, 20|h for hexadecimal 29 | using a base identifier, 40|o for octal using a base identifier, or 100000|b for binary using a base identifer. 30 | 31 | ////////////////////////////////////////////////// 32 | // 1. REGISTERS ////////////////////////////////// 33 | ////////////////////////////////////////////////// 34 | 35 | TVM has 17 registers, modeled after x86 registers. 36 | Register names are written lower-case. 37 | Because of the implementation of the stack in Specter, all registers are general purpose and 38 | can be freely used (the instruction pointer is not available via a special register, jumps and call 39 | must be used to alter the flow). 40 | 41 | EAX 42 | EBX 43 | ECX 44 | EDX 45 | 46 | ESI 47 | EDI 48 | 49 | ESP 50 | EBP 51 | 52 | EIP 53 | 54 | R08 - R15 55 | 56 | ////////////////////////////////////////////////// 57 | // 2. MEMORY ///////////////////////////////////// 58 | ////////////////////////////////////////////////// 59 | 60 | Memory addresses are specified using brackets, in units of four bytes. Programs running within the virtual machine have their own 61 | address space, so no positive address within the address space is off limits. 62 | 63 | Unlike TinyVM, Specter does not use a part of the heap memory for the stack, they are separate containers. So there are no areas of memory that are off limits. 64 | 65 | To specify the 256th word in the address space, you can use [256], [100|h], [0x100], or [100000000|b]. Any syntax that's 66 | valid when specifying a value is valid when specifying an address. 67 | 68 | ////////////////////////////////////////////////// 69 | // 3. LABELS ///////////////////////////////////// 70 | ////////////////////////////////////////////////// 71 | 72 | Labels are specified by appending a colon to an identifier. Local labels are not yet supported. 73 | 74 | Labels must be specified at the beginning of a line or on their own line. 75 | 76 | ////////////////////////////////////////////////// 77 | // 4. INSTRUCTION LISTING //////////////////////// 78 | ////////////////////////////////////////////////// 79 | 80 | Instructions listed are displayed in complete usage form, with example arguments, enclosed in square brackets. 81 | The square brackets are not to be used in actual TVM programs. 82 | 83 | // I. Memory // 84 | 85 | [mov arg0, arg1] 86 | Moves value specified from arg1 to arg0 87 | 88 | // II. Stack // 89 | 90 | [push arg] 91 | Pushes arg onto the stack 92 | 93 | [pop arg] 94 | Pops a value from the stack, storing it in arg 95 | 96 | [pushf] 97 | Pushes the FLAGS register to the stack 98 | 99 | [popf arg] 100 | Pops the flag register to arg 101 | 102 | // III. Calling Conventions // 103 | 104 | [call address] 105 | Push the current address to the stack and jump to the subroutine specified 106 | 107 | [ret] 108 | Pop the previous address from the stack to the instruction pointer to return control to the caller 109 | 110 | // IV. Arithmetic Operators // 111 | 112 | [inc arg] 113 | Increments arg 114 | 115 | [dec arg] 116 | Decrements arg 117 | 118 | [add arg0, arg1] 119 | Adds arg1 to arg0, storing the result in arg0 120 | 121 | [sub arg0, arg1] 122 | Subtracts arg1 from arg0, storing the result in arg0 123 | 124 | [mul arg0, arg1] 125 | Multiplies arg1 and arg0, storing the result in arg0 126 | 127 | [div arg0, arg1] 128 | Divides arg0 by arg1, storing the quotient in arg0 129 | 130 | [mod arg0, arg1] 131 | Same as the '%' (modulus) operator in C. Calculates arg0 mod arg1 and stores the result in the remainder register. 132 | 133 | [rem arg] 134 | Retrieves the value stored in the remainder register, storing it in arg 135 | 136 | // V. Binary Operators // 137 | 138 | [not arg] 139 | Calculates the binary NOT of arg, storing it in arg 140 | 141 | [xor arg0, arg1] 142 | Calculates the binary XOR of arg0 and arg1, storing the result in arg0 143 | 144 | [or arg0, arg1] 145 | Calculates the binary OR of arg0 and arg1, storing the result in arg0 146 | 147 | [and arg0, arg1] 148 | Calculates the binary AND of arg0 and arg1, storing the result in arg0 149 | 150 | [shl arg0, arg1] 151 | Shift arg0 left by arg1 places 152 | 153 | [shr arg0, arg1] 154 | Shifts arg0 right by arg1 places 155 | 156 | // VI. Comparison // 157 | 158 | [cmp arg0, arg1] 159 | Compares arg0 and arg1, storing the result in the FLAGS register 160 | 161 | // VII. Control Flow Manipulation // 162 | 163 | [jmp address] 164 | Jumps to an address or label 165 | 166 | [je address] 167 | Jump if equal 168 | 169 | [jne address] 170 | Jump if not equal 171 | 172 | [jg address] 173 | Jump if greater 174 | 175 | [jge address] 176 | Jump if equal or greater 177 | 178 | [jl address] 179 | Jump if lesser 180 | 181 | [jle address] 182 | Jump if lesser or equal 183 | 184 | // VIII. Input / Output // 185 | 186 | [prn arg] 187 | Print an integer 188 | -------------------------------------------------------------------------------- /__init__.py: -------------------------------------------------------------------------------- 1 | __author__ = "Cholerae Hu" 2 | -------------------------------------------------------------------------------- /codes.py: -------------------------------------------------------------------------------- 1 | from enum import Enum, IntEnum 2 | 3 | class opcode(Enum): 4 | _OP_END = 1 5 | _OP_NOP = 2 6 | _OP_INT = 3 7 | _OP_MOV = 4 8 | _OP_PUSH = 5 9 | _OP_POP = 6 10 | _OP_PUSHF = 7 11 | _OP_POPF = 8 12 | _OP_INC = 9 13 | _OP_DEC = 10 14 | _OP_ADD = 11 15 | _OP_SUB = 12 16 | _OP_MUL = 13 17 | _OP_DIV= 14 18 | _OP_MOD = 15 19 | _OP_REM = 16 20 | _OP_NOT = 17 21 | _OP_XOR = 18 22 | _OP_OR = 19 23 | _OP_AND = 20 24 | _OP_SHL = 21 25 | _OP_SHR = 22 26 | _OP_CMP = 23 27 | _OP_CALL = 24 28 | _OP_JMP = 25 29 | _OP_RET = 26 30 | _OP_JE = 27 31 | _OP_JNE = 28 32 | _OP_JG = 29 33 | _OP_JGE = 30 34 | _OP_JL = 31 35 | _OP_JLE = 32 36 | _OP_PRN = 33 37 | 38 | class register(IntEnum): 39 | _RG_EAX = 0 40 | _RG_EBX = 1 41 | _RG_ECX = 2 42 | _RG_EDX = 3 43 | _RG_ESI = 4 44 | _RG_EDI = 5 45 | _RG_ESP = 6 46 | _RG_EBP = 7 47 | _RG_EIP = 8 48 | _RG_R08 = 9 49 | _RG_R09 = 10 50 | _RG_R10 = 11 51 | _RG_R11 = 12 52 | _RG_R12 = 13 53 | _RG_R13 = 14 54 | _RG_R14 = 15 55 | _RG_R15 = 16 56 | 57 | regsMap = { 58 | "eax": register._RG_EAX, 59 | "ebx": register._RG_EBX, 60 | "ecx": register._RG_ECX, 61 | "edx": register._RG_EDX, 62 | "esi": register._RG_ESI, 63 | "edi": register._RG_EDI, 64 | "esp": register._RG_ESP, 65 | "ebp": register._RG_EBP, 66 | "eip": register._RG_EIP, 67 | "r08": register._RG_R08, 68 | "r09": register._RG_R09, 69 | "r10": register._RG_R10, 70 | "r11": register._RG_R11, 71 | "r12": register._RG_R12, 72 | "r13": register._RG_R13, 73 | "r14": register._RG_R14, 74 | "r15": register._RG_R15, 75 | } 76 | 77 | regset = set(regsMap.keys()) 78 | 79 | opsRev = { 80 | opcode._OP_NOP: "nop", 81 | opcode._OP_INT: "int", 82 | opcode._OP_MOV: "mov", 83 | opcode._OP_PUSH: "push", 84 | opcode._OP_POP: "pop", 85 | opcode._OP_PUSHF: "pushf", 86 | opcode._OP_POPF: "popf", 87 | opcode._OP_INC: "inc", 88 | opcode._OP_DEC: "dec", 89 | opcode._OP_ADD: "add", 90 | opcode._OP_SUB: "sub", 91 | opcode._OP_MUL: "mul", 92 | opcode._OP_DIV: "div", 93 | opcode._OP_MOD: "mod", 94 | opcode._OP_REM: "rem", 95 | opcode._OP_NOT: "not", 96 | opcode._OP_XOR: "xor", 97 | opcode._OP_OR: "or", 98 | opcode._OP_AND: "and", 99 | opcode._OP_SHL: "shl", 100 | opcode._OP_SHR: "shr", 101 | opcode._OP_CMP: "cmp", 102 | opcode._OP_CALL: "call", 103 | opcode._OP_JMP: "jmp", 104 | opcode._OP_RET: "ret", 105 | opcode._OP_JE: "je", 106 | opcode._OP_JNE: "jne", 107 | opcode._OP_JG: "jg", 108 | opcode._OP_JGE: "jge", 109 | opcode._OP_JL: "jl", 110 | opcode._OP_JLE: "jle", 111 | opcode._OP_PRN: "prn", 112 | } 113 | 114 | opsMap = { 115 | "nop": opcode._OP_NOP, 116 | "int": opcode._OP_INT, 117 | "mov": opcode._OP_MOV, 118 | "push": opcode._OP_PUSH, 119 | "pop": opcode._OP_POP, 120 | "pushf": opcode._OP_PUSHF, 121 | "popf": opcode._OP_POPF, 122 | "inc": opcode._OP_INC, 123 | "dec": opcode._OP_DEC, 124 | "add": opcode._OP_ADD, 125 | "sub": opcode._OP_SUB, 126 | "mul": opcode._OP_MUL, 127 | "div": opcode._OP_DIV, 128 | "mod": opcode._OP_MOD, 129 | "rem": opcode._OP_REM, 130 | "not": opcode._OP_NOT, 131 | "xor": opcode._OP_XOR, 132 | "or": opcode._OP_OR, 133 | "and": opcode._OP_AND, 134 | "shl": opcode._OP_SHL, 135 | "shr": opcode._OP_SHR, 136 | "cmp": opcode._OP_CMP, 137 | "call": opcode._OP_CALL, 138 | "jmp": opcode._OP_JMP, 139 | "ret": opcode._OP_RET, 140 | "je": opcode._OP_JE, 141 | "jne": opcode._OP_JNE, 142 | "jg": opcode._OP_JG, 143 | "jge": opcode._OP_JGE, 144 | "jl": opcode._OP_JL, 145 | "jle": opcode._OP_JLE, 146 | "prn": opcode._OP_PRN, 147 | } 148 | 149 | opset = set(opsMap.keys()) 150 | 151 | 152 | 153 | -------------------------------------------------------------------------------- /examples/euler1.vm: -------------------------------------------------------------------------------- 1 | start: 2 | mov esi, 1000000 3 | mov eax, 0 4 | 5 | L0: 6 | mov ebx, eax 7 | mod ebx, 3 8 | rem ebx 9 | cmp ebx, 0 10 | jne L1 11 | 12 | add edx, eax 13 | je check 14 | 15 | L1: 16 | mov ebx, eax 17 | mod ebx, 5 18 | rem ebx 19 | cmp ebx, 0 20 | jne check 21 | 22 | add edx, eax 23 | 24 | check: 25 | inc eax 26 | cmp eax, esi 27 | jl L0 28 | 29 | prn edx 30 | -------------------------------------------------------------------------------- /examples/euler2.vm: -------------------------------------------------------------------------------- 1 | 2 | 3 | start: 4 | mov eax, 1 5 | mov ebx, 0 6 | 7 | mov ecx, 0 # ECX is our sum 8 | 9 | loop: add eax, ebx 10 | add ebx, eax 11 | 12 | mov edx, eax 13 | and edx, 1 14 | cmp edx, 0 15 | jne L0 16 | add ecx, eax 17 | 18 | L0: 19 | mov edx, ebx 20 | and edx, 1 21 | cmp edx, 0 22 | jne L1 23 | add ecx, ebx 24 | 25 | L1: 26 | cmp eax, 4000000 27 | jg end 28 | 29 | cmp ebx, 4000000 30 | jl loop 31 | 32 | end: 33 | prn ecx 34 | -------------------------------------------------------------------------------- /examples/fib.vm: -------------------------------------------------------------------------------- 1 | start: 2 | mov eax, 1 3 | mov ebx, 0 4 | 5 | loop: add eax, ebx 6 | add ebx, eax 7 | 8 | prn eax 9 | prn ebx 10 | 11 | prn eax 12 | cmp eax, 0 13 | jl end 14 | 15 | cmp ebx, 0 16 | jg loop 17 | 18 | end: 19 | -------------------------------------------------------------------------------- /examples/primes.vm: -------------------------------------------------------------------------------- 1 | # Simplistic prime-finding algorithm 2 | 3 | start: mov eax, 2 # EAX is prime candidate 4 | 5 | checkPrime: mov ebx, 2 # EBX is factor candidate 6 | 7 | checkFactor: cmp eax, ebx 8 | je primeFound 9 | 10 | mod eax, ebx 11 | rem ecx 12 | cmp ecx, 0 13 | je nextPrime 14 | 15 | inc ebx 16 | jmp checkFactor 17 | 18 | primeFound: prn eax 19 | 20 | nextPrime: inc eax 21 | cmp eax, 50 22 | jl checkPrime 23 | 24 | -------------------------------------------------------------------------------- /exception.py: -------------------------------------------------------------------------------- 1 | class ParseException(BaseException): 2 | 3 | def __init__(self, value): 4 | self.value = value 5 | 6 | def __str__(self): 7 | return repr(self.value) 8 | 9 | class NotImplementedException(BaseException): 10 | pass -------------------------------------------------------------------------------- /main.py: -------------------------------------------------------------------------------- 1 | from __future__ import with_statement, absolute_import 2 | import argparse 3 | from vm import Vm 4 | 5 | 6 | def main(): 7 | parser = argparse.ArgumentParser(description="A Tiny VM written in Python") 8 | parser.add_argument('--file', dest="filepath", help='the source file') 9 | args = parser.parse_args() 10 | with open(args.filepath) as srcfile: 11 | instance = Vm(srcfile) 12 | instance.run() 13 | 14 | if __name__ == "__main__": 15 | main() 16 | -------------------------------------------------------------------------------- /memory.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import 2 | from utils import Int 3 | 4 | class Memory(object): 5 | 6 | def __init__(self, reg_count): 7 | self.FLAGS = 0 8 | self.remainder = 0 9 | self.registers = [Int(0) for i in range(reg_count)] 10 | self.heap = [] 11 | self.stack = [] 12 | 13 | def push_stack(self, i): 14 | self.stack.append(int(i)) 15 | 16 | def pop_stack(self): 17 | ret = self.stack.pop() 18 | return ret -------------------------------------------------------------------------------- /program.py: -------------------------------------------------------------------------------- 1 | class Program(object): 2 | 3 | def __init__(self): 4 | self.start = 0 5 | self.instrs = [] 6 | self.labels = {} 7 | self.args = [] 8 | 9 | 10 | -------------------------------------------------------------------------------- /utils.py: -------------------------------------------------------------------------------- 1 | ''' 2 | Python do not support reference passing of int variables, 3 | so I must wrap int variables in Int objects to implement registers. 4 | The 'other' parameters below are int variables not Int. 5 | ''' 6 | 7 | class Int(object): 8 | 9 | def __init__(self, value): 10 | self.value = value 11 | 12 | def __str__(self): 13 | return repr(self.value) 14 | 15 | def __repr__(self): 16 | return repr(self.value) 17 | 18 | def __lt__(self, other): 19 | return self.value < other 20 | 21 | def __le__(self, other): 22 | return self.value <= other 23 | 24 | def __gt__(self, other): 25 | return self.value > other 26 | 27 | def __ge__(self, other): 28 | return self.value >= other 29 | 30 | def __eq__(self, other): 31 | return self.value == other 32 | 33 | def __ne__(self, other): 34 | return self.value == other 35 | 36 | def __iadd__(self, other): 37 | self.value += other 38 | return self 39 | 40 | def __isub__(self, other): 41 | self.value -= other 42 | return self 43 | 44 | def __imul__(self, other): 45 | self.value *= other 46 | return self 47 | 48 | def __itruediv__(self, other): 49 | self.value /= other 50 | return self 51 | 52 | def __mod__(self, other): 53 | return self.value % other 54 | 55 | def __iand__(self, other): 56 | self.value &= other 57 | return self 58 | 59 | def __ior__(self, other): 60 | self.value |= other 61 | return self 62 | 63 | def __ixor__(self, other): 64 | self.value ^= other 65 | return self 66 | 67 | def __irshift__(self, other): 68 | self.value >>= other 69 | return self 70 | 71 | def __ilshift__(self, other): 72 | self.value <<= other 73 | return self 74 | 75 | def __invert__(self): 76 | return ~self.value 77 | 78 | def __int__(self): 79 | return self.value 80 | 81 | def set(self, v): 82 | self.value = v 83 | -------------------------------------------------------------------------------- /vm.py: -------------------------------------------------------------------------------- 1 | from __future__ import print_function, absolute_import 2 | 3 | from memory import Memory 4 | from program import Program 5 | from codes import * 6 | from exception import ParseException, NotImplementedException 7 | from utils import Int 8 | 9 | class Vm(object): 10 | 11 | def __init__(self, srcfile): 12 | self.src = srcfile 13 | self.m = Memory(17) 14 | self.p = Program() 15 | 16 | def run(self): 17 | self.parse() 18 | i = Int(0) 19 | while self.p.instrs[int(i)] != opcode._OP_END: 20 | self.runInstruction(i) 21 | i += 1 22 | 23 | def runInstruction(self, instrIndex): 24 | instr = self.p.instrs[int(instrIndex)] 25 | a0, a1 = int(instrIndex)*2, (int(instrIndex)*2)+1 26 | if instr == opcode._OP_NOP: 27 | pass 28 | elif instr == opcode._OP_INT: 29 | raise NotImplementedException 30 | elif instr == opcode._OP_MOV: 31 | self.p.args[a0].set(int(self.p.args[a1])) 32 | elif instr == opcode._OP_PUSH: 33 | self.m.push_stack(self.p.args[a0]) 34 | elif instr == opcode._OP_POP: 35 | self.p.args[a0].set(self.m.pop_stack()) 36 | elif instr == opcode._OP_PUSHF: 37 | self.m.push_stack(self.m.FLAGS) 38 | elif instr == opcode._OP_POPF: 39 | self.m.FLAGS = self.m.pop_stack() 40 | elif instr == opcode._OP_INC: 41 | self.p.args[a0] += 1 42 | elif instr == opcode._OP_DEC: 43 | self.p.args[a0] -= 1 44 | elif instr == opcode._OP_ADD: 45 | self.p.args[a0] += int(self.p.args[a1]) 46 | elif instr == opcode._OP_SUB: 47 | self.p.args[a0] -= int(self.p.args[a1]) 48 | elif instr == opcode._OP_MUL: 49 | self.p.args[a0] *= int(self.p.args[a1]) 50 | elif instr == opcode._OP_DIV: 51 | self.p.args[a0] /= int(self.p.args[a1]) 52 | elif instr == opcode._OP_MOD: 53 | self.m.remainder = self.p.args[a0] % int(self.p.args[a1]) 54 | elif instr == opcode._OP_REM: 55 | self.p.args[a0].set(Int(self.m.remainder)) 56 | elif instr == opcode._OP_AND: 57 | self.p.args[a0] &= int(self.p.args[a1]) 58 | elif instr == opcode._OP_SHL: 59 | if self.p.args[a1] > 0: 60 | self.p.args[a0] <<= int(self.p.args[a1]) 61 | elif instr == opcode._OP_SHR: 62 | if self.p.args[a1] > 0: 63 | self.p.args[a0] >>= int(self.p.args[a1]) 64 | elif instr == opcode._OP_NOT: 65 | self.p.args[a0] = ~self.p.args[a0] 66 | elif instr == opcode._OP_OR: 67 | self.p.args[a0] |= int(self.p.args[a1]) 68 | elif instr == opcode._OP_XOR: 69 | self.p.args[a0] ^= int(self.p.args[a1]) 70 | elif instr == opcode._OP_CMP: 71 | if self.p.args[a0] == int(self.p.args[a1]): 72 | self.m.FLAGS = 0x1 73 | elif self.p.args[a0] > int(self.p.args[a1]): 74 | self.m.FLAGS = 0x2 75 | else: 76 | self.m.FLAGS = 0x0 77 | elif instr == opcode._OP_CALL: 78 | self.m.push_stack(int(instrIndex)) 79 | instrIndex.set(int(self.p.args[a0])-1) 80 | elif instr == opcode._OP_JMP: 81 | instrIndex.set(int(self.p.args[a0])-1) 82 | elif instr == opcode._OP_RET: 83 | instrIndex.set(self.m.pop_stack()) 84 | elif instr == opcode._OP_JE: 85 | if self.m.FLAGS & 0x1 != 0: 86 | instrIndex.set(int(self.p.args[a0])-1) 87 | elif instr == opcode._OP_JNE: 88 | if self.m.FLAGS & 0x1 == 0: 89 | instrIndex.set(int(self.p.args[a0])-1) 90 | elif instr == opcode._OP_JG: 91 | if self.m.FLAGS & 0x2 != 0: 92 | instrIndex.set(int(self.p.args[a0])-1) 93 | elif instr == opcode._OP_JGE: 94 | if self.m.FLAGS & 0x3 != 0: 95 | instrIndex.set(int(self.p.args[a0])-1) 96 | elif instr == opcode._OP_JL: 97 | if self.m.FLAGS & 0x3 == 0: 98 | instrIndex.set(int(self.p.args[a0])-1) 99 | elif instr == opcode._OP_JLE: 100 | if self.m.FLAGS & 0x2 == 0: 101 | instrIndex.set(int(self.p.args[a0])-1) 102 | elif instr == opcode._OP_PRN: 103 | print(self.p.args[a0]) 104 | 105 | def parse_value(self, tok, instrIndex, argIndex): 106 | number = toValue(tok) 107 | self.p.args[(instrIndex * 2) + argIndex] = number 108 | return True 109 | 110 | def parse_address(self, tok, instrIndex, argIndex): 111 | if tok.startswith('['): 112 | i = toValue(tok) 113 | self.p.args[(instrIndex*2)+argIndex] = self.m.heap[i] 114 | return True 115 | return False 116 | 117 | def parse_register(self, tok, instrIndex, argIndex): 118 | if tok in regset: 119 | reg = regsMap[tok] 120 | self.p.args[(instrIndex*2)+argIndex] = self.m.registers[reg] 121 | return True 122 | return False 123 | 124 | def parse_instr(self, tok): 125 | if tok in opset: 126 | op = opsMap[tok] 127 | self.p.instrs.append(op) 128 | return True 129 | return False 130 | 131 | def parse_label_value(self, tok, instrIndex, argIndex): 132 | ret = True 133 | try: 134 | label = Int(self.p.labels[tok]) 135 | self.p.args[(instrIndex*2)+argIndex] = label 136 | except KeyError: 137 | ret = False 138 | return ret 139 | 140 | def parse_label_def(self, tok): 141 | if tok.endswith(':'): 142 | label = tok[:len(tok)-1] 143 | if label in regset: 144 | raise ParseException("register name {} cannot be used as a label".format(label)) 145 | if label in self.p.labels.keys(): 146 | raise ParseException("label name {} already exists".format(label)) 147 | self.p.labels[label] = len(self.p.instrs) 148 | return True 149 | return False 150 | 151 | def parse(self): 152 | lines = [] 153 | 154 | for line in self.src: 155 | toks = parse_line(line) 156 | lines.append(toks) 157 | hasInstr = False 158 | 159 | for tok in toks: 160 | if tok.startswith('#'): 161 | break 162 | if self.parse_label_def(tok): 163 | if hasInstr: 164 | raise ParseException("cannot define label " + tok + " after an instruction in the same line") 165 | continue 166 | if self.parse_instr(tok): 167 | hasInstr = True 168 | continue 169 | 170 | self.p.args = [Int(0) for i in range(len(self.p.instrs) * 2)] 171 | 172 | instrIndex = -1 173 | for toks in lines: 174 | hasInstr = False 175 | argIndex = 0 176 | for tok in toks: 177 | if tok.startswith('#'): 178 | break 179 | if tok.endswith(':'): 180 | continue 181 | if tok in opset: 182 | instrIndex += 1 183 | hasInstr = True 184 | continue 185 | if not hasInstr: 186 | raise ParseException("found argument token " + tok + " without instruction") 187 | if self.parse_register(tok, instrIndex, argIndex) or \ 188 | self.parse_label_value(tok, instrIndex, argIndex) or \ 189 | self.parse_address(tok, instrIndex, argIndex): 190 | argIndex += 1 191 | continue 192 | if self.parse_value(tok, instrIndex, argIndex): 193 | argIndex += 1 194 | 195 | self.p.instrs.append(opcode._OP_END) 196 | 197 | 198 | def parse_line(line): 199 | tokens = [] 200 | line = line.strip() 201 | for i in line.split(' '): 202 | tokens.extend(i.split(',')) 203 | tokens = list(filter(lambda x: x != "", tokens)) 204 | tokens = [x.lower() for x in tokens] 205 | return tokens 206 | 207 | def toValue(tok): 208 | sepIndex = tok.find('|') 209 | base = 10 210 | val = tok 211 | if sepIndex > 0 and sepIndex < len(tok)-1: 212 | val = tok[:sepIndex] 213 | baseFlag = tok[sepIndex+1:] 214 | base = { 215 | 'h': 16, 216 | 'd': 10, 217 | 'o': 8, 218 | 'b': 2, 219 | }[baseFlag] 220 | elif len(tok) >= 3 and tok.startswith('0') and not tok[1].isdigit(): 221 | val = tok[2:] 222 | baseFlag = tok[1] 223 | base = { 224 | 'h': 16, 225 | 'd': 10, 226 | 'o': 8, 227 | 'b': 2, 228 | }[baseFlag] 229 | i = parse_int(val, base) 230 | return Int(i) 231 | 232 | def parse_int(s, base): 233 | if not is_valid_number(s, base): 234 | raise ValueError("{0} is not a valid number in base {1}".format(s, base)) 235 | return int(s, base=base) 236 | 237 | def is_valid_number(s, base): 238 | for i in s: 239 | flag = False 240 | if i >= '0' and i <= '9': 241 | i = ord(i) - ord('0') 242 | flag = True 243 | elif i >= 'a' and i <= 'f': 244 | i = ord(i) - ord('a') + 10 245 | flag = True 246 | if not flag: 247 | return False 248 | if i > base: 249 | return False 250 | return True 251 | 252 | 253 | 254 | 255 | --------------------------------------------------------------------------------