├── .gitignore ├── LICENSE ├── Makefile ├── README ├── docs └── BYTECODE ├── programs ├── helloworld.c ├── loop.c └── math.c └── src ├── opcodes.h ├── vm.c └── vm.h /.gitignore: -------------------------------------------------------------------------------- 1 | .DS_Store -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: 2 | 3 | The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. 4 | 5 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | CC = gcc 2 | CFLAGS = -pedantic -Wall -ansi 3 | 4 | all: 5 | mkdir -p programs/bin 6 | $(CC) $(CFLAGS) src/vm.c programs/helloworld.c -o programs/bin/helloworld 7 | $(CC) $(CFLAGS) src/vm.c programs/math.c -o programs/bin/math 8 | $(CC) $(CFLAGS) src/vm.c programs/loop.c -o programs/bin/loop 9 | 10 | run: 11 | ./programs/bin/helloworld 12 | ./programs/bin/math 13 | ./programs/bin/loop 14 | 15 | clean: 16 | rm -rf programs/bin/ -------------------------------------------------------------------------------- /README: -------------------------------------------------------------------------------- 1 | SmallVM is just that, a small virtual machine with focus on educating others about virtual machines. 2 | 3 | It is written in ANSI C and is fairly documented. 4 | 5 | Features: 6 | * Portable - Builds on almost any platform with an ANSI C compiler 7 | * Full instruction set - Includes mathematical instructions, memory and register modifiers, jumps, and conditionals 8 | * Documented - LOTS of comments 9 | * Easy to modify - Changing the amount of registers or memory size is easy 10 | * Under the MIT license - Please read the file LICENSE in this directory 11 | 12 | How to build: 13 | 14 | git clone git@github.com:andyrothtech/SmallVM.git 15 | cd smallvm 16 | make 17 | 18 | # The sample programs in the programs directory have now been compiled into the programs/bin directory. -------------------------------------------------------------------------------- /docs/BYTECODE: -------------------------------------------------------------------------------- 1 | This is the format of the bytecode used by SmallVM. 2 | 3 | A typical hexadecimal instruction looks like this: 4 | 5 | 0x01002008 6 | 7 | Let's pick this apart... 8 | 9 | Starting from the beginning, we can decode this instruction 10 | 11 | 01 is the opcode. Opcodes can only be one byte or two hex digits. 12 | 002 is the "A" argument. Arguments are always 12 bits or 3 hex digits. 13 | 006 is the "B" argument. Arguments are always 12 bits or 3 hex digits. 14 | 15 | So now, we can tell that 01 is being called with the arguments 002 and 006. 16 | 17 | As we can see in opcodes.h, the value 0x01, or 01, is the opcode for the SET 18 | instruction. Now we know that SET is being called on two values, 002, and 006. 19 | 20 | The arguments take a little more work decoding... 21 | 22 | For any value below 0x05, the value means the contents in the register. Register 23 | values start at 0x00 and end at 0x04. 0x00 is A, 0x01 is B, and so on until 24 | register E, which is 0x04. This means that the first argument, 002 means, the 25 | value in register C. 26 | 27 | WARNING: Do not mix up arguments A and B with registers A and B 28 | 29 | However, for any value above or equal to 0x05, the value is a literal number. 30 | How do we represent 0 then? Easy. Any literal is subtracted from 0x05 to get the 31 | number it represents, so 0x05 means the number 0. 0x06 means the number 1. This 32 | means that argument B's value is 8 - 5, which is 3. 33 | 34 | So the fully decoded instruction is now... 35 | 36 | Set the contents in register C to the value 3. 37 | 38 | Or in assembly... 39 | 40 | SET C 3 41 | 42 | That's it! Easy once you get the hang of it. 43 | 44 | 45 | EXTRA: 46 | 47 | How does the code do all the decoding? 48 | 49 | If we represent the hex instruction as bits, we get a format like this: 50 | 51 | ooooaaaaaaaaaaaabbbbbbbbbbbb 52 | 53 | (o = opcode, a = argument a, and b = argument b) 54 | 55 | To get the opcode we must count the number of bits from the left before the 56 | first o. If you are lazy, I will do it for you. The answer is 24 bits. 57 | 58 | We must do something called a bit shift, which is getting rid of a certain 59 | number of bits in a specified direction. We want to move them right. If "word" 60 | is defined as the hex instruction above, this would be accomplished with this: 61 | 62 | opcode = word >> 24 63 | 64 | There is also a process called masking, which is a lot like a logical AND 65 | The value is 1 only if both values are 1. It looks like this: 66 | 67 | ooooaaaaaaaaaaaabbbbbbbbbbbb 68 | 0000000000000000111111111111 69 | 70 | Which will give us only b, this can be accomplished with a bit shift but the 71 | next example becomes easier if this is done with a mask. 72 | 73 | The value of 111111111111 = 2047, so a mask would look like this, also assuming 74 | that "word" is defined as the hex instruction above. 75 | 76 | b = word & 2047 77 | 78 | The two processes can be mixed together to get the value of a. The first step is 79 | to remove the b argument by shifting, then remove the opcode by masking. 80 | Let's do that shift now. 81 | 82 | ooooaaaaaaaaaaaabbbbbbbbbbbb 83 | 84 | We are removing 12 bits to the right, so that would look like this: 85 | 86 | a = word >> 12 87 | 88 | or, in bit form: 89 | 90 | ooooaaaaaaaaaaaa 91 | 92 | Now to remove the opcode, this can be done by masking: 93 | 94 | ooooaaaaaaaaaaaa 95 | 0000111111111111 96 | 97 | so our code to get the a argument now looks like this: 98 | 99 | a = (word >> 12) & 2047 100 | 101 | The parentheses are required to make sure the shift and mask happen in the right 102 | order. 103 | 104 | So all our instruction decoding code looks like this now: 105 | 106 | opcode = word >> 24 107 | b = word & 2047 108 | a = (word >> 12) & 2047 109 | 110 | which can be found in the vm_execute function inside of vm.c 111 | 112 | Thank you, 113 | AndyRoth -------------------------------------------------------------------------------- /programs/helloworld.c: -------------------------------------------------------------------------------- 1 | #include "../src/vm.h" 2 | 3 | int main(void) 4 | { 5 | word code[] = { 6 | 0x01000007, /* SET A 2 */ 7 | 0x01001007, /* SET B 2 */ 8 | 0x02000001, /* ADD A B */ 9 | }; 10 | /* Expected output: A = 4, B = 2, C = 0, D = 0, E = 0 */ 11 | 12 | /* Create a new virtual machine */ 13 | vm_state *state = vm_new(); 14 | 15 | vm_load(state, code, 3); /* Load the code into memory */ 16 | vm_run(state); /* Run all code in memory */ 17 | 18 | vm_close(state); /* Free up the VM and close */ 19 | 20 | return EXIT_SUCCESS; 21 | } 22 | 23 | -------------------------------------------------------------------------------- /programs/loop.c: -------------------------------------------------------------------------------- 1 | #include "../src/vm.h" 2 | 3 | int main(void) 4 | { 5 | word code[] = { 6 | 0x01000006, /* SET A 1 */ 7 | 0x01001006, /* SET B 1 */ 8 | 0x02000001, /* ADD A B */ 9 | 0x0B000009, /* IFN A 4 */ 10 | 0x09007000, /* JMP 2 */ 11 | }; 12 | /* Expected output: A = 4, B = 1, C = 0, D = 0, E = 0 */ 13 | 14 | /* Create a new virtual machine */ 15 | vm_state *state = vm_new(); 16 | 17 | vm_load(state, code, 5); /* Load the code into memory */ 18 | vm_run(state); /* Run all code in memory */ 19 | 20 | vm_close(state); /* Free up the VM and close */ 21 | 22 | return EXIT_SUCCESS; 23 | } 24 | 25 | -------------------------------------------------------------------------------- /programs/math.c: -------------------------------------------------------------------------------- 1 | #include "../src/vm.h" 2 | 3 | int main(void) 4 | { 5 | word code[] = { 6 | 0x01000009, /* SET A 4 */ 7 | 0x01001007, /* SET B 2 */ 8 | 0x02000001, /* ADD A B */ 9 | 0x01002006, /* SET C 1 */ 10 | 0x03000002, /* SUB A C */ 11 | 0x04001000 /* MULT B A */ 12 | }; 13 | /* Expected output: A = 5, B = A, C = 1, D = 0, E = 0 */ 14 | 15 | /* Create a new virtual machine */ 16 | vm_state *state = vm_new(); 17 | 18 | vm_load(state, code, 6); /* Load the code into memory */ 19 | vm_run(state); /* Run all code in memory */ 20 | 21 | vm_close(state); /* Free up the VM and close */ 22 | 23 | return EXIT_SUCCESS; 24 | } 25 | 26 | -------------------------------------------------------------------------------- /src/opcodes.h: -------------------------------------------------------------------------------- 1 | #ifndef OPCODES_H 2 | #define OPCODES_H 3 | 4 | #define OP_SET 0x01 /* Sets a register to a value or to the contents of another register */ 5 | #define OP_ADD 0x02 /* Adds two values or register contents */ 6 | #define OP_SUB 0x03 /* Subtracts two values or register contents */ 7 | #define OP_MULT 0x04 /* Multiplies two values or register contents */ 8 | #define OP_DIV 0x05 /* Divides two values or register contents */ 9 | #define OP_MOD 0x06 /* Mods two values or register contents */ 10 | #define OP_STORE 0x07 /* Stores a value into memory */ 11 | #define OP_GET 0x08 /* Get a value from memory */ 12 | #define OP_JMP 0x09 /* Jump to another location in memory */ 13 | #define OP_IF 0x0A /* Performs the next instruction if the values are equal */ 14 | #define OP_IFN 0x0B /* Performs the next instruction if the values are not equal */ 15 | 16 | #endif /* OPCODES_H */ 17 | -------------------------------------------------------------------------------- /src/vm.c: -------------------------------------------------------------------------------- 1 | #include "vm.h" 2 | 3 | vm_state *vm_new(void) 4 | { 5 | /* Allocate space for vm */ 6 | vm_state *state = malloc(sizeof(vm_state)); 7 | 8 | /* Allocate space for the registers and the RAM */ 9 | state->registers = calloc(REGISTER_COUNT, sizeof(word)); 10 | state->memory = calloc(MEMORY_WORD_COUNT, sizeof(word)); 11 | state->pc = 0; 12 | 13 | /* Initialize anything else the virtual machine struct needs */ 14 | state->usedMemory = 0; 15 | 16 | /* Make sure everything is allocated */ 17 | if (!state || !state->registers || !state->memory) 18 | vm_error(state, "Not enough free memory"); 19 | 20 | return state; 21 | } 22 | 23 | void vm_close(vm_state *state) 24 | { 25 | int i; 26 | 27 | /* Print out the registers */ 28 | 29 | printf("\nRegisters:\n"); 30 | 31 | for (i = 0; i < REGISTER_COUNT; i++) 32 | printf("%c: %#x\n", 65 + i, state->registers[i]); 33 | 34 | /* Print out the contents of memory */ 35 | 36 | printf("\nMemory:\n"); 37 | 38 | for (i = 0; i < MEMORY_WORD_COUNT; i++) 39 | { 40 | word *current = &state->memory[i]; /* Get the word from memory */ 41 | printf("%#x ", *current); /* Print it out in hex format */ 42 | } 43 | 44 | printf("\n"); 45 | 46 | /* Free any pointers in the struct */ 47 | free(state->registers); 48 | free(state->memory); 49 | 50 | /* Free the memory taken up by the struct */ 51 | free(state); 52 | } 53 | 54 | void vm_error(vm_state *state, char *message, ...) 55 | { 56 | va_list args; 57 | 58 | fprintf(stderr, "ERROR: "); 59 | 60 | /* Print out formatted string and arguments */ 61 | va_start(args, message); 62 | vfprintf(stderr, message, args); 63 | va_end(args); 64 | 65 | fprintf(stderr, "\n"); 66 | 67 | /* Close the vm */ 68 | vm_close(state); 69 | 70 | /* Exit with error code */ 71 | exit(EXIT_FAILURE); 72 | } 73 | 74 | /* Get the value of either a register or a literal */ 75 | word *get_value(vm_state *state, word aWord, word *sink) 76 | { 77 | if (aWord < REGISTER_COUNT) /* A register value */ 78 | { 79 | return &state->registers[aWord]; /* Return the value of the register */ 80 | } 81 | else /* A literal value */ 82 | { 83 | *sink = aWord - REGISTER_COUNT; /* Fix offset and return the value */ 84 | return sink; 85 | } 86 | } 87 | 88 | void vm_execute(vm_state *state, word *instruction) 89 | { 90 | /* Opcode is the word right shifted */ 91 | word opcode = *instruction >> 24; 92 | /* A argument is the word right shifted and masked */ 93 | word a = (*instruction >> 12) & 2047; 94 | /* B argument is the word masked */ 95 | word b = *instruction & 2047; 96 | 97 | word *valA; /* Declared here to satisfy ANSI C requirements */ 98 | word *valB; /* Declared here to satisfy ANSI C requirements */ 99 | 100 | word sinkA; 101 | word sinkB; 102 | 103 | /* Get the values from the arguments */ 104 | valA = get_value(state, a, &sinkA); 105 | valB = get_value(state, b, &sinkB); 106 | 107 | switch (opcode) 108 | { 109 | case OP_SET: 110 | { 111 | *valA = *valB; 112 | 113 | printf("SET %#x %#x\n", a, b); 114 | 115 | break; 116 | } 117 | case OP_ADD: 118 | { 119 | *valA = *valA + *valB; 120 | 121 | printf("ADD %#x %#x\n", *valA, *valB); 122 | 123 | break; 124 | } 125 | case OP_SUB: 126 | { 127 | *valA = *valA - *valB; 128 | 129 | printf("SUB %#x %#x\n", *valA, *valB); 130 | 131 | break; 132 | } 133 | case OP_MULT: 134 | { 135 | *valA = *valA * *valB; 136 | 137 | printf("MULT %#x %#x\n", *valA, *valB); 138 | 139 | break; 140 | } 141 | case OP_DIV: 142 | { 143 | *valA = *valA / *valB; 144 | 145 | printf("DIV %#x %#x\n", *valA, *valB); 146 | 147 | break; 148 | } 149 | case OP_MOD: 150 | { 151 | *valA = *valA % *valB; 152 | 153 | printf("MOD %#x %#x\n", *valA, *valB); 154 | 155 | break; 156 | } 157 | case OP_STORE: 158 | { 159 | /* Set the memory contents to argument B's value */ 160 | state->memory[*valA + state->usedMemory] = *valB; 161 | 162 | printf("STORE %#x %#x\n", *valA, *valB); 163 | 164 | break; 165 | } 166 | case OP_GET: 167 | { 168 | /* Set argument A's value to the memory contents */ 169 | *valA = state->memory[*valB + state->usedMemory]; 170 | 171 | printf("GET %#x %#x\n", *valA, *valB); 172 | 173 | break; 174 | } 175 | case OP_JMP: 176 | { 177 | /* Set PC to a new location */ 178 | state->pc = *valA - 1; /* Subtract 1 to counteract pc increment */ 179 | 180 | printf("JMP %#x\n", *valA); 181 | 182 | break; 183 | } 184 | case OP_IF: 185 | { 186 | if (*valA != *valB) /* Check if the values are different */ 187 | state->pc++; /* Skip the next instruction */ 188 | 189 | printf("IF %#x %#x\n", *valA, *valB); 190 | 191 | break; 192 | } 193 | case OP_IFN: 194 | { 195 | if (*valA == *valB) /* Check if the values are equal */ 196 | state->pc++; /* Skip the next instruction */ 197 | 198 | printf("IFN %#x %#x\n", *valA, *valB); 199 | 200 | break; 201 | } 202 | default: 203 | { 204 | vm_error(state, "Unknown opcode: %#x", opcode); 205 | 206 | break; 207 | } 208 | } 209 | } 210 | 211 | void vm_load(vm_state *state, word *instrs, int count) 212 | { 213 | int i; 214 | 215 | for (i = 0; i < count; i++) 216 | { 217 | state->memory[i] = instrs[i]; /* Load the program into memory */ 218 | state->usedMemory++; /* Count how much of the memory is taken up */ 219 | } 220 | } 221 | 222 | void vm_run(vm_state *state) 223 | { 224 | /* Run through every word in memory */ 225 | while ((state->pc) < (state->usedMemory)) 226 | { 227 | vm_execute(state, &state->memory[state->pc]); /* Execute the word */ 228 | 229 | state->pc++; /* Increment the program counter */ 230 | } 231 | } 232 | -------------------------------------------------------------------------------- /src/vm.h: -------------------------------------------------------------------------------- 1 | #ifndef VM_H 2 | #define VM_H 3 | 4 | #include 5 | #include 6 | #include 7 | #include /* int32_t */ 8 | 9 | #include "opcodes.h" 10 | 11 | #define REGISTER_COUNT 5 /* The number of registers */ 12 | #define MEMORY_WORD_COUNT 20 /* The number of memory words */ 13 | 14 | /* A word is just a 32 bit integer */ 15 | typedef int32_t word; 16 | 17 | /* State struct */ 18 | typedef struct 19 | { 20 | word *registers; /* CPU registers */ 21 | word *memory; /* RAM */ 22 | word pc; /* Program counter */ 23 | int usedMemory; /* A counter for the amount of memory used */ 24 | } vm_state; 25 | 26 | vm_state *vm_new(void); /* Creates a new virtual machine state */ 27 | void vm_close(vm_state *state); /* Close a virtual machine instance */ 28 | 29 | void vm_execute(vm_state *state, word *instruction); /* Executes an instruction */ 30 | void vm_load(vm_state *state, word *instrs, int count); /* Loads instructions into memory */ 31 | void vm_run(vm_state *state); /* Runs from the begining of memory */ 32 | void vm_error(vm_state *state, char *message, ...); /* Cleans up after an error */ 33 | 34 | #endif /* VM_H */ 35 | --------------------------------------------------------------------------------