├── .gitignore ├── .gitattributes ├── calc.nf ├── README.md └── nibbleforth.py /.gitignore: -------------------------------------------------------------------------------- 1 | *.pyc 2 | -------------------------------------------------------------------------------- /.gitattributes: -------------------------------------------------------------------------------- 1 | * text=auto 2 | -------------------------------------------------------------------------------- /calc.nf: -------------------------------------------------------------------------------- 1 | \ simple expression parser and calculator 2 | \ 3 | \ expression: ['+'|'-'] term ['+'|'-' term]* 4 | \ term: factor ['*'|'/' factor]* 5 | \ factor: '(' expression ')' | number 6 | \ number: digit [digit]* 7 | \ 8 | \ originally from: http://blog.brush.co.nz/2007/11/recursive-decent/ 9 | 10 | variable _c 11 | 12 | : c _c @ ; 13 | 14 | : next ( -- ) 15 | key _c ! c emit ; 16 | 17 | : digit? ( c -- ) 18 | [char] 0 - 10 u< ; 19 | 20 | : number ( -- n ) 21 | c digit? 0= abort" digit expected" 22 | 0 begin 23 | 10 * c [char] 0 - + next 24 | c digit? 0= 25 | until ; 26 | 27 | : factor ( -- n ) 28 | c [char] ( = if 29 | next expression c [char] ) <> abort" ) expected" next 30 | else 31 | number 32 | then ; 33 | 34 | : term ( -- n ) 35 | factor 36 | begin 37 | c [char] * = dup c [char] / = or 38 | while 39 | next factor swap if * else / then 40 | repeat drop ; 41 | 42 | : expression ( -- n ) 43 | c [char] - = dup c [char] + = or if next then 44 | term swap if negate then 45 | begin 46 | c [char] + = dup c [char] - = or 47 | while 48 | next term swap if + else - then 49 | repeat drop ; 50 | 51 | : calc ( -- ) 52 | next expression cr . ; 53 | 54 | calc 55 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | NibbleForth - A very compact stack machine (Forth) bytecode 2 | =========================================================== 3 | 4 | This is just an idea at this point. I don't have time to work on it further at 5 | this point. Posting my notes on GitHub for the record. 6 | 7 | We'd been struggling with code size issues on one of our projects. We're using 8 | one microcontroller with 32KB of flash (program memory), only 15KB of which we 9 | have allocated to code size. So it was pretty tight, and gcc even with `-Os` 10 | wasn't producing very tight code. 11 | 12 | So I started thinking about the smallest possible instruction set. I'm pretty 13 | familiar with 14 | [Forth](http://en.wikipedia.org/wiki/Forth_%28programming_language%29) and 15 | stack-based virtual machines, so that's where my thoughts went. My basic 16 | ideas were: 17 | 18 | Use **variable-length instruction opcodes**, and assign the most frequently-used 19 | opcodes the lowest numbers so they can be encoded in the smallest 20 | instructions. Kind of like 21 | [UTF-8](http://en.wikipedia.org/wiki/UTF-8#Description), or the [base 128 22 | varints](https://developers.google.com/protocol-buffers/docs/encoding#varints) 23 | used in Google protocol buffers -- but using nibbles instead of bytes. 24 | 25 | Taken to the extreme, this is [Huffman 26 | coding](http://en.wikipedia.org/wiki/Huffman_coding), which uses a variable 27 | number of *bits* to encode each symbol, with the most frequently-used symbols 28 | getting the shortest bit codes. However, I suspect Huffman decoding would be 29 | too slow for an embedded virtual machine. 30 | 31 | My hunch was that the most common instructions are used *way* more than the 32 | majority, meaning that encoding the most common opcodes in 4 bits and the 33 | slightly less common ones in 8 bits would be a huge gain. 34 | 35 | And my hunch was correct -- I analyzed a bunch of Forth programs that come 36 | with [Gforth](http://bernd-paysan.de/gforth.html) using `nibbleforth.py`, and 37 | `exit` is by far the most common in most programs, with `jz` and `jmp` often 38 | close behind, and then the others usually varied from program to program. 39 | 40 | Perhaps even more importantly, is to use **Forth-like [token 41 | threading](http://en.wikipedia.org/wiki/Threaded_code#Token_threading)** on top 42 | of this, so it's not just primitive opcodes that can be encoded small, but any 43 | user-defined word too. So instruction 0 might be "return", instruction 1 might 44 | be "jump-if-zero", instruction 2 might be "user-function-1", etc. And there's 45 | be a tiny VM interpreter that looked up these numbers in a table (of 16-bit 46 | pointers) to get their actual address. 47 | 48 | And your compiler would do this frequency tokenization globally on each 49 | program, so for each program you compiled you'd get the best results for the 50 | instructions/words it used. 51 | 52 | On top of that, you could **combine common sequences of instructions** into 53 | their own words (i.e., calls to a function). Pretty much like dictionary-based 54 | compression algorithms like LZW uses -- in fact, you might use the greedy [LZW 55 | algorithm](http://en.wikipedia.org/wiki/LZW) to find them. 56 | 57 | C compilers do [common subexpression 58 | elimination](http://en.wikipedia.org/wiki/Common_subexpression_elimination), 59 | but it's only ever done within a single function, and we could do it globally, 60 | making it much more powerful and compressive. You'd have to be careful and use 61 | a few heuristics so you didn't actually make it bigger by factoring too much, 62 | or factor so much it was too too slow. 63 | 64 | Note that Forth programmers factor into tiny words in any case, so this may 65 | not gain as much for folks who already program in a heavily-factored style 66 | with tiny words/functions. Have you ever considered that when programmers 67 | factor things into functions, they're basically running a [dictionary 68 | compression](http://en.wikipedia.org/wiki/Dictionary_coder) algorithm 69 | manually? 70 | 71 | Also you could **inline any Forth "words" that were only used once**, as it 72 | wouldn't help code size to have them as separate words. C compilers do this, 73 | but only on a file-local basis. 74 | 75 | In fact, that's a common pattern with C compilers -- they can only optimize 76 | local to a function, or at most, local to a file. The linker can remove unused 77 | functions, but it can't really do any further optimization. 78 | 79 | In any case, it would be a fun project to work on at some stage. :-) 80 | 81 | References 82 | ---------- 83 | 84 | * [Improving Code Density Using Compression Techniques](http://researcher.watson.ibm.com/researcher/files/us-lefurgy/micro30.net.compress.pdf) 85 | by Lefurgy, Bird, Chen, Mudge -- this one has two similarites to my idea: 86 | compressing into nibbles, and rolling common sequences of instructions into a 87 | function call 88 | * [Generation of Fast Interpreters for Huffman Compressed Bytecode](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.156.2546&rep=rep1&type=pdf) 89 | by Latendresse and Feeley 90 | * [Anton Ertl's papers on Forth interpreters](http://www.complang.tuwien.ac.at/projects/interpreters.html) 91 | -- I haven't read this stuff, but some of it looks relevant and interesting 92 | -------------------------------------------------------------------------------- /nibbleforth.py: -------------------------------------------------------------------------------- 1 | """Nibbleforth - the world's most compact stack machine (Forth) bytecode 2 | 3 | This is just some source code I started to test some ideas. Don't have time to 4 | work on it further at this point. Posting this on GitHub for the record. 5 | 6 | Currently this source code only parses and interprets a very limited subset of 7 | Forth, and is able to print out the frequencies of each word used, including 8 | jump, conditional jump, and literal instructions. 9 | 10 | See README.md for more details about how this would actually work. 11 | 12 | """ 13 | 14 | import collections 15 | import msvcrt 16 | import operator 17 | import os 18 | import re 19 | import sys 20 | 21 | stack = [] 22 | def push(x): stack.append(x) 23 | def pop(): return stack.pop() 24 | 25 | mem = [0] * 100 26 | def fetch(): push(mem[pop()]) 27 | def store(): a = pop(); mem[a] = pop() 28 | 29 | def dup(): x = pop(); push(x); push(x) 30 | def swap(): x = pop(); y = pop(); push(x); push(y) 31 | def drop(): pop() 32 | 33 | def abortq(): 34 | s = pop() 35 | if pop(): 36 | print >>sys.stderr, s 37 | sys.exit(1) 38 | def key(): 39 | c = ord(msvcrt.getch()) 40 | if c == 27: 41 | print >>sys.stderr, 'exiting' 42 | sys.exit(1) 43 | push(c) 44 | def emit(): sys.stdout.write(chr(pop())) 45 | def cr(): sys.stdout.write('\n') 46 | def dot(): sys.stdout.write(str(pop())) 47 | 48 | def plus(): push(pop() + pop()) 49 | def minus(): n = pop(); push(pop() - n) 50 | def star(): push(pop() * pop()) 51 | def slash(): n = pop(); push(pop() // n) 52 | def or_(): push(pop() | pop()) 53 | def negate(): push(-pop()) 54 | 55 | def zeroequals(): push(pop() == 0) 56 | def uless(): n = pop(); push(0 <= pop() < n) 57 | def equals(): push(pop() == pop()) 58 | def notequals(): push(pop() != pop()) 59 | 60 | primitives = { 61 | '@': fetch, 62 | '!': store, 63 | 'dup': dup, 64 | 'swap': swap, 65 | 'drop': drop, 66 | 'abort"': abortq, 67 | 'key': key, 68 | 'emit': emit, 69 | 'cr': cr, 70 | '.': dot, 71 | '+': plus, 72 | '-': minus, 73 | '*': star, 74 | '/': slash, 75 | 'or': or_, 76 | 'negate': negate, 77 | '0=': zeroequals, 78 | 'u<': uless, 79 | '=': equals, 80 | '<>': notequals, 81 | } 82 | 83 | def run(wordlist, program): 84 | pc = 0 85 | while True: 86 | op = program[pc] 87 | pc += 1 88 | if op == 'exit': 89 | return 90 | elif op == 'jz': 91 | offset = pop() 92 | if pop() == 0: 93 | pc += offset 94 | continue 95 | elif op == 'jmp': 96 | offset = pop() 97 | pc += offset 98 | continue 99 | 100 | if isinstance(op, int): 101 | push(op) 102 | elif isinstance(op, str) and op.startswith('__s"'): 103 | push(op[4:-1]) 104 | elif op in wordlist: 105 | run(wordlist, wordlist[op]) 106 | elif op in primitives: 107 | primitives[op]() 108 | else: 109 | raise Exception('unknown op: {0!r}'.format(op)) 110 | 111 | class CompileError(Exception): 112 | def __init__(self, msg, filename, line_num): 113 | self.msg = msg 114 | self.filename = os.path.split(filename)[1] 115 | self.line_num = line_num 116 | 117 | def __str__(self): 118 | return '{0}:{1}: {2}'.format(self.filename, self.line_num, self.msg) 119 | 120 | class Compiler(object): 121 | word_re = re.compile(r'(\s+)') 122 | 123 | def __init__(self, filename): 124 | self.filename = filename 125 | self.compiling = False 126 | self.line_num = 1 127 | self.definition_name = None 128 | self.definition = [] 129 | self.wordlist = {} 130 | self.noname_num = 0 131 | self.control_stack = [] 132 | self.here = 0 133 | 134 | def parse(self): 135 | with open(self.filename) as f: 136 | for line in f: 137 | self.parse_line(line) 138 | self.line_num += 1 139 | 140 | def run(self, word): 141 | run(self.wordlist, self.wordlist[word]) 142 | 143 | @classmethod 144 | def parse_int(cls, s, base=10): 145 | try: 146 | return int(s, base) 147 | except ValueError: 148 | return None 149 | 150 | def parse_line(self, line): 151 | words = self.word_re.split(line) 152 | self.it = iter(words) 153 | for word in self.it: 154 | if not word or word.isspace(): 155 | continue 156 | word = word.lower() 157 | if word in self.immediates: 158 | self.immediates[word](self) 159 | elif self.compiling: 160 | if word.startswith('$'): 161 | int_value = self.parse_int(word[1:], base=16) 162 | else: 163 | int_value = self.parse_int(word) 164 | word = int_value if int_value is not None else word 165 | self.compile(word) 166 | 167 | def get_frequencies(self): 168 | freqs = collections.defaultdict(int) 169 | for definition in self.wordlist.itervalues(): 170 | for word in definition: 171 | freqs[word] += 1 172 | return sorted(freqs.iteritems(), key=operator.itemgetter(1, 0)) 173 | 174 | def get_word(self): 175 | self.it.next() # eat whitespace "token" 176 | return self.it.next() 177 | 178 | def get_string(self, delim='"'): 179 | pieces = [] 180 | space = self.it.next()[1:] # skip one space 181 | if space: 182 | pieces.append(space) 183 | while True: 184 | piece = self.it.next() 185 | delim_pos = piece.find(delim) 186 | if delim_pos >= 0: 187 | last_piece = piece[:delim_pos] 188 | if last_piece: 189 | pieces.append(last_piece) 190 | break 191 | pieces.append(piece) 192 | return ''.join(pieces) 193 | 194 | def error(self, msg): 195 | return CompileError(msg, self.filename, self.line_num) 196 | 197 | def compile(self, word): 198 | self.definition.append(word) 199 | 200 | def backslash(self): 201 | for word in self.it: 202 | pass 203 | 204 | def paren(self): 205 | for word in self.it: 206 | if word == ')': 207 | break 208 | 209 | def colon(self, name=None): 210 | if self.compiling: 211 | raise self.error("can't use ':' when already in a colon definition") 212 | self.compiling = True 213 | self.definition_name = self.get_word() if name is None else name 214 | self.definition = [] 215 | 216 | def colon_noname(self): 217 | self.noname_num += 1 218 | name = 'noname{0}'.format(self.noname_num) 219 | self.colon(name=name) 220 | 221 | def semicolon(self): 222 | if not self.compiling: 223 | raise self.error("can't use ';' outside of a colon definition") 224 | if self.control_stack: 225 | raise self.error('control structure mismatch') 226 | self.compiling = False 227 | self.compile('exit') 228 | self.wordlist[self.definition_name] = self.definition 229 | print ':', self.definition_name 230 | print ' ', ' '.join(repr(x) for x in self.definition) 231 | 232 | def left_bracket(self): 233 | if not self.compiling: 234 | raise self.error("can't use '[' outside of a colon definition") 235 | self.compiling = False 236 | 237 | def right_bracket(self): 238 | if self.compiling: 239 | raise self.error("can't use ']' when compiling") 240 | self.compiling = True 241 | 242 | def bracket_tick(self): 243 | name = self.get_word() 244 | self.compile('&' + name) 245 | 246 | def jump_forward(self, opcode): 247 | self.compile(None) 248 | self.compile(opcode) 249 | self.control_stack.append(('forward', len(self.definition))) 250 | 251 | def resolve_forward(self, stack_index=0): 252 | if not self.control_stack: 253 | raise self.error('control structure mismatch') 254 | direction, offset = self.control_stack.pop(len(self.control_stack) - 1 - stack_index) 255 | if direction != 'forward': 256 | raise self.error('control structure mismatch') 257 | delta = len(self.definition) - offset 258 | self.definition[offset - 2] = delta 259 | 260 | def mark_reverse(self): 261 | self.control_stack.append(('reverse', len(self.definition))) 262 | 263 | def resolve_reverse(self, opcode, stack_index=0): 264 | if not self.control_stack: 265 | raise self.error('control structure mismatch') 266 | direction, offset = self.control_stack.pop(len(self.control_stack) - 1 - stack_index) 267 | if direction != 'reverse': 268 | raise self.error('control structure mismatch') 269 | delta = offset - len(self.definition) - 2 270 | self.compile(delta) 271 | self.compile(opcode) 272 | 273 | def if_(self): 274 | self.jump_forward('jz') 275 | 276 | def else_(self): 277 | self.jump_forward('jmp') 278 | self.resolve_forward(1) 279 | 280 | def then(self): 281 | self.resolve_forward() 282 | 283 | def begin(self): 284 | self.mark_reverse() 285 | 286 | def while_(self): 287 | if not self.control_stack: 288 | raise self.error('control structure mismatch') 289 | self.jump_forward('jz') 290 | top = self.control_stack.pop() 291 | second = self.control_stack.pop() 292 | self.control_stack.append(top) 293 | self.control_stack.append(second) 294 | 295 | def repeat(self): 296 | self.resolve_reverse('jmp') 297 | self.resolve_forward() 298 | 299 | def until(self): 300 | self.resolve_reverse('jz') 301 | 302 | def again(self): 303 | self.resolve_reverse('jmp') 304 | 305 | def s_quote(self): 306 | if not self.compiling: 307 | return 308 | s = self.get_string() 309 | self.compile('__s"{0}"'.format(s)) 310 | 311 | def abort_quote(self): 312 | self.s_quote() 313 | self.compile('abort"') 314 | 315 | def bracket_char(self): 316 | ch = self.get_word() 317 | self.compile(ord(ch)) 318 | 319 | def postpone(self): 320 | self.compile('&' + self.get_word()) 321 | self.compile('compile') 322 | 323 | def variable(self): 324 | name = self.get_word() 325 | address = self.here 326 | def var_address(self): 327 | self.compile(address) 328 | self.immediates[name] = var_address 329 | self.here += 1 330 | 331 | immediates = { 332 | '\\': backslash, 333 | '\\g': backslash, 334 | '(': paren, 335 | ':': colon, 336 | ':noname': colon_noname, 337 | ';': semicolon, 338 | '[': left_bracket, 339 | ']': right_bracket, 340 | "[']": bracket_tick, 341 | 'if': if_, 342 | 'else': else_, 343 | 'then': then, 344 | 'endif': then, 345 | 'begin': begin, 346 | 'while': while_, 347 | 'repeat': repeat, 348 | 'until': until, 349 | 'again': again, 350 | 's"': s_quote, 351 | 'abort"': abort_quote, 352 | '[char]': bracket_char, 353 | 'postpone': postpone, 354 | 'variable': variable, 355 | } 356 | 357 | if __name__ == '__main__': 358 | compiler = Compiler(sys.argv[1]) 359 | compiler.parse() 360 | 361 | for word, freq in compiler.get_frequencies(): 362 | if freq == 1: 363 | continue 364 | print word, freq 365 | print '-' * 80 366 | 367 | compiler.run('calc') 368 | --------------------------------------------------------------------------------