├── startup ├── cleanup ├── tessconfig ├── alpine_passwd ├── capture_the_flag ├── .gitignore ├── line_numbers ├── with_chroot ├── reset_container ├── find_sender ├── hylafax_is_broken ├── prep_alpine_chroot ├── service_queue ├── capture_the_flag_inner ├── alpine_group ├── CCode ├── LICENCE ├── config.ttyACM0 └── README.md /startup: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | source /etc/profile 3 | sh -c "$@" 4 | -------------------------------------------------------------------------------- /cleanup: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | rm -rf alpine_root sbin 3 | rm -rf apk-tools_static* 4 | -------------------------------------------------------------------------------- /tessconfig: -------------------------------------------------------------------------------- 1 | load_system_dawg false 2 | load_freq_dawg false 3 | tessedit_write_images true 4 | -------------------------------------------------------------------------------- /alpine_passwd: -------------------------------------------------------------------------------- 1 | root:x:0:0:root:/root:/sbin/nologin 2 | low:x:500:100:low:/work_dir:/bin/ash 3 | -------------------------------------------------------------------------------- /capture_the_flag: -------------------------------------------------------------------------------- 1 | TODO 2 | 3 | Make this file contains something unique on the real host machine 4 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .PKGINFO 2 | alpine_root 3 | *.apk 4 | apk-tools-static* 5 | *.rsa.pub 6 | sbin 7 | outputs 8 | -------------------------------------------------------------------------------- /line_numbers: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | import sys 3 | fname = sys.argv[1] 4 | lines = open(fname).read().split('\n') 5 | n = len(lines) 6 | width = len(str(n)) 7 | 8 | for i, line in enumerate(lines): 9 | print(f"{i+1: >{width}} {line}") 10 | -------------------------------------------------------------------------------- /with_chroot: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | 3 | ROOT=$PWD/alpine_root 4 | MAPUID=$(id -u $OUTERUSER) 5 | MAPGID=$(id -g $OUTERUSER) 6 | unshare -imnpuUCT --map-root --map-users 1100:$MAPUID:1 --map-groups 1100:$MAPGID:1 -R alpine_root -S 1100 --wd /workdir /workdir/startup "$@" 7 | -------------------------------------------------------------------------------- /reset_container: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | AROOT=alpine_root 3 | rm -rf "$AROOT/workdir" 4 | mkdir "$AROOT/workdir" 5 | cp startup "$AROOT/workdir" 6 | cp capture_the_flag_inner "$AROOT/workdir/capture_the_flag" 7 | chown -R root:root "$AROOT" 8 | chown $OUTERUSER:$OUTERUSER "$AROOT/workdir" 9 | -------------------------------------------------------------------------------- /find_sender: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | 3 | import sys 4 | import re 5 | 6 | with open(sys.argv[1]) as f: 7 | text = f.read(); 8 | for line in text.upper().split('\n'): 9 | m = re.search("REPLY *= *([0-9 ]+)#", line) 10 | if m: 11 | print(m.group(1)) 12 | break 13 | -------------------------------------------------------------------------------- /hylafax_is_broken: -------------------------------------------------------------------------------- 1 | Some problems that I came across when using hylafax on raspbian: 2 | 3 | during setup, using faxaddmodem: ondelay is not able to contact the modem, and hangs 4 | fix option 1: rename ondelay so that faxaddmodem can't find it and uses the builtin code instead 5 | fix option 2: modify faxaddmodem so that it doesn't even try to use ondelay 6 | fix option 3: manually construct the config file for the modem 7 | 8 | faxrecvd looks for config in /var/spool/hylafax/etc, but it is bind-mounted to /etc/hylafax when the faxgetty service starts 9 | fix: modify the faxrecvd file to look in /etc/hylafax instead 10 | 11 | 12 | -------------------------------------------------------------------------------- /prep_alpine_chroot: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | 3 | set -e 4 | 5 | if [ $# -ne 1 ]; then 6 | echo "Usage: $0 " 7 | exit 1 8 | fi 9 | 10 | if [ "x$OUTERUSER" == "x" ] ; then 11 | echo "Please set OUTERUSER to the host-side low privilaged user name" 12 | exit 1 13 | fi 14 | MIRROR='https://uk.alpinelinux.org/alpine/latest-stable/main' 15 | ARCH=$1 16 | APKPKG='apk-tools-static-2.14.4-r0.apk' 17 | 18 | AROOT=alpine_root 19 | mkdir "$AROOT" 20 | curl -LO "$MIRROR/$ARCH/$APKPKG" 21 | tar -xzf "$APKPKG" 22 | ./sbin/apk.static -X "$MIRROR" -U --allow-untrusted -p "$AROOT" --initdb add alpine-base build-base || true 23 | cp alpine_passwd "$AROOT/etc/passwd" 24 | cp alpine_group "$AROOT/etc/group" 25 | 26 | -------------------------------------------------------------------------------- /service_queue: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | 3 | import os 4 | import sys 5 | import time 6 | import subprocess 7 | d = sys.argv[1] 8 | 9 | def is_in_use(p): 10 | return subprocess.run(['fuser', p]).returncode == 0 11 | 12 | while True: 13 | all_files = [f for f in sorted(os.listdir(d)) if f != 'seqf'] 14 | if len(all_files) > 0: 15 | f = all_files[0] 16 | print(f"Process file {f}") 17 | fullpath = os.path.join(d, f) 18 | while is_in_use(fullpath): 19 | print("waiting for file to close") 20 | time.sleep(1) 21 | subprocess.run(['./build_and_run', fullpath]) 22 | os.unlink(fullpath) 23 | continue 24 | time.sleep(5) 25 | 26 | -------------------------------------------------------------------------------- /capture_the_flag_inner: -------------------------------------------------------------------------------- 1 | Reading this file is step 1 2 | 3 | If you can tell me (Lex) the contents of the file called capture_the_flag from outside this container then you win! 4 | 5 | I'm not sure what you win. I don't have a prize for you. But I would like to know how to secure my container better. 6 | 7 | To save you some time: 8 | - the container runs a near-bare-minimum Apline Linux userland 9 | - the host is a Raspberry Pi running Raspbian 10 | - the container is running programs as a low-privilage user, not root 11 | - the container is constructed using `unshare` to make a whole bunch of empty kernel namesapces 12 | - the host _is_ connected to the internet, but the only user that's allowed to log in can only do so with an ssh key 13 | - the only other connection to the host is via fax 14 | -------------------------------------------------------------------------------- /alpine_group: -------------------------------------------------------------------------------- 1 | root:x:0:root 2 | bin:x:1:root,bin,daemon 3 | daemon:x:2:root,bin,daemon 4 | sys:x:3:root,bin,adm 5 | adm:x:4:root,adm,daemon 6 | tty:x:5: 7 | disk:x:6:root,adm 8 | lp:x:7:lp 9 | mem:x:8: 10 | kmem:x:9: 11 | wheel:x:10:root 12 | floppy:x:11:root 13 | mail:x:12:mail 14 | news:x:13:news 15 | uucp:x:14:uucp 16 | man:x:15:man 17 | cron:x:16:cron 18 | console:x:17: 19 | audio:x:18: 20 | cdrom:x:19: 21 | dialout:x:20:root 22 | ftp:x:21: 23 | sshd:x:22: 24 | input:x:23: 25 | at:x:25:at 26 | tape:x:26:root 27 | video:x:27:root 28 | netdev:x:28: 29 | readproc:x:30: 30 | squid:x:31:squid 31 | xfs:x:33:xfs 32 | kvm:x:34:kvm 33 | games:x:35: 34 | shadow:x:42: 35 | cdrw:x:80: 36 | www-data:x:82: 37 | usb:x:85: 38 | vpopmail:x:89: 39 | users:x:100:games 40 | ntp:x:123: 41 | nofiles:x:200: 42 | smmsp:x:209:smmsp 43 | locate:x:245: 44 | abuild:x:300: 45 | utmp:x:406: 46 | low:x:500:low 47 | ping:x:999: 48 | nogroup:x:65533: 49 | nobody:x:65534: 50 | -------------------------------------------------------------------------------- /CCode: -------------------------------------------------------------------------------- 1 | { 2 | } 3 | [ 4 | ] 5 | ( 6 | ) 7 | - 8 | = 9 | + 10 | < 11 | > 12 | # 13 | ! 14 | " 15 | ' 16 | % 17 | ^ 18 | & 19 | * 20 | 0 21 | 1 22 | 2 23 | 3 24 | 4 25 | 5 26 | 6 27 | 7 28 | 8 29 | 9 30 | #include 31 | volatile 32 | char 33 | int 34 | long 35 | unsigned 36 | static 37 | auto 38 | uint8_t 39 | uint16_t 40 | uint32_t 41 | uint64_t 42 | int8_t 43 | int16_t 44 | int32_t 45 | int64_t 46 | for 47 | while 48 | void 49 | switch 50 | case 51 | break 52 | do 53 | if 54 | else 55 | return 56 | main 57 | stdio 58 | math 59 | a 60 | b 61 | c 62 | d 63 | e 64 | f 65 | g 66 | h 67 | i 68 | j 69 | k 70 | l 71 | m 72 | n 73 | o 74 | p 75 | q 76 | r 77 | s 78 | t 79 | u 80 | v 81 | w 82 | x 83 | y 84 | z 85 | REPLY 86 | reply 87 | reply= 88 | REPLY= 89 | // 90 | /* 91 | */ 92 | printf 93 | open 94 | close 95 | read 96 | write 97 | fopen 98 | fclose 99 | fwrite 100 | fread 101 | putc 102 | getc 103 | putchar 104 | getchar 105 | sin 106 | cos 107 | foo 108 | bar 109 | asm 110 | -------------------------------------------------------------------------------- /LICENCE: -------------------------------------------------------------------------------- 1 | Copyright (c) Lex Bailey, All rights reserved. 2 | 3 | Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 4 | 5 | Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 6 | Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. 7 | This software must not be used as training data for Artificial Inteligence (AI) systems, Large Language Models (LLM), or any other Machine Learning (ML) applications. 8 | 9 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 10 | -------------------------------------------------------------------------------- /config.ttyACM0: -------------------------------------------------------------------------------- 1 | # $Id$ 2 | 3 | # 4 | # Configuration for a Rockwell/Conexant K56 Class 1.0 modem using 5 | # the RCV56DPF and similar chipsets. These chipsets are generally 6 | # flash- upgradable and appear on ISA as well as PCI. They 7 | # support "Class 1.0" which means that they have "adaptive receive" 8 | # (AT+FAR=1), but not V.34. 9 | # 10 | # Comtrol RocketModem II/III/IV 11 | # MultiTech MT5600ZDX 12 | # Zoltrix FMVSP56i3 13 | # 14 | # 15 | CountryCode: 0 16 | AreaCode: 0 17 | FAXNumber: 8 18 | LongDistancePrefix: 0 19 | InternationalPrefix: 0 20 | DialStringRules: etc/dialrules 21 | ServerTracing: 1 22 | SessionTracing: 11 23 | RecvFileMode: 0600 24 | LogFileMode: 0600 25 | DeviceMode: 0600 26 | RingsBeforeAnswer: 1 27 | SpeakerVolume: off 28 | GettyArgs: "-h %l dx_%s" 29 | LocalIdentifier: CompilerFax 30 | TagLineFont: etc/lutRS18.pcf 31 | TagLineFormat: "From %%l|%c|Page %%P of %%T" 32 | MaxRecvPages: 25 33 | # 34 | # 35 | # Modem-related stuff: should reflect modem command interface 36 | # and hardware connection/cabling (e.g. flow control). 37 | # 38 | ModemType: Class1.0 # use this to supply a hint 39 | ModemRate: 19200 # rate for DCE-DTE communication 40 | ModemFlowControl: rtscts # default 41 | # 42 | # With the RocketModem IV (and possibly III) you may need to use a much 43 | # higher ModemRate than 19200 because, apparently, it suffers from potential 44 | # buffer underrun problems. Fortunately, it doesn't have buffer overflow 45 | # issues, and so using 115200 should work (both rtscts and xonxoff test good). 46 | # 47 | ModemNoFlowCmd: AT&K0 # setup no flow control 48 | ModemHardFlowCmd: AT&K3 # setup hardware flow control 49 | ModemSoftFlowCmd: AT&K4 # setup software flow control 50 | ModemSetupDTRCmd: AT&D2 # setup so DTR drop resets modem 51 | ModemSetupDCDCmd: AT&C1 # setup so DCD reflects carrier (or not) 52 | # 53 | Class1AdaptRecvCmd: AT+FAR=1 # reports carrier detection mismatches 54 | ModemDialCmd: ATX3DP%s 55 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # CompilerFax 2 | 3 | You fax it some C code, it compiles it, runs it, faxes back the result 4 | 5 | A quick demo vid: https://youtu.be/pJ-25-pRhpY 6 | 7 | ## If you send me your code 8 | 9 | Just a quick note for anyone sending faxes to any CompilerFax instance that I (Lex Bailey) am running (probably at EMF Camp). If you send me a fax then you are granting me a licence to share the contents of that fax for information/education/entertainment purposes. (I'd love to collect some of the best faxes, and best attempts to break out of the container, and share them as a fun look at how people used this service.) Thanks :) 10 | 11 | ## Tips for users 12 | 13 | The OCR is far from perfect. You need to use a font that is good for OCR. I have had success using Calibri. FreeMono is not too bad, but tends to be a little less predictable. 14 | 15 | Some of the worst problems you will encounter are some characters being misread: 16 | 17 | * O -> 0 18 | * i -> 1 19 | * x -> X 20 | 21 | There are others, of course. This is not a complete list. In my testing I have been avoiding using "i" and "x" as variable names. It seems to be hard for the ORC to read "x" in the correct case. It often wants to upper-case it for some reason. Similarly, avoid "i" if you can. Even with good programming fonts the OCR still struggles. 22 | 23 | Put spaces around things... 24 | 25 | `a+=1;` tends to be seen by the OCR program as a _word_ of sorts, rather than four seperate things. try using `a += 1 ;` 26 | 27 | I also found that `i++` was often misread as `itt`. adding a space in there (`i ++`) sometimes helped, but the best thing to do was to use `i += 1`, and to avoid the variable name `i` entirely if possible, as mentioned above. 28 | 29 | The OCR program is not trained specifically on code. It is mostly trained on prose, and so using full words will likely produce better results. `int total = 0;` is better than `int t = 0;` 30 | 31 | Most importantly, DO NOT FORGET: you must have the `REPLY = #` text in your program, or you won't get your results faxed back to you. (see the usage section below) The spaces are optional, but having the spaces helps the OCR. 32 | 33 | If you're really having trouble with some text, then you could try getting creative with the program so that bad OCR has less of an effect. 34 | 35 | For example, consider this program: 36 | 37 | ``` 38 | #include 39 | 40 | int main ( ) { 41 | printf ( "Hello World\n" ) ; 42 | int i = 1; 43 | while ( i < 11 ) { 44 | i += 1 ; 45 | printf ( "%d\n" , i - 1 ) ; 46 | } 47 | } 48 | ``` 49 | 50 | It has quite a lot of instances of the letter i by itself, and the number 1 by itself or as part of the number 11. There's 9 instances of `i`, or `1`. This is 9 chances for the OCR to fail. 51 | 52 | The obvious first thing to do is to rename `i` to something else, like `a`. This would get us down to only having to worry about the 5 instances of `1` which each be misread. That's 5 chances for the OCR to fail. But you can also remove all of the 1 digits... 53 | 54 | The program below does exactly what the one above does, but has none of the likely-misread characters) 55 | 56 | ``` 57 | #include 58 | #define one ( 4 - 3 ) 59 | #define eleven ( 3 + 4 + 4 ) 60 | 61 | int main ( ) { 62 | printf ( "Hello World\n" ) ; 63 | int a = one ; 64 | while ( a < eleven ) { 65 | a += one ; 66 | printf ( "%d\n" , a - one ) ; 67 | } 68 | } 69 | ``` 70 | 71 | the digits 3 and 4 are fairly distinct in most fonts, they are much less likely to be misread. 72 | 73 | ## Hardware 74 | 75 | - Raspberry Pi running Raspbian Trixie (older versions have a bug in unshare that prevents the container from mapping users correctly) 76 | - Fax modem (I used a startech USB modem) 77 | 78 | You don't have to use a raspberry pi. You can use some other system, as long as it can run hylafax and talk to a modem. 79 | 80 | ## Setup 81 | 82 | Sorry, these instructions are not very detailed right now, maybe I'll go into more detail when I'm in less of a rush, but... 83 | 84 | 1. install hylafax on the raspberry pi, and some other deps: `sudo apt install hylafax-server clang-format tesseract-ocr` 85 | 2. configure hylafax for your modem 86 | 3. (optional) swear at hylafax for all the trouble that it gave you in steps 1 and 2 (see `hylafax_is_broken` for details of bugs you might encounter) 87 | 4. clone this repo on to the ras pi (or your system of choice) 88 | 5. create a config file (a sourcme that sets OUTERUSER to `pi` (or whatever username you chose for the pi) and source it. it should also set FAXID to something like "CompilerFax@1234" where 1234 is your phone number 89 | 6. run `sudo ./prep_alpine_chroot` 90 | 7. set up the job queue program to run on boot (copy the systemd unit into the systemd config, enable it, start it) TODO add systemd unit file to repo 91 | 92 | ## Usage 93 | 94 | 1. Get a sheet of paper with some C code on it. Leave your reply phone number in the source code with the text "REPLY = 1234 #" where 1234 is replaced with your phone number. 95 | 2. Fax the C code to CompilerFax 96 | 3. wait for a response from CompilerFax 97 | 98 | ## Troubleshooting 99 | 100 | If you don't get a fax back, maybe your REPLY=# line was not detected correctly, or you used the wrong number 101 | 102 | Alternatively it could by hylafax being bad 103 | 104 | You can read the logs with journalctl 105 | 106 | sudo journalctl -u compilerfax 107 | sudo journalctl -u hylafax 108 | sudo journalctl -u faxq 109 | sudo journalctl -u hfaxd 110 | 111 | you can inspect the logs from the compiler service (see the servicelog directory inside the maing working directory, which is the root of the repo) 112 | 113 | you can also inspect the fax queue in /var/spool/hylafax/recvq, which will normally be empty, but might have jobs stuck in it if the compiler service is not working correctly 114 | 115 | ## Architecture 116 | 117 | Hylafax does most of the heavy lifting. It handles all of the modem communication and the incoming and outgoing fax queues. 118 | 119 | The programs that make the compiler work are: 120 | 121 | `service_queue` - reads from the incoming fax queue, launches `build_and_run` for each incoming fax 122 | 123 | `build_and_run` - runs once per fax. This process must complete for each incoming fax before the next fax starts to be processed. It does the OCR on the incoming .tif file, generates a report, and adds it to the outgoing fax queue with `sendfax` 124 | 125 | `build_document` - is called by `build_and_run` to generate a postscript file, which can then be turned into a pdf to fax as a reply 126 | 127 | `reset_container` - is called by `build_and_run` before each compilation and run job to reset the container to the base state (delete any files created by the last job) 128 | 129 | `with_chroot` - this is the main wrapper around the alpine linux container. It does the `unshare` call to run a contained program in a bunch of empty namespaces. The user OUTERUSER is mapped to the `low` user in the container. root is also mapped, and the program runs as low. This prevents the contained program from modifying most of the container and stops the program for being able to escape the container (hopefully) 130 | 131 | `line_numbers` - adds line numbers to the program listing (for building the report). Fairly boring program is this one. 132 | 133 | Other programs, not used in normal operation of the service, but useful for maintenence: 134 | 135 | `cleanup` - deletes the alpine linux root, and the copy of apk that installed it 136 | 137 | `prep_alpine_chroot` - creates the alpine linux root using apk 138 | 139 | normally you want to run `./cleanup` and then `./prep_alpine_chroot aarch64` if there's been a significant update to the container system. If in doubt then just run it, it can't do any harm 140 | --------------------------------------------------------------------------------