├── .gitignore ├── LICENSE ├── Makefile ├── README.md ├── bestline.c ├── bestline.h ├── bin ├── footprint.png ├── sectorlisp.bin ├── sectorlisp.gif └── yodawg.png ├── lisp.c ├── lisp.lisp ├── sectorlisp.S ├── sectorlisp.lds └── test ├── .gitignore ├── Makefile ├── README.md ├── eval10.lisp ├── eval15.lisp ├── qemu.sh ├── tcat.c ├── test1.lisp └── test2.lisp /.gitignore: -------------------------------------------------------------------------------- 1 | *.o 2 | *.bin 3 | *.bin.dbg 4 | *.com.dbg 5 | *.elf 6 | .aarch64 7 | /lisp 8 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright 2020 Justine Alexandra Roberts Tunney 2 | Copyright 2021 Alain Greppin 3 | 4 | Permission to use, copy, modify, and/or distribute this software for 5 | any purpose with or without fee is hereby granted, provided that the 6 | above copyright notice and this permission notice appear in all copies. 7 | 8 | THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL 9 | WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED 10 | WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE 11 | AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL 12 | DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR 13 | PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER 14 | TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR 15 | PERFORMANCE OF THIS SOFTWARE. 16 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | CFLAGS = -std=gnu89 -w -O 2 | 3 | CLEANFILES = \ 4 | lisp \ 5 | lisp.o \ 6 | bestline.o \ 7 | sectorlisp.o \ 8 | sectorlisp.bin \ 9 | sectorlisp.bin.dbg 10 | 11 | .PHONY: all 12 | all: lisp \ 13 | sectorlisp.bin \ 14 | sectorlisp.bin.dbg 15 | 16 | .PHONY: clean 17 | clean:; $(RM) lisp lisp.o bestline.o sectorlisp.o sectorlisp.bin sectorlisp.bin.dbg 18 | 19 | lisp: lisp.o bestline.o 20 | lisp.o: lisp.c bestline.h 21 | bestline.o: bestline.c bestline.h 22 | 23 | sectorlisp.o: sectorlisp.S 24 | $(AS) -g -o $@ $< 25 | 26 | sectorlisp.bin.dbg: sectorlisp.o sectorlisp.lds 27 | $(LD) -T sectorlisp.lds -o $@ $< 28 | 29 | sectorlisp.bin: sectorlisp.bin.dbg 30 | objcopy -S -O binary sectorlisp.bin.dbg sectorlisp.bin 31 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # sectorlisp 2 | 3 | sectorlisp is a 512-byte implementation of LISP that's able to bootstrap 4 | John McCarthy's meta-circular evaluator on bare metal. 5 | 6 | ![Yo dawg, I heard you like LISP so I put a LISP in your LISP so you can eval while you eval](bin/yodawg.png) 7 | 8 | ## Overview 9 | 10 | LISP has been described as the [Maxwell's equations of 11 | software](https://michaelnielsen.org/ddi/lisp-as-the-maxwells-equations-of-software/). 12 | Yet there's been very little focus to date on reducing these equations 13 | to their simplest possible form. Even the [original LISP 14 | paper](https://people.cs.umass.edu/~emery/classes/cmpsci691st/readings/PL/LISP.pdf) 15 | from the 1960's defines LISP with nonessential elements, e.g. `LABEL`. 16 | 17 | This project aims to solve that by doing three things: 18 | 19 | 1. We provide a LISP implementation that's written in LISP, as a single 20 | pure expression, using only the essential functions of the language. 21 | See [lisp.lisp](lisp.lisp). It's the same meta-circular evaluator in 22 | John McCarthy's paper from the 1960's, except with its bugs fixed, 23 | dependencies included, and syntactic sugar removed. 24 | 25 | 2. We provide a readable portable C reference implementation to show how 26 | the meta-circular evaluator can be natively bootstrapped on POSIX 27 | conforming platforms, with a pleasant readline-like interface. See 28 | [lisp.c](lisp.c). 29 | 30 | 2. We provide a 512-byte i8086 implementation of LISP that boots from 31 | BIOS on personal computers. See [sectorlisp.S](sectorlisp.S). To the 32 | best of our knowledge, this is the tiniest true LISP implementation 33 | to date. 34 | 35 |

36 | Binary Footprint Comparison 37 |

38 | 39 | ## Getting Started 40 | 41 | See [lisp.lisp](lisp.lisp) for code examples that you can copy and paste 42 | into your LISP REPL. 43 | 44 | You can run the C implementation as follows: 45 | 46 | ```sh 47 | $ make 48 | $ ./lisp 49 | ``` 50 | 51 | After running `make` you should see a `sectorlisp.bin` file, which is a 52 | master boot record you can put on a flopy disk and boot from BIOS. If 53 | you would prefer to run it in an emulator, we recommend using 54 | [Das Blinkenlights](https://justine.lol/blinkenlights/). 55 | 56 | ```sh 57 | curl --compressed https://justine.lol/blinkenlights/blinkenlights-latest.com >blinkenlights.com 58 | chmod +x blinkenlights.com 59 | ./blinkenlights.com -rt sectorlisp.bin 60 | ``` 61 | 62 | Alternatively you may use QEMU as follows: 63 | 64 | ```sh 65 | qemu-system-i386 -nographic -fda sectorlisp.bin 66 | ``` 67 | 68 | Further information may be found on [our wiki](https://github.com/jart/sectorlisp/wiki). 69 | 70 | ## Demo 71 | 72 |

73 | 74 | booting sectorlisp in emulator 76 |

77 | 78 | The video above demonstrates how to boot sectorlisp in the blinkenlights 79 | emulator, to bootstrap the meta-circular evaluator, which evaluates a 80 | program for finding the first element in a tree. 81 | 82 | You can [watch the full demo on YouTube](https://youtu.be/hvTHZ6E0Abo). 83 | -------------------------------------------------------------------------------- /bestline.c: -------------------------------------------------------------------------------- 1 | /*-*- mode:c;indent-tabs-mode:nil;c-basic-offset:4;tab-width:8;coding:utf-8 -*-│ 2 | │ vi: set et ft=c ts=4 sts=4 sw=4 fenc=utf-8 :vi │ 3 | ╞══════════════════════════════════════════════════════════════════════════════╡ 4 | │ │ 5 | │ Bestline ── Library for interactive pseudoteletypewriter command │ 6 | │ sessions using ANSI Standard X3.64 control sequences │ 7 | │ │ 8 | │ OVERVIEW │ 9 | │ │ 10 | │ Bestline is a fork of linenoise (a popular readline alternative) │ 11 | │ that fixes its bugs and adds the missing features while reducing │ 12 | │ binary footprint (surprisingly) by removing bloated dependencies │ 13 | │ which means you can finally have a permissively-licensed command │ 14 | │ prompt w/ a 30kb footprint that's nearly as good as gnu readline │ 15 | │ │ 16 | │ EXAMPLE │ 17 | │ │ 18 | │ main() { │ 19 | │ char *line; │ 20 | │ while ((line = bestlineWithHistory("IN> ", "foo"))) { │ 21 | │ fputs("OUT> ", stdout); │ 22 | │ fputs(line, stdout); │ 23 | │ fputs("\n", stdout); │ 24 | │ free(line); │ 25 | │ } │ 26 | │ } │ 27 | │ │ 28 | │ CHANGES │ 29 | │ │ 30 | │ - Remove bell │ 31 | │ - Add kill ring │ 32 | │ - Fix flickering │ 33 | │ - Add UTF-8 editing │ 34 | │ - Add CTRL-R search │ 35 | │ - Support unlimited lines │ 36 | │ - Add parentheses awareness │ 37 | │ - React to terminal resizing │ 38 | │ - Don't generate .data section │ 39 | │ - Support terminal flow control │ 40 | │ - Make history loading 10x faster │ 41 | │ - Make multiline mode the only mode │ 42 | │ - Accommodate O_NONBLOCK file descriptors │ 43 | │ - Restore raw mode on process foregrounding │ 44 | │ - Make source code compatible with C++ compilers │ 45 | │ - Fix corruption issues by using generalized parsing │ 46 | │ - Implement nearly all GNU readline editing shortcuts │ 47 | │ - Remove heavyweight dependencies like printf/sprintf │ 48 | │ - Remove ISIG→^C→EAGAIN hack and use ephemeral handlers │ 49 | │ - Support running on Windows in MinTTY or CMD.EXE on Win10+ │ 50 | │ - Support diacratics, русский, Ελληνικά, 漢字, 仮名, 한글 │ 51 | │ │ 52 | │ SHORTCUTS │ 53 | │ │ 54 | │ CTRL-E END │ 55 | │ CTRL-A START │ 56 | │ CTRL-B BACK │ 57 | │ CTRL-F FORWARD │ 58 | │ CTRL-L CLEAR │ 59 | │ CTRL-H BACKSPACE │ 60 | │ CTRL-D DELETE │ 61 | │ CTRL-Y YANK │ 62 | │ CTRL-D EOF (IF EMPTY) │ 63 | │ CTRL-N NEXT HISTORY │ 64 | │ CTRL-P PREVIOUS HISTORY │ 65 | │ CTRL-R SEARCH HISTORY │ 66 | │ CTRL-G CANCEL SEARCH │ 67 | │ CTRL-J INSERT NEWLINE │ 68 | │ ALT-< BEGINNING OF HISTORY │ 69 | │ ALT-> END OF HISTORY │ 70 | │ ALT-F FORWARD WORD │ 71 | │ ALT-B BACKWARD WORD │ 72 | │ CTRL-ALT-F FORWARD EXPR │ 73 | │ CTRL-ALT-B BACKWARD EXPR │ 74 | │ ALT-RIGHT FORWARD EXPR │ 75 | │ ALT-LEFT BACKWARD EXPR │ 76 | │ ALT-SHIFT-B BARF EXPR │ 77 | │ ALT-SHIFT-S SLURP EXPR │ 78 | │ ALT-SHIFT-R RAISE EXPR │ 79 | │ CTRL-K KILL LINE FORWARDS │ 80 | │ CTRL-U KILL LINE BACKWARDS │ 81 | │ ALT-H KILL WORD BACKWARDS │ 82 | │ CTRL-W KILL WORD BACKWARDS │ 83 | │ CTRL-ALT-H KILL WORD BACKWARDS │ 84 | │ ALT-D KILL WORD FORWARDS │ 85 | │ ALT-Y ROTATE KILL RING AND YANK AGAIN │ 86 | │ ALT-\ SQUEEZE ADJACENT WHITESPACE │ 87 | │ CTRL-T TRANSPOSE │ 88 | │ ALT-T TRANSPOSE WORD │ 89 | │ ALT-U UPPERCASE WORD │ 90 | │ ALT-L LOWERCASE WORD │ 91 | │ ALT-C CAPITALIZE WORD │ 92 | │ CTRL-C CTRL-C INTERRUPT PROCESS │ 93 | │ CTRL-Z SUSPEND PROCESS │ 94 | │ CTRL-\ QUIT PROCESS │ 95 | │ CTRL-S PAUSE OUTPUT │ 96 | │ CTRL-Q UNPAUSE OUTPUT (IF PAUSED) │ 97 | │ CTRL-Q ESCAPED INSERT │ 98 | │ CTRL-SPACE SET MARK │ 99 | │ CTRL-X CTRL-X GOTO MARK │ 100 | │ PROTIP REMAP CAPS LOCK TO CTRL │ 101 | │ │ 102 | ╞══════════════════════════════════════════════════════════════════════════════╡ 103 | │ │ 104 | │ Copyright 2018-2021 Justine Tunney │ 105 | │ Copyright 2010-2016 Salvatore Sanfilippo │ 106 | │ Copyright 2010-2013 Pieter Noordhuis │ 107 | │ │ 108 | │ All rights reserved. │ 109 | │ │ 110 | │ Redistribution and use in source and binary forms, with or without │ 111 | │ modification, are permitted provided that the following conditions are │ 112 | │ met: │ 113 | │ │ 114 | │ * Redistributions of source code must retain the above copyright │ 115 | │ notice, this list of conditions and the following disclaimer. │ 116 | │ │ 117 | │ * Redistributions in binary form must reproduce the above copyright │ 118 | │ notice, this list of conditions and the following disclaimer in the │ 119 | │ documentation and/or other materials provided with the distribution. │ 120 | │ │ 121 | │ THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS │ 122 | │ "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT │ 123 | │ LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR │ 124 | │ A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT │ 125 | │ HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, │ 126 | │ SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT │ 127 | │ LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, │ 128 | │ DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY │ 129 | │ THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT │ 130 | │ (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE │ 131 | │ OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. │ 132 | │ │ 133 | ╚─────────────────────────────────────────────────────────────────────────────*/ 134 | #include "bestline.h" 135 | 136 | #define _POSIX_C_SOURCE 1 /* so GCC builds in ANSI mode */ 137 | #define _XOPEN_SOURCE 700 /* so GCC builds in ANSI mode */ 138 | #define _DARWIN_C_SOURCE 1 /* so SIGWINCH / IUTF8 on XNU */ 139 | #include 140 | #include 141 | #include 142 | #include 143 | #include 144 | #include 145 | #include 146 | #include 147 | #include 148 | #include 149 | #include 150 | #include 151 | #include 152 | #include 153 | #include 154 | #include 155 | #include 156 | #include 157 | #ifndef SIGWINCH 158 | #define SIGWINCH 28 /* GNU/Systemd + XNU + FreeBSD + NetBSD + OpenBSD */ 159 | #endif 160 | #ifndef IUTF8 161 | #define IUTF8 0 162 | #endif 163 | 164 | __asm__(".ident\t\"\\n\\n\ 165 | Bestline (BSD-2)\\n\ 166 | Copyright 2018-2020 Justine Tunney \\n\ 167 | Copyright 2010-2016 Salvatore Sanfilippo \\n\ 168 | Copyright 2010-2013 Pieter Noordhuis \""); 169 | 170 | #ifndef BESTLINE_MAX_RING 171 | #define BESTLINE_MAX_RING 8 172 | #endif 173 | 174 | #ifndef BESTLINE_MAX_HISTORY 175 | #define BESTLINE_MAX_HISTORY 1024 176 | #endif 177 | 178 | #define BESTLINE_HISTORY_PREV +1 179 | #define BESTLINE_HISTORY_NEXT -1 180 | 181 | #define Ctrl(C) ((C) ^ 0100) 182 | #define Min(X, Y) ((Y) > (X) ? (X) : (Y)) 183 | #define Max(X, Y) ((Y) < (X) ? (X) : (Y)) 184 | #define Case(X, Y) \ 185 | case X: \ 186 | Y; \ 187 | break 188 | #define Read16le(X) ((255 & (X)[0]) << 000 | (255 & (X)[1]) << 010) 189 | #define Read32le(X) \ 190 | ((unsigned)(255 & (X)[0]) << 000 | (unsigned)(255 & (X)[1]) << 010 | \ 191 | (unsigned)(255 & (X)[2]) << 020 | (unsigned)(255 & (X)[3]) << 030) 192 | 193 | struct abuf { 194 | char *b; 195 | unsigned len; 196 | unsigned cap; 197 | }; 198 | 199 | struct rune { 200 | unsigned c; 201 | unsigned n; 202 | }; 203 | 204 | struct bestlineRing { 205 | unsigned i; 206 | char *p[BESTLINE_MAX_RING]; 207 | }; 208 | 209 | /* The bestlineState structure represents the state during line editing. 210 | * We pass this state to functions implementing specific editing 211 | * functionalities. */ 212 | struct bestlineState { 213 | int ifd; /* terminal stdin file descriptor */ 214 | int ofd; /* terminal stdout file descriptor */ 215 | struct winsize ws; /* rows and columns in terminal */ 216 | char *buf; /* edited line buffer */ 217 | const char *prompt; /* prompt to display */ 218 | int hindex; /* history index */ 219 | int rows; /* rows being used */ 220 | int oldpos; /* previous refresh cursor position */ 221 | unsigned buflen; /* edited line buffer size */ 222 | unsigned pos; /* current buffer index */ 223 | unsigned len; /* current edited line length */ 224 | unsigned mark; /* saved cursor position */ 225 | unsigned yi, yj; /* boundaries of last yank */ 226 | char seq[2][16]; /* keystroke history for yanking code */ 227 | char final; /* set to true on last update */ 228 | char dirty; /* if an update was squashed */ 229 | struct abuf full; /* used for multiline mode */ 230 | }; 231 | 232 | static const char *const kUnsupported[] = {"dumb", "cons25", "emacs"}; 233 | 234 | static int gotint; 235 | static int gotcont; 236 | static int gotwinch; 237 | static signed char rawmode; 238 | static char maskmode; 239 | static char emacsmode; 240 | static char llamamode; 241 | static char balancemode; 242 | static char ispaused; 243 | static char iscapital; 244 | static unsigned historylen; 245 | static struct bestlineRing ring; 246 | static struct sigaction orig_cont; 247 | static struct sigaction orig_winch; 248 | static struct termios orig_termios; 249 | static char *history[BESTLINE_MAX_HISTORY]; 250 | static bestlineXlatCallback *xlatCallback; 251 | static bestlineHintsCallback *hintsCallback; 252 | static bestlineFreeHintsCallback *freeHintsCallback; 253 | static bestlineCompletionCallback *completionCallback; 254 | 255 | static void bestlineAtExit(void); 256 | static void bestlineRefreshLine(struct bestlineState *); 257 | 258 | static void bestlineOnInt(int sig) { 259 | gotint = sig; 260 | } 261 | 262 | static void bestlineOnCont(int sig) { 263 | gotcont = sig; 264 | } 265 | 266 | static void bestlineOnWinch(int sig) { 267 | gotwinch = sig; 268 | } 269 | 270 | static char IsControl(unsigned c) { 271 | return c <= 0x1F || (0x7F <= c && c <= 0x9F); 272 | } 273 | 274 | static int GetMonospaceCharacterWidth(unsigned c) { 275 | return !IsControl(c) + 276 | (c >= 0x1100 && (c <= 0x115f || c == 0x2329 || c == 0x232a || 277 | (c >= 0x2e80 && c <= 0xa4cf && c != 0x303f) || 278 | (c >= 0xac00 && c <= 0xd7a3) || (c >= 0xf900 && c <= 0xfaff) || 279 | (c >= 0xfe10 && c <= 0xfe19) || (c >= 0xfe30 && c <= 0xfe6f) || 280 | (c >= 0xff00 && c <= 0xff60) || (c >= 0xffe0 && c <= 0xffe6) || 281 | (c >= 0x20000 && c <= 0x2fffd) || (c >= 0x30000 && c <= 0x3fffd))); 282 | } 283 | 284 | /** 285 | * Returns nonzero if 𝑐 isn't alphanumeric. 286 | * 287 | * Line reading interfaces generally define this operation as UNICODE 288 | * characters that aren't in the letter category (Lu, Ll, Lt, Lm, Lo) 289 | * and aren't in the number categorie (Nd, Nl, No). We also add a few 290 | * other things like blocks and emoji (So). 291 | */ 292 | char bestlineIsSeparator(unsigned c) { 293 | int m, l, r, n; 294 | if (c < 0200) { 295 | return !(('0' <= c && c <= '9') || ('A' <= c && c <= 'Z') || ('a' <= c && c <= 'z')); 296 | } 297 | if (c <= 0xffff) { 298 | static const unsigned short kGlyphs[][2] = { 299 | {0x00aa, 0x00aa}, /* 1x English */ 300 | {0x00b2, 0x00b3}, /* 2x English Arabic */ 301 | {0x00b5, 0x00b5}, /* 1x Greek */ 302 | {0x00b9, 0x00ba}, /* 2x English Arabic */ 303 | {0x00bc, 0x00be}, /* 3x Vulgar English Arabic */ 304 | {0x00c0, 0x00d6}, /* 23x Watin */ 305 | {0x00d8, 0x00f6}, /* 31x Watin */ 306 | {0x0100, 0x02c1}, /* 450x Watin-AB,IPA,Spacemod */ 307 | {0x02c6, 0x02d1}, /* 12x Spacemod */ 308 | {0x02e0, 0x02e4}, /* 5x Spacemod */ 309 | {0x02ec, 0x02ec}, /* 1x Spacemod */ 310 | {0x02ee, 0x02ee}, /* 1x Spacemod */ 311 | {0x0370, 0x0374}, /* 5x Greek */ 312 | {0x0376, 0x0377}, /* 2x Greek */ 313 | {0x037a, 0x037d}, /* 4x Greek */ 314 | {0x037f, 0x037f}, /* 1x Greek */ 315 | {0x0386, 0x0386}, /* 1x Greek */ 316 | {0x0388, 0x038a}, /* 3x Greek */ 317 | {0x038c, 0x038c}, /* 1x Greek */ 318 | {0x038e, 0x03a1}, /* 20x Greek */ 319 | {0x03a3, 0x03f5}, /* 83x Greek */ 320 | {0x03f7, 0x0481}, /* 139x Greek */ 321 | {0x048a, 0x052f}, /* 166x Cyrillic */ 322 | {0x0531, 0x0556}, /* 38x Armenian */ 323 | {0x0560, 0x0588}, /* 41x Armenian */ 324 | {0x05d0, 0x05ea}, /* 27x Hebrew */ 325 | {0x0620, 0x064a}, /* 43x Arabic */ 326 | {0x0660, 0x0669}, /* 10x Arabic */ 327 | {0x0671, 0x06d3}, /* 99x Arabic */ 328 | {0x06ee, 0x06fc}, /* 15x Arabic */ 329 | {0x0712, 0x072f}, /* 30x Syriac */ 330 | {0x074d, 0x07a5}, /* 89x Syriac,Arabic2,Thaana */ 331 | {0x07c0, 0x07ea}, /* 43x NKo */ 332 | {0x0800, 0x0815}, /* 22x Samaritan */ 333 | {0x0840, 0x0858}, /* 25x Mandaic */ 334 | {0x0904, 0x0939}, /* 54x Devanagari */ 335 | {0x0993, 0x09a8}, /* 22x Bengali */ 336 | {0x09e6, 0x09f1}, /* 12x Bengali */ 337 | {0x0a13, 0x0a28}, /* 22x Gurmukhi */ 338 | {0x0a66, 0x0a6f}, /* 10x Gurmukhi */ 339 | {0x0a93, 0x0aa8}, /* 22x Gujarati */ 340 | {0x0b13, 0x0b28}, /* 22x Oriya */ 341 | {0x0c92, 0x0ca8}, /* 23x Kannada */ 342 | {0x0caa, 0x0cb3}, /* 10x Kannada */ 343 | {0x0ce6, 0x0cef}, /* 10x Kannada */ 344 | {0x0d12, 0x0d3a}, /* 41x Malayalam */ 345 | {0x0d85, 0x0d96}, /* 18x Sinhala */ 346 | {0x0d9a, 0x0db1}, /* 24x Sinhala */ 347 | {0x0de6, 0x0def}, /* 10x Sinhala */ 348 | {0x0e01, 0x0e30}, /* 48x Thai */ 349 | {0x0e8c, 0x0ea3}, /* 24x Lao */ 350 | {0x0f20, 0x0f33}, /* 20x Tibetan */ 351 | {0x0f49, 0x0f6c}, /* 36x Tibetan */ 352 | {0x109e, 0x10c5}, /* 40x Myanmar,Georgian */ 353 | {0x10d0, 0x10fa}, /* 43x Georgian */ 354 | {0x10fc, 0x1248}, /* 333x Georgian,Hangul,Ethiopic */ 355 | {0x13a0, 0x13f5}, /* 86x Cherokee */ 356 | {0x1401, 0x166d}, /* 621x Aboriginal */ 357 | {0x16a0, 0x16ea}, /* 75x Runic */ 358 | {0x1700, 0x170c}, /* 13x Tagalog */ 359 | {0x1780, 0x17b3}, /* 52x Khmer */ 360 | {0x1820, 0x1878}, /* 89x Mongolian */ 361 | {0x1a00, 0x1a16}, /* 23x Buginese */ 362 | {0x1a20, 0x1a54}, /* 53x Tai Tham */ 363 | {0x1a80, 0x1a89}, /* 10x Tai Tham */ 364 | {0x1a90, 0x1a99}, /* 10x Tai Tham */ 365 | {0x1b05, 0x1b33}, /* 47x Balinese */ 366 | {0x1b50, 0x1b59}, /* 10x Balinese */ 367 | {0x1b83, 0x1ba0}, /* 30x Sundanese */ 368 | {0x1bae, 0x1be5}, /* 56x Sundanese */ 369 | {0x1c90, 0x1cba}, /* 43x Georgian2 */ 370 | {0x1cbd, 0x1cbf}, /* 3x Georgian2 */ 371 | {0x1e00, 0x1f15}, /* 278x Watin-C,Greek2 */ 372 | {0x2070, 0x2071}, /* 2x Supersub */ 373 | {0x2074, 0x2079}, /* 6x Supersub */ 374 | {0x207f, 0x2089}, /* 11x Supersub */ 375 | {0x2090, 0x209c}, /* 13x Supersub */ 376 | {0x2100, 0x2117}, /* 24x Letterlike */ 377 | {0x2119, 0x213f}, /* 39x Letterlike */ 378 | {0x2145, 0x214a}, /* 6x Letterlike */ 379 | {0x214c, 0x218b}, /* 64x Letterlike,Numbery */ 380 | {0x21af, 0x21cd}, /* 31x Arrows */ 381 | {0x21d5, 0x21f3}, /* 31x Arrows */ 382 | {0x230c, 0x231f}, /* 20x Technical */ 383 | {0x232b, 0x237b}, /* 81x Technical */ 384 | {0x237d, 0x239a}, /* 30x Technical */ 385 | {0x23b4, 0x23db}, /* 40x Technical */ 386 | {0x23e2, 0x2426}, /* 69x Technical,ControlPictures */ 387 | {0x2460, 0x25b6}, /* 343x Enclosed,Boxes,Blocks,Shapes */ 388 | {0x25c2, 0x25f7}, /* 54x Shapes */ 389 | {0x2600, 0x266e}, /* 111x Symbols */ 390 | {0x2670, 0x2767}, /* 248x Symbols,Dingbats */ 391 | {0x2776, 0x27bf}, /* 74x Dingbats */ 392 | {0x2800, 0x28ff}, /* 256x Braille */ 393 | {0x2c00, 0x2c2e}, /* 47x Glagolitic */ 394 | {0x2c30, 0x2c5e}, /* 47x Glagolitic */ 395 | {0x2c60, 0x2ce4}, /* 133x Watin-D */ 396 | {0x2d00, 0x2d25}, /* 38x Georgian2 */ 397 | {0x2d30, 0x2d67}, /* 56x Tifinagh */ 398 | {0x2d80, 0x2d96}, /* 23x Ethiopic2 */ 399 | {0x2e2f, 0x2e2f}, /* 1x Punctuation2 */ 400 | {0x3005, 0x3007}, /* 3x CJK Symbols & Punctuation */ 401 | {0x3021, 0x3029}, /* 9x CJK Symbols & Punctuation */ 402 | {0x3031, 0x3035}, /* 5x CJK Symbols & Punctuation */ 403 | {0x3038, 0x303c}, /* 5x CJK Symbols & Punctuation */ 404 | {0x3041, 0x3096}, /* 86x Hiragana */ 405 | {0x30a1, 0x30fa}, /* 90x Katakana */ 406 | {0x3105, 0x312f}, /* 43x Bopomofo */ 407 | {0x3131, 0x318e}, /* 94x Hangul Compatibility Jamo */ 408 | {0x31a0, 0x31ba}, /* 27x Bopomofo Extended */ 409 | {0x31f0, 0x31ff}, /* 16x Katakana Phonetic Extensions */ 410 | {0x3220, 0x3229}, /* 10x Enclosed CJK Letters & Months */ 411 | {0x3248, 0x324f}, /* 8x Enclosed CJK Letters & Months */ 412 | {0x3251, 0x325f}, /* 15x Enclosed CJK Letters & Months */ 413 | {0x3280, 0x3289}, /* 10x Enclosed CJK Letters & Months */ 414 | {0x32b1, 0x32bf}, /* 15x Enclosed CJK Letters & Months */ 415 | {0x3400, 0x4db5}, /* 6582x CJK Unified Ideographs Extension A */ 416 | {0x4dc0, 0x9fef}, /* 21040x Yijing Hexagram, CJK Unified Ideographs */ 417 | {0xa000, 0xa48c}, /* 1165x Yi Syllables */ 418 | {0xa4d0, 0xa4fd}, /* 46x Lisu */ 419 | {0xa500, 0xa60c}, /* 269x Vai */ 420 | {0xa610, 0xa62b}, /* 28x Vai */ 421 | {0xa6a0, 0xa6ef}, /* 80x Bamum */ 422 | {0xa80c, 0xa822}, /* 23x Syloti Nagri */ 423 | {0xa840, 0xa873}, /* 52x Phags-pa */ 424 | {0xa882, 0xa8b3}, /* 50x Saurashtra */ 425 | {0xa8d0, 0xa8d9}, /* 10x Saurashtra */ 426 | {0xa900, 0xa925}, /* 38x Kayah Li */ 427 | {0xa930, 0xa946}, /* 23x Rejang */ 428 | {0xa960, 0xa97c}, /* 29x Hangul Jamo Extended-A */ 429 | {0xa984, 0xa9b2}, /* 47x Javanese */ 430 | {0xa9cf, 0xa9d9}, /* 11x Javanese */ 431 | {0xaa00, 0xaa28}, /* 41x Cham */ 432 | {0xaa50, 0xaa59}, /* 10x Cham */ 433 | {0xabf0, 0xabf9}, /* 10x Meetei Mayek */ 434 | {0xac00, 0xd7a3}, /* 11172x Hangul Syllables */ 435 | {0xf900, 0xfa6d}, /* 366x CJK Compatibility Ideographs */ 436 | {0xfa70, 0xfad9}, /* 106x CJK Compatibility Ideographs */ 437 | {0xfb1f, 0xfb28}, /* 10x Alphabetic Presentation Forms */ 438 | {0xfb2a, 0xfb36}, /* 13x Alphabetic Presentation Forms */ 439 | {0xfb46, 0xfbb1}, /* 108x Alphabetic Presentation Forms */ 440 | {0xfbd3, 0xfd3d}, /* 363x Arabic Presentation Forms-A */ 441 | {0xfe76, 0xfefc}, /* 135x Arabic Presentation Forms-B */ 442 | {0xff10, 0xff19}, /* 10x Dubs */ 443 | {0xff21, 0xff3a}, /* 26x Dubs */ 444 | {0xff41, 0xff5a}, /* 26x Dubs */ 445 | {0xff66, 0xffbe}, /* 89x Dubs */ 446 | {0xffc2, 0xffc7}, /* 6x Dubs */ 447 | {0xffca, 0xffcf}, /* 6x Dubs */ 448 | {0xffd2, 0xffd7}, /* 6x Dubs */ 449 | {0xffda, 0xffdc}, /* 3x Dubs */ 450 | }; 451 | l = 0; 452 | r = n = sizeof(kGlyphs) / sizeof(kGlyphs[0]); 453 | while (l < r) { 454 | m = (l + r) >> 1; 455 | if (kGlyphs[m][1] < c) { 456 | l = m + 1; 457 | } else { 458 | r = m; 459 | } 460 | } 461 | return !(l < n && kGlyphs[l][0] <= c && c <= kGlyphs[l][1]); 462 | } else { 463 | static const unsigned kAstralGlyphs[][2] = { 464 | {0x10107, 0x10133}, /* 45x Aegean */ 465 | {0x10140, 0x10178}, /* 57x Ancient Greek Numbers */ 466 | {0x1018a, 0x1018b}, /* 2x Ancient Greek Numbers */ 467 | {0x10280, 0x1029c}, /* 29x Lycian */ 468 | {0x102a0, 0x102d0}, /* 49x Carian */ 469 | {0x102e1, 0x102fb}, /* 27x Coptic Epact Numbers */ 470 | {0x10300, 0x10323}, /* 36x Old Italic */ 471 | {0x1032d, 0x1034a}, /* 30x Old Italic, Gothic */ 472 | {0x10350, 0x10375}, /* 38x Old Permic */ 473 | {0x10380, 0x1039d}, /* 30x Ugaritic */ 474 | {0x103a0, 0x103c3}, /* 36x Old Persian */ 475 | {0x103c8, 0x103cf}, /* 8x Old Persian */ 476 | {0x103d1, 0x103d5}, /* 5x Old Persian */ 477 | {0x10400, 0x1049d}, /* 158x Deseret, Shavian, Osmanya */ 478 | {0x104b0, 0x104d3}, /* 36x Osage */ 479 | {0x104d8, 0x104fb}, /* 36x Osage */ 480 | {0x10500, 0x10527}, /* 40x Elbasan */ 481 | {0x10530, 0x10563}, /* 52x Caucasian Albanian */ 482 | {0x10600, 0x10736}, /* 311x Linear A */ 483 | {0x10800, 0x10805}, /* 6x Cypriot Syllabary */ 484 | {0x1080a, 0x10835}, /* 44x Cypriot Syllabary */ 485 | {0x10837, 0x10838}, /* 2x Cypriot Syllabary */ 486 | {0x1083f, 0x1089e}, /* 86x Cypriot,ImperialAramaic,Palmyrene,Nabataean */ 487 | {0x108e0, 0x108f2}, /* 19x Hatran */ 488 | {0x108f4, 0x108f5}, /* 2x Hatran */ 489 | {0x108fb, 0x1091b}, /* 33x Hatran */ 490 | {0x10920, 0x10939}, /* 26x Lydian */ 491 | {0x10980, 0x109b7}, /* 56x Meroitic Hieroglyphs */ 492 | {0x109bc, 0x109cf}, /* 20x Meroitic Cursive */ 493 | {0x109d2, 0x10a00}, /* 47x Meroitic Cursive */ 494 | {0x10a10, 0x10a13}, /* 4x Kharoshthi */ 495 | {0x10a15, 0x10a17}, /* 3x Kharoshthi */ 496 | {0x10a19, 0x10a35}, /* 29x Kharoshthi */ 497 | {0x10a40, 0x10a48}, /* 9x Kharoshthi */ 498 | {0x10a60, 0x10a7e}, /* 31x Old South Arabian */ 499 | {0x10a80, 0x10a9f}, /* 32x Old North Arabian */ 500 | {0x10ac0, 0x10ac7}, /* 8x Manichaean */ 501 | {0x10ac9, 0x10ae4}, /* 28x Manichaean */ 502 | {0x10aeb, 0x10aef}, /* 5x Manichaean */ 503 | {0x10b00, 0x10b35}, /* 54x Avestan */ 504 | {0x10b40, 0x10b55}, /* 22x Inscriptional Parthian */ 505 | {0x10b58, 0x10b72}, /* 27x Inscriptional Parthian and Pahlavi */ 506 | {0x10b78, 0x10b91}, /* 26x Inscriptional Pahlavi, Psalter Pahlavi */ 507 | {0x10c00, 0x10c48}, /* 73x Old Turkic */ 508 | {0x10c80, 0x10cb2}, /* 51x Old Hungarian */ 509 | {0x10cc0, 0x10cf2}, /* 51x Old Hungarian */ 510 | {0x10cfa, 0x10d23}, /* 42x Old Hungarian, Hanifi Rohingya */ 511 | {0x10d30, 0x10d39}, /* 10x Hanifi Rohingya */ 512 | {0x10e60, 0x10e7e}, /* 31x Rumi Numeral Symbols */ 513 | {0x10f00, 0x10f27}, /* 40x Old Sogdian */ 514 | {0x10f30, 0x10f45}, /* 22x Sogdian */ 515 | {0x10f51, 0x10f54}, /* 4x Sogdian */ 516 | {0x10fe0, 0x10ff6}, /* 23x Elymaic */ 517 | {0x11003, 0x11037}, /* 53x Brahmi */ 518 | {0x11052, 0x1106f}, /* 30x Brahmi */ 519 | {0x11083, 0x110af}, /* 45x Kaithi */ 520 | {0x110d0, 0x110e8}, /* 25x Sora Sompeng */ 521 | {0x110f0, 0x110f9}, /* 10x Sora Sompeng */ 522 | {0x11103, 0x11126}, /* 36x Chakma */ 523 | {0x11136, 0x1113f}, /* 10x Chakma */ 524 | {0x11144, 0x11144}, /* 1x Chakma */ 525 | {0x11150, 0x11172}, /* 35x Mahajani */ 526 | {0x11176, 0x11176}, /* 1x Mahajani */ 527 | {0x11183, 0x111b2}, /* 48x Sharada */ 528 | {0x111c1, 0x111c4}, /* 4x Sharada */ 529 | {0x111d0, 0x111da}, /* 11x Sharada */ 530 | {0x111dc, 0x111dc}, /* 1x Sharada */ 531 | {0x111e1, 0x111f4}, /* 20x Sinhala Archaic Numbers */ 532 | {0x11200, 0x11211}, /* 18x Khojki */ 533 | {0x11213, 0x1122b}, /* 25x Khojki */ 534 | {0x11280, 0x11286}, /* 7x Multani */ 535 | {0x11288, 0x11288}, /* 1x Multani */ 536 | {0x1128a, 0x1128d}, /* 4x Multani */ 537 | {0x1128f, 0x1129d}, /* 15x Multani */ 538 | {0x1129f, 0x112a8}, /* 10x Multani */ 539 | {0x112b0, 0x112de}, /* 47x Khudawadi */ 540 | {0x112f0, 0x112f9}, /* 10x Khudawadi */ 541 | {0x11305, 0x1130c}, /* 8x Grantha */ 542 | {0x1130f, 0x11310}, /* 2x Grantha */ 543 | {0x11313, 0x11328}, /* 22x Grantha */ 544 | {0x1132a, 0x11330}, /* 7x Grantha */ 545 | {0x11332, 0x11333}, /* 2x Grantha */ 546 | {0x11335, 0x11339}, /* 5x Grantha */ 547 | {0x1133d, 0x1133d}, /* 1x Grantha */ 548 | {0x11350, 0x11350}, /* 1x Grantha */ 549 | {0x1135d, 0x11361}, /* 5x Grantha */ 550 | {0x11400, 0x11434}, /* 53x Newa */ 551 | {0x11447, 0x1144a}, /* 4x Newa */ 552 | {0x11450, 0x11459}, /* 10x Newa */ 553 | {0x1145f, 0x1145f}, /* 1x Newa */ 554 | {0x11480, 0x114af}, /* 48x Tirhuta */ 555 | {0x114c4, 0x114c5}, /* 2x Tirhuta */ 556 | {0x114c7, 0x114c7}, /* 1x Tirhuta */ 557 | {0x114d0, 0x114d9}, /* 10x Tirhuta */ 558 | {0x11580, 0x115ae}, /* 47x Siddham */ 559 | {0x115d8, 0x115db}, /* 4x Siddham */ 560 | {0x11600, 0x1162f}, /* 48x Modi */ 561 | {0x11644, 0x11644}, /* 1x Modi */ 562 | {0x11650, 0x11659}, /* 10x Modi */ 563 | {0x11680, 0x116aa}, /* 43x Takri */ 564 | {0x116b8, 0x116b8}, /* 1x Takri */ 565 | {0x116c0, 0x116c9}, /* 10x Takri */ 566 | {0x11700, 0x1171a}, /* 27x Ahom */ 567 | {0x11730, 0x1173b}, /* 12x Ahom */ 568 | {0x11800, 0x1182b}, /* 44x Dogra */ 569 | {0x118a0, 0x118f2}, /* 83x Warang Citi */ 570 | {0x118ff, 0x118ff}, /* 1x Warang Citi */ 571 | {0x119a0, 0x119a7}, /* 8x Nandinagari */ 572 | {0x119aa, 0x119d0}, /* 39x Nandinagari */ 573 | {0x119e1, 0x119e1}, /* 1x Nandinagari */ 574 | {0x119e3, 0x119e3}, /* 1x Nandinagari */ 575 | {0x11a00, 0x11a00}, /* 1x Zanabazar Square */ 576 | {0x11a0b, 0x11a32}, /* 40x Zanabazar Square */ 577 | {0x11a3a, 0x11a3a}, /* 1x Zanabazar Square */ 578 | {0x11a50, 0x11a50}, /* 1x Soyombo */ 579 | {0x11a5c, 0x11a89}, /* 46x Soyombo */ 580 | {0x11a9d, 0x11a9d}, /* 1x Soyombo */ 581 | {0x11ac0, 0x11af8}, /* 57x Pau Cin Hau */ 582 | {0x11c00, 0x11c08}, /* 9x Bhaiksuki */ 583 | {0x11c0a, 0x11c2e}, /* 37x Bhaiksuki */ 584 | {0x11c40, 0x11c40}, /* 1x Bhaiksuki */ 585 | {0x11c50, 0x11c6c}, /* 29x Bhaiksuki */ 586 | {0x11c72, 0x11c8f}, /* 30x Marchen */ 587 | {0x11d00, 0x11d06}, /* 7x Masaram Gondi */ 588 | {0x11d08, 0x11d09}, /* 2x Masaram Gondi */ 589 | {0x11d0b, 0x11d30}, /* 38x Masaram Gondi */ 590 | {0x11d46, 0x11d46}, /* 1x Masaram Gondi */ 591 | {0x11d50, 0x11d59}, /* 10x Masaram Gondi */ 592 | {0x11d60, 0x11d65}, /* 6x Gunjala Gondi */ 593 | {0x11d67, 0x11d68}, /* 2x Gunjala Gondi */ 594 | {0x11d6a, 0x11d89}, /* 32x Gunjala Gondi */ 595 | {0x11d98, 0x11d98}, /* 1x Gunjala Gondi */ 596 | {0x11da0, 0x11da9}, /* 10x Gunjala Gondi */ 597 | {0x11ee0, 0x11ef2}, /* 19x Makasar */ 598 | {0x11fc0, 0x11fd4}, /* 21x Tamil Supplement */ 599 | {0x12000, 0x12399}, /* 922x Cuneiform */ 600 | {0x12400, 0x1246e}, /* 111x Cuneiform Numbers & Punctuation */ 601 | {0x12480, 0x12543}, /* 196x Early Dynastic Cuneiform */ 602 | {0x13000, 0x1342e}, /* 1071x Egyptian Hieroglyphs */ 603 | {0x14400, 0x14646}, /* 583x Anatolian Hieroglyphs */ 604 | {0x16800, 0x16a38}, /* 569x Bamum Supplement */ 605 | {0x16a40, 0x16a5e}, /* 31x Mro */ 606 | {0x16a60, 0x16a69}, /* 10x Mro */ 607 | {0x16ad0, 0x16aed}, /* 30x Bassa Vah */ 608 | {0x16b00, 0x16b2f}, /* 48x Pahawh Hmong */ 609 | {0x16b40, 0x16b43}, /* 4x Pahawh Hmong */ 610 | {0x16b50, 0x16b59}, /* 10x Pahawh Hmong */ 611 | {0x16b5b, 0x16b61}, /* 7x Pahawh Hmong */ 612 | {0x16b63, 0x16b77}, /* 21x Pahawh Hmong */ 613 | {0x16b7d, 0x16b8f}, /* 19x Pahawh Hmong */ 614 | {0x16e40, 0x16e96}, /* 87x Medefaidrin */ 615 | {0x16f00, 0x16f4a}, /* 75x Miao */ 616 | {0x16f50, 0x16f50}, /* 1x Miao */ 617 | {0x16f93, 0x16f9f}, /* 13x Miao */ 618 | {0x16fe0, 0x16fe1}, /* 2x Ideographic Symbols & Punctuation */ 619 | {0x16fe3, 0x16fe3}, /* 1x Ideographic Symbols & Punctuation */ 620 | {0x17000, 0x187f7}, /* 6136x Tangut */ 621 | {0x18800, 0x18af2}, /* 755x Tangut Components */ 622 | {0x1b000, 0x1b11e}, /* 287x Kana Supplement */ 623 | {0x1b150, 0x1b152}, /* 3x Small Kana Extension */ 624 | {0x1b164, 0x1b167}, /* 4x Small Kana Extension */ 625 | {0x1b170, 0x1b2fb}, /* 396x Nushu */ 626 | {0x1bc00, 0x1bc6a}, /* 107x Duployan */ 627 | {0x1bc70, 0x1bc7c}, /* 13x Duployan */ 628 | {0x1bc80, 0x1bc88}, /* 9x Duployan */ 629 | {0x1bc90, 0x1bc99}, /* 10x Duployan */ 630 | {0x1d2e0, 0x1d2f3}, /* 20x Mayan Numerals */ 631 | {0x1d360, 0x1d378}, /* 25x Counting Rod Numerals */ 632 | {0x1d400, 0x1d454}, /* 85x 𝐀..𝑔 Math */ 633 | {0x1d456, 0x1d49c}, /* 71x 𝑖..𝒜 Math */ 634 | {0x1d49e, 0x1d49f}, /* 2x 𝒞..𝒟 Math */ 635 | {0x1d4a2, 0x1d4a2}, /* 1x 𝒢..𝒢 Math */ 636 | {0x1d4a5, 0x1d4a6}, /* 2x 𝒥..𝒦 Math */ 637 | {0x1d4a9, 0x1d4ac}, /* 4x 𝒩..𝒬 Math */ 638 | {0x1d4ae, 0x1d4b9}, /* 12x 𝒮..𝒹 Math */ 639 | {0x1d4bb, 0x1d4bb}, /* 1x 𝒻..𝒻 Math */ 640 | {0x1d4bd, 0x1d4c3}, /* 7x 𝒽..𝓃 Math */ 641 | {0x1d4c5, 0x1d505}, /* 65x 𝓅..𝔅 Math */ 642 | {0x1d507, 0x1d50a}, /* 4x 𝔇..𝔊 Math */ 643 | {0x1d50d, 0x1d514}, /* 8x 𝔍..𝔔 Math */ 644 | {0x1d516, 0x1d51c}, /* 7x 𝔖..𝔜 Math */ 645 | {0x1d51e, 0x1d539}, /* 28x 𝔞..𝔹 Math */ 646 | {0x1d53b, 0x1d53e}, /* 4x 𝔻..𝔾 Math */ 647 | {0x1d540, 0x1d544}, /* 5x 𝕀..𝕄 Math */ 648 | {0x1d546, 0x1d546}, /* 1x 𝕆..𝕆 Math */ 649 | {0x1d54a, 0x1d550}, /* 7x 𝕊..𝕐 Math */ 650 | {0x1d552, 0x1d6a5}, /* 340x 𝕒..𝚥 Math */ 651 | {0x1d6a8, 0x1d6c0}, /* 25x 𝚨..𝛀 Math */ 652 | {0x1d6c2, 0x1d6da}, /* 25x 𝛂..𝛚 Math */ 653 | {0x1d6dc, 0x1d6fa}, /* 31x 𝛜..𝛺 Math */ 654 | {0x1d6fc, 0x1d714}, /* 25x 𝛼..𝜔 Math */ 655 | {0x1d716, 0x1d734}, /* 31x 𝜖..𝜴 Math */ 656 | {0x1d736, 0x1d74e}, /* 25x 𝜶..𝝎 Math */ 657 | {0x1d750, 0x1d76e}, /* 31x 𝝐..𝝮 Math */ 658 | {0x1d770, 0x1d788}, /* 25x 𝝰..𝞈 Math */ 659 | {0x1d78a, 0x1d7a8}, /* 31x 𝞊..𝞨 Math */ 660 | {0x1d7aa, 0x1d7c2}, /* 25x 𝞪..𝟂 Math */ 661 | {0x1d7c4, 0x1d7cb}, /* 8x 𝟄..𝟋 Math */ 662 | {0x1d7ce, 0x1d9ff}, /* 562x Math, Sutton SignWriting */ 663 | {0x1f100, 0x1f10c}, /* 13x Enclosed Alphanumeric Supplement */ 664 | {0x20000, 0x2a6d6}, /* 42711x CJK Unified Ideographs Extension B */ 665 | {0x2a700, 0x2b734}, /* 4149x CJK Unified Ideographs Extension C */ 666 | {0x2b740, 0x2b81d}, /* 222x CJK Unified Ideographs Extension D */ 667 | {0x2b820, 0x2cea1}, /* 5762x CJK Unified Ideographs Extension E */ 668 | {0x2ceb0, 0x2ebe0}, /* 7473x CJK Unified Ideographs Extension F */ 669 | {0x2f800, 0x2fa1d}, /* 542x CJK Compatibility Ideographs Supplement */ 670 | }; 671 | l = 0; 672 | r = n = sizeof(kAstralGlyphs) / sizeof(kAstralGlyphs[0]); 673 | while (l < r) { 674 | m = (l + r) >> 1; 675 | if (kAstralGlyphs[m][1] < c) { 676 | l = m + 1; 677 | } else { 678 | r = m; 679 | } 680 | } 681 | return !(l < n && kAstralGlyphs[l][0] <= c && c <= kAstralGlyphs[l][1]); 682 | } 683 | } 684 | 685 | unsigned bestlineLowercase(unsigned c) { 686 | int m, l, r, n; 687 | if (c < 0200) { 688 | if ('A' <= c && c <= 'Z') { 689 | return c + 32; 690 | } else { 691 | return c; 692 | } 693 | } else if (c <= 0xffff) { 694 | if ((0x0100 <= c && c <= 0x0176) || /* 60x Ā..ā → ā..ŵ Watin-A */ 695 | (0x01de <= c && c <= 0x01ee) || /* 9x Ǟ..Ǯ → ǟ..ǯ Watin-B */ 696 | (0x01f8 <= c && c <= 0x021e) || /* 20x Ǹ..Ȟ → ǹ..ȟ Watin-B */ 697 | (0x0222 <= c && c <= 0x0232) || /* 9x Ȣ..Ȳ → ȣ..ȳ Watin-B */ 698 | (0x1e00 <= c && c <= 0x1eff)) { /*256x Ḁ..Ỿ → ḁ..ỿ Watin-C */ 699 | if (c == 0x0130) 700 | return c - 199; 701 | if (c == 0x1e9e) 702 | return c; 703 | return c + (~c & 1); 704 | } else if (0x01cf <= c && c <= 0x01db) { 705 | return c + (c & 1); /* 7x Ǐ..Ǜ → ǐ..ǜ Watin-B */ 706 | } else if (0x13a0 <= c && c <= 0x13ef) { 707 | return c + 38864; /* 80x Ꭰ ..Ꮿ → ꭰ ..ꮿ Cherokee */ 708 | } else { 709 | static const struct { 710 | unsigned short a; 711 | unsigned short b; 712 | short d; 713 | } kLower[] = { 714 | {0x00c0, 0x00d6, +32}, /* 23x À ..Ö → à ..ö Watin */ 715 | {0x00d8, 0x00de, +32}, /* 7x Ø ..Þ → ø ..þ Watin */ 716 | {0x0178, 0x0178, -121}, /* 1x Ÿ ..Ÿ → ÿ ..ÿ Watin-A */ 717 | {0x0179, 0x0179, +1}, /* 1x Ź ..Ź → ź ..ź Watin-A */ 718 | {0x017b, 0x017b, +1}, /* 1x Ż ..Ż → ż ..ż Watin-A */ 719 | {0x017d, 0x017d, +1}, /* 1x Ž ..Ž → ž ..ž Watin-A */ 720 | {0x0181, 0x0181, +210}, /* 1x Ɓ ..Ɓ → ɓ ..ɓ Watin-B */ 721 | {0x0182, 0x0182, +1}, /* 1x Ƃ ..Ƃ → ƃ ..ƃ Watin-B */ 722 | {0x0184, 0x0184, +1}, /* 1x Ƅ ..Ƅ → ƅ ..ƅ Watin-B */ 723 | {0x0186, 0x0186, +206}, /* 1x Ɔ ..Ɔ → ɔ ..ɔ Watin-B */ 724 | {0x0187, 0x0187, +1}, /* 1x Ƈ ..Ƈ → ƈ ..ƈ Watin-B */ 725 | {0x0189, 0x018a, +205}, /* 2x Ɖ ..Ɗ → ɖ ..ɗ Watin-B */ 726 | {0x018b, 0x018b, +1}, /* 1x Ƌ ..Ƌ → ƌ ..ƌ Watin-B */ 727 | {0x018e, 0x018e, +79}, /* 1x Ǝ ..Ǝ → ǝ ..ǝ Watin-B */ 728 | {0x018f, 0x018f, +202}, /* 1x Ə ..Ə → ə ..ə Watin-B */ 729 | {0x0190, 0x0190, +203}, /* 1x Ɛ ..Ɛ → ɛ ..ɛ Watin-B */ 730 | {0x0191, 0x0191, +1}, /* 1x Ƒ ..Ƒ → ƒ ..ƒ Watin-B */ 731 | {0x0193, 0x0193, +205}, /* 1x Ɠ ..Ɠ → ɠ ..ɠ Watin-B */ 732 | {0x0194, 0x0194, +207}, /* 1x Ɣ ..Ɣ → ɣ ..ɣ Watin-B */ 733 | {0x0196, 0x0196, +211}, /* 1x Ɩ ..Ɩ → ɩ ..ɩ Watin-B */ 734 | {0x0197, 0x0197, +209}, /* 1x Ɨ ..Ɨ → ɨ ..ɨ Watin-B */ 735 | {0x0198, 0x0198, +1}, /* 1x Ƙ ..Ƙ → ƙ ..ƙ Watin-B */ 736 | {0x019c, 0x019c, +211}, /* 1x Ɯ ..Ɯ → ɯ ..ɯ Watin-B */ 737 | {0x019d, 0x019d, +213}, /* 1x Ɲ ..Ɲ → ɲ ..ɲ Watin-B */ 738 | {0x019f, 0x019f, +214}, /* 1x Ɵ ..Ɵ → ɵ ..ɵ Watin-B */ 739 | {0x01a0, 0x01a0, +1}, /* 1x Ơ ..Ơ → ơ ..ơ Watin-B */ 740 | {0x01a2, 0x01a2, +1}, /* 1x Ƣ ..Ƣ → ƣ ..ƣ Watin-B */ 741 | {0x01a4, 0x01a4, +1}, /* 1x Ƥ ..Ƥ → ƥ ..ƥ Watin-B */ 742 | {0x01a6, 0x01a6, +218}, /* 1x Ʀ ..Ʀ → ʀ ..ʀ Watin-B */ 743 | {0x01a7, 0x01a7, +1}, /* 1x Ƨ ..Ƨ → ƨ ..ƨ Watin-B */ 744 | {0x01a9, 0x01a9, +218}, /* 1x Ʃ ..Ʃ → ʃ ..ʃ Watin-B */ 745 | {0x01ac, 0x01ac, +1}, /* 1x Ƭ ..Ƭ → ƭ ..ƭ Watin-B */ 746 | {0x01ae, 0x01ae, +218}, /* 1x Ʈ ..Ʈ → ʈ ..ʈ Watin-B */ 747 | {0x01af, 0x01af, +1}, /* 1x Ư ..Ư → ư ..ư Watin-B */ 748 | {0x01b1, 0x01b2, +217}, /* 2x Ʊ ..Ʋ → ʊ ..ʋ Watin-B */ 749 | {0x01b3, 0x01b3, +1}, /* 1x Ƴ ..Ƴ → ƴ ..ƴ Watin-B */ 750 | {0x01b5, 0x01b5, +1}, /* 1x Ƶ ..Ƶ → ƶ ..ƶ Watin-B */ 751 | {0x01b7, 0x01b7, +219}, /* 1x Ʒ ..Ʒ → ʒ ..ʒ Watin-B */ 752 | {0x01b8, 0x01b8, +1}, /* 1x Ƹ ..Ƹ → ƹ ..ƹ Watin-B */ 753 | {0x01bc, 0x01bc, +1}, /* 1x Ƽ ..Ƽ → ƽ ..ƽ Watin-B */ 754 | {0x01c4, 0x01c4, +2}, /* 1x DŽ ..DŽ → dž ..dž Watin-B */ 755 | {0x01c5, 0x01c5, +1}, /* 1x Dž ..Dž → dž ..dž Watin-B */ 756 | {0x01c7, 0x01c7, +2}, /* 1x LJ ..LJ → lj ..lj Watin-B */ 757 | {0x01c8, 0x01c8, +1}, /* 1x Lj ..Lj → lj ..lj Watin-B */ 758 | {0x01ca, 0x01ca, +2}, /* 1x NJ ..NJ → nj ..nj Watin-B */ 759 | {0x01cb, 0x01cb, +1}, /* 1x Nj ..Nj → nj ..nj Watin-B */ 760 | {0x01cd, 0x01cd, +1}, /* 1x Ǎ ..Ǎ → ǎ ..ǎ Watin-B */ 761 | {0x01f1, 0x01f1, +2}, /* 1x DZ ..DZ → dz ..dz Watin-B */ 762 | {0x01f2, 0x01f2, +1}, /* 1x Dz ..Dz → dz ..dz Watin-B */ 763 | {0x01f4, 0x01f4, +1}, /* 1x Ǵ ..Ǵ → ǵ ..ǵ Watin-B */ 764 | {0x01f6, 0x01f6, -97}, /* 1x Ƕ ..Ƕ → ƕ ..ƕ Watin-B */ 765 | {0x01f7, 0x01f7, -56}, /* 1x Ƿ ..Ƿ → ƿ ..ƿ Watin-B */ 766 | {0x0220, 0x0220, -130}, /* 1x Ƞ ..Ƞ → ƞ ..ƞ Watin-B */ 767 | {0x023b, 0x023b, +1}, /* 1x Ȼ ..Ȼ → ȼ ..ȼ Watin-B */ 768 | {0x023d, 0x023d, -163}, /* 1x Ƚ ..Ƚ → ƚ ..ƚ Watin-B */ 769 | {0x0241, 0x0241, +1}, /* 1x Ɂ ..Ɂ → ɂ ..ɂ Watin-B */ 770 | {0x0243, 0x0243, -195}, /* 1x Ƀ ..Ƀ → ƀ ..ƀ Watin-B */ 771 | {0x0244, 0x0244, +69}, /* 1x Ʉ ..Ʉ → ʉ ..ʉ Watin-B */ 772 | {0x0245, 0x0245, +71}, /* 1x Ʌ ..Ʌ → ʌ ..ʌ Watin-B */ 773 | {0x0246, 0x0246, +1}, /* 1x Ɇ ..Ɇ → ɇ ..ɇ Watin-B */ 774 | {0x0248, 0x0248, +1}, /* 1x Ɉ ..Ɉ → ɉ ..ɉ Watin-B */ 775 | {0x024a, 0x024a, +1}, /* 1x Ɋ ..Ɋ → ɋ ..ɋ Watin-B */ 776 | {0x024c, 0x024c, +1}, /* 1x Ɍ ..Ɍ → ɍ ..ɍ Watin-B */ 777 | {0x024e, 0x024e, +1}, /* 1x Ɏ ..Ɏ → ɏ ..ɏ Watin-B */ 778 | {0x0386, 0x0386, +38}, /* 1x Ά ..Ά → ά ..ά Greek */ 779 | {0x0388, 0x038a, +37}, /* 3x Έ ..Ί → έ ..ί Greek */ 780 | {0x038c, 0x038c, +64}, /* 1x Ό ..Ό → ό ..ό Greek */ 781 | {0x038e, 0x038f, +63}, /* 2x Ύ ..Ώ → ύ ..ώ Greek */ 782 | {0x0391, 0x03a1, +32}, /* 17x Α ..Ρ → α ..ρ Greek */ 783 | {0x03a3, 0x03ab, +32}, /* 9x Σ ..Ϋ → σ ..ϋ Greek */ 784 | {0x03dc, 0x03dc, +1}, /* 1x Ϝ ..Ϝ → ϝ ..ϝ Greek */ 785 | {0x03f4, 0x03f4, -60}, /* 1x ϴ ..ϴ → θ ..θ Greek */ 786 | {0x0400, 0x040f, +80}, /* 16x Ѐ ..Џ → ѐ ..џ Cyrillic */ 787 | {0x0410, 0x042f, +32}, /* 32x А ..Я → а ..я Cyrillic */ 788 | {0x0460, 0x0460, +1}, /* 1x Ѡ ..Ѡ → ѡ ..ѡ Cyrillic */ 789 | {0x0462, 0x0462, +1}, /* 1x Ѣ ..Ѣ → ѣ ..ѣ Cyrillic */ 790 | {0x0464, 0x0464, +1}, /* 1x Ѥ ..Ѥ → ѥ ..ѥ Cyrillic */ 791 | {0x0472, 0x0472, +1}, /* 1x Ѳ ..Ѳ → ѳ ..ѳ Cyrillic */ 792 | {0x0490, 0x0490, +1}, /* 1x Ґ ..Ґ → ґ ..ґ Cyrillic */ 793 | {0x0498, 0x0498, +1}, /* 1x Ҙ ..Ҙ → ҙ ..ҙ Cyrillic */ 794 | {0x049a, 0x049a, +1}, /* 1x Қ ..Қ → қ ..қ Cyrillic */ 795 | {0x0531, 0x0556, +48}, /* 38x Ա ..Ֆ → ա ..ֆ Armenian */ 796 | {0x10a0, 0x10c5, +7264}, /* 38x Ⴀ ..Ⴥ → ⴀ ..ⴥ Georgian */ 797 | {0x10c7, 0x10c7, +7264}, /* 1x Ⴧ ..Ⴧ → ⴧ ..ⴧ Georgian */ 798 | {0x10cd, 0x10cd, +7264}, /* 1x Ⴭ ..Ⴭ → ⴭ ..ⴭ Georgian */ 799 | {0x13f0, 0x13f5, +8}, /* 6x Ᏸ ..Ᏽ → ᏸ ..ᏽ Cherokee */ 800 | {0x1c90, 0x1cba, -3008}, /* 43x Ა ..Ჺ → ა ..ჺ Georgian2 */ 801 | {0x1cbd, 0x1cbf, -3008}, /* 3x Ჽ ..Ჿ → ჽ ..ჿ Georgian2 */ 802 | {0x1f08, 0x1f0f, -8}, /* 8x Ἀ ..Ἇ → ἀ ..ἇ Greek2 */ 803 | {0x1f18, 0x1f1d, -8}, /* 6x Ἐ ..Ἕ → ἐ ..ἕ Greek2 */ 804 | {0x1f28, 0x1f2f, -8}, /* 8x Ἠ ..Ἧ → ἠ ..ἧ Greek2 */ 805 | {0x1f38, 0x1f3f, -8}, /* 8x Ἰ ..Ἷ → ἰ ..ἷ Greek2 */ 806 | {0x1f48, 0x1f4d, -8}, /* 6x Ὀ ..Ὅ → ὀ ..ὅ Greek2 */ 807 | {0x1f59, 0x1f59, -8}, /* 1x Ὑ ..Ὑ → ὑ ..ὑ Greek2 */ 808 | {0x1f5b, 0x1f5b, -8}, /* 1x Ὓ ..Ὓ → ὓ ..ὓ Greek2 */ 809 | {0x1f5d, 0x1f5d, -8}, /* 1x Ὕ ..Ὕ → ὕ ..ὕ Greek2 */ 810 | {0x1f5f, 0x1f5f, -8}, /* 1x Ὗ ..Ὗ → ὗ ..ὗ Greek2 */ 811 | {0x1f68, 0x1f6f, -8}, /* 8x Ὠ ..Ὧ → ὠ ..ὧ Greek2 */ 812 | {0x1f88, 0x1f8f, -8}, /* 8x ᾈ ..ᾏ → ᾀ ..ᾇ Greek2 */ 813 | {0x1f98, 0x1f9f, -8}, /* 8x ᾘ ..ᾟ → ᾐ ..ᾗ Greek2 */ 814 | {0x1fa8, 0x1faf, -8}, /* 8x ᾨ ..ᾯ → ᾠ ..ᾧ Greek2 */ 815 | {0x1fb8, 0x1fb9, -8}, /* 2x Ᾰ ..Ᾱ → ᾰ ..ᾱ Greek2 */ 816 | {0x1fba, 0x1fbb, -74}, /* 2x Ὰ ..Ά → ὰ ..ά Greek2 */ 817 | {0x1fbc, 0x1fbc, -9}, /* 1x ᾼ ..ᾼ → ᾳ ..ᾳ Greek2 */ 818 | {0x1fc8, 0x1fcb, -86}, /* 4x Ὲ ..Ή → ὲ ..ή Greek2 */ 819 | {0x1fcc, 0x1fcc, -9}, /* 1x ῌ ..ῌ → ῃ ..ῃ Greek2 */ 820 | {0x1fd8, 0x1fd9, -8}, /* 2x Ῐ ..Ῑ → ῐ ..ῑ Greek2 */ 821 | {0x1fda, 0x1fdb, -100}, /* 2x Ὶ ..Ί → ὶ ..ί Greek2 */ 822 | {0x1fe8, 0x1fe9, -8}, /* 2x Ῠ ..Ῡ → ῠ ..ῡ Greek2 */ 823 | {0x1fea, 0x1feb, -112}, /* 2x Ὺ ..Ύ → ὺ ..ύ Greek2 */ 824 | {0x1fec, 0x1fec, -7}, /* 1x Ῥ ..Ῥ → ῥ ..ῥ Greek2 */ 825 | {0x1ff8, 0x1ff9, -128}, /* 2x Ὸ ..Ό → ὸ ..ό Greek2 */ 826 | {0x1ffa, 0x1ffb, -126}, /* 2x Ὼ ..Ώ → ὼ ..ώ Greek2 */ 827 | {0x1ffc, 0x1ffc, -9}, /* 1x ῼ ..ῼ → ῳ ..ῳ Greek2 */ 828 | {0x2126, 0x2126, -7517}, /* 1x Ω ..Ω → ω ..ω Letterlike */ 829 | {0x212a, 0x212a, -8383}, /* 1x K ..K → k ..k Letterlike */ 830 | {0x212b, 0x212b, -8262}, /* 1x Å ..Å → å ..å Letterlike */ 831 | {0x2132, 0x2132, +28}, /* 1x Ⅎ ..Ⅎ → ⅎ ..ⅎ Letterlike */ 832 | {0x2160, 0x216f, +16}, /* 16x Ⅰ ..Ⅿ → ⅰ ..ⅿ Numbery */ 833 | {0x2183, 0x2183, +1}, /* 1x Ↄ ..Ↄ → ↄ ..ↄ Numbery */ 834 | {0x24b6, 0x24cf, +26}, /* 26x Ⓐ ..Ⓩ → ⓐ ..ⓩ Enclosed */ 835 | {0x2c00, 0x2c2e, +48}, /* 47x Ⰰ ..Ⱞ → ⰰ ..ⱞ Glagolitic */ 836 | {0xff21, 0xff3a, +32}, /* 26x A..Z → a..z Dubs */ 837 | }; 838 | l = 0; 839 | r = n = sizeof(kLower) / sizeof(kLower[0]); 840 | while (l < r) { 841 | m = (l + r) >> 1; 842 | if (kLower[m].b < c) { 843 | l = m + 1; 844 | } else { 845 | r = m; 846 | } 847 | } 848 | if (l < n && kLower[l].a <= c && c <= kLower[l].b) { 849 | return c + kLower[l].d; 850 | } else { 851 | return c; 852 | } 853 | } 854 | } else { 855 | static struct { 856 | unsigned a; 857 | unsigned b; 858 | short d; 859 | } kAstralLower[] = { 860 | {0x10400, 0x10427, +40}, /* 40x 𐐀 ..𐐧 → 𐐨 ..𐑏 Deseret */ 861 | {0x104b0, 0x104d3, +40}, /* 36x 𐒰 ..𐓓 → 𐓘 ..𐓻 Osage */ 862 | {0x1d400, 0x1d419, +26}, /* 26x 𝐀 ..𝐙 → 𝐚 ..𝐳 Math */ 863 | {0x1d43c, 0x1d44d, +26}, /* 18x 𝐼 ..𝑍 → 𝑖 ..𝑧 Math */ 864 | {0x1d468, 0x1d481, +26}, /* 26x 𝑨 ..𝒁 → 𝒂 ..𝒛 Math */ 865 | {0x1d4ae, 0x1d4b5, +26}, /* 8x 𝒮 ..𝒵 → 𝓈 ..𝓏 Math */ 866 | {0x1d4d0, 0x1d4e9, +26}, /* 26x 𝓐 ..𝓩 → 𝓪 ..𝔃 Math */ 867 | {0x1d50d, 0x1d514, +26}, /* 8x 𝔍 ..𝔔 → 𝔧 ..𝔮 Math */ 868 | {0x1d56c, 0x1d585, +26}, /* 26x 𝕬 ..𝖅 → 𝖆 ..𝖟 Math */ 869 | {0x1d5a0, 0x1d5b9, +26}, /* 26x 𝖠 ..𝖹 → 𝖺 ..𝗓 Math */ 870 | {0x1d5d4, 0x1d5ed, +26}, /* 26x 𝗔 ..𝗭 → 𝗮 ..𝘇 Math */ 871 | {0x1d608, 0x1d621, +26}, /* 26x 𝘈 ..𝘡 → 𝘢 ..𝘻 Math */ 872 | {0x1d63c, 0x1d655, -442}, /* 26x 𝘼 ..𝙕 → 𝒂 ..𝒛 Math */ 873 | {0x1d670, 0x1d689, +26}, /* 26x 𝙰 ..𝚉 → 𝚊 ..𝚣 Math */ 874 | {0x1d6a8, 0x1d6b8, +26}, /* 17x 𝚨 ..𝚸 → 𝛂 ..𝛒 Math */ 875 | {0x1d6e2, 0x1d6f2, +26}, /* 17x 𝛢 ..𝛲 → 𝛼 ..𝜌 Math */ 876 | {0x1d71c, 0x1d72c, +26}, /* 17x 𝜜 ..𝜬 → 𝜶 ..𝝆 Math */ 877 | {0x1d756, 0x1d766, +26}, /* 17x 𝝖 ..𝝦 → 𝝰 ..𝞀 Math */ 878 | {0x1d790, 0x1d7a0, -90}, /* 17x 𝞐 ..𝞠 → 𝜶 ..𝝆 Math */ 879 | }; 880 | l = 0; 881 | r = n = sizeof(kAstralLower) / sizeof(kAstralLower[0]); 882 | while (l < r) { 883 | m = (l + r) >> 1; 884 | if (kAstralLower[m].b < c) { 885 | l = m + 1; 886 | } else { 887 | r = m; 888 | } 889 | } 890 | if (l < n && kAstralLower[l].a <= c && c <= kAstralLower[l].b) { 891 | return c + kAstralLower[l].d; 892 | } else { 893 | return c; 894 | } 895 | } 896 | } 897 | 898 | unsigned bestlineUppercase(unsigned c) { 899 | int m, l, r, n; 900 | if (c < 0200) { 901 | if ('a' <= c && c <= 'z') { 902 | return c - 32; 903 | } else { 904 | return c; 905 | } 906 | } else if (c <= 0xffff) { 907 | if ((0x0101 <= c && c <= 0x0177) || /* 60x ā..ŵ → Ā..ā Watin-A */ 908 | (0x01df <= c && c <= 0x01ef) || /* 9x ǟ..ǯ → Ǟ..Ǯ Watin-B */ 909 | (0x01f8 <= c && c <= 0x021e) || /* 20x ǹ..ȟ → Ǹ..Ȟ Watin-B */ 910 | (0x0222 <= c && c <= 0x0232) || /* 9x ȣ..ȳ → Ȣ..Ȳ Watin-B */ 911 | (0x1e01 <= c && c <= 0x1eff)) { /*256x ḁ..ỿ → Ḁ..Ỿ Watin-C */ 912 | if (c == 0x0131) 913 | return c + 232; 914 | if (c == 0x1e9e) 915 | return c; 916 | return c - (c & 1); 917 | } else if (0x01d0 <= c && c <= 0x01dc) { 918 | return c - (~c & 1); /* 7x ǐ..ǜ → Ǐ..Ǜ Watin-B */ 919 | } else if (0xab70 <= c && c <= 0xabbf) { 920 | return c - 38864; /* 80x ꭰ ..ꮿ → Ꭰ ..Ꮿ Cherokee Supplement */ 921 | } else { 922 | static const struct { 923 | unsigned short a; 924 | unsigned short b; 925 | short d; 926 | } kUpper[] = { 927 | {0x00b5, 0x00b5, +743}, /* 1x µ ..µ → Μ ..Μ Watin */ 928 | {0x00e0, 0x00f6, -32}, /* 23x à ..ö → À ..Ö Watin */ 929 | {0x00f8, 0x00fe, -32}, /* 7x ø ..þ → Ø ..Þ Watin */ 930 | {0x00ff, 0x00ff, +121}, /* 1x ÿ ..ÿ → Ÿ ..Ÿ Watin */ 931 | {0x017a, 0x017a, -1}, /* 1x ź ..ź → Ź ..Ź Watin-A */ 932 | {0x017c, 0x017c, -1}, /* 1x ż ..ż → Ż ..Ż Watin-A */ 933 | {0x017e, 0x017e, -1}, /* 1x ž ..ž → Ž ..Ž Watin-A */ 934 | {0x017f, 0x017f, -300}, /* 1x ſ ..ſ → S ..S Watin-A */ 935 | {0x0180, 0x0180, +195}, /* 1x ƀ ..ƀ → Ƀ ..Ƀ Watin-B */ 936 | {0x0183, 0x0183, -1}, /* 1x ƃ ..ƃ → Ƃ ..Ƃ Watin-B */ 937 | {0x0185, 0x0185, -1}, /* 1x ƅ ..ƅ → Ƅ ..Ƅ Watin-B */ 938 | {0x0188, 0x0188, -1}, /* 1x ƈ ..ƈ → Ƈ ..Ƈ Watin-B */ 939 | {0x018c, 0x018c, -1}, /* 1x ƌ ..ƌ → Ƌ ..Ƌ Watin-B */ 940 | {0x0192, 0x0192, -1}, /* 1x ƒ ..ƒ → Ƒ ..Ƒ Watin-B */ 941 | {0x0195, 0x0195, +97}, /* 1x ƕ ..ƕ → Ƕ ..Ƕ Watin-B */ 942 | {0x0199, 0x0199, -1}, /* 1x ƙ ..ƙ → Ƙ ..Ƙ Watin-B */ 943 | {0x019a, 0x019a, +163}, /* 1x ƚ ..ƚ → Ƚ ..Ƚ Watin-B */ 944 | {0x019e, 0x019e, +130}, /* 1x ƞ ..ƞ → Ƞ ..Ƞ Watin-B */ 945 | {0x01a1, 0x01a1, -1}, /* 1x ơ ..ơ → Ơ ..Ơ Watin-B */ 946 | {0x01a3, 0x01a3, -1}, /* 1x ƣ ..ƣ → Ƣ ..Ƣ Watin-B */ 947 | {0x01a5, 0x01a5, -1}, /* 1x ƥ ..ƥ → Ƥ ..Ƥ Watin-B */ 948 | {0x01a8, 0x01a8, -1}, /* 1x ƨ ..ƨ → Ƨ ..Ƨ Watin-B */ 949 | {0x01ad, 0x01ad, -1}, /* 1x ƭ ..ƭ → Ƭ ..Ƭ Watin-B */ 950 | {0x01b0, 0x01b0, -1}, /* 1x ư ..ư → Ư ..Ư Watin-B */ 951 | {0x01b4, 0x01b4, -1}, /* 1x ƴ ..ƴ → Ƴ ..Ƴ Watin-B */ 952 | {0x01b6, 0x01b6, -1}, /* 1x ƶ ..ƶ → Ƶ ..Ƶ Watin-B */ 953 | {0x01b9, 0x01b9, -1}, /* 1x ƹ ..ƹ → Ƹ ..Ƹ Watin-B */ 954 | {0x01bd, 0x01bd, -1}, /* 1x ƽ ..ƽ → Ƽ ..Ƽ Watin-B */ 955 | {0x01bf, 0x01bf, +56}, /* 1x ƿ ..ƿ → Ƿ ..Ƿ Watin-B */ 956 | {0x01c5, 0x01c5, -1}, /* 1x Dž ..Dž → DŽ ..DŽ Watin-B */ 957 | {0x01c6, 0x01c6, -2}, /* 1x dž ..dž → DŽ ..DŽ Watin-B */ 958 | {0x01c8, 0x01c8, -1}, /* 1x Lj ..Lj → LJ ..LJ Watin-B */ 959 | {0x01c9, 0x01c9, -2}, /* 1x lj ..lj → LJ ..LJ Watin-B */ 960 | {0x01cb, 0x01cb, -1}, /* 1x Nj ..Nj → NJ ..NJ Watin-B */ 961 | {0x01cc, 0x01cc, -2}, /* 1x nj ..nj → NJ ..NJ Watin-B */ 962 | {0x01ce, 0x01ce, -1}, /* 1x ǎ ..ǎ → Ǎ ..Ǎ Watin-B */ 963 | {0x01dd, 0x01dd, -79}, /* 1x ǝ ..ǝ → Ǝ ..Ǝ Watin-B */ 964 | {0x01f2, 0x01f2, -1}, /* 1x Dz ..Dz → DZ ..DZ Watin-B */ 965 | {0x01f3, 0x01f3, -2}, /* 1x dz ..dz → DZ ..DZ Watin-B */ 966 | {0x01f5, 0x01f5, -1}, /* 1x ǵ ..ǵ → Ǵ ..Ǵ Watin-B */ 967 | {0x023c, 0x023c, -1}, /* 1x ȼ ..ȼ → Ȼ ..Ȼ Watin-B */ 968 | {0x023f, 0x0240, +10815}, /* 2x ȿ ..ɀ → Ȿ ..Ɀ Watin-B */ 969 | {0x0242, 0x0242, -1}, /* 1x ɂ ..ɂ → Ɂ ..Ɂ Watin-B */ 970 | {0x0247, 0x0247, -1}, /* 1x ɇ ..ɇ → Ɇ ..Ɇ Watin-B */ 971 | {0x0249, 0x0249, -1}, /* 1x ɉ ..ɉ → Ɉ ..Ɉ Watin-B */ 972 | {0x024b, 0x024b, -1}, /* 1x ɋ ..ɋ → Ɋ ..Ɋ Watin-B */ 973 | {0x024d, 0x024d, -1}, /* 1x ɍ ..ɍ → Ɍ ..Ɍ Watin-B */ 974 | {0x024f, 0x024f, -1}, /* 1x ɏ ..ɏ → Ɏ ..Ɏ Watin-B */ 975 | {0x037b, 0x037d, +130}, /* 3x ͻ ..ͽ → Ͻ ..Ͽ Greek */ 976 | {0x03ac, 0x03ac, -38}, /* 1x ά ..ά → Ά ..Ά Greek */ 977 | {0x03ad, 0x03af, -37}, /* 3x έ ..ί → Έ ..Ί Greek */ 978 | {0x03b1, 0x03c1, -32}, /* 17x α ..ρ → Α ..Ρ Greek */ 979 | {0x03c2, 0x03c2, -31}, /* 1x ς ..ς → Σ ..Σ Greek */ 980 | {0x03c3, 0x03cb, -32}, /* 9x σ ..ϋ → Σ ..Ϋ Greek */ 981 | {0x03cc, 0x03cc, -64}, /* 1x ό ..ό → Ό ..Ό Greek */ 982 | {0x03cd, 0x03ce, -63}, /* 2x ύ ..ώ → Ύ ..Ώ Greek */ 983 | {0x03d0, 0x03d0, -62}, /* 1x ϐ ..ϐ → Β ..Β Greek */ 984 | {0x03d1, 0x03d1, -57}, /* 1x ϑ ..ϑ → Θ ..Θ Greek */ 985 | {0x03d5, 0x03d5, -47}, /* 1x ϕ ..ϕ → Φ ..Φ Greek */ 986 | {0x03d6, 0x03d6, -54}, /* 1x ϖ ..ϖ → Π ..Π Greek */ 987 | {0x03dd, 0x03dd, -1}, /* 1x ϝ ..ϝ → Ϝ ..Ϝ Greek */ 988 | {0x03f0, 0x03f0, -86}, /* 1x ϰ ..ϰ → Κ ..Κ Greek */ 989 | {0x03f1, 0x03f1, -80}, /* 1x ϱ ..ϱ → Ρ ..Ρ Greek */ 990 | {0x03f5, 0x03f5, -96}, /* 1x ϵ ..ϵ → Ε ..Ε Greek */ 991 | {0x0430, 0x044f, -32}, /* 32x а ..я → А ..Я Cyrillic */ 992 | {0x0450, 0x045f, -80}, /* 16x ѐ ..џ → Ѐ ..Џ Cyrillic */ 993 | {0x0461, 0x0461, -1}, /* 1x ѡ ..ѡ → Ѡ ..Ѡ Cyrillic */ 994 | {0x0463, 0x0463, -1}, /* 1x ѣ ..ѣ → Ѣ ..Ѣ Cyrillic */ 995 | {0x0465, 0x0465, -1}, /* 1x ѥ ..ѥ → Ѥ ..Ѥ Cyrillic */ 996 | {0x0473, 0x0473, -1}, /* 1x ѳ ..ѳ → Ѳ ..Ѳ Cyrillic */ 997 | {0x0491, 0x0491, -1}, /* 1x ґ ..ґ → Ґ ..Ґ Cyrillic */ 998 | {0x0499, 0x0499, -1}, /* 1x ҙ ..ҙ → Ҙ ..Ҙ Cyrillic */ 999 | {0x049b, 0x049b, -1}, /* 1x қ ..қ → Қ ..Қ Cyrillic */ 1000 | {0x0561, 0x0586, -48}, /* 38x ա ..ֆ → Ա ..Ֆ Armenian */ 1001 | {0x10d0, 0x10fa, +3008}, /* 43x ა ..ჺ → Ა ..Ჺ Georgian */ 1002 | {0x10fd, 0x10ff, +3008}, /* 3x ჽ ..ჿ → Ჽ ..Ჿ Georgian */ 1003 | {0x13f8, 0x13fd, -8}, /* 6x ᏸ ..ᏽ → Ᏸ ..Ᏽ Cherokee */ 1004 | {0x214e, 0x214e, -28}, /* 1x ⅎ ..ⅎ → Ⅎ ..Ⅎ Letterlike */ 1005 | {0x2170, 0x217f, -16}, /* 16x ⅰ ..ⅿ → Ⅰ ..Ⅿ Numbery */ 1006 | {0x2184, 0x2184, -1}, /* 1x ↄ ..ↄ → Ↄ ..Ↄ Numbery */ 1007 | {0x24d0, 0x24e9, -26}, /* 26x ⓐ ..ⓩ → Ⓐ ..Ⓩ Enclosed */ 1008 | {0x2c30, 0x2c5e, -48}, /* 47x ⰰ ..ⱞ → Ⰰ ..Ⱞ Glagolitic */ 1009 | {0x2d00, 0x2d25, -7264}, /* 38x ⴀ ..ⴥ → Ⴀ ..Ⴥ Georgian2 */ 1010 | {0x2d27, 0x2d27, -7264}, /* 1x ⴧ ..ⴧ → Ⴧ ..Ⴧ Georgian2 */ 1011 | {0x2d2d, 0x2d2d, -7264}, /* 1x ⴭ ..ⴭ → Ⴭ ..Ⴭ Georgian2 */ 1012 | {0xff41, 0xff5a, -32}, /* 26x a..z → A..Z Dubs */ 1013 | }; 1014 | l = 0; 1015 | r = n = sizeof(kUpper) / sizeof(kUpper[0]); 1016 | while (l < r) { 1017 | m = (l + r) >> 1; 1018 | if (kUpper[m].b < c) { 1019 | l = m + 1; 1020 | } else { 1021 | r = m; 1022 | } 1023 | } 1024 | if (l < n && kUpper[l].a <= c && c <= kUpper[l].b) { 1025 | return c + kUpper[l].d; 1026 | } else { 1027 | return c; 1028 | } 1029 | } 1030 | } else { 1031 | static const struct { 1032 | unsigned a; 1033 | unsigned b; 1034 | short d; 1035 | } kAstralUpper[] = { 1036 | {0x10428, 0x1044f, -40}, /* 40x 𐐨..𐑏 → 𐐀..𐐧 Deseret */ 1037 | {0x104d8, 0x104fb, -40}, /* 36x 𐓘..𐓻 → 𐒰..𐓓 Osage */ 1038 | {0x1d41a, 0x1d433, -26}, /* 26x 𝐚..𝐳 → 𝐀..𝐙 Math */ 1039 | {0x1d456, 0x1d467, -26}, /* 18x 𝑖..𝑧 → 𝐼..𝑍 Math */ 1040 | {0x1d482, 0x1d49b, -26}, /* 26x 𝒂..𝒛 → 𝑨..𝒁 Math */ 1041 | {0x1d4c8, 0x1d4cf, -26}, /* 8x 𝓈..𝓏 → 𝒮..𝒵 Math */ 1042 | {0x1d4ea, 0x1d503, -26}, /* 26x 𝓪..𝔃 → 𝓐..𝓩 Math */ 1043 | {0x1d527, 0x1d52e, -26}, /* 8x 𝔧..𝔮 → 𝔍..𝔔 Math */ 1044 | {0x1d586, 0x1d59f, -26}, /* 26x 𝖆..𝖟 → 𝕬..𝖅 Math */ 1045 | {0x1d5ba, 0x1d5d3, -26}, /* 26x 𝖺..𝗓 → 𝖠..𝖹 Math */ 1046 | {0x1d5ee, 0x1d607, -26}, /* 26x 𝗮..𝘇 → 𝗔..𝗭 Math */ 1047 | {0x1d622, 0x1d63b, -26}, /* 26x 𝘢..𝘻 → 𝘈..𝘡 Math */ 1048 | {0x1d68a, 0x1d6a3, +442}, /* 26x 𝒂..𝒛 → 𝘼..𝙕 Math */ 1049 | {0x1d6c2, 0x1d6d2, -26}, /* 26x 𝚊..𝚣 → 𝙰..𝚉 Math */ 1050 | {0x1d6fc, 0x1d70c, -26}, /* 17x 𝛂..𝛒 → 𝚨..𝚸 Math */ 1051 | {0x1d736, 0x1d746, -26}, /* 17x 𝛼..𝜌 → 𝛢..𝛲 Math */ 1052 | {0x1d770, 0x1d780, -26}, /* 17x 𝜶..𝝆 → 𝜜..𝜬 Math */ 1053 | {0x1d770, 0x1d756, -26}, /* 17x 𝝰..𝞀 → 𝝖..𝝦 Math */ 1054 | {0x1d736, 0x1d790, -90}, /* 17x 𝜶..𝝆 → 𝞐..𝞠 Math */ 1055 | }; 1056 | l = 0; 1057 | r = n = sizeof(kAstralUpper) / sizeof(kAstralUpper[0]); 1058 | while (l < r) { 1059 | m = (l + r) >> 1; 1060 | if (kAstralUpper[m].b < c) { 1061 | l = m + 1; 1062 | } else { 1063 | r = m; 1064 | } 1065 | } 1066 | if (l < n && kAstralUpper[l].a <= c && c <= kAstralUpper[l].b) { 1067 | return c + kAstralUpper[l].d; 1068 | } else { 1069 | return c; 1070 | } 1071 | } 1072 | } 1073 | 1074 | char bestlineNotSeparator(unsigned c) { 1075 | return !bestlineIsSeparator(c); 1076 | } 1077 | 1078 | static unsigned GetMirror(const unsigned short A[][2], size_t n, unsigned c) { 1079 | int l, m, r; 1080 | l = 0; 1081 | r = n - 1; 1082 | while (l <= r) { 1083 | m = (l + r) >> 1; 1084 | if (A[m][0] < c) { 1085 | l = m + 1; 1086 | } else if (A[m][0] > c) { 1087 | r = m - 1; 1088 | } else { 1089 | return A[m][1]; 1090 | } 1091 | } 1092 | return 0; 1093 | } 1094 | 1095 | unsigned bestlineMirrorLeft(unsigned c) { 1096 | static const unsigned short kMirrorRight[][2] = { 1097 | {L')', L'('}, {L']', L'['}, {L'}', L'{'}, {L'⁆', L'⁅'}, {L'⁾', L'⁽'}, 1098 | {L'₎', L'₍'}, {L'⌉', L'⌈'}, {L'⌋', L'⌊'}, {L'〉', L'〈'}, {L'❩', L'❨'}, 1099 | {L'❫', L'❪'}, {L'❭', L'❬'}, {L'❯', L'❮'}, {L'❱', L'❰'}, {L'❳', L'❲'}, 1100 | {L'❵', L'❴'}, {L'⟆', L'⟅'}, {L'⟧', L'⟦'}, {L'⟩', L'⟨'}, {L'⟫', L'⟪'}, 1101 | {L'⟭', L'⟬'}, {L'⟯', L'⟮'}, {L'⦄', L'⦃'}, {L'⦆', L'⦅'}, {L'⦈', L'⦇'}, 1102 | {L'⦊', L'⦉'}, {L'⦌', L'⦋'}, {L'⦎', L'⦏'}, {L'⦐', L'⦍'}, {L'⦒', L'⦑'}, 1103 | {L'⦔', L'⦓'}, {L'⦘', L'⦗'}, {L'⧙', L'⧘'}, {L'⧛', L'⧚'}, {L'⧽', L'⧼'}, 1104 | {L'﹚', L'﹙'}, {L'﹜', L'﹛'}, {L'﹞', L'﹝'}, {L')', L'('}, {L']', L'['}, 1105 | {L'}', L'{'}, {L'」', L'「'}, 1106 | }; 1107 | return GetMirror(kMirrorRight, sizeof(kMirrorRight) / sizeof(kMirrorRight[0]), c); 1108 | } 1109 | 1110 | unsigned bestlineMirrorRight(unsigned c) { 1111 | static const unsigned short kMirrorLeft[][2] = { 1112 | {L'(', L')'}, {L'[', L']'}, {L'{', L'}'}, {L'⁅', L'⁆'}, {L'⁽', L'⁾'}, 1113 | {L'₍', L'₎'}, {L'⌈', L'⌉'}, {L'⌊', L'⌋'}, {L'〈', L'〉'}, {L'❨', L'❩'}, 1114 | {L'❪', L'❫'}, {L'❬', L'❭'}, {L'❮', L'❯'}, {L'❰', L'❱'}, {L'❲', L'❳'}, 1115 | {L'❴', L'❵'}, {L'⟅', L'⟆'}, {L'⟦', L'⟧'}, {L'⟨', L'⟩'}, {L'⟪', L'⟫'}, 1116 | {L'⟬', L'⟭'}, {L'⟮', L'⟯'}, {L'⦃', L'⦄'}, {L'⦅', L'⦆'}, {L'⦇', L'⦈'}, 1117 | {L'⦉', L'⦊'}, {L'⦋', L'⦌'}, {L'⦍', L'⦐'}, {L'⦏', L'⦎'}, {L'⦑', L'⦒'}, 1118 | {L'⦓', L'⦔'}, {L'⦗', L'⦘'}, {L'⧘', L'⧙'}, {L'⧚', L'⧛'}, {L'⧼', L'⧽'}, 1119 | {L'﹙', L'﹚'}, {L'﹛', L'﹜'}, {L'﹝', L'﹞'}, {L'(', L')'}, {L'[', L']'}, 1120 | {L'{', L'}'}, {L'「', L'」'}, 1121 | }; 1122 | return GetMirror(kMirrorLeft, sizeof(kMirrorLeft) / sizeof(kMirrorLeft[0]), c); 1123 | } 1124 | 1125 | static char StartsWith(const char *s, const char *prefix) { 1126 | for (;;) { 1127 | if (!*prefix) 1128 | return 1; 1129 | if (!*s) 1130 | return 0; 1131 | if (*s++ != *prefix++) 1132 | return 0; 1133 | } 1134 | } 1135 | 1136 | static char EndsWith(const char *s, const char *suffix) { 1137 | size_t n, m; 1138 | n = strlen(s); 1139 | m = strlen(suffix); 1140 | if (m > n) 1141 | return 0; 1142 | return !memcmp(s + n - m, suffix, m); 1143 | } 1144 | 1145 | char bestlineIsXeparator(unsigned c) { 1146 | return (bestlineIsSeparator(c) && !bestlineMirrorLeft(c) && !bestlineMirrorRight(c)); 1147 | } 1148 | 1149 | static unsigned Capitalize(unsigned c) { 1150 | if (!iscapital) { 1151 | c = bestlineUppercase(c); 1152 | iscapital = 1; 1153 | } 1154 | return c; 1155 | } 1156 | 1157 | static inline int Bsr(unsigned long long x) { 1158 | #if defined(__GNUC__) && !defined(__STRICT_ANSI__) 1159 | int b; 1160 | b = __builtin_clzll(x); 1161 | b ^= sizeof(unsigned long long) * CHAR_BIT - 1; 1162 | return b; 1163 | #else 1164 | static const char kDebruijn[64] = { 1165 | 0, 47, 1, 56, 48, 27, 2, 60, 57, 49, 41, 37, 28, 16, 3, 61, 54, 58, 35, 52, 50, 42, 1166 | 21, 44, 38, 32, 29, 23, 17, 11, 4, 62, 46, 55, 26, 59, 40, 36, 15, 53, 34, 51, 20, 43, 1167 | 31, 22, 10, 45, 25, 39, 14, 33, 19, 30, 9, 24, 13, 18, 8, 12, 7, 6, 5, 63, 1168 | }; 1169 | x |= x >> 1; 1170 | x |= x >> 2; 1171 | x |= x >> 4; 1172 | x |= x >> 8; 1173 | x |= x >> 16; 1174 | x |= x >> 32; 1175 | return kDebruijn[(x * 0x03f79d71b4cb0a89) >> 58]; 1176 | #endif 1177 | } 1178 | 1179 | static struct rune DecodeUtf8(int c) { 1180 | struct rune r; 1181 | if (c < 252) { 1182 | r.n = Bsr(255 & ~c); 1183 | r.c = c & (((1 << r.n) - 1) | 3); 1184 | r.n = 6 - r.n; 1185 | } else { 1186 | r.c = c & 3; 1187 | r.n = 5; 1188 | } 1189 | return r; 1190 | } 1191 | 1192 | static unsigned long long EncodeUtf8(unsigned c) { 1193 | static const unsigned short kTpEnc[32 - 7] = { 1194 | 1 | 0300 << 8, 1 | 0300 << 8, 1 | 0300 << 8, 1 | 0300 << 8, 2 | 0340 << 8, 1195 | 2 | 0340 << 8, 2 | 0340 << 8, 2 | 0340 << 8, 2 | 0340 << 8, 3 | 0360 << 8, 1196 | 3 | 0360 << 8, 3 | 0360 << 8, 3 | 0360 << 8, 3 | 0360 << 8, 4 | 0370 << 8, 1197 | 4 | 0370 << 8, 4 | 0370 << 8, 4 | 0370 << 8, 4 | 0370 << 8, 5 | 0374 << 8, 1198 | 5 | 0374 << 8, 5 | 0374 << 8, 5 | 0374 << 8, 5 | 0374 << 8, 5 | 0374 << 8, 1199 | }; 1200 | int e, n; 1201 | unsigned long long w; 1202 | if (c < 0200) 1203 | return c; 1204 | e = kTpEnc[Bsr(c) - 7]; 1205 | n = e & 0xff; 1206 | w = 0; 1207 | do { 1208 | w |= 0200 | (c & 077); 1209 | w <<= 8; 1210 | c >>= 6; 1211 | } while (--n); 1212 | return c | w | e >> 8; 1213 | } 1214 | 1215 | static struct rune GetUtf8(const char *p, size_t n) { 1216 | struct rune r; 1217 | if ((r.n = r.c = 0) < n && (r.c = p[r.n++] & 255) >= 0300) { 1218 | r.c = DecodeUtf8(r.c).c; 1219 | while (r.n < n && (p[r.n] & 0300) == 0200) { 1220 | r.c = r.c << 6 | (p[r.n++] & 077); 1221 | } 1222 | } 1223 | return r; 1224 | } 1225 | 1226 | static char *FormatUnsigned(char *p, unsigned x) { 1227 | char t; 1228 | size_t i, a, b; 1229 | i = 0; 1230 | do { 1231 | p[i++] = x % 10 + '0'; 1232 | x = x / 10; 1233 | } while (x > 0); 1234 | p[i] = '\0'; 1235 | if (i) { 1236 | for (a = 0, b = i - 1; a < b; ++a, --b) { 1237 | t = p[a]; 1238 | p[a] = p[b]; 1239 | p[b] = t; 1240 | } 1241 | } 1242 | return p + i; 1243 | } 1244 | 1245 | static void abInit(struct abuf *a) { 1246 | a->len = 0; 1247 | a->cap = 16; 1248 | a->b = (char *)malloc(a->cap); 1249 | a->b[0] = 0; 1250 | } 1251 | 1252 | static char abGrow(struct abuf *a, int need) { 1253 | int cap; 1254 | char *b; 1255 | cap = a->cap; 1256 | do 1257 | cap += cap / 2; 1258 | while (cap < need); 1259 | if (!(b = (char *)realloc(a->b, cap * sizeof(*a->b)))) 1260 | return 0; 1261 | a->cap = cap; 1262 | a->b = b; 1263 | return 1; 1264 | } 1265 | 1266 | static void abAppendw(struct abuf *a, unsigned long long w) { 1267 | char *p; 1268 | if (a->len + 8 > a->cap && !abGrow(a, a->len + 8)) 1269 | return; 1270 | p = a->b + a->len; 1271 | p[0] = (0x00000000000000FF & w) >> 000; 1272 | p[1] = (0x000000000000FF00 & w) >> 010; 1273 | p[2] = (0x0000000000FF0000 & w) >> 020; 1274 | p[3] = (0x00000000FF000000 & w) >> 030; 1275 | p[4] = (0x000000FF00000000 & w) >> 040; 1276 | p[5] = (0x0000FF0000000000 & w) >> 050; 1277 | p[6] = (0x00FF000000000000 & w) >> 060; 1278 | p[7] = (0xFF00000000000000 & w) >> 070; 1279 | a->len += w ? (Bsr(w) >> 3) + 1 : 1; 1280 | } 1281 | 1282 | static void abAppend(struct abuf *a, const char *s, int len) { 1283 | if (a->len + len + 1 > a->cap && !abGrow(a, a->len + len + 1)) 1284 | return; 1285 | memcpy(a->b + a->len, s, len); 1286 | a->b[a->len + len] = 0; 1287 | a->len += len; 1288 | } 1289 | 1290 | static void abAppends(struct abuf *a, const char *s) { 1291 | abAppend(a, s, strlen(s)); 1292 | } 1293 | 1294 | static void abAppendu(struct abuf *a, unsigned u) { 1295 | char b[11]; 1296 | abAppend(a, b, FormatUnsigned(b, u) - b); 1297 | } 1298 | 1299 | static void abFree(struct abuf *a) { 1300 | free(a->b); 1301 | a->b = 0; 1302 | } 1303 | 1304 | static size_t GetFdSize(int fd) { 1305 | struct stat st; 1306 | st.st_size = 0; 1307 | fstat(fd, &st); 1308 | return st.st_size; 1309 | } 1310 | 1311 | static char IsCharDev(int fd) { 1312 | struct stat st; 1313 | st.st_mode = 0; 1314 | fstat(fd, &st); 1315 | return (st.st_mode & S_IFMT) == S_IFCHR; 1316 | } 1317 | 1318 | static int MyRead(int fd, void *c, int); 1319 | static int MyWrite(int fd, const void *c, int); 1320 | static int MyPoll(int fd, int events, int to); 1321 | 1322 | static int (*_MyRead)(int fd, void *c, int n) = MyRead; 1323 | static int (*_MyWrite)(int fd, const void *c, int n) = MyWrite; 1324 | static int (*_MyPoll)(int fd, int events, int to) = MyPoll; 1325 | 1326 | static int WaitUntilReady(int fd, int events) { 1327 | return _MyPoll(fd, events, -1); 1328 | } 1329 | 1330 | static char HasPendingInput(int fd) { 1331 | return _MyPoll(fd, POLLIN, 0) == 1; 1332 | } 1333 | 1334 | static char *GetLineBlock(FILE *f) { 1335 | ssize_t rc; 1336 | char *p = 0; 1337 | size_t n, c = 0; 1338 | if ((rc = getdelim(&p, &c, '\n', f)) != EOF) { 1339 | for (n = rc; n; --n) { 1340 | if (p[n - 1] == '\r' || p[n - 1] == '\n') { 1341 | p[n - 1] = 0; 1342 | } else { 1343 | break; 1344 | } 1345 | } 1346 | return p; 1347 | } else { 1348 | free(p); 1349 | return 0; 1350 | } 1351 | } 1352 | 1353 | long bestlineReadCharacter(int fd, char *p, unsigned long n) { 1354 | int e; 1355 | size_t i; 1356 | ssize_t rc; 1357 | struct rune r; 1358 | unsigned char c; 1359 | enum { kAscii, kUtf8, kEsc, kCsi1, kCsi2, kSs, kNf, kStr, kStr2, kDone } t; 1360 | i = 0; 1361 | r.c = 0; 1362 | r.n = 0; 1363 | e = errno; 1364 | t = kAscii; 1365 | if (n) 1366 | p[0] = 0; 1367 | do { 1368 | for (;;) { 1369 | if (gotint) { 1370 | errno = EINTR; 1371 | return -1; 1372 | } 1373 | if (n) { 1374 | rc = _MyRead(fd, &c, 1); 1375 | } else { 1376 | rc = _MyRead(fd, 0, 0); 1377 | } 1378 | if (rc == -1 && errno == EINTR) { 1379 | if (!i) { 1380 | return -1; 1381 | } 1382 | } else if (rc == -1 && (errno == EAGAIN || errno == EWOULDBLOCK)) { 1383 | if (WaitUntilReady(fd, POLLIN) == -1) { 1384 | if (rc == -1 && errno == EINTR) { 1385 | if (!i) { 1386 | return -1; 1387 | } 1388 | } else { 1389 | return -1; 1390 | } 1391 | } 1392 | } else if (rc == -1) { 1393 | return -1; 1394 | } else if (!rc) { 1395 | if (!i) { 1396 | errno = e; 1397 | return 0; 1398 | } else { 1399 | errno = EILSEQ; 1400 | return -1; 1401 | } 1402 | } else { 1403 | break; 1404 | } 1405 | } 1406 | if (i + 1 < n) { 1407 | p[i] = c; 1408 | p[i + 1] = 0; 1409 | } else if (i < n) { 1410 | p[i] = 0; 1411 | } 1412 | ++i; 1413 | switch (t) { 1414 | Whoopsie: 1415 | if (n) 1416 | p[0] = c; 1417 | t = kAscii; 1418 | i = 1; 1419 | /* fallthrough */ 1420 | case kAscii: 1421 | if (c < 0200) { 1422 | if (c == 033) { 1423 | t = kEsc; 1424 | } else { 1425 | t = kDone; 1426 | } 1427 | } else if (c >= 0300) { 1428 | t = kUtf8; 1429 | r = DecodeUtf8(c); 1430 | } else { 1431 | /* ignore overlong sequences */ 1432 | } 1433 | break; 1434 | case kUtf8: 1435 | if ((c & 0300) == 0200) { 1436 | r.c <<= 6; 1437 | r.c |= c & 077; 1438 | if (!--r.n) { 1439 | switch (r.c) { 1440 | case 033: 1441 | t = kEsc; /* parsed but not canonicalized */ 1442 | break; 1443 | case 0x9b: 1444 | t = kCsi1; /* unusual but legal */ 1445 | break; 1446 | case 0x8e: /* SS2 (Single Shift Two) */ 1447 | case 0x8f: /* SS3 (Single Shift Three) */ 1448 | t = kSs; 1449 | break; 1450 | case 0x90: /* DCS (Device Control String) */ 1451 | case 0x98: /* SOS (Start of String) */ 1452 | case 0x9d: /* OSC (Operating System Command) */ 1453 | case 0x9e: /* PM (Privacy Message) */ 1454 | case 0x9f: /* APC (Application Program Command) */ 1455 | t = kStr; 1456 | break; 1457 | default: 1458 | t = kDone; 1459 | break; 1460 | } 1461 | } 1462 | } else { 1463 | goto Whoopsie; /* ignore underlong sequences if not eof */ 1464 | } 1465 | break; 1466 | case kEsc: 1467 | if (0x20 <= c && c <= 0x2f) { /* Nf */ 1468 | /* 1469 | * Almost no one uses ANSI Nf sequences 1470 | * They overlaps with alt+graphic keystrokes 1471 | * We care more about being able to type alt-/ 1472 | */ 1473 | if (c == ' ' || c == '#') { 1474 | t = kNf; 1475 | } else { 1476 | t = kDone; 1477 | } 1478 | } else if (0x30 <= c && c <= 0x3f) { /* Fp */ 1479 | t = kDone; 1480 | } else if (0x20 <= c && c <= 0x5F) { /* Fe */ 1481 | switch (c) { 1482 | case '[': 1483 | t = kCsi1; 1484 | break; 1485 | case 'N': /* SS2 (Single Shift Two) */ 1486 | case 'O': /* SS3 (Single Shift Three) */ 1487 | t = kSs; 1488 | break; 1489 | case 'P': /* DCS (Device Control String) */ 1490 | case 'X': /* SOS (Start of String) */ 1491 | case ']': /* OSC (Operating System Command) */ 1492 | case '^': /* PM (Privacy Message) */ 1493 | case '_': /* APC (Application Program Command) */ 1494 | t = kStr; 1495 | break; 1496 | default: 1497 | t = kDone; 1498 | break; 1499 | } 1500 | } else if (0x60 <= c && c <= 0x7e) { /* Fs */ 1501 | t = kDone; 1502 | } else if (c == 033) { 1503 | if (i < 3) { 1504 | /* alt chording */ 1505 | } else { 1506 | t = kDone; /* esc mashing */ 1507 | i = 1; 1508 | } 1509 | } else { 1510 | t = kDone; 1511 | } 1512 | break; 1513 | case kSs: 1514 | t = kDone; 1515 | break; 1516 | case kNf: 1517 | if (0x30 <= c && c <= 0x7e) { 1518 | t = kDone; 1519 | } else if (!(0x20 <= c && c <= 0x2f)) { 1520 | goto Whoopsie; 1521 | } 1522 | break; 1523 | case kCsi1: 1524 | if (0x20 <= c && c <= 0x2f) { 1525 | t = kCsi2; 1526 | } else if (c == '[' && ((i == 3) || (i == 4 && p[1] == 033))) { 1527 | /* linux function keys */ 1528 | } else if (0x40 <= c && c <= 0x7e) { 1529 | t = kDone; 1530 | } else if (!(0x30 <= c && c <= 0x3f)) { 1531 | goto Whoopsie; 1532 | } 1533 | break; 1534 | case kCsi2: 1535 | if (0x40 <= c && c <= 0x7e) { 1536 | t = kDone; 1537 | } else if (!(0x20 <= c && c <= 0x2f)) { 1538 | goto Whoopsie; 1539 | } 1540 | break; 1541 | case kStr: 1542 | switch (c) { 1543 | case '\a': 1544 | t = kDone; 1545 | break; 1546 | case 0033: /* ESC */ 1547 | case 0302: /* C1 (UTF-8) */ 1548 | t = kStr2; 1549 | break; 1550 | default: 1551 | break; 1552 | } 1553 | break; 1554 | case kStr2: 1555 | switch (c) { 1556 | case '\a': 1557 | case '\\': /* ST (ASCII) */ 1558 | case 0234: /* ST (UTF-8) */ 1559 | t = kDone; 1560 | break; 1561 | default: 1562 | t = kStr; 1563 | break; 1564 | } 1565 | break; 1566 | default: 1567 | assert(0); 1568 | } 1569 | } while (t != kDone); 1570 | errno = e; 1571 | return i; 1572 | } 1573 | 1574 | static char *GetLineChar(int fin, int fout) { 1575 | size_t got; 1576 | ssize_t rc; 1577 | char seq[16]; 1578 | struct abuf a; 1579 | struct sigaction sa[3]; 1580 | abInit(&a); 1581 | gotint = 0; 1582 | sigemptyset(&sa->sa_mask); 1583 | sa->sa_flags = 0; 1584 | sa->sa_handler = bestlineOnInt; 1585 | sigaction(SIGINT, sa, sa + 1); 1586 | sigaction(SIGQUIT, sa, sa + 2); 1587 | for (;;) { 1588 | if (gotint) { 1589 | rc = -1; 1590 | break; 1591 | } 1592 | if ((rc = bestlineReadCharacter(fin, seq, sizeof(seq))) == -1) { 1593 | if (errno == EAGAIN || errno == EWOULDBLOCK) { 1594 | if (WaitUntilReady(fin, POLLIN) > 0) { 1595 | continue; 1596 | } 1597 | } 1598 | if (errno == EINTR) { 1599 | continue; 1600 | } else { 1601 | break; 1602 | } 1603 | } 1604 | if (!(got = rc)) { 1605 | if (a.len) { 1606 | break; 1607 | } else { 1608 | rc = -1; 1609 | break; 1610 | } 1611 | } 1612 | if (seq[0] == '\r') { 1613 | if (HasPendingInput(fin)) { 1614 | if ((rc = bestlineReadCharacter(fin, seq + 1, sizeof(seq) - 1)) > 0) { 1615 | if (seq[0] == '\n') { 1616 | break; 1617 | } 1618 | } else { 1619 | rc = -1; 1620 | break; 1621 | } 1622 | } else { 1623 | _MyWrite(fout, "\n", 1); 1624 | break; 1625 | } 1626 | } else if (seq[0] == Ctrl('D')) { 1627 | break; 1628 | } else if (seq[0] == '\n') { 1629 | break; 1630 | } else if (seq[0] == '\b') { 1631 | while (a.len && (a.b[a.len - 1] & 0300) == 0200) 1632 | --a.len; 1633 | if (a.len) 1634 | --a.len; 1635 | } 1636 | if (!IsControl(seq[0])) { 1637 | abAppend(&a, seq, got); 1638 | } 1639 | } 1640 | sigaction(SIGQUIT, sa + 2, 0); 1641 | sigaction(SIGINT, sa + 1, 0); 1642 | if (gotint) { 1643 | abFree(&a); 1644 | raise(gotint); 1645 | errno = EINTR; 1646 | rc = -1; 1647 | } 1648 | if (rc != -1) { 1649 | return a.b; 1650 | } else { 1651 | abFree(&a); 1652 | return 0; 1653 | } 1654 | } 1655 | 1656 | static char *GetLine(FILE *in, FILE *out) { 1657 | if (!IsCharDev(fileno(in))) { 1658 | return GetLineBlock(in); 1659 | } else { 1660 | return GetLineChar(fileno(in), fileno(out)); 1661 | } 1662 | } 1663 | 1664 | static char *Copy(char *d, const char *s, size_t n) { 1665 | memcpy(d, s, n); 1666 | return d + n; 1667 | } 1668 | 1669 | static int CompareStrings(const char *a, const char *b) { 1670 | size_t i; 1671 | int x, y, c; 1672 | for (i = 0;; ++i) { 1673 | x = bestlineLowercase(a[i] & 255); 1674 | y = bestlineLowercase(b[i] & 255); 1675 | if ((c = x - y) || !x) { 1676 | return c; 1677 | } 1678 | } 1679 | } 1680 | 1681 | static const char *FindSubstringReverse(const char *p, size_t n, const char *q, size_t m) { 1682 | size_t i; 1683 | if (m <= n) { 1684 | n -= m; 1685 | do { 1686 | for (i = 0; i < m; ++i) { 1687 | if (p[n + i] != q[i]) { 1688 | break; 1689 | } 1690 | } 1691 | if (i == m) { 1692 | return p + n; 1693 | } 1694 | } while (n--); 1695 | } 1696 | return 0; 1697 | } 1698 | 1699 | static int ParseUnsigned(const char *s, void *e) { 1700 | int c, x; 1701 | for (x = 0; (c = *s++);) { 1702 | if ('0' <= c && c <= '9') { 1703 | x = Min(c - '0' + x * 10, 32767); 1704 | } else { 1705 | break; 1706 | } 1707 | } 1708 | if (e) 1709 | *(const char **)e = s; 1710 | return x; 1711 | } 1712 | 1713 | /** 1714 | * Returns UNICODE CJK Monospace Width of string. 1715 | * 1716 | * Control codes and ANSI sequences have a width of zero. We only parse 1717 | * a limited subset of ANSI here since we don't store ANSI codes in the 1718 | * linenoiseState::buf, but we do encourage CSI color codes in prompts. 1719 | */ 1720 | static size_t GetMonospaceWidth(const char *p, size_t n, char *out_haswides) { 1721 | int c, d; 1722 | size_t i, w; 1723 | struct rune r; 1724 | char haswides; 1725 | enum { kAscii, kUtf8, kEsc, kCsi1, kCsi2 } t; 1726 | for (haswides = r.c = r.n = w = i = 0, t = kAscii; i < n; ++i) { 1727 | c = p[i] & 255; 1728 | switch (t) { 1729 | Whoopsie: 1730 | t = kAscii; 1731 | /* fallthrough */ 1732 | case kAscii: 1733 | if (c < 0200) { 1734 | if (c == 033) { 1735 | t = kEsc; 1736 | } else { 1737 | ++w; 1738 | } 1739 | } else if (c >= 0300) { 1740 | t = kUtf8; 1741 | r = DecodeUtf8(c); 1742 | } 1743 | break; 1744 | case kUtf8: 1745 | if ((c & 0300) == 0200) { 1746 | r.c <<= 6; 1747 | r.c |= c & 077; 1748 | if (!--r.n) { 1749 | d = GetMonospaceCharacterWidth(r.c); 1750 | d = Max(0, d); 1751 | w += d; 1752 | haswides |= d > 1; 1753 | t = kAscii; 1754 | break; 1755 | } 1756 | } else { 1757 | goto Whoopsie; 1758 | } 1759 | break; 1760 | case kEsc: 1761 | if (c == '[') { 1762 | t = kCsi1; 1763 | } else { 1764 | t = kAscii; 1765 | } 1766 | break; 1767 | case kCsi1: 1768 | if (0x20 <= c && c <= 0x2f) { 1769 | t = kCsi2; 1770 | } else if (0x40 <= c && c <= 0x7e) { 1771 | t = kAscii; 1772 | } else if (!(0x30 <= c && c <= 0x3f)) { 1773 | goto Whoopsie; 1774 | } 1775 | break; 1776 | case kCsi2: 1777 | if (0x40 <= c && c <= 0x7e) { 1778 | t = kAscii; 1779 | } else if (!(0x20 <= c && c <= 0x2f)) { 1780 | goto Whoopsie; 1781 | } 1782 | break; 1783 | default: 1784 | assert(0); 1785 | } 1786 | } 1787 | if (out_haswides) { 1788 | *out_haswides = haswides; 1789 | } 1790 | return w; 1791 | } 1792 | 1793 | static int bestlineIsUnsupportedTerm(void) { 1794 | size_t i; 1795 | char *term; 1796 | static char once, res; 1797 | if (!once) { 1798 | if ((term = getenv("TERM"))) { 1799 | for (i = 0; i < sizeof(kUnsupported) / sizeof(*kUnsupported); i++) { 1800 | if (!CompareStrings(term, kUnsupported[i])) { 1801 | res = 1; 1802 | break; 1803 | } 1804 | } 1805 | } 1806 | once = 1; 1807 | } 1808 | return res; 1809 | } 1810 | 1811 | static int enableRawMode(int fd) { 1812 | struct termios raw; 1813 | struct sigaction sa; 1814 | if (tcgetattr(fd, &orig_termios) != -1) { 1815 | raw = orig_termios; 1816 | raw.c_iflag &= ~(BRKINT | ICRNL | INPCK | ISTRIP | IXON); 1817 | raw.c_lflag &= ~(ECHO | ICANON | IEXTEN | ISIG); 1818 | raw.c_iflag |= IUTF8; 1819 | raw.c_cflag |= CS8; 1820 | raw.c_cc[VMIN] = 1; 1821 | raw.c_cc[VTIME] = 0; 1822 | if (tcsetattr(fd, TCSANOW, &raw) != -1) { 1823 | sa.sa_flags = 0; 1824 | sa.sa_handler = bestlineOnCont; 1825 | sigemptyset(&sa.sa_mask); 1826 | sigaction(SIGCONT, &sa, &orig_cont); 1827 | sa.sa_handler = bestlineOnWinch; 1828 | sigaction(SIGWINCH, &sa, &orig_winch); 1829 | rawmode = fd; 1830 | gotwinch = 0; 1831 | gotcont = 0; 1832 | return 0; 1833 | } 1834 | } 1835 | errno = ENOTTY; 1836 | return -1; 1837 | } 1838 | 1839 | static void bestlineUnpause(int fd) { 1840 | if (ispaused) { 1841 | tcflow(fd, TCOON); 1842 | ispaused = 0; 1843 | } 1844 | } 1845 | 1846 | void bestlineDisableRawMode(void) { 1847 | if (rawmode != -1) { 1848 | bestlineUnpause(rawmode); 1849 | sigaction(SIGCONT, &orig_cont, 0); 1850 | sigaction(SIGWINCH, &orig_winch, 0); 1851 | tcsetattr(rawmode, TCSANOW, &orig_termios); 1852 | rawmode = -1; 1853 | } 1854 | } 1855 | 1856 | static int bestlineWrite(int fd, const void *p, size_t n) { 1857 | ssize_t rc; 1858 | size_t wrote; 1859 | do { 1860 | for (;;) { 1861 | if (gotint) { 1862 | errno = EINTR; 1863 | return -1; 1864 | } 1865 | if (ispaused) { 1866 | return 0; 1867 | } 1868 | rc = _MyWrite(fd, p, n); 1869 | if (rc == -1 && errno == EINTR) { 1870 | continue; 1871 | } else if (rc == -1 && (errno == EAGAIN || errno == EWOULDBLOCK)) { 1872 | if (WaitUntilReady(fd, POLLOUT) == -1) { 1873 | if (errno == EINTR) { 1874 | continue; 1875 | } else { 1876 | return -1; 1877 | } 1878 | } 1879 | } else { 1880 | break; 1881 | } 1882 | } 1883 | if (rc != -1) { 1884 | wrote = rc; 1885 | n -= wrote; 1886 | p = (char *)p + wrote; 1887 | } else { 1888 | return -1; 1889 | } 1890 | } while (n); 1891 | return 0; 1892 | } 1893 | 1894 | static int bestlineWriteStr(int fd, const char *p) { 1895 | return bestlineWrite(fd, p, strlen(p)); 1896 | } 1897 | 1898 | static ssize_t bestlineRead(int fd, char *buf, size_t size, struct bestlineState *l) { 1899 | size_t got; 1900 | ssize_t rc; 1901 | int refreshme; 1902 | do { 1903 | refreshme = 0; 1904 | if (gotint) { 1905 | errno = EINTR; 1906 | return -1; 1907 | } 1908 | if (gotcont && rawmode != -1) { 1909 | enableRawMode(rawmode); 1910 | if (l) 1911 | refreshme = 1; 1912 | } 1913 | if (gotwinch && l) { 1914 | refreshme = 1; 1915 | } 1916 | if (refreshme) 1917 | bestlineRefreshLine(l); 1918 | rc = bestlineReadCharacter(fd, buf, size); 1919 | } while (rc == -1 && errno == EINTR); 1920 | if (rc != -1) { 1921 | got = rc; 1922 | if (got > 0 && l) { 1923 | memcpy(l->seq[1], l->seq[0], sizeof(l->seq[0])); 1924 | memset(l->seq[0], 0, sizeof(l->seq[0])); 1925 | memcpy(l->seq[0], buf, Min(Min(size, got), sizeof(l->seq[0]) - 1)); 1926 | } 1927 | } 1928 | return rc; 1929 | } 1930 | 1931 | /** 1932 | * Returns number of columns in current terminal. 1933 | * 1934 | * 1. Checks COLUMNS environment variable (set by Emacs) 1935 | * 2. Tries asking termios (works for pseudoteletypewriters) 1936 | * 3. Falls back to inband signalling (works w/ pipe or serial) 1937 | * 4. Otherwise we conservatively assume 80 columns 1938 | * 1939 | * @param ws should be initialized by caller to zero before first call 1940 | * @param ifd is input file descriptor 1941 | * @param ofd is output file descriptor 1942 | * @return window size 1943 | */ 1944 | static struct winsize GetTerminalSize(struct winsize ws, int ifd, int ofd) { 1945 | int x; 1946 | ssize_t n; 1947 | char *p, *s, b[16]; 1948 | ioctl(ofd, TIOCGWINSZ, &ws); 1949 | if ((!ws.ws_row && (s = getenv("ROWS")) && (x = ParseUnsigned(s, 0)))) { 1950 | ws.ws_row = x; 1951 | } 1952 | if ((!ws.ws_col && (s = getenv("COLUMNS")) && (x = ParseUnsigned(s, 0)))) { 1953 | ws.ws_col = x; 1954 | } 1955 | if (((!ws.ws_col || !ws.ws_row) && bestlineRead(ifd, 0, 0, 0) != -1 && 1956 | bestlineWriteStr(ofd, "\0337" /* save position */ 1957 | "\033[9979;9979H" /* move cursor to bottom right corner */ 1958 | "\033[6n" /* report position */ 1959 | "\0338") != -1 && /* restore position */ 1960 | (n = bestlineRead(ifd, b, sizeof(b), 0)) != -1 && 1961 | n && b[0] == 033 && b[1] == '[' && b[n - 1] == 'R')) { 1962 | p = b + 2; 1963 | if ((x = ParseUnsigned(p, &p))) 1964 | ws.ws_row = x; 1965 | if (*p++ == ';' && (x = ParseUnsigned(p, 0))) 1966 | ws.ws_col = x; 1967 | } 1968 | if (!ws.ws_col) 1969 | ws.ws_col = 80; 1970 | if (!ws.ws_row) 1971 | ws.ws_row = 24; 1972 | return ws; 1973 | } 1974 | 1975 | /* Clear the screen. Used to handle ctrl+l */ 1976 | void bestlineClearScreen(int fd) { 1977 | bestlineWriteStr(fd, "\033[H" /* move cursor to top left corner */ 1978 | "\033[2J"); /* erase display */ 1979 | } 1980 | 1981 | static void bestlineBeep(void) { 1982 | /* THE TERMINAL BELL IS DEAD - HISTORY HAS KILLED IT */ 1983 | } 1984 | 1985 | static char bestlineGrow(struct bestlineState *ls, size_t n) { 1986 | char *p; 1987 | size_t m; 1988 | m = ls->buflen; 1989 | if (m >= n) 1990 | return 1; 1991 | do 1992 | m += m >> 1; 1993 | while (m < n); 1994 | if (!(p = (char *)realloc(ls->buf, m * sizeof(*ls->buf)))) 1995 | return 0; 1996 | ls->buf = p; 1997 | ls->buflen = m; 1998 | return 1; 1999 | } 2000 | 2001 | /* This is an helper function for bestlineEdit() and is called when the 2002 | * user types the key in order to complete the string currently in the 2003 | * input. 2004 | * 2005 | * The state of the editing is encapsulated into the pointed bestlineState 2006 | * structure as described in the structure definition. */ 2007 | static ssize_t bestlineCompleteLine(struct bestlineState *ls, char *seq, int size) { 2008 | ssize_t nread; 2009 | size_t i, n, stop; 2010 | bestlineCompletions lc; 2011 | struct bestlineState original, saved; 2012 | nread = 0; 2013 | memset(&lc, 0, sizeof(lc)); 2014 | completionCallback(ls->buf, ls->pos, &lc); 2015 | if (!lc.len) { 2016 | bestlineBeep(); 2017 | } else { 2018 | i = 0; 2019 | stop = 0; 2020 | original = *ls; 2021 | while (!stop) { 2022 | /* Show completion or original buffer */ 2023 | if (i < lc.len) { 2024 | saved = *ls; 2025 | ls->len = strlen(lc.cvec[i]); 2026 | ls->pos = original.pos + ls->len - original.len; 2027 | ls->buf = lc.cvec[i]; 2028 | bestlineRefreshLine(ls); 2029 | ls->len = saved.len; 2030 | ls->pos = saved.pos; 2031 | ls->buf = saved.buf; 2032 | if (lc.len == 1) { 2033 | nread = 0; 2034 | goto FinishQuickly; 2035 | } 2036 | } else { 2037 | bestlineRefreshLine(ls); 2038 | } 2039 | if ((nread = bestlineRead(ls->ifd, seq, size, ls)) <= 0) { 2040 | bestlineFreeCompletions(&lc); 2041 | return -1; 2042 | } 2043 | switch (seq[0]) { 2044 | case '\t': 2045 | i = (i + 1) % (lc.len + 1); 2046 | if (i == lc.len) { 2047 | bestlineBeep(); 2048 | } 2049 | break; 2050 | default: 2051 | if (i < lc.len) { 2052 | FinishQuickly: 2053 | n = strlen(lc.cvec[i]); 2054 | if (bestlineGrow(ls, n + 1)) { 2055 | memcpy(ls->buf, lc.cvec[i], n + 1); 2056 | ls->len = n; 2057 | ls->pos = original.pos + n - original.len; 2058 | } 2059 | } 2060 | stop = 1; 2061 | break; 2062 | } 2063 | } 2064 | } 2065 | bestlineFreeCompletions(&lc); 2066 | return nread; 2067 | } 2068 | 2069 | static void bestlineEditHistoryGoto(struct bestlineState *l, unsigned i) { 2070 | size_t n; 2071 | if (historylen <= 1) 2072 | return; 2073 | if (i > historylen - 1) 2074 | return; 2075 | i = Max(Min(i, historylen - 1), 0); 2076 | free(history[historylen - 1 - l->hindex]); 2077 | history[historylen - 1 - l->hindex] = strdup(l->buf); 2078 | l->hindex = i; 2079 | n = strlen(history[historylen - 1 - l->hindex]); 2080 | bestlineGrow(l, n + 1); 2081 | n = Min(n, l->buflen - 1); 2082 | memcpy(l->buf, history[historylen - 1 - l->hindex], n); 2083 | l->buf[n] = 0; 2084 | l->len = l->pos = n; 2085 | bestlineRefreshLine(l); 2086 | } 2087 | 2088 | static void bestlineEditHistoryMove(struct bestlineState *l, int dx) { 2089 | bestlineEditHistoryGoto(l, l->hindex + dx); 2090 | } 2091 | 2092 | static char *bestlineMakeSearchPrompt(struct abuf *ab, int fail, const char *s, int n) { 2093 | ab->len = 0; 2094 | abAppendw(ab, '('); 2095 | if (fail) 2096 | abAppends(ab, "failed "); 2097 | abAppends(ab, "reverse-i-search `\033[4m"); 2098 | abAppend(ab, s, n); 2099 | abAppends(ab, "\033[24m"); 2100 | abAppends(ab, s + n); 2101 | abAppendw(ab, Read32le("') ")); 2102 | return ab->b; 2103 | } 2104 | 2105 | static int bestlineSearch(struct bestlineState *l, char *seq, int size) { 2106 | char *p; 2107 | char isstale; 2108 | struct abuf ab; 2109 | struct abuf prompt; 2110 | unsigned i, j, k, matlen; 2111 | const char *oldprompt, *q; 2112 | int rc, fail, added, oldpos, oldindex; 2113 | if (historylen <= 1) 2114 | return 0; 2115 | abInit(&ab); 2116 | abInit(&prompt); 2117 | oldpos = l->pos; 2118 | oldprompt = l->prompt; 2119 | oldindex = l->hindex; 2120 | for (fail = matlen = 0;;) { 2121 | l->prompt = bestlineMakeSearchPrompt(&prompt, fail, ab.b, matlen); 2122 | bestlineRefreshLine(l); 2123 | fail = 1; 2124 | added = 0; 2125 | j = l->pos; 2126 | i = l->hindex; 2127 | rc = bestlineRead(l->ifd, seq, size, l); 2128 | if (rc > 0) { 2129 | if (seq[0] == Ctrl('?') || seq[0] == Ctrl('H')) { 2130 | if (ab.len) { 2131 | --ab.len; 2132 | matlen = Min(matlen, ab.len); 2133 | } 2134 | } else if (seq[0] == Ctrl('R')) { 2135 | if (j) { 2136 | --j; 2137 | } else if (i + 1 < historylen) { 2138 | ++i; 2139 | j = strlen(history[historylen - 1 - i]); 2140 | } 2141 | } else if (seq[0] == Ctrl('G')) { 2142 | bestlineEditHistoryGoto(l, oldindex); 2143 | l->pos = oldpos; 2144 | rc = 0; 2145 | break; 2146 | } else if (IsControl(seq[0])) { /* only sees canonical c0 */ 2147 | break; 2148 | } else { 2149 | abAppend(&ab, seq, rc); 2150 | added = rc; 2151 | } 2152 | } else { 2153 | break; 2154 | } 2155 | isstale = 0; 2156 | while (i < historylen) { 2157 | p = history[historylen - 1 - i]; 2158 | k = strlen(p); 2159 | if (!isstale) { 2160 | j = Min(k, j + ab.len); 2161 | } else { 2162 | isstale = 0; 2163 | j = k; 2164 | } 2165 | if ((q = FindSubstringReverse(p, j, ab.b, ab.len))) { 2166 | bestlineEditHistoryGoto(l, i); 2167 | l->pos = q - p; 2168 | fail = 0; 2169 | if (added) { 2170 | matlen += added; 2171 | added = 0; 2172 | } 2173 | break; 2174 | } else { 2175 | isstale = 1; 2176 | ++i; 2177 | } 2178 | } 2179 | } 2180 | l->prompt = oldprompt; 2181 | bestlineRefreshLine(l); 2182 | abFree(&prompt); 2183 | abFree(&ab); 2184 | bestlineRefreshLine(l); 2185 | return rc; 2186 | } 2187 | 2188 | static void bestlineRingFree(void) { 2189 | size_t i; 2190 | for (i = 0; i < BESTLINE_MAX_RING; ++i) { 2191 | if (ring.p[i]) { 2192 | free(ring.p[i]); 2193 | ring.p[i] = 0; 2194 | } 2195 | } 2196 | } 2197 | 2198 | static void bestlineRingPush(const char *p, size_t n) { 2199 | char *q; 2200 | if (!n) 2201 | return; 2202 | if (!(q = (char *)malloc(n + 1))) 2203 | return; 2204 | ring.i = (ring.i + 1) % BESTLINE_MAX_RING; 2205 | free(ring.p[ring.i]); 2206 | ring.p[ring.i] = (char *)memcpy(q, p, n); 2207 | ring.p[ring.i][n] = 0; 2208 | } 2209 | 2210 | static void bestlineRingRotate(void) { 2211 | size_t i; 2212 | for (i = 0; i < BESTLINE_MAX_RING; ++i) { 2213 | ring.i = (ring.i - 1) % BESTLINE_MAX_RING; 2214 | if (ring.p[ring.i]) 2215 | break; 2216 | } 2217 | } 2218 | 2219 | static char *bestlineRefreshHints(struct bestlineState *l) { 2220 | char *hint; 2221 | struct abuf ab; 2222 | const char *ansi1 = "\033[90m", *ansi2 = "\033[39m"; 2223 | if (!hintsCallback) 2224 | return 0; 2225 | if (!(hint = hintsCallback(l->buf, &ansi1, &ansi2))) 2226 | return 0; 2227 | abInit(&ab); 2228 | if (ansi1) 2229 | abAppends(&ab, ansi1); 2230 | abAppends(&ab, hint); 2231 | if (ansi2) 2232 | abAppends(&ab, ansi2); 2233 | if (freeHintsCallback) 2234 | freeHintsCallback(hint); 2235 | return ab.b; 2236 | } 2237 | 2238 | static size_t Backward(struct bestlineState *l, size_t pos) { 2239 | if (pos) { 2240 | do 2241 | --pos; 2242 | while (pos && (l->buf[pos] & 0300) == 0200); 2243 | } 2244 | return pos; 2245 | } 2246 | 2247 | static int bestlineEditMirrorLeft(struct bestlineState *l, int res[2]) { 2248 | unsigned c, pos, left, right, depth, index; 2249 | if ((pos = Backward(l, l->pos))) { 2250 | right = GetUtf8(l->buf + pos, l->len - pos).c; 2251 | if ((left = bestlineMirrorLeft(right))) { 2252 | depth = 0; 2253 | index = pos; 2254 | do { 2255 | pos = Backward(l, pos); 2256 | c = GetUtf8(l->buf + pos, l->len - pos).c; 2257 | if (c == right) { 2258 | ++depth; 2259 | } else if (c == left) { 2260 | if (depth) { 2261 | --depth; 2262 | } else { 2263 | res[0] = pos; 2264 | res[1] = index; 2265 | return 0; 2266 | } 2267 | } 2268 | } while (pos); 2269 | } 2270 | } 2271 | return -1; 2272 | } 2273 | 2274 | static int bestlineEditMirrorRight(struct bestlineState *l, int res[2]) { 2275 | struct rune rune; 2276 | unsigned pos, left, right, depth, index; 2277 | pos = l->pos; 2278 | rune = GetUtf8(l->buf + pos, l->len - pos); 2279 | left = rune.c; 2280 | if ((right = bestlineMirrorRight(left))) { 2281 | depth = 0; 2282 | index = pos; 2283 | do { 2284 | pos += rune.n; 2285 | rune = GetUtf8(l->buf + pos, l->len - pos); 2286 | if (rune.c == left) { 2287 | ++depth; 2288 | } else if (rune.c == right) { 2289 | if (depth) { 2290 | --depth; 2291 | } else { 2292 | res[0] = index; 2293 | res[1] = pos; 2294 | return 0; 2295 | } 2296 | } 2297 | } while (pos + rune.n < l->len); 2298 | } 2299 | return -1; 2300 | } 2301 | 2302 | static int bestlineEditMirror(struct bestlineState *l, int res[2]) { 2303 | int rc; 2304 | rc = bestlineEditMirrorLeft(l, res); 2305 | if (rc == -1) 2306 | rc = bestlineEditMirrorRight(l, res); 2307 | return rc; 2308 | } 2309 | 2310 | static void bestlineRefreshLineImpl(struct bestlineState *l, int force) { 2311 | char *hint; 2312 | char flipit; 2313 | char hasflip; 2314 | char haswides; 2315 | struct abuf ab; 2316 | const char *buf; 2317 | struct rune rune; 2318 | struct winsize oldsize; 2319 | int fd, plen, rows, len, pos; 2320 | unsigned x, xn, yn, width, pwidth; 2321 | int i, t, cx, cy, tn, resized, flip[2]; 2322 | 2323 | /* 2324 | * synchonize the i/o state 2325 | */ 2326 | if (ispaused) { 2327 | if (force) { 2328 | bestlineUnpause(l->ofd); 2329 | } else { 2330 | return; 2331 | } 2332 | } 2333 | if (!force && HasPendingInput(l->ifd)) { 2334 | l->dirty = 1; 2335 | return; 2336 | } 2337 | oldsize = l->ws; 2338 | if ((resized = gotwinch) && rawmode != -1) { 2339 | gotwinch = 0; 2340 | l->ws = GetTerminalSize(l->ws, l->ifd, l->ofd); 2341 | } 2342 | hasflip = !l->final && !bestlineEditMirror(l, flip); 2343 | 2344 | StartOver: 2345 | fd = l->ofd; 2346 | buf = l->buf; 2347 | pos = l->pos; 2348 | len = l->len; 2349 | xn = l->ws.ws_col; 2350 | yn = l->ws.ws_row; 2351 | plen = strlen(l->prompt); 2352 | pwidth = GetMonospaceWidth(l->prompt, plen, 0); 2353 | width = GetMonospaceWidth(buf, len, &haswides); 2354 | 2355 | /* 2356 | * handle the case where the line is larger than the whole display 2357 | * gnu readline actually isn't able to deal with this situation!!! 2358 | * we kludge xn to address the edge case of wide chars on the edge 2359 | */ 2360 | for (tn = xn - haswides * 2;;) { 2361 | if (pwidth + width + 1 < tn * yn) 2362 | break; /* we're fine */ 2363 | if (!len || width < 2) 2364 | break; /* we can't do anything */ 2365 | if (pwidth + 2 > tn * yn) 2366 | break; /* we can't do anything */ 2367 | if (pos > len / 2) { 2368 | /* hide content on the left if we're editing on the right */ 2369 | rune = GetUtf8(buf, len); 2370 | buf += rune.n; 2371 | len -= rune.n; 2372 | pos -= rune.n; 2373 | } else { 2374 | /* hide content on the right if we're editing on left */ 2375 | t = len; 2376 | while (len && (buf[len - 1] & 0300) == 0200) 2377 | --len; 2378 | if (len) 2379 | --len; 2380 | rune = GetUtf8(buf + len, t - len); 2381 | } 2382 | if ((t = GetMonospaceCharacterWidth(rune.c)) > 0) { 2383 | width -= t; 2384 | } 2385 | } 2386 | pos = Max(0, Min(pos, len)); 2387 | 2388 | /* 2389 | * now generate the terminal codes to update the line 2390 | * 2391 | * since we support unlimited lines it's important that we don't 2392 | * clear the screen before we draw the screen. doing that causes 2393 | * flickering. the key with terminals is to overwrite cells, and 2394 | * then use \e[K and \e[J to clear everything else. 2395 | * 2396 | * we make the assumption that prompts and hints may contain ansi 2397 | * sequences, but the buffer does not. 2398 | * 2399 | * we need to handle the edge case where a wide character like 度 2400 | * might be at the edge of the window, when there's one cell left. 2401 | * so we can't use division based on string width to compute the 2402 | * coordinates and have to track it as we go. 2403 | */ 2404 | cy = -1; 2405 | cx = -1; 2406 | rows = 1; 2407 | abInit(&ab); 2408 | abAppendw(&ab, '\r'); /* start of line */ 2409 | if (l->rows - l->oldpos - 1 > 0) { 2410 | abAppends(&ab, "\033["); 2411 | abAppendu(&ab, l->rows - l->oldpos - 1); 2412 | abAppendw(&ab, 'A'); /* cursor up clamped */ 2413 | } 2414 | abAppends(&ab, l->prompt); 2415 | x = pwidth; 2416 | for (i = 0; i < len; i += rune.n) { 2417 | rune = GetUtf8(buf + i, len - i); 2418 | if (x && x + rune.n > xn) { 2419 | if (cy >= 0) 2420 | ++cy; 2421 | if (x < xn) { 2422 | abAppends(&ab, "\033[K"); /* clear line forward */ 2423 | } 2424 | abAppends(&ab, "\r" /* start of line */ 2425 | "\n"); /* cursor down unclamped */ 2426 | ++rows; 2427 | x = 0; 2428 | } 2429 | if (i == pos) { 2430 | cy = 0; 2431 | cx = x; 2432 | } 2433 | if (maskmode) { 2434 | abAppendw(&ab, '*'); 2435 | } else { 2436 | flipit = hasflip && (i == flip[0] || i == flip[1]); 2437 | if (flipit) 2438 | abAppends(&ab, "\033[1m"); 2439 | abAppendw(&ab, EncodeUtf8(rune.c)); 2440 | if (flipit) 2441 | abAppends(&ab, "\033[22m"); 2442 | } 2443 | t = GetMonospaceCharacterWidth(rune.c); 2444 | t = Max(0, t); 2445 | x += t; 2446 | } 2447 | if (!l->final && (hint = bestlineRefreshHints(l))) { 2448 | if (GetMonospaceWidth(hint, strlen(hint), 0) < xn - x) { 2449 | if (cx < 0) { 2450 | cx = x; 2451 | } 2452 | abAppends(&ab, hint); 2453 | } 2454 | free(hint); 2455 | } 2456 | abAppendw(&ab, Read32le("\033[J")); /* erase display forwards */ 2457 | 2458 | /* 2459 | * if we are at the very end of the screen with our prompt, we need 2460 | * to emit a newline and move the prompt to the first column. 2461 | */ 2462 | if (pos && pos == len && x >= xn) { 2463 | abAppendw(&ab, Read32le("\n\r\0")); 2464 | ++rows; 2465 | } 2466 | 2467 | /* 2468 | * move cursor to right position 2469 | */ 2470 | if (cy > 0) { 2471 | abAppends(&ab, "\033["); 2472 | abAppendu(&ab, cy); 2473 | abAppendw(&ab, 'A'); /* cursor up */ 2474 | } 2475 | if (cx > 0) { 2476 | abAppendw(&ab, Read32le("\r\033[")); 2477 | abAppendu(&ab, cx); 2478 | abAppendw(&ab, 'C'); /* cursor right */ 2479 | } else if (!cx) { 2480 | abAppendw(&ab, '\r'); /* start */ 2481 | } 2482 | 2483 | /* 2484 | * now get ready to progress state 2485 | * we use a mostly correct kludge when the tty resizes 2486 | */ 2487 | l->rows = rows; 2488 | if (resized && oldsize.ws_col > l->ws.ws_col) { 2489 | resized = 0; 2490 | abFree(&ab); 2491 | goto StartOver; 2492 | } 2493 | l->dirty = 0; 2494 | l->oldpos = Max(0, cy); 2495 | 2496 | /* 2497 | * send codes to terminal 2498 | */ 2499 | bestlineWrite(fd, ab.b, ab.len); 2500 | abFree(&ab); 2501 | } 2502 | 2503 | static void bestlineRefreshLine(struct bestlineState *l) { 2504 | bestlineRefreshLineImpl(l, 0); 2505 | } 2506 | 2507 | static void bestlineRefreshLineForce(struct bestlineState *l) { 2508 | bestlineRefreshLineImpl(l, 1); 2509 | } 2510 | 2511 | static void bestlineEditInsert(struct bestlineState *l, const char *p, size_t n) { 2512 | if (!bestlineGrow(l, l->len + n + 1)) 2513 | return; 2514 | memmove(l->buf + l->pos + n, l->buf + l->pos, l->len - l->pos); 2515 | memcpy(l->buf + l->pos, p, n); 2516 | l->pos += n; 2517 | l->len += n; 2518 | l->buf[l->len] = 0; 2519 | bestlineRefreshLine(l); 2520 | } 2521 | 2522 | static void bestlineEditHome(struct bestlineState *l) { 2523 | l->pos = 0; 2524 | bestlineRefreshLine(l); 2525 | } 2526 | 2527 | static void bestlineEditEnd(struct bestlineState *l) { 2528 | l->pos = l->len; 2529 | bestlineRefreshLine(l); 2530 | } 2531 | 2532 | static void bestlineEditUp(struct bestlineState *l) { 2533 | bestlineEditHistoryMove(l, BESTLINE_HISTORY_PREV); 2534 | } 2535 | 2536 | static void bestlineEditDown(struct bestlineState *l) { 2537 | bestlineEditHistoryMove(l, BESTLINE_HISTORY_NEXT); 2538 | } 2539 | 2540 | static void bestlineEditBof(struct bestlineState *l) { 2541 | bestlineEditHistoryGoto(l, historylen - 1); 2542 | } 2543 | 2544 | static void bestlineEditEof(struct bestlineState *l) { 2545 | bestlineEditHistoryGoto(l, 0); 2546 | } 2547 | 2548 | static void bestlineEditRefresh(struct bestlineState *l) { 2549 | bestlineClearScreen(l->ofd); 2550 | bestlineRefreshLine(l); 2551 | } 2552 | 2553 | static size_t Forward(struct bestlineState *l, size_t pos) { 2554 | return pos + GetUtf8(l->buf + pos, l->len - pos).n; 2555 | } 2556 | 2557 | static size_t Backwards(struct bestlineState *l, size_t pos, char pred(unsigned)) { 2558 | size_t i; 2559 | struct rune r; 2560 | while (pos) { 2561 | i = Backward(l, pos); 2562 | r = GetUtf8(l->buf + i, l->len - i); 2563 | if (pred(r.c)) { 2564 | pos = i; 2565 | } else { 2566 | break; 2567 | } 2568 | } 2569 | return pos; 2570 | } 2571 | 2572 | static size_t Forwards(struct bestlineState *l, size_t pos, char pred(unsigned)) { 2573 | struct rune r; 2574 | while (pos < l->len) { 2575 | r = GetUtf8(l->buf + pos, l->len - pos); 2576 | if (pred(r.c)) { 2577 | pos += r.n; 2578 | } else { 2579 | break; 2580 | } 2581 | } 2582 | return pos; 2583 | } 2584 | 2585 | static size_t ForwardWord(struct bestlineState *l, size_t pos) { 2586 | pos = Forwards(l, pos, bestlineIsSeparator); 2587 | pos = Forwards(l, pos, bestlineNotSeparator); 2588 | return pos; 2589 | } 2590 | 2591 | static size_t BackwardWord(struct bestlineState *l, size_t pos) { 2592 | pos = Backwards(l, pos, bestlineIsSeparator); 2593 | pos = Backwards(l, pos, bestlineNotSeparator); 2594 | return pos; 2595 | } 2596 | 2597 | static size_t EscapeWord(struct bestlineState *l, size_t i) { 2598 | size_t j; 2599 | struct rune r; 2600 | for (; i && i < l->len; i += r.n) { 2601 | if (i < l->len) { 2602 | r = GetUtf8(l->buf + i, l->len - i); 2603 | if (bestlineIsSeparator(r.c)) 2604 | break; 2605 | } 2606 | if ((j = i)) { 2607 | do 2608 | --j; 2609 | while (j && (l->buf[j] & 0300) == 0200); 2610 | r = GetUtf8(l->buf + j, l->len - j); 2611 | if (bestlineIsSeparator(r.c)) 2612 | break; 2613 | } 2614 | } 2615 | return i; 2616 | } 2617 | 2618 | static void bestlineEditLeft(struct bestlineState *l) { 2619 | l->pos = Backward(l, l->pos); 2620 | bestlineRefreshLine(l); 2621 | } 2622 | 2623 | static void bestlineEditRight(struct bestlineState *l) { 2624 | if (l->pos == l->len) 2625 | return; 2626 | do 2627 | l->pos++; 2628 | while (l->pos < l->len && (l->buf[l->pos] & 0300) == 0200); 2629 | bestlineRefreshLine(l); 2630 | } 2631 | 2632 | static void bestlineEditLeftWord(struct bestlineState *l) { 2633 | l->pos = BackwardWord(l, l->pos); 2634 | bestlineRefreshLine(l); 2635 | } 2636 | 2637 | static void bestlineEditRightWord(struct bestlineState *l) { 2638 | l->pos = ForwardWord(l, l->pos); 2639 | bestlineRefreshLine(l); 2640 | } 2641 | 2642 | static void bestlineEditLeftExpr(struct bestlineState *l) { 2643 | int mark[2]; 2644 | l->pos = Backwards(l, l->pos, bestlineIsXeparator); 2645 | if (!bestlineEditMirrorLeft(l, mark)) { 2646 | l->pos = mark[0]; 2647 | } else { 2648 | l->pos = Backwards(l, l->pos, bestlineNotSeparator); 2649 | } 2650 | bestlineRefreshLine(l); 2651 | } 2652 | 2653 | static void bestlineEditRightExpr(struct bestlineState *l) { 2654 | int mark[2]; 2655 | l->pos = Forwards(l, l->pos, bestlineIsXeparator); 2656 | if (!bestlineEditMirrorRight(l, mark)) { 2657 | l->pos = Forward(l, mark[1]); 2658 | } else { 2659 | l->pos = Forwards(l, l->pos, bestlineNotSeparator); 2660 | } 2661 | bestlineRefreshLine(l); 2662 | } 2663 | 2664 | static void bestlineEditDelete(struct bestlineState *l) { 2665 | size_t i; 2666 | if (l->pos == l->len) 2667 | return; 2668 | i = Forward(l, l->pos); 2669 | memmove(l->buf + l->pos, l->buf + i, l->len - i + 1); 2670 | l->len -= i - l->pos; 2671 | bestlineRefreshLine(l); 2672 | } 2673 | 2674 | static void bestlineEditRubout(struct bestlineState *l) { 2675 | size_t i; 2676 | if (!l->pos) 2677 | return; 2678 | i = Backward(l, l->pos); 2679 | memmove(l->buf + i, l->buf + l->pos, l->len - l->pos + 1); 2680 | l->len -= l->pos - i; 2681 | l->pos = i; 2682 | bestlineRefreshLine(l); 2683 | } 2684 | 2685 | static void bestlineEditDeleteWord(struct bestlineState *l) { 2686 | size_t i; 2687 | if (l->pos == l->len) 2688 | return; 2689 | i = ForwardWord(l, l->pos); 2690 | bestlineRingPush(l->buf + l->pos, i - l->pos); 2691 | memmove(l->buf + l->pos, l->buf + i, l->len - i + 1); 2692 | l->len -= i - l->pos; 2693 | bestlineRefreshLine(l); 2694 | } 2695 | 2696 | static void bestlineEditRuboutWord(struct bestlineState *l) { 2697 | size_t i; 2698 | if (!l->pos) 2699 | return; 2700 | i = BackwardWord(l, l->pos); 2701 | bestlineRingPush(l->buf + i, l->pos - i); 2702 | memmove(l->buf + i, l->buf + l->pos, l->len - l->pos + 1); 2703 | l->len -= l->pos - i; 2704 | l->pos = i; 2705 | bestlineRefreshLine(l); 2706 | } 2707 | 2708 | static void bestlineEditXlatWord(struct bestlineState *l, unsigned xlat(unsigned)) { 2709 | unsigned c; 2710 | size_t i, j; 2711 | struct rune r; 2712 | struct abuf ab; 2713 | abInit(&ab); 2714 | i = Forwards(l, l->pos, bestlineIsSeparator); 2715 | for (j = i; j < l->len; j += r.n) { 2716 | r = GetUtf8(l->buf + j, l->len - j); 2717 | if (bestlineIsSeparator(r.c)) 2718 | break; 2719 | if ((c = xlat(r.c)) != r.c) { 2720 | abAppendw(&ab, EncodeUtf8(c)); 2721 | } else { /* avoid canonicalization */ 2722 | abAppend(&ab, l->buf + j, r.n); 2723 | } 2724 | } 2725 | if (ab.len && bestlineGrow(l, i + ab.len + l->len - j + 1)) { 2726 | l->pos = i + ab.len; 2727 | abAppend(&ab, l->buf + j, l->len - j); 2728 | l->len = i + ab.len; 2729 | memcpy(l->buf + i, ab.b, ab.len + 1); 2730 | bestlineRefreshLine(l); 2731 | } 2732 | abFree(&ab); 2733 | } 2734 | 2735 | static void bestlineEditLowercaseWord(struct bestlineState *l) { 2736 | bestlineEditXlatWord(l, bestlineLowercase); 2737 | } 2738 | 2739 | static void bestlineEditUppercaseWord(struct bestlineState *l) { 2740 | bestlineEditXlatWord(l, bestlineUppercase); 2741 | } 2742 | 2743 | static void bestlineEditCapitalizeWord(struct bestlineState *l) { 2744 | iscapital = 0; 2745 | bestlineEditXlatWord(l, Capitalize); 2746 | } 2747 | 2748 | static void bestlineEditKillLeft(struct bestlineState *l) { 2749 | size_t diff, old_pos; 2750 | bestlineRingPush(l->buf, l->pos); 2751 | old_pos = l->pos; 2752 | l->pos = 0; 2753 | diff = old_pos - l->pos; 2754 | memmove(l->buf + l->pos, l->buf + old_pos, l->len - old_pos + 1); 2755 | l->len -= diff; 2756 | bestlineRefreshLine(l); 2757 | } 2758 | 2759 | static void bestlineEditKillRight(struct bestlineState *l) { 2760 | bestlineRingPush(l->buf + l->pos, l->len - l->pos); 2761 | l->buf[l->pos] = '\0'; 2762 | l->len = l->pos; 2763 | bestlineRefreshLine(l); 2764 | } 2765 | 2766 | static void bestlineEditYank(struct bestlineState *l) { 2767 | char *p; 2768 | size_t n; 2769 | if (!ring.p[ring.i]) 2770 | return; 2771 | n = strlen(ring.p[ring.i]); 2772 | if (!bestlineGrow(l, l->len + n + 1)) 2773 | return; 2774 | if (!(p = (char *)malloc(l->len - l->pos + 1))) 2775 | return; 2776 | memcpy(p, l->buf + l->pos, l->len - l->pos + 1); 2777 | memcpy(l->buf + l->pos, ring.p[ring.i], n); 2778 | memcpy(l->buf + l->pos + n, p, l->len - l->pos + 1); 2779 | free(p); 2780 | l->yi = l->pos; 2781 | l->yj = l->pos + n; 2782 | l->pos += n; 2783 | l->len += n; 2784 | bestlineRefreshLine(l); 2785 | } 2786 | 2787 | static void bestlineEditRotate(struct bestlineState *l) { 2788 | if ((l->seq[1][0] == Ctrl('Y') || (l->seq[1][0] == 033 && l->seq[1][1] == 'y'))) { 2789 | if (l->yi < l->len && l->yj <= l->len) { 2790 | memmove(l->buf + l->yi, l->buf + l->yj, l->len - l->yj + 1); 2791 | l->len -= l->yj - l->yi; 2792 | l->pos -= l->yj - l->yi; 2793 | } 2794 | bestlineRingRotate(); 2795 | bestlineEditYank(l); 2796 | } 2797 | } 2798 | 2799 | static void bestlineEditTranspose(struct bestlineState *l) { 2800 | char *q, *p; 2801 | size_t a, b, c; 2802 | b = l->pos; 2803 | if (b == l->len) 2804 | --b; 2805 | a = Backward(l, b); 2806 | c = Forward(l, b); 2807 | if (!(a < b && b < c)) 2808 | return; 2809 | p = q = (char *)malloc(c - a); 2810 | p = Copy(p, l->buf + b, c - b); 2811 | p = Copy(p, l->buf + a, b - a); 2812 | assert((size_t)(p - q) == c - a); 2813 | memcpy(l->buf + a, q, p - q); 2814 | l->pos = c; 2815 | free(q); 2816 | bestlineRefreshLine(l); 2817 | } 2818 | 2819 | static void bestlineEditTransposeWords(struct bestlineState *l) { 2820 | char *q, *p; 2821 | size_t i, pi, xi, xj, yi, yj; 2822 | i = l->pos; 2823 | if (i == l->len) { 2824 | i = Backwards(l, i, bestlineIsSeparator); 2825 | i = Backwards(l, i, bestlineNotSeparator); 2826 | } 2827 | pi = EscapeWord(l, i); 2828 | xj = Backwards(l, pi, bestlineIsSeparator); 2829 | xi = Backwards(l, xj, bestlineNotSeparator); 2830 | yi = Forwards(l, pi, bestlineIsSeparator); 2831 | yj = Forwards(l, yi, bestlineNotSeparator); 2832 | if (!(xi < xj && xj < yi && yi < yj)) 2833 | return; 2834 | p = q = (char *)malloc(yj - xi); 2835 | p = Copy(p, l->buf + yi, yj - yi); 2836 | p = Copy(p, l->buf + xj, yi - xj); 2837 | p = Copy(p, l->buf + xi, xj - xi); 2838 | assert((size_t)(p - q) == yj - xi); 2839 | memcpy(l->buf + xi, q, p - q); 2840 | l->pos = yj; 2841 | free(q); 2842 | bestlineRefreshLine(l); 2843 | } 2844 | 2845 | static void bestlineEditSqueeze(struct bestlineState *l) { 2846 | size_t i, j; 2847 | i = Backwards(l, l->pos, bestlineIsSeparator); 2848 | j = Forwards(l, l->pos, bestlineIsSeparator); 2849 | if (!(i < j)) 2850 | return; 2851 | memmove(l->buf + i, l->buf + j, l->len - j + 1); 2852 | l->len -= j - i; 2853 | l->pos = i; 2854 | bestlineRefreshLine(l); 2855 | } 2856 | 2857 | static void bestlineEditMark(struct bestlineState *l) { 2858 | l->mark = l->pos; 2859 | } 2860 | 2861 | static void bestlineEditGoto(struct bestlineState *l) { 2862 | if (l->mark > l->len) 2863 | return; 2864 | l->pos = Min(l->mark, l->len); 2865 | bestlineRefreshLine(l); 2866 | } 2867 | 2868 | static size_t bestlineEscape(char *d, const char *s, size_t n) { 2869 | char *p; 2870 | size_t i; 2871 | unsigned c, w, l; 2872 | for (p = d, l = i = 0; i < n; ++i) { 2873 | switch ((c = s[i] & 255)) { 2874 | Case('\a', w = Read16le("\\a")); 2875 | Case('\b', w = Read16le("\\b")); 2876 | Case('\t', w = Read16le("\\t")); 2877 | Case('\n', w = Read16le("\\n")); 2878 | Case('\v', w = Read16le("\\v")); 2879 | Case('\f', w = Read16le("\\f")); 2880 | Case('\r', w = Read16le("\\r")); 2881 | Case('"', w = Read16le("\\\"")); 2882 | Case('\'', w = Read16le("\\\'")); 2883 | Case('\\', w = Read16le("\\\\")); 2884 | default: 2885 | if (c <= 0x1F || c == 0x7F || (c == '?' && l == '?')) { 2886 | w = Read16le("\\x"); 2887 | w |= "0123456789abcdef"[(c & 0xF0) >> 4] << 020; 2888 | w |= "0123456789abcdef"[(c & 0x0F) >> 0] << 030; 2889 | } else { 2890 | w = c; 2891 | } 2892 | break; 2893 | } 2894 | p[0] = (w & 0x000000ff) >> 000; 2895 | p[1] = (w & 0x0000ff00) >> 010; 2896 | p[2] = (w & 0x00ff0000) >> 020; 2897 | p[3] = (w & 0xff000000) >> 030; 2898 | p += (Bsr(w) >> 3) + 1; 2899 | l = w; 2900 | } 2901 | return p - d; 2902 | } 2903 | 2904 | static void bestlineEditInsertEscape(struct bestlineState *l) { 2905 | size_t m; 2906 | ssize_t n; 2907 | char seq[16]; 2908 | char esc[sizeof(seq) * 4]; 2909 | if ((n = bestlineRead(l->ifd, seq, sizeof(seq), l)) > 0) { 2910 | m = bestlineEscape(esc, seq, n); 2911 | bestlineEditInsert(l, esc, m); 2912 | } 2913 | } 2914 | 2915 | static void bestlineEditInterrupt(void) { 2916 | gotint = SIGINT; 2917 | } 2918 | 2919 | static void bestlineEditQuit(void) { 2920 | gotint = SIGQUIT; 2921 | } 2922 | 2923 | static void bestlineEditSuspend(void) { 2924 | raise(SIGSTOP); 2925 | } 2926 | 2927 | static void bestlineEditPause(struct bestlineState *l) { 2928 | tcflow(l->ofd, TCOOFF); 2929 | ispaused = 1; 2930 | } 2931 | 2932 | static void bestlineEditCtrlq(struct bestlineState *l) { 2933 | if (ispaused) { 2934 | bestlineUnpause(l->ofd); 2935 | bestlineRefreshLineForce(l); 2936 | } else { 2937 | bestlineEditInsertEscape(l); 2938 | } 2939 | } 2940 | 2941 | /** 2942 | * Moves last item inside current s-expression to outside, e.g. 2943 | * 2944 | * (a| b c) 2945 | * (a| b) c 2946 | * 2947 | * The cursor position changes only if a paren is moved before it: 2948 | * 2949 | * (a b c |) 2950 | * (a b) c | 2951 | * 2952 | * To accommodate non-LISP languages we connect unspaced outer symbols: 2953 | * 2954 | * f(a,| b, g()) 2955 | * f(a,| b), g() 2956 | * 2957 | * Our standard keybinding is ALT-SHIFT-B. 2958 | */ 2959 | static void bestlineEditBarf(struct bestlineState *l) { 2960 | struct rune r; 2961 | unsigned long w; 2962 | size_t i, pos, depth = 0; 2963 | unsigned lhs, rhs, end, *stack = 0; 2964 | /* go as far right within current s-expr as possible */ 2965 | for (pos = l->pos;; pos += r.n) { 2966 | if (pos == l->len) 2967 | goto Finish; 2968 | r = GetUtf8(l->buf + pos, l->len - pos); 2969 | if (depth) { 2970 | if (r.c == stack[depth - 1]) { 2971 | --depth; 2972 | } 2973 | } else { 2974 | if ((rhs = bestlineMirrorRight(r.c))) { 2975 | stack = (unsigned *)realloc(stack, ++depth * sizeof(*stack)); 2976 | stack[depth - 1] = rhs; 2977 | } else if (bestlineMirrorLeft(r.c)) { 2978 | end = pos; 2979 | break; 2980 | } 2981 | } 2982 | } 2983 | /* go back one item */ 2984 | pos = Backwards(l, pos, bestlineIsXeparator); 2985 | for (;; pos = i) { 2986 | if (!pos) 2987 | goto Finish; 2988 | i = Backward(l, pos); 2989 | r = GetUtf8(l->buf + i, l->len - i); 2990 | if (depth) { 2991 | if (r.c == stack[depth - 1]) { 2992 | --depth; 2993 | } 2994 | } else { 2995 | if ((lhs = bestlineMirrorLeft(r.c))) { 2996 | stack = (unsigned *)realloc(stack, ++depth * sizeof(*stack)); 2997 | stack[depth - 1] = lhs; 2998 | } else if (bestlineIsSeparator(r.c)) { 2999 | break; 3000 | } 3001 | } 3002 | } 3003 | pos = Backwards(l, pos, bestlineIsXeparator); 3004 | /* now move the text */ 3005 | r = GetUtf8(l->buf + end, l->len - end); 3006 | memmove(l->buf + pos + r.n, l->buf + pos, end - pos); 3007 | w = EncodeUtf8(r.c); 3008 | for (i = 0; i < r.n; ++i) { 3009 | l->buf[pos + i] = w; 3010 | w >>= 8; 3011 | } 3012 | if (l->pos > pos) { 3013 | l->pos += r.n; 3014 | } 3015 | bestlineRefreshLine(l); 3016 | Finish: 3017 | free(stack); 3018 | } 3019 | 3020 | /** 3021 | * Moves first item outside current s-expression to inside, e.g. 3022 | * 3023 | * (a| b) c d 3024 | * (a| b c) d 3025 | * 3026 | * To accommodate non-LISP languages we connect unspaced outer symbols: 3027 | * 3028 | * f(a,| b), g() 3029 | * f(a,| b, g()) 3030 | * 3031 | * Our standard keybinding is ALT-SHIFT-S. 3032 | */ 3033 | static void bestlineEditSlurp(struct bestlineState *l) { 3034 | char rp[6]; 3035 | struct rune r; 3036 | size_t pos, depth = 0; 3037 | unsigned rhs, point = 0, start = 0, *stack = 0; 3038 | /* go to outside edge of current s-expr */ 3039 | for (pos = l->pos; pos < l->len; pos += r.n) { 3040 | r = GetUtf8(l->buf + pos, l->len - pos); 3041 | if (depth) { 3042 | if (r.c == stack[depth - 1]) { 3043 | --depth; 3044 | } 3045 | } else { 3046 | if ((rhs = bestlineMirrorRight(r.c))) { 3047 | stack = (unsigned *)realloc(stack, ++depth * sizeof(*stack)); 3048 | stack[depth - 1] = rhs; 3049 | } else if (bestlineMirrorLeft(r.c)) { 3050 | point = pos; 3051 | pos += r.n; 3052 | start = pos; 3053 | break; 3054 | } 3055 | } 3056 | } 3057 | /* go forward one item */ 3058 | pos = Forwards(l, pos, bestlineIsXeparator); 3059 | for (; pos < l->len; pos += r.n) { 3060 | r = GetUtf8(l->buf + pos, l->len - pos); 3061 | if (depth) { 3062 | if (r.c == stack[depth - 1]) { 3063 | --depth; 3064 | } 3065 | } else { 3066 | if ((rhs = bestlineMirrorRight(r.c))) { 3067 | stack = (unsigned *)realloc(stack, ++depth * sizeof(*stack)); 3068 | stack[depth - 1] = rhs; 3069 | } else if (bestlineIsSeparator(r.c)) { 3070 | break; 3071 | } 3072 | } 3073 | } 3074 | /* now move the text */ 3075 | memcpy(rp, l->buf + point, start - point); 3076 | memmove(l->buf + point, l->buf + start, pos - start); 3077 | memcpy(l->buf + pos - (start - point), rp, start - point); 3078 | bestlineRefreshLine(l); 3079 | free(stack); 3080 | } 3081 | 3082 | static void bestlineEditRaise(struct bestlineState *l) { 3083 | (void)l; 3084 | } 3085 | 3086 | static char IsBalanced(struct abuf *buf) { 3087 | unsigned i, d; 3088 | for (d = i = 0; i < buf->len; ++i) { 3089 | if (buf->b[i] == '(') 3090 | ++d; 3091 | else if (d > 0 && buf->b[i] == ')') 3092 | --d; 3093 | } 3094 | return d == 0; 3095 | } 3096 | 3097 | /** 3098 | * Runs bestline engine. 3099 | * 3100 | * This function is the core of the line editing capability of bestline. 3101 | * It expects 'fd' to be already in "raw mode" so that every key pressed 3102 | * will be returned ASAP to read(). 3103 | * 3104 | * The resulting string is put into 'buf' when the user type enter, or 3105 | * when ctrl+d is typed. 3106 | * 3107 | * Returns chomped character count in buf >=0 or -1 on eof / error 3108 | */ 3109 | static ssize_t bestlineEdit(int stdin_fd, int stdout_fd, const char *prompt, const char *init, 3110 | char **obuf) { 3111 | ssize_t rc; 3112 | char seq[16]; 3113 | const char *promptnotnull, *promptlastnl; 3114 | size_t nread; 3115 | int pastemode; 3116 | struct rune rune; 3117 | unsigned long long w; 3118 | struct bestlineState l; 3119 | pastemode = 0; 3120 | memset(&l, 0, sizeof(l)); 3121 | if (!(l.buf = (char *)malloc((l.buflen = 32)))) 3122 | return -1; 3123 | l.buf[0] = 0; 3124 | l.ifd = stdin_fd; 3125 | l.ofd = stdout_fd; 3126 | promptnotnull = prompt ? prompt : ""; 3127 | promptlastnl = strrchr(promptnotnull, '\n'); 3128 | l.prompt = promptlastnl ? promptlastnl + 1 : promptnotnull; 3129 | l.ws = GetTerminalSize(l.ws, l.ifd, l.ofd); 3130 | abInit(&l.full); 3131 | bestlineHistoryAdd(""); 3132 | bestlineWriteStr(l.ofd, promptnotnull); 3133 | init = init ? init : ""; 3134 | bestlineEditInsert(&l, init, strlen(init)); 3135 | while (1) { 3136 | if (l.dirty) 3137 | bestlineRefreshLineForce(&l); 3138 | rc = bestlineRead(l.ifd, seq, sizeof(seq), &l); 3139 | if (rc > 0) { 3140 | if (seq[0] == Ctrl('R')) { 3141 | rc = bestlineSearch(&l, seq, sizeof(seq)); 3142 | if (!rc) 3143 | continue; 3144 | } else if (seq[0] == '\t' && completionCallback) { 3145 | rc = bestlineCompleteLine(&l, seq, sizeof(seq)); 3146 | if (!rc) 3147 | continue; 3148 | } 3149 | } 3150 | if (rc > 0) { 3151 | nread = rc; 3152 | } else if (!rc && l.len) { 3153 | nread = 1; 3154 | seq[0] = '\r'; 3155 | seq[1] = 0; 3156 | } else { 3157 | if (historylen) { 3158 | free(history[--historylen]); 3159 | history[historylen] = 0; 3160 | } 3161 | free(l.buf); 3162 | abFree(&l.full); 3163 | return -1; 3164 | } 3165 | switch (seq[0]) { 3166 | Case(Ctrl('P'), bestlineEditUp(&l)); 3167 | Case(Ctrl('E'), bestlineEditEnd(&l)); 3168 | Case(Ctrl('N'), bestlineEditDown(&l)); 3169 | Case(Ctrl('A'), bestlineEditHome(&l)); 3170 | Case(Ctrl('B'), bestlineEditLeft(&l)); 3171 | Case(Ctrl('@'), bestlineEditMark(&l)); 3172 | Case(Ctrl('Y'), bestlineEditYank(&l)); 3173 | Case(Ctrl('Q'), bestlineEditCtrlq(&l)); 3174 | Case(Ctrl('F'), bestlineEditRight(&l)); 3175 | Case(Ctrl('\\'), bestlineEditQuit()); 3176 | Case(Ctrl('S'), bestlineEditPause(&l)); 3177 | Case(Ctrl('?'), bestlineEditRubout(&l)); 3178 | Case(Ctrl('H'), bestlineEditRubout(&l)); 3179 | Case(Ctrl('L'), bestlineEditRefresh(&l)); 3180 | Case(Ctrl('Z'), bestlineEditSuspend()); 3181 | Case(Ctrl('U'), bestlineEditKillLeft(&l)); 3182 | Case(Ctrl('T'), bestlineEditTranspose(&l)); 3183 | Case(Ctrl('K'), bestlineEditKillRight(&l)); 3184 | Case(Ctrl('W'), bestlineEditRuboutWord(&l)); 3185 | case Ctrl('C'): 3186 | if (emacsmode) { 3187 | if (bestlineRead(l.ifd, seq, sizeof(seq), &l) != 1) 3188 | break; 3189 | switch (seq[0]) { 3190 | Case(Ctrl('C'), bestlineEditInterrupt()); 3191 | Case(Ctrl('B'), bestlineEditBarf(&l)); 3192 | Case(Ctrl('S'), bestlineEditSlurp(&l)); 3193 | Case(Ctrl('R'), bestlineEditRaise(&l)); 3194 | default: 3195 | break; 3196 | } 3197 | } else { 3198 | bestlineEditInterrupt(); 3199 | } 3200 | break; 3201 | case Ctrl('X'): 3202 | if (l.seq[1][0] == Ctrl('X')) { 3203 | bestlineEditGoto(&l); 3204 | } 3205 | break; 3206 | case Ctrl('D'): 3207 | if (l.len) { 3208 | bestlineEditDelete(&l); 3209 | } else { 3210 | if (historylen) { 3211 | free(history[--historylen]); 3212 | history[historylen] = 0; 3213 | } 3214 | free(l.buf); 3215 | abFree(&l.full); 3216 | return -1; 3217 | } 3218 | break; 3219 | case '\n': 3220 | l.final = 1; 3221 | bestlineEditEnd(&l); 3222 | bestlineRefreshLineForce(&l); 3223 | l.final = 0; 3224 | abAppend(&l.full, l.buf, l.len); 3225 | l.prompt = "... "; 3226 | abAppends(&l.full, "\n"); 3227 | l.len = 0; 3228 | l.pos = 0; 3229 | bestlineWriteStr(stdout_fd, "\r\n"); 3230 | bestlineRefreshLineForce(&l); 3231 | break; 3232 | case '\r': { 3233 | char is_finished = 1; 3234 | char needs_strip = 0; 3235 | if (historylen) { 3236 | free(history[--historylen]); 3237 | history[historylen] = 0; 3238 | } 3239 | l.final = 1; 3240 | bestlineEditEnd(&l); 3241 | bestlineRefreshLineForce(&l); 3242 | l.final = 0; 3243 | abAppend(&l.full, l.buf, l.len); 3244 | if (pastemode) 3245 | is_finished = 0; 3246 | if (balancemode) 3247 | if (!IsBalanced(&l.full)) 3248 | is_finished = 0; 3249 | if (llamamode) 3250 | if (StartsWith(l.full.b, "\"\"\"")) 3251 | needs_strip = is_finished = l.full.len > 6 && EndsWith(l.full.b, "\"\"\""); 3252 | if (is_finished) { 3253 | if (needs_strip) { 3254 | int len = l.full.len - 6; 3255 | *obuf = strndup(l.full.b + 3, len); 3256 | abFree(&l.full); 3257 | free(l.buf); 3258 | return len; 3259 | } else { 3260 | *obuf = l.full.b; 3261 | free(l.buf); 3262 | return l.full.len; 3263 | } 3264 | } else { 3265 | l.prompt = "... "; 3266 | abAppends(&l.full, "\n"); 3267 | l.len = 0; 3268 | l.pos = 0; 3269 | bestlineWriteStr(stdout_fd, "\r\n"); 3270 | bestlineRefreshLineForce(&l); 3271 | } 3272 | break; 3273 | } 3274 | case 033: 3275 | if (nread < 2) 3276 | break; 3277 | switch (seq[1]) { 3278 | Case('<', bestlineEditBof(&l)); 3279 | Case('>', bestlineEditEof(&l)); 3280 | Case('B', bestlineEditBarf(&l)); 3281 | Case('S', bestlineEditSlurp(&l)); 3282 | Case('R', bestlineEditRaise(&l)); 3283 | Case('y', bestlineEditRotate(&l)); 3284 | Case('\\', bestlineEditSqueeze(&l)); 3285 | Case('b', bestlineEditLeftWord(&l)); 3286 | Case('f', bestlineEditRightWord(&l)); 3287 | Case('h', bestlineEditRuboutWord(&l)); 3288 | Case('d', bestlineEditDeleteWord(&l)); 3289 | Case('l', bestlineEditLowercaseWord(&l)); 3290 | Case('u', bestlineEditUppercaseWord(&l)); 3291 | Case('c', bestlineEditCapitalizeWord(&l)); 3292 | Case('t', bestlineEditTransposeWords(&l)); 3293 | Case(Ctrl('B'), bestlineEditLeftExpr(&l)); 3294 | Case(Ctrl('F'), bestlineEditRightExpr(&l)); 3295 | Case(Ctrl('H'), bestlineEditRuboutWord(&l)); 3296 | case '[': 3297 | if (nread == 6 && !memcmp(seq, "\033[200~", 6)) { 3298 | pastemode = 1; 3299 | break; 3300 | } 3301 | if (nread == 6 && !memcmp(seq, "\033[201~", 6)) { 3302 | pastemode = 0; 3303 | break; 3304 | } 3305 | if (nread < 3) 3306 | break; 3307 | if (seq[2] >= '0' && seq[2] <= '9') { 3308 | if (nread < 4) 3309 | break; 3310 | if (seq[3] == '~') { 3311 | switch (seq[2]) { 3312 | Case('1', bestlineEditHome(&l)); /* \e[1~ */ 3313 | Case('3', bestlineEditDelete(&l)); /* \e[3~ */ 3314 | Case('4', bestlineEditEnd(&l)); /* \e[4~ */ 3315 | default: 3316 | break; 3317 | } 3318 | } 3319 | } else { 3320 | switch (seq[2]) { 3321 | Case('A', bestlineEditUp(&l)); 3322 | Case('B', bestlineEditDown(&l)); 3323 | Case('C', bestlineEditRight(&l)); 3324 | Case('D', bestlineEditLeft(&l)); 3325 | Case('H', bestlineEditHome(&l)); 3326 | Case('F', bestlineEditEnd(&l)); 3327 | default: 3328 | break; 3329 | } 3330 | } 3331 | break; 3332 | case 'O': 3333 | if (nread < 3) 3334 | break; 3335 | switch (seq[2]) { 3336 | Case('A', bestlineEditUp(&l)); 3337 | Case('B', bestlineEditDown(&l)); 3338 | Case('C', bestlineEditRight(&l)); 3339 | Case('D', bestlineEditLeft(&l)); 3340 | Case('H', bestlineEditHome(&l)); 3341 | Case('F', bestlineEditEnd(&l)); 3342 | default: 3343 | break; 3344 | } 3345 | break; 3346 | case 033: 3347 | if (nread < 3) 3348 | break; 3349 | switch (seq[2]) { 3350 | case '[': 3351 | if (nread < 4) 3352 | break; 3353 | switch (seq[3]) { 3354 | Case('C', bestlineEditRightExpr(&l)); /* \e\e[C alt-right */ 3355 | Case('D', bestlineEditLeftExpr(&l)); /* \e\e[D alt-left */ 3356 | default: 3357 | break; 3358 | } 3359 | break; 3360 | case 'O': 3361 | if (nread < 4) 3362 | break; 3363 | switch (seq[3]) { 3364 | Case('C', bestlineEditRightExpr(&l)); /* \e\eOC alt-right */ 3365 | Case('D', bestlineEditLeftExpr(&l)); /* \e\eOD alt-left */ 3366 | default: 3367 | break; 3368 | } 3369 | break; 3370 | default: 3371 | break; 3372 | } 3373 | break; 3374 | default: 3375 | break; 3376 | } 3377 | break; 3378 | default: 3379 | if (!IsControl(seq[0])) { /* only sees canonical c0 */ 3380 | if (xlatCallback) { 3381 | rune = GetUtf8(seq, nread); 3382 | w = EncodeUtf8(xlatCallback(rune.c)); 3383 | nread = 0; 3384 | do { 3385 | seq[nread++] = w; 3386 | } while ((w >>= 8)); 3387 | } 3388 | bestlineEditInsert(&l, seq, nread); 3389 | } 3390 | break; 3391 | } 3392 | } 3393 | } 3394 | 3395 | void bestlineFree(void *ptr) { 3396 | free(ptr); 3397 | } 3398 | 3399 | void bestlineHistoryFree(void) { 3400 | size_t i; 3401 | for (i = 0; i < BESTLINE_MAX_HISTORY; i++) { 3402 | if (history[i]) { 3403 | free(history[i]); 3404 | history[i] = 0; 3405 | } 3406 | } 3407 | historylen = 0; 3408 | } 3409 | 3410 | static void bestlineAtExit(void) { 3411 | bestlineDisableRawMode(); 3412 | bestlineHistoryFree(); 3413 | bestlineRingFree(); 3414 | } 3415 | 3416 | int bestlineHistoryAdd(const char *line) { 3417 | char *linecopy; 3418 | if (!BESTLINE_MAX_HISTORY) 3419 | return 0; 3420 | if (historylen && !strcmp(history[historylen - 1], line)) 3421 | return 0; 3422 | if (!(linecopy = strdup(line))) 3423 | return 0; 3424 | if (historylen == BESTLINE_MAX_HISTORY) { 3425 | free(history[0]); 3426 | memmove(history, history + 1, sizeof(char *) * (BESTLINE_MAX_HISTORY - 1)); 3427 | historylen--; 3428 | } 3429 | history[historylen++] = linecopy; 3430 | return 1; 3431 | } 3432 | 3433 | /** 3434 | * Saves line editing history to file. 3435 | * 3436 | * @return 0 on success, or -1 w/ errno 3437 | */ 3438 | int bestlineHistorySave(const char *filename) { 3439 | FILE *fp; 3440 | unsigned j; 3441 | mode_t old_umask; 3442 | old_umask = umask(S_IXUSR | S_IRWXG | S_IRWXO); 3443 | fp = fopen(filename, "w"); 3444 | umask(old_umask); 3445 | if (!fp) 3446 | return -1; 3447 | chmod(filename, S_IRUSR | S_IWUSR); 3448 | for (j = 0; j < historylen; j++) { 3449 | fputs(history[j], fp); 3450 | fputc('\n', fp); 3451 | } 3452 | fclose(fp); 3453 | return 0; 3454 | } 3455 | 3456 | /** 3457 | * Loads history from the specified file. 3458 | * 3459 | * If the file doesn't exist, zero is returned and this will do nothing. 3460 | * If the file does exists and the operation succeeded zero is returned 3461 | * otherwise on error -1 is returned. 3462 | * 3463 | * @return 0 on success, or -1 w/ errno 3464 | */ 3465 | int bestlineHistoryLoad(const char *filename) { 3466 | char **h; 3467 | int rc, fd, err; 3468 | size_t i, j, k, n, t; 3469 | char *m, *e, *p, *q, *f, *s; 3470 | err = errno, rc = 0; 3471 | if (!BESTLINE_MAX_HISTORY) 3472 | return 0; 3473 | if (!(h = (char **)calloc(2 * BESTLINE_MAX_HISTORY, sizeof(char *)))) 3474 | return -1; 3475 | if ((fd = open(filename, O_RDONLY)) != -1) { 3476 | if ((n = GetFdSize(fd))) { 3477 | if ((m = (char *)mmap(0, n, PROT_READ, MAP_SHARED, fd, 0)) != MAP_FAILED) { 3478 | for (i = 0, e = (p = m) + n; p < e; p = f + 1) { 3479 | if (!(q = (char *)memchr(p, '\n', e - p))) 3480 | q = e; 3481 | for (f = q; q > p; --q) { 3482 | if (q[-1] != '\n' && q[-1] != '\r') 3483 | break; 3484 | } 3485 | if (q > p) { 3486 | h[i * 2 + 0] = p; 3487 | h[i * 2 + 1] = q; 3488 | i = (i + 1) % BESTLINE_MAX_HISTORY; 3489 | } 3490 | } 3491 | bestlineHistoryFree(); 3492 | for (j = 0; j < BESTLINE_MAX_HISTORY; ++j) { 3493 | if (h[(k = (i + j) % BESTLINE_MAX_HISTORY) * 2]) { 3494 | if ((s = (char *)malloc((t = h[k * 2 + 1] - h[k * 2]) + 1))) { 3495 | memcpy(s, h[k * 2], t), s[t] = 0; 3496 | history[historylen++] = s; 3497 | } 3498 | } 3499 | } 3500 | munmap(m, n); 3501 | } else { 3502 | rc = -1; 3503 | } 3504 | } 3505 | close(fd); 3506 | } else if (errno == ENOENT) { 3507 | errno = err; 3508 | } else { 3509 | rc = -1; 3510 | } 3511 | free(h); 3512 | return rc; 3513 | } 3514 | 3515 | /** 3516 | * Like bestlineRaw, but with the additional parameter init used as the buffer 3517 | * initial value. 3518 | */ 3519 | char *bestlineRawInit(const char *prompt, const char *init, int infd, int outfd) { 3520 | char *buf; 3521 | ssize_t rc; 3522 | static char once; 3523 | struct sigaction sa[3]; 3524 | if (!once) 3525 | atexit(bestlineAtExit), once = 1; 3526 | if (enableRawMode(infd) == -1) 3527 | return 0; 3528 | buf = 0; 3529 | gotint = 0; 3530 | sigemptyset(&sa->sa_mask); 3531 | sa->sa_flags = 0; 3532 | sa->sa_handler = bestlineOnInt; 3533 | sigaction(SIGINT, sa, sa + 1); 3534 | sigaction(SIGQUIT, sa, sa + 2); 3535 | bestlineWriteStr(outfd, "\033[?2004h"); // enable bracketed paste mode 3536 | rc = bestlineEdit(infd, outfd, prompt, init, &buf); 3537 | bestlineWriteStr(outfd, "\033[?2004l"); // disable bracketed paste mode 3538 | bestlineDisableRawMode(); 3539 | sigaction(SIGQUIT, sa + 2, 0); 3540 | sigaction(SIGINT, sa + 1, 0); 3541 | if (gotint) { 3542 | free(buf); 3543 | buf = 0; 3544 | raise(gotint); 3545 | errno = EINTR; 3546 | rc = -1; 3547 | } 3548 | bestlineWriteStr(outfd, "\r\n"); 3549 | if (rc != -1) { 3550 | return buf; 3551 | } else { 3552 | free(buf); 3553 | return 0; 3554 | } 3555 | } 3556 | 3557 | /** 3558 | * Reads line interactively. 3559 | * 3560 | * This function can be used instead of bestline() in cases where we 3561 | * know for certain we're dealing with a terminal, which means we can 3562 | * avoid linking any stdio code. 3563 | * 3564 | * @return chomped allocated string of read line or null on eof/error 3565 | */ 3566 | char *bestlineRaw(const char *prompt, int infd, int outfd) { 3567 | return bestlineRawInit(prompt, "", infd, outfd); 3568 | } 3569 | 3570 | /** 3571 | * Like bestline, but with the additional parameter init used as the buffer 3572 | * initial value. The init parameter is only used if the terminal has basic 3573 | * capabilites. 3574 | */ 3575 | char *bestlineInit(const char *prompt, const char *init) { 3576 | if (prompt && *prompt && (strchr(prompt, '\t') || strchr(prompt + 1, '\r'))) { 3577 | errno = EINVAL; 3578 | return 0; 3579 | } 3580 | if ((!isatty(fileno(stdin)) || !isatty(fileno(stdout)))) { 3581 | if (prompt && *prompt && (IsCharDev(fileno(stdin)) && IsCharDev(fileno(stdout)))) { 3582 | fputs(prompt, stdout); 3583 | fflush(stdout); 3584 | } 3585 | return GetLine(stdin, stdout); 3586 | } else if (bestlineIsUnsupportedTerm()) { 3587 | if (prompt && *prompt) { 3588 | fputs(prompt, stdout); 3589 | fflush(stdout); 3590 | } 3591 | return GetLine(stdin, stdout); 3592 | } else { 3593 | fflush(stdout); 3594 | return bestlineRawInit(prompt, init, fileno(stdin), fileno(stdout)); 3595 | } 3596 | } 3597 | 3598 | /** 3599 | * Reads line intelligently. 3600 | * 3601 | * The high level function that is the main API of the bestline library. 3602 | * This function checks if the terminal has basic capabilities, just checking 3603 | * for a blacklist of inarticulate terminals, and later either calls the line 3604 | * editing function or uses dummy fgets() so that you will be able to type 3605 | * something even in the most desperate of the conditions. 3606 | * 3607 | * @param prompt is printed before asking for input if we have a term 3608 | * and this may be set to empty or null to disable and prompt may 3609 | * contain ansi escape sequences, color, utf8, etc. 3610 | * @return chomped allocated string of read line or null on eof/error 3611 | */ 3612 | char *bestline(const char *prompt) { 3613 | return bestlineInit(prompt, ""); 3614 | } 3615 | 3616 | /** 3617 | * Reads line intelligently w/ history, e.g. 3618 | * 3619 | * // see ~/.foo_history 3620 | * main() { 3621 | * char *line; 3622 | * while ((line = bestlineWithHistory("IN> ", "foo"))) { 3623 | * printf("OUT> %s\n", line); 3624 | * free(line); 3625 | * } 3626 | * } 3627 | * 3628 | * @param prompt is printed before asking for input if we have a term 3629 | * and this may be set to empty or null to disable and prompt may 3630 | * contain ansi escape sequences, color, utf8, etc. 3631 | * @param prog is name of your app, used to generate history filename 3632 | * however if it contains a slash / dot then we'll assume prog is 3633 | * the history filename which as determined by the caller 3634 | * @return chomped allocated string of read line or null on eof/error 3635 | */ 3636 | char *bestlineWithHistory(const char *prompt, const char *prog) { 3637 | char *line; 3638 | struct abuf path; 3639 | const char *a, *b; 3640 | abInit(&path); 3641 | if (prog) { 3642 | if (strchr(prog, '/') || strchr(prog, '.')) { 3643 | abAppends(&path, prog); 3644 | } else { 3645 | b = ""; 3646 | if (!(a = getenv("HOME"))) { 3647 | if (!(a = getenv("HOMEDRIVE")) || !(b = getenv("HOMEPATH"))) { 3648 | a = ""; 3649 | } 3650 | } 3651 | if (*a) { 3652 | abAppends(&path, a); 3653 | abAppends(&path, b); 3654 | abAppendw(&path, '/'); 3655 | } 3656 | abAppendw(&path, '.'); 3657 | abAppends(&path, prog); 3658 | abAppends(&path, "_history"); 3659 | } 3660 | } 3661 | if (path.len) { 3662 | bestlineHistoryLoad(path.b); 3663 | } 3664 | line = bestline(prompt); 3665 | if (path.len && line && *line) { 3666 | /* history here is inefficient but helpful when the user has multiple 3667 | * repls open at the same time, so history propagates between them */ 3668 | bestlineHistoryLoad(path.b); 3669 | bestlineHistoryAdd(line); 3670 | bestlineHistorySave(path.b); 3671 | } 3672 | abFree(&path); 3673 | return line; 3674 | } 3675 | 3676 | /** 3677 | * Registers tab completion callback. 3678 | */ 3679 | void bestlineSetCompletionCallback(bestlineCompletionCallback *fn) { 3680 | completionCallback = fn; 3681 | } 3682 | 3683 | /** 3684 | * Registers hints callback. 3685 | * 3686 | * Register a hits function to be called to show hits to the user at the 3687 | * right of the prompt. 3688 | */ 3689 | void bestlineSetHintsCallback(bestlineHintsCallback *fn) { 3690 | hintsCallback = fn; 3691 | } 3692 | 3693 | /** 3694 | * Sets free hints callback. 3695 | * 3696 | * This registers a function to free the hints returned by the hints 3697 | * callback registered with bestlineSetHintsCallback(). 3698 | */ 3699 | void bestlineSetFreeHintsCallback(bestlineFreeHintsCallback *fn) { 3700 | freeHintsCallback = fn; 3701 | } 3702 | 3703 | /** 3704 | * Sets character translation callback. 3705 | */ 3706 | void bestlineSetXlatCallback(bestlineXlatCallback *fn) { 3707 | xlatCallback = fn; 3708 | } 3709 | 3710 | /** 3711 | * Adds completion. 3712 | * 3713 | * This function is used by the callback function registered by the user 3714 | * in order to add completion options given the input string when the 3715 | * user typed . See the example.c source code for a very easy to 3716 | * understand example. 3717 | */ 3718 | void bestlineAddCompletion(bestlineCompletions *lc, const char *str) { 3719 | size_t len; 3720 | char *copy, **cvec; 3721 | if ((copy = (char *)malloc((len = strlen(str)) + 1))) { 3722 | memcpy(copy, str, len + 1); 3723 | if ((cvec = (char **)realloc(lc->cvec, (lc->len + 1) * sizeof(*lc->cvec)))) { 3724 | lc->cvec = cvec; 3725 | lc->cvec[lc->len++] = copy; 3726 | } else { 3727 | free(copy); 3728 | } 3729 | } 3730 | } 3731 | 3732 | /** 3733 | * Frees list of completion option populated by bestlineAddCompletion(). 3734 | */ 3735 | void bestlineFreeCompletions(bestlineCompletions *lc) { 3736 | size_t i; 3737 | for (i = 0; i < lc->len; i++) 3738 | free(lc->cvec[i]); 3739 | if (lc->cvec) 3740 | free(lc->cvec); 3741 | } 3742 | 3743 | /** 3744 | * Enables "mask mode". 3745 | * 3746 | * When it is enabled, instead of the input that the user is typing, the 3747 | * terminal will just display a corresponding number of asterisks, like 3748 | * "****". This is useful for passwords and other secrets that should 3749 | * not be displayed. 3750 | * 3751 | * @see bestlineMaskModeDisable() 3752 | */ 3753 | void bestlineMaskModeEnable(void) { 3754 | maskmode = 1; 3755 | } 3756 | 3757 | /** 3758 | * Disables "mask mode". 3759 | */ 3760 | void bestlineMaskModeDisable(void) { 3761 | maskmode = 0; 3762 | } 3763 | 3764 | /** 3765 | * Enables or disables "balance mode". 3766 | * 3767 | * When it is enabled, bestline() will block until parentheses are 3768 | * balanced. This is useful for code but not for free text. 3769 | */ 3770 | void bestlineBalanceMode(char mode) { 3771 | balancemode = mode; 3772 | } 3773 | 3774 | /** 3775 | * Enables or disables "ollama mode". 3776 | * 3777 | * This enables you to type multiline input by putting triple quotes at 3778 | * the beginning and end. For example: 3779 | * 3780 | * >>> """ 3781 | * ... second line 3782 | * ... third line 3783 | * ... """ 3784 | * 3785 | * Would yield the string `"\nsecond line\nthird line\n"`. 3786 | * 3787 | * @param mode is 1 to enable, or 0 to disable 3788 | */ 3789 | void bestlineLlamaMode(char mode) { 3790 | llamamode = mode; 3791 | } 3792 | 3793 | /** 3794 | * Enables Emacs mode. 3795 | * 3796 | * This mode remaps CTRL-C so you can use additional shortcuts, like C-c 3797 | * C-s for slurp. By default, CTRL-C raises SIGINT for exiting programs. 3798 | */ 3799 | void bestlineEmacsMode(char mode) { 3800 | emacsmode = mode; 3801 | } 3802 | 3803 | /** 3804 | * Allows implementation of user functions for read, write, and poll 3805 | * with the intention of polling for background I/O. 3806 | */ 3807 | 3808 | static int MyRead(int fd, void *c, int n) { 3809 | return read(fd, c, n); 3810 | } 3811 | 3812 | static int MyWrite(int fd, const void *c, int n) { 3813 | return write(fd, c, n); 3814 | } 3815 | 3816 | static int MyPoll(int fd, int events, int to) { 3817 | struct pollfd p[1]; 3818 | p[0].fd = fd; 3819 | p[0].events = events; 3820 | return poll(p, 1, to); 3821 | } 3822 | 3823 | void bestlineUserIO(int (*userReadFn)(int, void *, int), int (*userWriteFn)(int, const void *, int), 3824 | int (*userPollFn)(int, int, int)) { 3825 | if (userReadFn) 3826 | _MyRead = userReadFn; 3827 | else 3828 | _MyRead = MyRead; 3829 | if (userWriteFn) 3830 | _MyWrite = userWriteFn; 3831 | else 3832 | _MyWrite = MyWrite; 3833 | if (userPollFn) 3834 | _MyPoll = userPollFn; 3835 | else 3836 | _MyPoll = MyPoll; 3837 | } 3838 | -------------------------------------------------------------------------------- /bestline.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | #ifdef __cplusplus 3 | extern "C" { 4 | #endif 5 | 6 | typedef struct bestlineCompletions { 7 | unsigned long len; 8 | char **cvec; 9 | } bestlineCompletions; 10 | 11 | typedef void(bestlineCompletionCallback)(const char *, int, 12 | bestlineCompletions *); 13 | typedef char *(bestlineHintsCallback)(const char *, const char **, const char **); 14 | typedef void(bestlineFreeHintsCallback)(void *); 15 | typedef unsigned(bestlineXlatCallback)(unsigned); 16 | 17 | void bestlineSetCompletionCallback(bestlineCompletionCallback *); 18 | void bestlineSetHintsCallback(bestlineHintsCallback *); 19 | void bestlineSetFreeHintsCallback(bestlineFreeHintsCallback *); 20 | void bestlineAddCompletion(bestlineCompletions *, const char *); 21 | void bestlineSetXlatCallback(bestlineXlatCallback *); 22 | 23 | char *bestline(const char *); 24 | char *bestlineInit(const char *, const char *); 25 | char *bestlineRaw(const char *, int, int); 26 | char *bestlineRawInit(const char *, const char *, int, int); 27 | char *bestlineWithHistory(const char *, const char *); 28 | int bestlineHistoryAdd(const char *); 29 | int bestlineHistoryLoad(const char *); 30 | int bestlineHistorySave(const char *); 31 | void bestlineBalanceMode(char); 32 | void bestlineEmacsMode(char); 33 | void bestlineClearScreen(int); 34 | void bestlineDisableRawMode(void); 35 | void bestlineFree(void *); 36 | void bestlineFreeCompletions(bestlineCompletions *); 37 | void bestlineHistoryFree(void); 38 | void bestlineLlamaMode(char); 39 | void bestlineMaskModeDisable(void); 40 | void bestlineMaskModeEnable(void); 41 | 42 | void bestlineUserIO(int (*)(int, void *, int), int (*)(int, const void *, int), 43 | int (*)(int, int, int)); 44 | 45 | char bestlineIsSeparator(unsigned); 46 | char bestlineNotSeparator(unsigned); 47 | char bestlineIsXeparator(unsigned); 48 | unsigned bestlineUppercase(unsigned); 49 | unsigned bestlineLowercase(unsigned); 50 | long bestlineReadCharacter(int, char *, unsigned long); 51 | 52 | #ifdef __cplusplus 53 | } 54 | #endif 55 | -------------------------------------------------------------------------------- /bin/footprint.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jart/sectorlisp/fc6fa331d23807cae685a1cfbcb3e2955e2aa08f/bin/footprint.png -------------------------------------------------------------------------------- /bin/sectorlisp.bin: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jart/sectorlisp/fc6fa331d23807cae685a1cfbcb3e2955e2aa08f/bin/sectorlisp.bin -------------------------------------------------------------------------------- /bin/sectorlisp.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jart/sectorlisp/fc6fa331d23807cae685a1cfbcb3e2955e2aa08f/bin/sectorlisp.gif -------------------------------------------------------------------------------- /bin/yodawg.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jart/sectorlisp/fc6fa331d23807cae685a1cfbcb3e2955e2aa08f/bin/yodawg.png -------------------------------------------------------------------------------- /lisp.c: -------------------------------------------------------------------------------- 1 | /*-*- mode:c;indent-tabs-mode:nil;c-basic-offset:2;tab-width:8;coding:utf-8 -*-│ 2 | │ vi: set et ft=c ts=2 sts=2 sw=2 fenc=utf-8 :vi │ 3 | ╞══════════════════════════════════════════════════════════════════════════════╡ 4 | │ Copyright 2020 Justine Alexandra Roberts Tunney │ 5 | │ │ 6 | │ Permission to use, copy, modify, and/or distribute this software for │ 7 | │ any purpose with or without fee is hereby granted, provided that the │ 8 | │ above copyright notice and this permission notice appear in all copies. │ 9 | │ │ 10 | │ THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL │ 11 | │ WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED │ 12 | │ WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE │ 13 | │ AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL │ 14 | │ DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR │ 15 | │ PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER │ 16 | │ TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR │ 17 | │ PERFORMANCE OF THIS SOFTWARE. │ 18 | ╚─────────────────────────────────────────────────────────────────────────────*/ 19 | #include "bestline.h" 20 | 21 | #include 22 | #include 23 | #include 24 | #include 25 | #include 26 | #include 27 | 28 | /*───────────────────────────────────────────────────────────────────────────│─╗ 29 | │ The LISP Challenge § LISP Machine ─╬─│┼ 30 | ╚────────────────────────────────────────────────────────────────────────────│*/ 31 | 32 | #define kT 4 33 | #define kQuote 6 34 | #define kCond 12 35 | #define kRead 17 36 | #define kPrint 22 37 | #define kAtom 28 38 | #define kCar 33 39 | #define kCdr 37 40 | #define kCons 41 41 | #define kEq 46 42 | 43 | #define M (RAM + sizeof(RAM) / sizeof(RAM[0]) / 2) 44 | #define S "NIL\0T\0QUOTE\0COND\0READ\0PRINT\0ATOM\0CAR\0CDR\0CONS\0EQ" 45 | 46 | int cx; /* stores negative memory use */ 47 | int dx; /* stores lookahead character */ 48 | int RAM[0100000]; /* your own ibm7090 */ 49 | 50 | Intern() { 51 | int i, j, x; 52 | for (i = 0; (x = M[i++]);) { 53 | for (j = 0;; ++j) { 54 | if (x != RAM[j]) break; 55 | if (!x) return i - j - 1; 56 | x = M[i++]; 57 | } 58 | while (x) 59 | x = M[i++]; 60 | } 61 | j = 0; 62 | x = --i; 63 | while ((M[i++] = RAM[j++])); 64 | return x; 65 | } 66 | 67 | GetChar() { 68 | int c, t; 69 | static char *l, *p; 70 | if (l || (l = p = bestlineWithHistory("* ", "sectorlisp"))) { 71 | if (*p) { 72 | c = *p++ & 255; 73 | } else { 74 | free(l); 75 | l = p = 0; 76 | c = '\n'; 77 | } 78 | t = dx; 79 | dx = c; 80 | return t; 81 | } else { 82 | PrintChar('\n'); 83 | exit(0); 84 | } 85 | } 86 | 87 | PrintChar(b) { 88 | fputwc(b, stdout); 89 | } 90 | 91 | GetToken() { 92 | int c, i = 0; 93 | do if ((c = GetChar()) > ' ') RAM[i++] = c; 94 | while (c <= ' ' || (c > ')' && dx > ')')); 95 | RAM[i] = 0; 96 | return c; 97 | } 98 | 99 | AddList(x) { 100 | return Cons(x, GetList()); 101 | } 102 | 103 | GetList() { 104 | int c = GetToken(); 105 | if (c == ')') return 0; 106 | return AddList(GetObject(c)); 107 | } 108 | 109 | GetObject(c) { 110 | if (c == '(') return GetList(); 111 | return Intern(); 112 | } 113 | 114 | Read() { 115 | return GetObject(GetToken()); 116 | } 117 | 118 | PrintAtom(x) { 119 | int c; 120 | for (;;) { 121 | if (!(c = M[x++])) break; 122 | PrintChar(c); 123 | } 124 | } 125 | 126 | PrintList(x) { 127 | PrintChar('('); 128 | PrintObject(Car(x)); 129 | while ((x = Cdr(x))) { 130 | if (x < 0) { 131 | PrintChar(' '); 132 | PrintObject(Car(x)); 133 | } else { 134 | PrintChar(L'∙'); 135 | PrintObject(x); 136 | break; 137 | } 138 | } 139 | PrintChar(')'); 140 | } 141 | 142 | PrintObject(x) { 143 | if (x < 0) { 144 | PrintList(x); 145 | } else { 146 | PrintAtom(x); 147 | } 148 | } 149 | 150 | Print(e) { 151 | PrintObject(e); 152 | } 153 | 154 | PrintNewLine() { 155 | PrintChar('\n'); 156 | } 157 | 158 | /*───────────────────────────────────────────────────────────────────────────│─╗ 159 | │ The LISP Challenge § Bootstrap John McCarthy's Metacircular Evaluator ─╬─│┼ 160 | ╚────────────────────────────────────────────────────────────────────────────│*/ 161 | 162 | Car(x) { 163 | return M[x]; 164 | } 165 | 166 | Cdr(x) { 167 | return M[x + 1]; 168 | } 169 | 170 | Cons(car, cdr) { 171 | M[--cx] = cdr; 172 | M[--cx] = car; 173 | return cx; 174 | } 175 | 176 | Gc(x, m, k) { 177 | return x < m ? Cons(Gc(Car(x), m, k), 178 | Gc(Cdr(x), m, k)) + k : x; 179 | } 180 | 181 | Evlis(m, a) { 182 | if (m) { 183 | int x = Eval(Car(m), a); 184 | return Cons(x, Evlis(Cdr(m), a)); 185 | } else { 186 | return 0; 187 | } 188 | } 189 | 190 | Pairlis(x, y, a) { 191 | return x ? Cons(Cons(Car(x), Car(y)), 192 | Pairlis(Cdr(x), Cdr(y), a)) : a; 193 | } 194 | 195 | Assoc(x, y) { 196 | if (!y) return 0; 197 | if (x == Car(Car(y))) return Cdr(Car(y)); 198 | return Assoc(x, Cdr(y)); 199 | } 200 | 201 | Evcon(c, a) { 202 | if (Eval(Car(Car(c)), a)) { 203 | return Eval(Car(Cdr(Car(c))), a); 204 | } else { 205 | return Evcon(Cdr(c), a); 206 | } 207 | } 208 | 209 | Apply(f, x, a) { 210 | if (f < 0) return Eval(Car(Cdr(Cdr(f))), Pairlis(Car(Cdr(f)), x, a)); 211 | if (f > kEq) return Apply(Eval(f, a), x, a); 212 | if (f == kEq) return Car(x) == Car(Cdr(x)) ? kT : 0; 213 | if (f == kCons) return Cons(Car(x), Car(Cdr(x))); 214 | if (f == kAtom) return Car(x) < 0 ? 0 : kT; 215 | if (f == kCar) return Car(Car(x)); 216 | if (f == kCdr) return Cdr(Car(x)); 217 | if (f == kRead) return Read(); 218 | if (f == kPrint) return (x ? Print(Car(x)) : PrintNewLine()), 0; 219 | } 220 | 221 | Eval(e, a) { 222 | int A, B, C; 223 | if (e >= 0) 224 | return Assoc(e, a); 225 | if (Car(e) == kQuote) 226 | return Car(Cdr(e)); 227 | A = cx; 228 | if (Car(e) == kCond) { 229 | e = Evcon(Cdr(e), a); 230 | } else { 231 | e = Apply(Car(e), Evlis(Cdr(e), a), a); 232 | } 233 | B = cx; 234 | e = Gc(e, A, A - B); 235 | C = cx; 236 | while (C < B) 237 | M[--A] = M[--B]; 238 | cx = A; 239 | return e; 240 | } 241 | 242 | /*───────────────────────────────────────────────────────────────────────────│─╗ 243 | │ The LISP Challenge § User Interface ─╬─│┼ 244 | ╚────────────────────────────────────────────────────────────────────────────│*/ 245 | 246 | main() { 247 | int i; 248 | setlocale(LC_ALL, ""); 249 | bestlineSetXlatCallback(bestlineUppercase); 250 | for(i = 0; i < sizeof(S); ++i) M[i] = S[i]; 251 | for (;;) { 252 | cx = 0; 253 | Print(Eval(Read(), 0)); 254 | PrintNewLine(); 255 | } 256 | } 257 | -------------------------------------------------------------------------------- /lisp.lisp: -------------------------------------------------------------------------------- 1 | ;; (setq lisp-indent-function 'common-lisp-indent-function) 2 | ;; (paredit-mode) 3 | 4 | ;; ________ 5 | ;; /_ __/ /_ ___ 6 | ;; / / / __ \/ _ \ 7 | ;; / / / / / / __/ 8 | ;; /_/ /_/ /_/\___/ 9 | ;; __ _________ ____ ________ ____ 10 | ;; / / / _/ ___// __ \ / ____/ /_ ____ _/ / /__ ____ ____ ____ 11 | ;; / / / / \__ \/ /_/ / / / / __ \/ __ `/ / / _ \/ __ \/ __ `/ _ \ 12 | ;; / /____/ / ___/ / ____/ / /___/ / / / /_/ / / / __/ / / / /_/ / __/ 13 | ;; /_____/___//____/_/ \____/_/ /_/\__,_/_/_/\___/_/ /_/\__, /\___/ 14 | ;; /____/ 15 | ;; 16 | ;; The LISP Challenge 17 | ;; 18 | ;; Pick your favorite programming language 19 | ;; Implement the tiniest possible LISP machine that 20 | ;; Bootstraps John Mccarthy'S metacircular evaluator below 21 | ;; Winning is defined by lines of code for scripting languages 22 | ;; Winning is defined by binary footprint for compiled languages 23 | ;; 24 | ;; Listed Projects 25 | ;; 26 | ;; - 512 bytes: https://github.com/jart/sectorlisp 27 | ;; - 13 kilobytes: https://t3x.org/klisp/ 28 | ;; - 47 kilobytes: https://github.com/matp/tiny-lisp 29 | ;; - 150 kilobytes: https://github.com/JeffBezanson/femtolisp 30 | ;; - Send pull request to be listed here 31 | ;; 32 | ;; @see LISP From Nothing; Nils M. Holm; Lulu Press, Inc. 2020 33 | ;; @see Recursive Functions of Symbolic Expressions and Their 34 | ;; Computation By Machine, Part I; John McCarthy, Massachusetts 35 | ;; Institute of Technology, Cambridge, Mass. April 1960 36 | 37 | ;; NIL ATOM 38 | ;; ABSENCE OF VALUE AND TRUTH 39 | NIL 40 | 41 | ;; CONS CELL 42 | ;; BUILDING BLOCK OF DATA STRUCTURES 43 | (CONS NIL NIL) 44 | (CONS (QUOTE X) (QUOTE Y)) 45 | 46 | ;; REFLECTION 47 | ;; EVERYTHING IS AN ATOM OR NOT AN ATOM 48 | (ATOM NIL) 49 | (ATOM (CONS NIL NIL)) 50 | 51 | ;; QUOTING 52 | ;; CODE IS DATA AND DATA IS CODE 53 | (QUOTE (CONS NIL NIL)) 54 | (CONS (QUOTE CONS) (CONS NIL (CONS NIL NIL))) 55 | 56 | ;; LOGIC 57 | ;; BY WAY OF STRING INTERNING 58 | (EQ (QUOTE A) (QUOTE A)) 59 | (EQ (QUOTE T) (QUOTE F)) 60 | 61 | ;; FIND FIRST ATOM IN TREE 62 | ;; CORRECT RESULT OF EXPRESSION IS `A` 63 | ;; RECURSIVE CONDITIONAL FUNCTION BINDING 64 | ((LAMBDA (FF X) (FF X)) 65 | (QUOTE (LAMBDA (X) 66 | (COND ((ATOM X) X) 67 | ((QUOTE T) (FF (CAR X)))))) 68 | (QUOTE ((A) B C))) 69 | 70 | ;; LISP IMPLEMENTED IN LISP 71 | ;; WITHOUT ANY SUBJECTIVE SYNTACTIC SUGAR 72 | ;; RUNS "FIND FIRST ATOM IN TREE" PROGRAM 73 | ;; CORRECT RESULT OF EXPRESSION IS STILL `A` 74 | ;; REQUIRES CONS CAR CDR QUOTE ATOM EQ LAMBDA COND 75 | ;; SIMPLIFIED BUG FIXED VERSION OF JOHN MCCARTHY PAPER 76 | ;; NOTE: ((EQ (CAR E) ()) (QUOTE *UNDEFINED)) CAN HELP 77 | ;; NOTE: ((EQ (CAR E) (QUOTE LAMBDA)) E) IS NICE 78 | ((LAMBDA (ASSOC EVCON PAIRLIS EVLIS APPLY EVAL) 79 | (EVAL (QUOTE ((LAMBDA (FF X) (FF X)) 80 | (QUOTE (LAMBDA (X) 81 | (COND ((ATOM X) X) 82 | ((QUOTE T) (FF (CAR X)))))) 83 | (QUOTE ((A) B C)))) 84 | ())) 85 | (QUOTE (LAMBDA (X Y) 86 | (COND ((EQ Y ()) ()) 87 | ((EQ X (CAR (CAR Y))) 88 | (CDR (CAR Y))) 89 | ((QUOTE T) 90 | (ASSOC X (CDR Y)))))) 91 | (QUOTE (LAMBDA (C A) 92 | (COND ((EVAL (CAR (CAR C)) A) 93 | (EVAL (CAR (CDR (CAR C))) A)) 94 | ((QUOTE T) (EVCON (CDR C) A))))) 95 | (QUOTE (LAMBDA (X Y A) 96 | (COND ((EQ X ()) A) 97 | ((QUOTE T) (CONS (CONS (CAR X) (CAR Y)) 98 | (PAIRLIS (CDR X) (CDR Y) A)))))) 99 | (QUOTE (LAMBDA (M A) 100 | (COND ((EQ M ()) ()) 101 | ((QUOTE T) (CONS (EVAL (CAR M) A) 102 | (EVLIS (CDR M) A)))))) 103 | (QUOTE (LAMBDA (FN X A) 104 | (COND 105 | ((ATOM FN) 106 | (COND ((EQ FN (QUOTE CAR)) (CAR (CAR X))) 107 | ((EQ FN (QUOTE CDR)) (CDR (CAR X))) 108 | ((EQ FN (QUOTE ATOM)) (ATOM (CAR X))) 109 | ((EQ FN (QUOTE CONS)) (CONS (CAR X) (CAR (CDR X)))) 110 | ((EQ FN (QUOTE EQ)) (EQ (CAR X) (CAR (CDR X)))) 111 | ((QUOTE T) (APPLY (EVAL FN A) X A)))) 112 | ((EQ (CAR FN) (QUOTE LAMBDA)) 113 | (EVAL (CAR (CDR (CDR FN))) 114 | (PAIRLIS (CAR (CDR FN)) X A)))))) 115 | (QUOTE (LAMBDA (E A) 116 | (COND 117 | ((ATOM E) (ASSOC E A)) 118 | ((ATOM (CAR E)) 119 | (COND ((EQ (CAR E) (QUOTE QUOTE)) (CAR (CDR E))) 120 | ((EQ (CAR E) (QUOTE COND)) (EVCON (CDR E) A)) 121 | ((QUOTE T) (APPLY (CAR E) (EVLIS (CDR E) A) A)))) 122 | ((QUOTE T) (APPLY (CAR E) (EVLIS (CDR E) A) A)))))) 123 | -------------------------------------------------------------------------------- /sectorlisp.S: -------------------------------------------------------------------------------- 1 | /*-*- mode:unix-assembly; indent-tabs-mode:t; tab-width:8; coding:utf-8 -*-│ 2 | │ vi: set noet ft=asm ts=8 tw=8 fenc=utf-8 :vi │ 3 | ╞══════════════════════════════════════════════════════════════════════════════╡ 4 | │ Copyright 2020 Justine Alexandra Roberts Tunney │ 5 | │ Copyright 2021 Alain Greppin │ 6 | │ Some size optimisations by Peter Ferrie │ 7 | │ Copyright 2022 Hikaru Ikuta │ 8 | │ │ 9 | │ Permission to use, copy, modify, and/or distribute this software for │ 10 | │ any purpose with or without fee is hereby granted, provided that the │ 11 | │ above copyright notice and this permission notice appear in all copies. │ 12 | │ │ 13 | │ THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL │ 14 | │ WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED │ 15 | │ WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE │ 16 | │ AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL │ 17 | │ DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR │ 18 | │ PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER │ 19 | │ TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR │ 20 | │ PERFORMANCE OF THIS SOFTWARE. │ 21 | ╚─────────────────────────────────────────────────────────────────────────────*/ 22 | 23 | // LISP meta-circular evaluator in a MBR 24 | // Compatible with the original hardware 25 | 26 | .code16 27 | .globl _start 28 | _start: .asciz "NIL" # dec %si ; dec %cx ; dec %sp 29 | kT: .asciz "T" # add %dl,(%si) boot A:\ DL=0 30 | start: ljmp $0x7c00>>4,$begin # cs = 0x7c00 is boot address 31 | .asciz "" # interned strings 32 | kQuote: .asciz "QUOTE" # builtin for eval 33 | kCond: .asciz "COND" # builtin for eval 34 | kRead: .asciz "READ" # builtin to apply 35 | kPrint: .asciz "PRINT" # builtin to apply 36 | kCar: .asciz "CAR" # builtin to apply 37 | kCdr: .asciz "CDR" # ordering matters 38 | kCons: .asciz "CONS" # must be 3rd last 39 | kEq: .asciz "EQ" # must be 2nd last 40 | kAtom: .asciz "ATOM" # needs to be last 41 | 42 | .set .partition, 1 # set to one (1) to build with 43 | # partition table, for maximum 44 | # compatibility with hardware; 45 | # else zero for build to be as 46 | # small as possible 47 | 48 | begin: mov $0x8000,%sp # uses higher address as stack 49 | # and set independently of SS! 50 | # 8088 doesn't stop interrupts 51 | # after SS is set, and PC BIOS 52 | # sets SP to a value that will 53 | # damage our code if int fires 54 | # between it setting SS and SP 55 | push %cs # that means ss = ds = es = cs 56 | pop %ds # noting ljmp set cs to 0x7c00 57 | push %cs # that's the bios load address 58 | pop %es # therefore NULL points to NUL 59 | push %cs # terminated NIL string above! 60 | pop %ss # errata exists but don't care 61 | mov $2,%bx 62 | main: mov %sp,%cx 63 | mov $'\r',%al 64 | call PutChar # call first to initialize %dx 65 | call Read 66 | call Eval 67 | xchg %si,%ax 68 | call PrintObject 69 | jmp main 70 | 71 | GetToken: # GetToken():al, dl is g_look 72 | mov %cx,%di 73 | 1: mov %dl,%al 74 | cmp $' ',%al 75 | jbe 2f 76 | stosb 77 | xchg %ax,%si 78 | 2: call GetChar # exchanges dx and ax 79 | cmp $' ',%al 80 | jbe 1b 81 | cmp $')',%al 82 | jbe 3f 83 | cmp $')',%dl # dl = g_look 84 | ja 1b 85 | 3: mov %bh,(%di) # bh is zero 86 | xchg %si,%ax 87 | ret 88 | 89 | .PrintList: 90 | mov $'(',%al 91 | 2: push (%bx,%si) 92 | mov (%si),%si 93 | call .PutObject 94 | mov $' ',%al 95 | pop %si # restore 1 96 | test %si,%si 97 | js 2b # jump if cons 98 | jz 4f # jump if nil 99 | mov $249,%al # bullet (A∙B) 100 | call .PutObject 101 | 4: mov $')',%al 102 | jmp PutChar 103 | 104 | .ifPrint: 105 | xchg %di,%si # Print(x:si) 106 | test %di,%di 107 | jnz PrintObject # print newline for empty args 108 | mov $'\r',%al 109 | .PutObject: # .PutObject(c:al,x:si) 110 | .PrintString: # nul-terminated in si 111 | call PutChar # preserves si 112 | PrintObject: # PrintObject(x:si) 113 | test %si,%si # set sf=1 if cons 114 | js .PrintList # jump if not cons 115 | .PrintAtom: 116 | lodsb 117 | test %al,%al # test for nul terminator 118 | jnz .PrintString # -> ret 119 | ret 120 | 121 | .ifRead:mov %bp,%dx # get cached character 122 | Read: call GetToken 123 | # jmp GetObject 124 | 125 | GetObject: # called just after GetToken 126 | cmp $'(',%al 127 | je GetList 128 | # jmp Intern 129 | 130 | Intern: push %cx # Intern(cx,di): ax 131 | mov %di,%bp 132 | sub %cx,%bp 133 | inc %bp 134 | xor %di,%di 135 | 1: pop %si 136 | push %si 137 | mov %bp,%cx 138 | mov %di,%ax 139 | cmp %bh,(%di) 140 | je 8f 141 | rep cmpsb # memcmp(di,si,cx) 142 | je 9f 143 | dec %di 144 | xor %ax,%ax 145 | 2: scasb # rawmemchr(di,al) 146 | jne 2b 147 | jmp 1b 148 | 8: rep movsb # memcpy(di,si,cx) 149 | 9: pop %cx 150 | ret 151 | 152 | GetChar:xor %ax,%ax # GetChar→al:dl 153 | int $0x16 # get keystroke 154 | mov %ax,%bp # used for READ 155 | PutChar:mov $0x0e,%ah # prints CP-437 156 | int $0x10 # vidya service 157 | cmp $'\r',%al # don't clobber 158 | jne .RetDx # look xchg ret 159 | mov $'\n',%al 160 | jmp PutChar 161 | .RetDx: xchg %dx,%ax 162 | ret 163 | 164 | //////////////////////////////////////////////////////////////////////////////// 165 | 166 | Evlis: test %di,%di # Evlis(m:di,a:dx):ax 167 | jz .RetDi # jump if nil 168 | push (%bx,%di) # save 1 Cdr(m) 169 | mov (%di),%ax 170 | call Eval 171 | pop %di # restore 1 172 | push %ax # save 2 173 | call Evlis 174 | # jmp xCons 175 | 176 | xCons: pop %di # restore 2 177 | Cons: xchg %di,%cx # Cons(m:di,a:ax):ax 178 | mov %cx,(%di) # must preserve si 179 | mov %ax,(%bx,%di) 180 | lea 4(%di),%cx 181 | .RetDi: xchg %di,%ax 182 | ret 183 | 184 | Builtin:cmp $kAtom,%ax # atom: last builtin atom 185 | ja .resolv # ah is zero if not above 186 | mov (%si),%di # di = Car(x) 187 | je .ifAtom 188 | cmp $kPrint,%al 189 | je .ifPrint 190 | cmp $kRead,%al 191 | je .ifRead 192 | .ifCar: cmp $kCar,%al 193 | je Car 194 | cmp $kCons,%al 195 | jb Cdr 196 | .ifCons:mov (%bx,%si),%si # si = Cdr(x) 197 | lodsw # si = Cadr(x) 198 | je Cons 199 | .isEq: xor %di,%ax 200 | jne .retF 201 | .retT: mov $kT,%al 202 | ret 203 | 204 | GetList:call GetToken 205 | cmp $')',%al 206 | je .retF 207 | call GetObject 208 | push %ax # popped by xCons 209 | call GetList 210 | jmp xCons 211 | 212 | Gc: cmp %dx,%di # Gc(x:di,A:dx,B:si):ax 213 | jb .RetDi # we assume immutable cells 214 | push (%bx,%di) # mark prevents negative gc 215 | mov (%di),%di 216 | call Gc 217 | pop %di 218 | push %ax 219 | call Gc 220 | pop %di 221 | call Cons 222 | sub %si,%ax 223 | add %dx,%ax 224 | ret 225 | 226 | .resolv:push %si 227 | call Assoc # do (fn si) → ((λ ...) si) 228 | pop %si 229 | Apply: test %ax,%ax # Apply(fn:ax,x:si:a:dx):ax 230 | jns Builtin # jump if atom 231 | xchg %ax,%di # di = fn 232 | .lambda:mov (%bx,%di),%di # di = Cdr(fn) 233 | push %di # for .EvCadr 234 | mov (%di),%di # di = Cadr(fn) 235 | Pairlis:test %di,%di # Pairlis(x:di,y:si,a:dx):dx 236 | jz .EvCadr # return if x is nil 237 | lodsw # ax = Car(y) 238 | push (%bx,%di) # push Cdr(x) 239 | mov (%di),%di # di = Car(x) 240 | mov (%si),%si # si = Cdr(y) 241 | call Cons # Cons(Car(x),Car(y)) 242 | xchg %ax,%di 243 | xchg %dx,%ax 244 | call Cons # Cons(Cons(Car(x),Car(y)),a) 245 | xchg %ax,%dx # a = new list 246 | pop %di # grab Cdr(x) 247 | jmp Pairlis 248 | .ifAtom:test %di,%di # test if atom 249 | jns .retT 250 | .retF: xor %ax,%ax # ax = nil 251 | ret 252 | 253 | Assoc: mov %dx,%si # Assoc(x:ax,y:dx):ax 254 | 1: mov (%si),%di 255 | mov (%bx,%si),%si 256 | scasw 257 | jne 1b 258 | .byte 0xA9 # shifted ip; reads as test, cmp 259 | Cadr: mov (%bx,%di),%di # contents of decrement register 260 | .byte 0x3C # cmp §scasw,%al (nop next byte) 261 | Cdr: scasw # increments our data index by 2 262 | Car: mov (%di),%ax # contents of address register!! 263 | ret 264 | 265 | 1: mov (%bx,%di),%di # di = Cdr(c) 266 | Evcon: push %di # save c 267 | mov (%di),%si # di = Car(c) 268 | lodsw # ax = Caar(c) 269 | call Eval 270 | pop %di # restore c 271 | test %ax,%ax # nil test 272 | jz 1b 273 | push (%di) # push Car(c) 274 | .EvCadr:pop %di 275 | call Cadr # ax = Cadar(c) 276 | # jmp Eval 277 | 278 | Eval: test %ax,%ax # Eval(e:ax,a:dx):ax 279 | jz 1f 280 | jns Assoc # lookup val if atom 281 | xchg %ax,%si # di = e 282 | lodsw # ax = Car(e) 283 | cmp $kQuote,%ax # maybe CONS 284 | mov (%si),%di # di = Cdr(e) 285 | je Car 286 | cmp $kCond,%ax 287 | je Evcon # ABC Garbage Collector 288 | push %dx # save a 289 | push %cx # save A 290 | push %ax 291 | call Evlis 292 | xchg %ax,%si 293 | pop %ax 294 | call Apply 295 | 296 | .if .partition 297 | .fill 0x1BE - (. - _start), 1, 0x90 # to have this boot from a USB 298 | # drive on a modern PC, make a 299 | # degenerate "partition table" 300 | # where this sector starts the 301 | # bootable partition; inactive 302 | # partition table entries must 303 | # also be empty, or have valid 304 | # starting sector LBA numbers! 305 | 306 | # * 1st partition entry * 307 | .byte 0x00 # - bootable indicator 308 | .byte 0b11010010 # reads as add %dl,%dl 309 | .endif 310 | pop %dx # restore A 311 | mov %cx,%si # si = B 312 | xchg %ax,%di 313 | call Gc 314 | mov %dx,%di # di = A 315 | .if .partition 316 | .byte 0x00 # - hi8(c₀*Cₙ + h₀*Hₙ + s₀*Sₙ) 317 | .byte 0b11010010 # reads as add %dl,%dl 318 | .endif 319 | sub %si,%cx # cx = C - B 320 | .if .partition 321 | .byte 0x3C # cmp $0,%al 322 | # * 2nd partition entry * 323 | .byte 0x00 # - bootable indicator 324 | .endif 325 | rep movsb 326 | mov %di,%cx # cx = A + (C - B) 327 | pop %dx # restore a 328 | 1: ret 329 | 330 | .if .partition 331 | .fill 0x1CE + 0x8 - (. - _start), 1, 0xce 332 | .long 0 # - c₀*Cₙ + h₀*Hₙ + s₀*Sₙ 333 | 334 | .fill 0x1DE - (. - _start), 1, 0xce # * 3rd partition entry * 335 | .byte 0x80 # - bootable indicator 336 | .byte 0, 1, 0 # - h₀, s₀ (& c₀ hi bits), c₀ 337 | .byte 0x7F # - OS or filesystem indicator 338 | .byte 0xFF, 0xFF, 0xFF # - h₉, s₉ (& c₉ hi bits), c₉ 339 | .long 0 # - c₀*Cₙ + h₀*Hₙ + s₀*Sₙ 340 | 341 | .fill 0x1EE - (. - _start), 1, 0xce # * 4th partition entry * 342 | .byte 0x00 # - bootable indicator 343 | .endif 344 | .sig: .fill 0x200 - (2f - 1f) - (. - _start), 1, 0xce 345 | 1: .ascii "SECTORLISP" 346 | .byte 0 # - hi8(c₀*Cₙ + h₀*Hₙ + s₀*Sₙ) 347 | .ascii " v2 " 348 | .word 0xAA55 349 | 2: .type .sig,@object 350 | .type kQuote,@object 351 | .type kCond,@object 352 | .type kRead,@object 353 | .type kPrint,@object 354 | .type kAtom,@object 355 | .type kCar,@object 356 | .type kCdr,@object 357 | .type kCons,@object 358 | .type kEq,@object 359 | -------------------------------------------------------------------------------- /sectorlisp.lds: -------------------------------------------------------------------------------- 1 | ENTRY(_start) 2 | 3 | SECTIONS { 4 | . = 0; 5 | .text : { 6 | *(.text) 7 | *(.rodata .rodata.*) 8 | } 9 | /DISCARD/ : { 10 | *(.*) 11 | } 12 | } 13 | -------------------------------------------------------------------------------- /test/.gitignore: -------------------------------------------------------------------------------- 1 | /tcat 2 | -------------------------------------------------------------------------------- /test/Makefile: -------------------------------------------------------------------------------- 1 | test1: test1.lisp qemu.sh tcat 2 | sh qemu.sh test1.lisp 3 | test2: test2.lisp qemu.sh tcat 4 | sh qemu.sh test2.lisp 5 | eval10: eval10.lisp qemu.sh tcat 6 | sh qemu.sh eval10.lisp 7 | eval15: eval15.lisp qemu.sh tcat 8 | sh qemu.sh eval10.lisp 9 | tcat: tcat.c 10 | $(CC) -o $@ $< -Wall 11 | 12 | .PHONY: test1 eval10 eval15 13 | -------------------------------------------------------------------------------- /test/README.md: -------------------------------------------------------------------------------- 1 | # sectorlisp test scripts 2 | 3 | For best results, please resize your terminal to 80x25. 4 | 5 | You can launch a test with the following command: 6 | 7 | make test1 8 | 9 | _This is tested on Linux. The qemu.sh script requires qemu,cc,wc & nc._ 10 | 11 | ## files 12 | 13 | - test1.lisp contains basic tests 14 | - eval10.lisp evaluator from [eval.c as of commit 1058c95][1] 15 | - eval15.lisp evaluator from [eval.c as of commit 3b26982 (latest)][2] 16 | 17 | [//]: links 18 | [1]: https://github.com/jart/sectorlisp/blob/1058c959d80b7103514cd7e959dbd67b38f4400b/lisp.c 19 | [2]: https://github.com/jart/sectorlisp/blob/3b26982d9c06cd43760604b6364df197a782333e/lisp.c 20 | -------------------------------------------------------------------------------- /test/eval10.lisp: -------------------------------------------------------------------------------- 1 | ((LAMBDA (ASSOC EVCON BIND EVAL) 2 | (EVAL (QUOTE ((LAMBDA (FF X) (FF X)) 3 | (QUOTE (LAMBDA (X) 4 | (COND ((ATOM X) X) 5 | ((QUOTE T) (FF (CAR X)))))) 6 | (QUOTE ((A) B C)))) 7 | NIL)) 8 | (QUOTE (LAMBDA (X E) 9 | (COND ((EQ E NIL) NIL) 10 | ((EQ X (CAR (CAR E))) (CDR (CAR E))) 11 | ((QUOTE T) (ASSOC X (CDR E)))))) 12 | (QUOTE (LAMBDA (C E) 13 | (COND ((EVAL (CAR (CAR C)) E) (EVAL (CAR (CDR (CAR C))) E)) 14 | ((QUOTE T) (EVCON (CDR C) E))))) 15 | (QUOTE (LAMBDA (V A E) 16 | (COND ((EQ V NIL) E) 17 | ((QUOTE T) (CONS (CONS (CAR V) (EVAL (CAR A) E)) 18 | (BIND (CDR V) (CDR A) E)))))) 19 | (QUOTE (LAMBDA (E A) 20 | (COND 21 | ((ATOM E) (ASSOC E A)) 22 | ((ATOM (CAR E)) 23 | (COND 24 | ((EQ (CAR E) NIL) (QUOTE *UNDEFINED)) 25 | ((EQ (CAR E) (QUOTE QUOTE)) (CAR (CDR E))) 26 | ((EQ (CAR E) (QUOTE ATOM)) (ATOM (EVAL (CAR (CDR E)) A))) 27 | ((EQ (CAR E) (QUOTE EQ)) (EQ (EVAL (CAR (CDR E)) A) 28 | (EVAL (CAR (CDR (CDR E))) A))) 29 | ((EQ (CAR E) (QUOTE CAR)) (CAR (EVAL (CAR (CDR E)) A))) 30 | ((EQ (CAR E) (QUOTE CDR)) (CDR (EVAL (CAR (CDR E)) A))) 31 | ((EQ (CAR E) (QUOTE CONS)) (CONS (EVAL (CAR (CDR E)) A) 32 | (EVAL (CAR (CDR (CDR E))) A))) 33 | ((EQ (CAR E) (QUOTE COND)) (EVCON (CDR E) A)) 34 | ((EQ (CAR E) (QUOTE LAMBDA)) E) 35 | ((QUOTE T) (EVAL (CONS (ASSOC (CAR E) A) (CDR E)) A)))) 36 | ((EQ (CAR (CAR E)) (QUOTE LAMBDA)) 37 | (EVAL (CAR (CDR (CDR (CAR E)))) 38 | (BIND (CAR (CDR (CAR E))) (CDR E) A))))))) 39 | -------------------------------------------------------------------------------- /test/eval15.lisp: -------------------------------------------------------------------------------- 1 | ((LAMBDA (ASSOC EVCON PAIRLIS EVLIS APPLY EVAL) 2 | (EVAL (QUOTE ((LAMBDA (FF X) (FF X)) 3 | (QUOTE (LAMBDA (X) 4 | (COND ((ATOM X) X) 5 | ((QUOTE T) (FF (CAR X)))))) 6 | (QUOTE ((A) B C)))) 7 | NIL)) 8 | (QUOTE (LAMBDA (X E) 9 | (COND ((EQ E NIL) NIL) 10 | ((EQ X (CAR (CAR E))) (CDR (CAR E))) 11 | ((QUOTE T) (ASSOC X (CDR E)))))) 12 | (QUOTE (LAMBDA (C E) 13 | (COND ((EVAL (CAR (CAR C)) E) (EVAL (CAR (CDR (CAR C))) E)) 14 | ((QUOTE T) (EVCON (CDR C) E))))) 15 | (QUOTE (LAMBDA (X Y A) 16 | (COND ((EQ X NIL) A) 17 | ((QUOTE T) (CONS (CONS (CAR X) (CAR Y)) 18 | (PAIRLIS (CDR X) (CDR Y) A)))))) 19 | (QUOTE (LAMBDA (M A) 20 | (COND ((EQ M NIL) NIL) 21 | ((QUOTE T) (CONS (EVAL (CAR M) A) (EVLIS (CDR M) A)))))) 22 | (QUOTE (LAMBDA (FN X A) 23 | (COND ((ATOM FN) 24 | (COND ((EQ FN (QUOTE CAR)) (CAR (CAR X))) 25 | ((EQ FN (QUOTE CDR)) (CDR (CAR X))) 26 | ((EQ FN (QUOTE CONS)) (CONS (CAR X) (CAR (CDR X)))) 27 | ((EQ FN (QUOTE ATOM)) (ATOM (CAR X))) 28 | ((EQ FN (QUOTE EQ)) (EQ (CAR X) (CAR (CDR X)))) 29 | ((QUOTE T) (APPLY (EVAL FN A) X A)))) 30 | ((EQ (CAR FN) (QUOTE LAMBDA)) 31 | (EVAL (CAR (CDR (CDR FN))) (PAIRLIS (CAR (CDR FN)) X A))) 32 | ((QUOTE T) NIL)))) 33 | (QUOTE (LAMBDA (E A) 34 | (COND ((ATOM E) (ASSOC E A)) 35 | ((ATOM (CAR E)) 36 | (COND 37 | ((EQ (CAR E) (QUOTE QUOTE)) (CAR (CDR E))) 38 | ((EQ (CAR E) (QUOTE COND)) (EVCON (CDR E) A)) 39 | ((QUOTE T) (APPLY (CAR E) (EVLIS (CDR E) A) A)))) 40 | ((QUOTE T) (APPLY (CAR E) (EVLIS (CDR E) A) A)))))) 41 | -------------------------------------------------------------------------------- /test/qemu.sh: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | set -e 3 | FILE=$1 4 | [ -z "$FILE" ] && FILE=test1.lisp 5 | [ -r "$FILE" ] || (echo "cannot read file: $FILE"; exit 1) 6 | SIZE=$(wc -c "$FILE" | cut -d' ' -f1) 7 | QEMU="qemu-system-x86_64" 8 | QIMG="-drive file=../bin/sectorlisp.bin,index=0,if=floppy,format=raw -boot a" 9 | QMON="-monitor tcp:127.0.0.1:55555,server,nowait" 10 | 11 | trap 'echo quit | nc -N 127.0.0.1 55555' EXIT 12 | cat "$FILE" | tr '\n' '\r' | ./tcat | \ 13 | $QEMU -display curses -net none $QMON $QIMG & 14 | PID=$! 15 | SECS=$((1 + SIZE * 40 / 1000)) 16 | sleep $SECS 17 | -------------------------------------------------------------------------------- /test/tcat.c: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | int main() 4 | { 5 | int ret; 6 | char c; 7 | usleep(350 * 1000); 8 | while ((ret = read(0, &c, 1)) > 0) { 9 | usleep(35 * 1000); 10 | if ((ret = write(1, &c, 1)) <= 0) 11 | break; 12 | } 13 | return ret; 14 | } 15 | -------------------------------------------------------------------------------- /test/test1.lisp: -------------------------------------------------------------------------------- 1 | (ATOM NIL) 2 | (CONS () ()) 3 | (QUOTE ((A) B)) 4 | (EQ (QUOTE A) (QUOTE B)) 5 | (EQ (QUOTE A) (QUOTE A)) 6 | (CONS (QUOTE A) (QUOTE B)) 7 | (CONS (QUOTE A) (CONS (QUOTE B) NIL)) 8 | (CAR (CONS (QUOTE A) (QUOTE B))) 9 | (CDR (CONS (QUOTE A) (QUOTE B))) 10 | (CDR (CONS (QUOTE A) (CONS (QUOTE B) NIL))) 11 | (COND ((QUOTE T) (QUOTE A))) 12 | (COND ((QUOTE NIL) (QUOTE A)) ((QUOTE T) (QUOTE B))) 13 | ((LAMBDA (Z) Z) (QUOTE ZZZ)) 14 | ((LAMBDA (Z) (CAR Z)) (CONS (QUOTE A) (QUOTE B))) 15 | 16 | -------------------------------------------------------------------------------- /test/test2.lisp: -------------------------------------------------------------------------------- 1 | (READ)AAA 2 | (READ)(1 (2 3) 4) 3 | (READ) 4 | 5 | AAA 6 | (READ) 7 | 8 | (1 (2 3) 4) 9 | (CAR (READ))(1 (2 3) 4) 10 | (CDR (READ))(1 (2 3) 4) 11 | (CONS (READ) (CONS (QUOTE A) NIL))B 12 | (CONS (READ) (CONS (QUOTE A) NIL))(1 (2 3) 4) 13 | (ATOM (READ))A 14 | (ATOM (READ))(1 2) 15 | (EQ (QUOTE A) (READ))A 16 | (EQ (QUOTE B) (READ))A 17 | (PRINT (QUOTE A)) 18 | (PRINT (QUOTE (1 2))) 19 | ((LAMBDA () ()) 20 | (PRINT (QUOTE A)) 21 | (PRINT (QUOTE B)) 22 | (PRINT) 23 | (PRINT (QUOTE C)) 24 | (PRINT (QUOTE (1 2 3))) 25 | (PRINT)) 26 | (PRINT (READ))AAA 27 | (PRINT (READ))(1 (2 3) 4) 28 | (PRINT) 29 | (PRINT (PRINT)) 30 | (PRINT (PRINT (QUOTE A))) 31 | ((LAMBDA (LOOP) (LOOP LOOP)) 32 | (QUOTE (LAMBDA (LOOP) 33 | ((LAMBDA () ()) 34 | (PRINT (QUOTE >)) 35 | (PRINT (CONS (QUOTE INPUT) (CONS (READ) NIL))) 36 | (PRINT) 37 | (LOOP LOOP))))) 38 | A 39 | B 40 | C 41 | (1 2) 42 | (1 (2 3) 4) 43 | --------------------------------------------------------------------------------