├── .gitignore ├── LICENSE.txt ├── README.md ├── config.toml ├── content ├── code │ ├── bool-listp.el │ ├── closure_test.go │ ├── eval │ │ ├── eval.go │ │ └── eval_amd64.s │ └── mv-lib.el └── post │ ├── c-broken-defaults.md │ ├── call-go-from-jit.md │ ├── cgo-funcall.md │ ├── disassembling-go-avx512.md │ ├── dumbing-down-go-interfaces.md │ ├── elisp-multi-return-values.md │ ├── faq.md │ ├── gen-map.md │ ├── go-asm-complementary-reference.md │ ├── go-asm-dispatch-tables.md │ ├── go-avx512.md │ ├── go-nested-functions-and-static-locals.md │ ├── go_ssa_rules.md │ ├── gogrep.md │ ├── goism-compilation-modes.md │ ├── goism-objects-layout-mode.md │ ├── log-fatal-vs-log-panic.md │ ├── naive-ssa-alternative.md │ ├── pathfinding.md │ ├── pratt-parsers-go.md │ ├── profile-guided-gogrep.md │ ├── riscv32-custom-instruction-and-its-simulation.md │ ├── ruleguard-modules.md │ ├── ruleguard.md │ ├── single-exit.md │ ├── step-pattern.md │ ├── writing-emacs-lisp-compiler-intrinsics.md │ └── yaml5.md ├── hugo_hints.txt ├── layouts ├── _default │ ├── single.html │ ├── summary.html │ └── terms.html └── partials │ ├── head.html │ ├── js.html │ └── navigation.html ├── scripts ├── deploy └── run_server ├── static ├── css │ ├── concatenated.css │ ├── font-awesome.css │ ├── normalize.css │ └── screen.css ├── favicon.ico ├── favicon_old.ico ├── files │ ├── go_x86_aliases.txt │ └── x86_2.csv ├── fonts │ ├── FontAwesome.otf │ ├── fontawesome-webfont.eot │ ├── fontawesome-webfont.svg │ ├── fontawesome-webfont.ttf │ ├── fontawesome-webfont.woff │ └── fontawesome-webfont.woff2 ├── highlight.pack.js ├── hljs-themes │ ├── hybrid.css │ └── wombat.css ├── img │ ├── avatar.jpg │ ├── genmap1.png │ ├── genmap2.png │ ├── genmap3.png │ ├── genmap4.png │ ├── github_watch.png │ ├── jit_call1.png │ ├── jit_call2.png │ ├── pathing_bithack.png │ ├── pathing_comparison.png │ ├── pathing_deltas.png │ ├── pathing_map.png │ ├── pathing_mathbits.png │ ├── pathing_pathmem.png │ ├── pathing_stonks.png │ ├── reg_table.png │ └── zeroalloc.png ├── jquery.min.js └── style.css └── themes └── hugo-steam-theme ├── CHANGELOG.md ├── LICENSE.md ├── README.md ├── archetypes └── default.md ├── exampleSite ├── .gitignore ├── config.toml ├── content │ ├── about.md │ └── post │ │ ├── creating-a-new-theme.md │ │ ├── goisforlovers.md │ │ ├── hugoisforlovers.md │ │ └── migrate-from-jekyll.md └── static │ └── .gitkeep ├── images ├── screenshot.png └── tn.png ├── layouts ├── 404.html ├── _default │ ├── baseof.html │ ├── list.html │ ├── single.html │ ├── summary.html │ └── terms.html ├── index.html └── partials │ ├── author.html │ ├── footer.html │ ├── head.html │ ├── header.html │ ├── js.html │ ├── navigation.html │ ├── pagination.html │ ├── share.html │ ├── social.html │ └── themes │ ├── blue-theme.html │ ├── custom-theme.html │ ├── green-theme.html │ ├── orange-theme.html │ └── red-theme.html ├── static ├── css │ ├── github.css │ └── screen.css ├── favicon.ico ├── fonts │ ├── icons.eot │ ├── icons.svg │ ├── icons.ttf │ └── icons.woff ├── img │ ├── appletouchicon.png │ ├── avatar.jpg │ └── favicon.ico └── js │ ├── index.js │ └── smooth-scroll.min.js └── theme.toml /.gitignore: -------------------------------------------------------------------------------- 1 | /public/ 2 | .hugo_build.lock 3 | -------------------------------------------------------------------------------- /LICENSE.txt: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2016-2017 Iskander Sharipov 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | Sources that are used to build https://github.com/quasilyte/quasilyte.github.io. 2 | The site itself can be found here: https://quasilyte.dev/blog/. 3 | 4 | Some information: 5 | - [Hugo](https://github.com/gohugoio/hugo) static site generator 6 | - [Steam](https://themes.gohugo.io/steam/) huge theme 7 | 8 | `./hugo_hints.txt` - memo for `hugo` commands. 9 | 10 | `./scripts/run_server` - run server on `127.0.0.1:1313`. 11 | 12 | `./scripts/deploy` - build script. 13 | 14 | Install steps: 15 | ```bash 16 | # 1. Install hugo. 17 | 18 | # 2. Install theme. 19 | mkdir -p themes 20 | cd themes 21 | git clone https://github.com/digitalcraftsman/hugo-steam-theme.git 22 | 23 | # 3. Check installation. 24 | ./script/run_server 25 | ``` 26 | -------------------------------------------------------------------------------- /config.toml: -------------------------------------------------------------------------------- 1 | baseURL = "https://quasilyte.dev/blog/" 2 | languageCode = "en-us" 3 | 4 | title = "Iskander (Alex) Sharipov technical blog" 5 | theme = "hugo-steam-theme" 6 | 7 | disqusShortname = "" 8 | googleAnalytics = "" 9 | 10 | # Number of posts per page 11 | paginate = 10 12 | 13 | [markup.goldmark.renderer] 14 | unsafe=true 15 | 16 | [[menu.main]] 17 | name = "[Posts by tags]" 18 | weight = 10 19 | identifier = "tags" 20 | url = "/tags/" 21 | 22 | [[menu.main]] 23 | name = "[Subscribe]" 24 | weight = 10 25 | identifier = "subscribe" 26 | url = "/post/faq/#subscribe" 27 | 28 | [[menu.main]] 29 | name = "[Report an issue]" 30 | weight = 10 31 | identifier = "fire_issue" 32 | url = "/post/faq/#report-an-issue" 33 | 34 | [params] 35 | title = "quasilyte blog" 36 | subtitle = "Technical blog about systems programming and related topics" 37 | copyright = "Released under the MIT license." 38 | 39 | # You can choose between green, orange, red and blue. 40 | themecolor = "green" 41 | 42 | # Link custom assets relative to /static 43 | favicon = "favicon.ico" 44 | customCSS = [] 45 | customJS = [] 46 | 47 | # To provide some metadata for search engines and the about section in the footer 48 | # feel free to add a few information about you and your website. 49 | name = "Iskander Sharipov" 50 | bio = "Lisper that got lost in a gophers land" 51 | description = "Technical blog about systems programming and related topics" 52 | 53 | # Link your social networks (optional) 54 | location = "" 55 | twitter = "quasilyte" 56 | linkedin = "quasilyte" 57 | googleplus = "" 58 | facebook = "" 59 | instagram = "" 60 | github = "quasilyte" 61 | gitlab = "" 62 | bitbucket = "" 63 | 64 | # Customize or translate the strings 65 | keepReadingStr = "" 66 | backtotopStr = "Back to top" 67 | shareStr = "Share" 68 | pageNotFoundTitle = "404 - Page not found" 69 | -------------------------------------------------------------------------------- /content/code/bool-listp.el: -------------------------------------------------------------------------------- 1 | ;;; -*- lexical-binding: t -*- 2 | 3 | ;; Requires `%return' intrinsic that is described in 4 | ;; https://quasilyte.github.io/blog/post/writing-emacs-lisp-compiler-intrinsics/ 5 | 6 | (defun bool-listp/ret (xs) 7 | (dolist (x xs) 8 | (unless (booleanp x) 9 | (%return nil))) 10 | t) 11 | 12 | (disassemble 'bool-listp/ret) 13 | ;; 0 dup 14 | ;; 1:1 dup 15 | ;; 2 goto-if-nil 3 16 | ;; 5 dup 17 | ;; 6 car 18 | ;; 7 constant booleanp 19 | ;; 8 stack-ref 1 20 | ;; 9 call 1 21 | ;; 10 goto-if-not-nil 2 22 | ;; 13 constant nil 23 | ;; 14 return 24 | ;; 15:2 stack-ref 1 25 | ;; 16 cdr 26 | ;; 17 discardN-preserve-tos 2 27 | ;; 19 goto 1 28 | ;; 22:3 discard 29 | ;; 23 constant t 30 | ;; 24 return 31 | 32 | 33 | (defun bool-listp (xs) 34 | (let ((x nil) 35 | (ret t)) 36 | (while (and (setq x (pop xs)) 37 | ret) 38 | (unless (booleanp x) 39 | (setq ret nil))) 40 | ret)) 41 | 42 | (disassemble 'bool-listp) 43 | ;; 0 constant nil 44 | ;; 1 constant t 45 | ;; 2:1 stack-ref 2 46 | ;; 3 dup 47 | ;; 4 cdr 48 | ;; 5 stack-set 4 49 | ;; 7 car-safe 50 | ;; 8 dup 51 | ;; 9 stack-set 3 52 | ;; 11 goto-if-nil 2 53 | ;; 14 dup 54 | ;; 15 goto-if-nil 2 55 | ;; 18 constant booleanp 56 | ;; 19 stack-ref 2 57 | ;; 20 call 1 58 | ;; 21 goto-if-not-nil 1 59 | ;; 24 constant nil 60 | ;; 25 stack-set 1 61 | ;; 27 goto 1 62 | ;; 30:2 return 63 | -------------------------------------------------------------------------------- /content/code/closure_test.go: -------------------------------------------------------------------------------- 1 | package bench 2 | 3 | import ( 4 | "fmt" 5 | "regexp" 6 | "strings" 7 | "testing" 8 | ) 9 | 10 | var vowels = map[rune]bool{ 11 | 'a': true, 'e': true, 'i': true, 12 | 'o': true, 'u': true, 'y': true, 13 | } 14 | var rxGolang = regexp.MustCompile(`[Gg]o|[Gg]golang`) 15 | 16 | func hasVowel(s string) bool { 17 | for _, c := range s { 18 | if vowels[c] { 19 | return true 20 | } 21 | } 22 | return false 23 | } 24 | func describeString1(s string) string { 25 | var attrs []string 26 | if hasVowel(s) { 27 | attrs = append(attrs, "has vowel letter") 28 | } 29 | if rxGolang.MatchString(s) { 30 | attrs = append(attrs, "may be about Go language") 31 | } 32 | attrs = append(attrs, fmt.Sprintf("has length of %d", len(s))) 33 | return strings.Join(attrs, "; ") 34 | } 35 | 36 | var describeString2 = func() func(name string) string { 37 | vowels := map[rune]bool{ 38 | 'a': true, 'e': true, 'i': true, 39 | 'o': true, 'u': true, 'y': true, 40 | } 41 | rxGolang := regexp.MustCompile(`[Gg]o|[Gg]golang`) 42 | hasVowel := func(s string) bool { 43 | for _, c := range s { 44 | if vowels[c] { 45 | return true 46 | } 47 | } 48 | return false 49 | } 50 | 51 | return func(s string) string { 52 | var attrs []string 53 | if hasVowel(s) { 54 | attrs = append(attrs, "has vowel letter") 55 | } 56 | if rxGolang.MatchString(s) { 57 | attrs = append(attrs, "may be about Go language") 58 | } 59 | attrs = append(attrs, fmt.Sprintf("has length of %d", len(s))) 60 | return strings.Join(attrs, "; ") 61 | } 62 | }() 63 | 64 | var discardResult string 65 | 66 | func BenchmarkNormalFunc(b *testing.B) { 67 | for i := 0; i < b.N; i++ { 68 | for _, s := range input { 69 | discardResult = describeString1(s) 70 | } 71 | } 72 | } 73 | 74 | func BenchmarkClosure(b *testing.B) { 75 | for i := 0; i < b.N; i++ { 76 | for _, s := range input { 77 | discardResult = describeString2(s) 78 | } 79 | } 80 | } 81 | 82 | var input = []string{ 83 | "4th Dimension/4D", 84 | "ABAP", 85 | "ABC", 86 | "ActionScript", 87 | "Ada", 88 | "Agilent VEE", 89 | "Algol", 90 | "Alice", 91 | "Angelscript", 92 | "Apex", 93 | "APL", 94 | "AppleScript", 95 | "Arc", 96 | "Arduino", 97 | "ASP", 98 | "AspectJ", 99 | "Assembly", 100 | "ATLAS", 101 | "Augeas", 102 | "AutoHotkey", 103 | "AutoIt", 104 | "AutoLISP", 105 | "Automator", 106 | "Avenue", 107 | "Awk", 108 | "Bash", 109 | "(Visual) Basic", 110 | "bc", 111 | "BCPL", 112 | "BETA", 113 | "BlitzMax", 114 | "Boo", 115 | "Bourne Shell", 116 | "Bro", 117 | "C", 118 | "C Shell", 119 | "C#", 120 | "C++", 121 | "C++/CLI", 122 | "C-Omega", 123 | "Caml", 124 | "Ceylon", 125 | "CFML", 126 | "cg", 127 | "Ch", 128 | "CHILL", 129 | "CIL", 130 | "CL (OS/400)", 131 | "Clarion", 132 | "Clean", 133 | "Clipper", 134 | "Clojure", 135 | "CLU", 136 | "COBOL", 137 | "Cobra", 138 | "CoffeeScript", 139 | "ColdFusion", 140 | "COMAL", 141 | "Common Lisp", 142 | "Coq", 143 | "cT", 144 | "Curl", 145 | "D", 146 | "Dart", 147 | "DCL", 148 | "DCPU-16 ASM", 149 | "Delphi/Object Pascal", 150 | "DiBOL", 151 | "Dylan", 152 | "E", 153 | "eC", 154 | "Ecl", 155 | "ECMAScript", 156 | "EGL", 157 | "Eiffel", 158 | "Elixir", 159 | "Emacs Lisp", 160 | "Erlang", 161 | "Etoys", 162 | "Euphoria", 163 | "EXEC", 164 | "F#", 165 | "Factor", 166 | "Falcon", 167 | "Fancy", 168 | "Fantom", 169 | "Felix", 170 | "Forth", 171 | "Fortran", 172 | "Fortress", 173 | "(Visual) FoxPro", 174 | "Gambas", 175 | "GNU Octave", 176 | "Go", 177 | "Google AppsScript", 178 | "Gosu", 179 | "Groovy", 180 | "Haskell", 181 | "haXe", 182 | "Heron", 183 | } 184 | 185 | -------------------------------------------------------------------------------- /content/code/eval/eval.go: -------------------------------------------------------------------------------- 1 | package main 2 | 3 | import "fmt" 4 | 5 | func eval(opbytes *byte) int64 6 | 7 | func main() { 8 | const ( 9 | opExit = iota 10 | opAdd1 11 | opSub1 12 | opZero 13 | ) 14 | prog := []byte{ 15 | opZero, // start with 0 16 | opAdd1, // 0+1 = 1 17 | opAdd1, // 1+1 = 2 18 | opSub1, // 2-1 = 1 19 | opAdd1, // 1+1 = 2 20 | opExit, // result is 2 21 | } 22 | fmt.Println(eval(&prog[0])) // Should print 2 23 | } 24 | -------------------------------------------------------------------------------- /content/code/eval/eval_amd64.s: -------------------------------------------------------------------------------- 1 | #include "textflag.h" 2 | 3 | DATA op_labels<>+0(SB)/8, $op_exit(SB) 4 | DATA op_labels<>+8(SB)/8, $op_add1(SB) 5 | DATA op_labels<>+16(SB)/8, $op_sub1(SB) 6 | DATA op_labels<>+24(SB)/8, $op_zero(SB) 7 | GLOBL op_labels<>(SB), (RODATA|NOPTR), $32 8 | 9 | #define next_op \ 10 | MOVBQZX (CX), DX \ 11 | ADDQ $1, CX \ 12 | MOVQ $op_labels<>(SB), DI \ 13 | JMP (DI)(DX*8) 14 | 15 | TEXT ·eval(SB), NOSPLIT, $0-16 16 | MOVQ opbytes+0(FP), CX //; Set up program counter (PC) 17 | next_op //; Start the evaluation 18 | 19 | TEXT op_exit(SB), NOSPLIT, $0-0 20 | MOVQ AX, ret+8(FP) 21 | RET 22 | 23 | TEXT op_add1(SB), NOSPLIT, $0-0 24 | ADDQ $1, AX 25 | next_op 26 | 27 | TEXT op_sub1(SB), NOSPLIT, $0-0 28 | SUBQ $1, AX 29 | next_op 30 | 31 | TEXT op_zero(SB), NOSPLIT, $0-0 32 | XORQ AX, AX 33 | next_op 34 | -------------------------------------------------------------------------------- /content/code/mv-lib.el: -------------------------------------------------------------------------------- 1 | ;;; -*- lexical-binding: t -*- 2 | 3 | ;; MIT License 4 | ;; Copyright (c) 2017 Iskander Sharipov 5 | ;; Permission is hereby granted, free of charge, to any person obtaining a copy 6 | ;; of this software and associated documentation files (the "Software"), to deal 7 | ;; in the Software without restriction, including without limitation the rights 8 | ;; to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | ;; copies of the Software, and to permit persons to whom the Software is 10 | ;; furnished to do so, subject to the following conditions: 11 | ;; The above copyright notice and this permission notice shall be included in all 12 | ;; copies or substantial portions of the Software. 13 | ;; THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 14 | ;; IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 15 | ;; FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 16 | ;; AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 17 | ;; LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 18 | ;; OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 19 | ;; SOFTWARE. 20 | 21 | (defconst mv--max-count 10) ;; Arbitrary limit 22 | 23 | (defun mv--var (index) 24 | "Get return value variable symbol by INDEX" 25 | (when (>= index mv--max-count) 26 | (error "Index %d is too high (%d is max)" index (1- mv--max-count))) 27 | (intern (format "mv--%d" index))) 28 | 29 | (dotimes (i mv--max-count) 30 | (eval `(defvar ,(mv--var i) nil))) 31 | 32 | (defmacro mv-ret (&rest xs) 33 | "Return multiple values from a function. 34 | Results can be used using `mv-let' macro." 35 | (let ((forms nil) 36 | (values (cdr xs)) 37 | (i 0)) 38 | (dolist (value values) 39 | (push `(setq ,(mv--var i) ,value) forms) 40 | (setq i (1+ i))) 41 | `(progn 42 | ,@(nreverse forms) 43 | ,(car xs)))) 44 | 45 | (defmacro mv-let (name-list mv-expr &rest body) 46 | "Call MV-EXPR and bind each returned value to the corresponding 47 | symbol in NAME-LIST. Bound variables are visible for each form inside BODY." 48 | (declare (indent 2)) 49 | (let ((forms nil) 50 | (i 0)) 51 | ;; We can not ignore first expression even if it is bound to "_". 52 | (push `(,(pop name-list) ,mv-expr) forms) 53 | (dolist (name name-list) 54 | (unless (eq name '_) 55 | (push `(,name ,(mv--var i)) forms)) 56 | (setq i (1+ i))) 57 | `(let ,(nreverse forms) 58 | ,@body))) 59 | 60 | ;; (defun test-3 (a b c) 61 | ;; (mv-ret c b a)) 62 | 63 | ;; (mv-let (a b c) (test-3 1 2 3) 64 | ;; (format "%d %d %d" a b c)) ;; => "3 2 1" 65 | 66 | ;; (let ((lexical-binding t)) 67 | ;; (benchmark-run-compiled 1000000 68 | ;; (mv-let (a b c) (mv-ret 1 2 3) 69 | ;; (ignore a b c)))) 70 | 71 | -------------------------------------------------------------------------------- /content/post/c-broken-defaults.md: -------------------------------------------------------------------------------- 1 | +++ 2 | date = "2016-12-26" 3 | title = "C broken defaults" 4 | tags = [ 5 | "[c]", 6 | "[language design]", 7 | "[rants]", 8 | ] 9 | description = "Trying to enumerate what C defaults are wrong." 10 | draft = false 11 | +++ 12 | 13 | ## State of the C 14 | 15 | C fits its niche quite well. 16 | If you want relatively simple, ubiquitous and efficient language 17 | there is no much room for selection. 18 | 19 | It "evolves so slowly" because it is already quite complete. 20 | Most of the parts that can be improved without making C 21 | yet another bloated language require breaking changes. 22 | 23 | > C could be designed better if we accept older code invalidation. 24 | 25 | ...but in reallity it is impossible to achieve. 26 | If you are using C, you must know many of its quirks, 27 | use external static code analyzers and read carefully 28 | a lots of 29 | [safe coding standards](https://www.securecoding.cert.org/confluence/display/c/SEI+CERT+C+Coding+Standard). 30 | 31 | This post describes subjects that I believe should be 32 | changed in order to get a better language. 33 | Note that C is mostly unsafe by design; 34 | it trusts programmer nearly as much as assemblers do. 35 | The main target is not making C higher level, but rather 36 | reconsider the defaults and make best practice enforcements easier. 37 | 38 | ## Mutability defaults 39 | 40 | Programming language should force you to think about 41 | your code more thoroughly. Whenever there is a choice, 42 | the most safe and strict choice should be favoured for a default. 43 | 44 | Mutable state must have explicit eye-catcher. 45 | We generally should care more about marking potentially 46 | tricky code rather than const-correct code 47 | [Rust language](https://www.rust-lang.org) also takes this approach). 48 | 49 | > All variables and aggregate type members 50 | > should be immutable by default. 51 | 52 | ```c 53 | typedef struct Str Str; 54 | struct Str { 55 | mut char* data; 56 | mut size_t len; 57 | }; 58 | 59 | // Both arguments are "const Str*" 60 | bool str_eq(Str*, Str*); 61 | 62 | // Mutable pointers marked so explicitly 63 | void str_copy(Str* dst, mut Str* src); 64 | ``` 65 | 66 | Compiler should warn if the variable marked as mutable, 67 | but needs not to be. 68 | 69 | ## Tag names 70 | 71 | There could be a rationale for separate "namespace" for user-defined 72 | types like structs, unions and enums. 73 | C has no real namespaces, so if we put everything into single 74 | symbol table it will bloat and compilation time can increase 75 | marginally. 76 | 77 | Everything is fine except that 90% of people instantly typedef 78 | anonymous structs. Or, if they want to be able to forward 79 | declare it inside other header, 80 | [smarter typedef is done](http://www.embedded.com/electronics-blogs/programming-pointers/4024450/Tag-vs-Type-Names). 81 | This leads us to registering same symbol inside two tables. 82 | Not only this is not convenient, it is also inefficient. 83 | 84 | > Tag symbols for type names = mistake. 85 | 86 | ## Multiple variable declarations 87 | 88 | Multiple declarations on the same line do not improve code readability. 89 | C is not about typing fewer keywords => no real gain in using this syntax. 90 | They can also be a source of confusion for amateurs (when they define 91 | both pointer and non-pointer variables of type T). 92 | 93 | > Declaring multiple variables in one statement should be forbidden. 94 | 95 | ## Builtin types 96 | 97 | More builtin primitive types would be convenient. 98 | Language would feel more coherent if things like size_t, 99 | int32_t and bool were builtin. 100 | Currently, we must include at least 3 headers to have most 101 | useful primitive types: "stddef.h" for size_t, 102 | "stdint.h" for fixed width types and "stdbool.h" to 103 | avoid ugly _Bool. Predefined NULL of special type 104 | would be great as well, but this is C++-ism. 105 | 106 | Talking about breaking existing code, I prefer int32 as a type name 107 | opposed to int32_t. 108 | 109 | > Most useful primitive types should be builtin. 110 | 111 | ## Array decay 112 | 113 | If you want to pass an "array" of known length, 114 | [C provides no help for that](http://www.drdobbs.com/architecture-and-design/cs-biggest-mistake/228701625). 115 | You can try something like [Cello](http://libcello.org/learn/a-fat-pointer-library) 116 | to fix this, but then you lose an ability 117 | to pass sub-arrays without copying (address plus offset). 118 | 119 | Most projects I have ever seen define some kind of "fat pointer" 120 | structure. That is, structure of `{void*, size_t}`. 121 | The problem is: this structure is vital, universal and useful, 122 | but it is missing from the standard library => 123 | every project defines their own fat pointer. 124 | I demand "stdarray.h". 125 | 126 | Every homebrew array is incompatible with someone else's array. 127 | We end up with two kinds of APIs as a result: 128 | one which expects two separate arguments 129 | for data and its length and another which exposes custom array type. 130 | 131 | As an addition, you will most likely need `{void*, size_t, size_t}` 132 | structure to express fixed-size container that is partially filled. 133 | This is essential to build extendable arrays (C++ calls them vectors). 134 | There are many useful fundamental data structures, but we need 135 | to start from something. Array seems like a good and easy first step. 136 | 137 | > Arrays with length must be better supported by the language. 138 | 139 | ## Aliasing defaults 140 | 141 | Additional pointer qualifier is needed to make aliasing 142 | possible only with explicit marker. 143 | If scope has more than one non-const `T*` then it 144 | should be marked either `alias` or `restrict`. 145 | Abscence of qualifier is an error. 146 | 147 | If pointers have different type, `restrict` is 148 | implied, but this can be redefined by explicit `alias`. 149 | This is needed to avoid breaking of 150 | [strict aliasing](http://blog.regehr.org/archives/1307) rules. 151 | 152 | Alias takes one or more arguments that specify what 153 | pointer could be aliased. If `a` aliases `b`, then 154 | `b` gets implicit `alias(a)` qualifier. 155 | 156 | > There should be more "restrict" 157 | > and const pointers than mutable and/or aliased pointers. 158 | 159 | ```c 160 | void copy(char* restrict dst, char* restrict src); 161 | void move(char* alias(src) dst, char* src); 162 | ``` 163 | 164 | ```c 165 | // If we specify dst as const, no need to mark 166 | // other pointer as restrict or alias. 167 | void copy(const char* dst, char* src); 168 | // But in case of move we want to pass aliased 169 | // pointers sometimes. 170 | void move(const alias(src) char* dst, char* src); 171 | ``` 172 | 173 | `alias` is choosed as a keyword because GCC already have 174 | similar attribute 175 | [may_alias](https://gcc.gnu.org/onlinedocs/gcc-4.0.2/gcc/Type-Attributes.html). 176 | 177 | ## Statement-orientation 178 | 179 | Expression-oriented languages are simply put, more expressive. 180 | There is no runtime cost because compiler can easily determine 181 | whenever particular construct is used inside lvalue context. 182 | 183 | > Expression-oriented is better than statement-oriented. 184 | 185 | But there is an important exception: 186 | 187 | > Assignments should be statements, not expressions. 188 | 189 | ```c 190 | void f(ErrorCode code) { 191 | puts(switch_expr (code) { 192 | case E_FOO: "foo error!"; 193 | case E_BAR: "bar error!"; 194 | default: "unknown error!"; 195 | }); 196 | } 197 | ``` 198 | 199 | One can argue that you can define separate function which 200 | uses same switch, but returns necessary value. 201 | This helps to avoid ugly "break", but introduces a new function. 202 | Other solution is to use 203 | [conditional operator](https://en.wikipedia.org/wiki/%3F). 204 | When formatted properly, it emulates "case" expression well. 205 | Too bad I have yet to see a compiler that checks controlling 206 | expressions to be in sequential order (like enum constants) 207 | to perform optimizations akin to switch. 208 | 209 | ## Constness erasure 210 | 211 | In modern code, casting away cv-qualifier is almost always a bad idea. 212 | Potentially, it can lead to undefined behavior. 213 | As long as const can be casted away, compiler can not make 214 | strong assumptions about it. Again, this affects both hypothetical 215 | perfomance and overall language safety. 216 | 217 | > It should be impossible to cast away const quallifiers. 218 | 219 | ## Missing features 220 | 221 | This section briefly describes controversial features 222 | from my wishlist. Completely optional things. 223 | 224 | Strict/strong typedefs were proposed for C++ more than once now. 225 | Check [this document](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3515.pdf). 226 | C could benefit from type-checked typedefs, 227 | but it can also lead to code pollution with casts if 228 | used wildly. If you interested in making C code more 229 | reliable via types, try 230 | [CQual](http://www.cs.umd.edu/~jfoster/cqual/) tool. 231 | 232 | ## To be continued 233 | 234 | I have not yet covered: 235 | 236 | * dumb preprocessor; 237 | * ambigious and clumsy syntax; 238 | * inabillity to initialize global const data in non-trivial way 239 | at compile time; 240 | * permitted duplicates in enum values; 241 | 242 | ...and some other things I dislike in C. 243 | 244 | Updates are not promised, but possible. 245 | 246 | -------------------------------------------------------------------------------- /content/post/cgo-funcall.md: -------------------------------------------------------------------------------- 1 | +++ 2 | date = "2017-08-18" 3 | title = "Path to convenient C FFI in Go" 4 | tags = [ 5 | "[go]", 6 | "[cgo]", 7 | "[c]", 8 | "[ffi]", 9 | "[reflection]", 10 | ] 11 | description = "Almost useful and flexible C FFI for Go." 12 | draft = false 13 | +++ 14 | 15 | ## DWIM-style FFI 16 | 17 | [CGo](https://golang.org/cmd/cgo/) is a widely adopted way of calling C functions from Go: 18 | low level [C FFI](https://en.wikipedia.org/wiki/Foreign_function_interface) which 19 | does many things behind the scene, but exposes only minimal functionality 20 | that is required to interact with C. 21 | 22 | CGo does no implicit conversions or data copying during C functions call, 23 | even `int` and `C.int` are not compatible. 24 | Code that uses this mechanism without wrappers ([bindings](https://en.wikipedia.org/wiki/Language_binding)) will be polluted with 25 | explicit slice/array copies and type conversions. 26 | 27 | That is perfectly fine for default behavior and FFI foundation, 28 | but sometimes we do not require this amount of control. 29 | In this cases, we want the programming environment to [do what we mean](https://en.wikipedia.org/wiki/DWIM). 30 | 31 | Imagine we have this C code: 32 | 33 | ```c 34 | // Let's assume this function is very important. 35 | int sum(int *xs, int xs_len) { 36 | int ret = 0; 37 | for (int i = 0; i < xs_len; ++i) { 38 | ret += xs[i]; 39 | } 40 | return ret; 41 | } 42 | ``` 43 | 44 | And you wish to call it from Go. 45 | There is `[]int` of valuable payload which must be 46 | aggregated with `sum` function. 47 | 48 | ```go 49 | xs := []int{1, 2, 3} // Payload 50 | // a. This is how you may want to call that function: 51 | sum := cffi.Func(C.sum) 52 | sum(xs) 53 | // b. This is how you actually can call that function: 54 | ys := append([]int{}, xs...) // Make a copy (for safety) 55 | C.sum(unsafe.Pointer(&ys[0]), C.int(len(ys))) 56 | ``` 57 | 58 | This article aims tries to reach **a**-like API, 59 | as close as possible. 60 | 61 | > Note that it is not always a desired behavior to 62 | > pass slice as 2 separate {data, len} arguments, 63 | > but I have selected this strategy to show something 64 | > worthwhile in the final section of this post. 65 | 66 | ## Universal {Go}->{C} value mapping 67 | 68 | If type mapping is the most boilerplate-full part, let's 69 | write a simple library that does it for us. 70 | 71 | Almost all primitive types have obvious C counterparts. 72 | For other types we can define conversion rules and apply 73 | them consistently. 74 | 75 | For starters, `Go2C` function should handle 1 type: integers. 76 | It takes arbitrary Go type as `interface{}` and 77 | returns inferred C type boxed into `interface{}`. 78 | 79 | ```go 80 | package cffi 81 | import "C" 82 | func Go2C(x interface{}) interface{} { 83 | switch x := x.(type) { 84 | case int: 85 | return C.int(x) 86 | default: 87 | panic("todo: implement more types") 88 | } 89 | } 90 | ``` 91 | 92 | So long it looks fine. 93 | Try to use it via client package and you may be surprised. 94 | 95 | ```go 96 | package main 97 | import "C" 98 | import "cffi" 99 | func main() { 100 | x := int(10) // Clearly, an int 101 | y := cffi.Go2C(x) // Dynamic type=C.int 102 | z := y.(C.int) // Panics! 103 | println(z) 104 | } 105 | ``` 106 | 107 | Exact error message may vary, but it reads like: "panic: interface 108 | conversion: interface {} is cffi._Ctype_int, not main._Ctype_int". 109 | 110 | This is a [known issue](https://github.com/golang/go/issues/13467). 111 | Practically speaking, we can not implement `Go2C` this way properly. 112 | 113 | This also makes it impossible to match result types and 114 | do reversal, `C2Go` mapping. 115 | 116 | > The only way to define a converter function is to 117 | > delegate that task to the client code. 118 | > Package that imports "C" does implement the conversion rules. 119 | 120 | ## CGo function metadata 121 | 122 | To define function like `func Call(fn , args ...) ` we need 123 | to have signature info of `fn` argument. 124 | 125 | How much information is provided by CGo? 126 | What kind of value `C.` yields, given that `` is a function? 127 | 128 | ```go 129 | package main 130 | // int foo(void) { return 0; } 131 | import "C" 132 | import "fmt" 133 | func main() { fmt.Printf("%T\n", C.foo) } 134 | ``` 135 | 136 | The answer is `unsafe.Pointer`. 137 | Well, this is bad for two reasons: 138 | 139 | 1. We can not wrap it into `reflect.Value`; 140 | 2. `unsafe.Pointer` gives zero type information; 141 | 142 | More experiments will reveal a cheesy way to get what we need. 143 | 144 | **Step1**: discover CGo name mangling scheme. 145 | 146 | ```go 147 | package main 148 | // void foo(void) {} 149 | import "C" 150 | func main() { C.foo(1, 2) } 151 | ``` 152 | 153 | Go will kindly reply that you called function with wrong number 154 | of arguments: 155 | 156 | ```text 157 | main.go:6: too many arguments in call to _Cfunc_foo 158 | have (number, number) 159 | want () 160 | ``` 161 | 162 | See that `_Cfunc_foo`? I think you get the pattern. 163 | 164 | **Step2**: examine mangled symbol directly. 165 | 166 | ```go 167 | package main 168 | // void foo(void) {} 169 | import "C" 170 | import "fmt" 171 | func main() { fmt.Printf("%#v\n", _Cfunc_foo) } 172 | ``` 173 | 174 | Go rejects your code: `main.go:5: undefined: _Cfunc_foo`. 175 | This is easy to fix. 176 | 177 | **Step3**: figure out a fix to error above. 178 | 179 | ```go 180 | package main 181 | // void foo(void) {} 182 | import "C" 183 | import "fmt" 184 | func main() { 185 | _ = C.foo() // Use "foo" 186 | fmt.Printf("%#v\n", _Cfunc_foo) 187 | } 188 | ``` 189 | 190 | This snippet actually gets us closer to the solution. 191 | Expression `_Cfunc_foo` is not equivalent to `C.foo` as it 192 | gives us `func() main._Ctype_void` type. 193 | 194 | ## Implementation overview 195 | 196 | Implementation requirements: 197 | 198 | 1. Function is callable via its symbol, like `f()`, where `f` is a symbol; 199 | 2. Ingoing and outgoing arguments are automatically converted; 200 | 201 | First requirement can be fulfilled only by global (possibly dot-imported) function 202 | or closure variable. 203 | Second requirement, due to restrictions outlined above, is possible 204 | with a help of external state. This state can be global or captured (with closures). 205 | 206 | ```go 207 | package main 208 | // int add1(int x) { return x + 1; } 209 | import "C" 210 | import "cffi" 211 | var add1 cffi.Func 212 | func init() { 213 | _ = C.add1(0) // [I] 214 | 215 | invoker := cffi.NewInvoker( // [II] 216 | // Go -> C 217 | func(x interface{}) interface{} { 218 | return C.int(x.(int)) 219 | }, 220 | // C -> Go 221 | func (x interface{}) interface{} { 222 | return int(x.(C.int)) 223 | } 224 | ) 225 | 226 | add1 = cffi.Wrap(invoker, _Cfunc_add1) // [III] 227 | } 228 | 229 | func main() { 230 | println(add1(50)) // [IV] 231 | } 232 | ``` 233 | 234 | **(I)** is needed if `add1` is never called via `C.add1` symbol. 235 | We will not get `_Cfunc_add1` without it. 236 | 237 | **(II)** invoker instance should take care of values conversions 238 | and universal call evaluation. 239 | 240 | **(III)** actual function pointer is wrapped into a closure 241 | that holds invoker. 242 | 243 | **(IV)** prepared closure can be used in a desired way. 244 | 245 | Invoker can handle 1->N value mapping. 246 | For example, it can be legal to return `[]interface{}` for 247 | Go values that should be unwrapped into 2 C function arguments. 248 | Slices are such example (we ignore `cap` on purpose). 249 | 250 | ```go 251 | func (x interface{}) interface{} { 252 | switch x := x.(type) { 253 | case []int: 254 | y := make([]C.int, len(x)) 255 | for i := range x { 256 | y[i] = C.int(x[i]) 257 | } 258 | ptr := (*C.int)(unsafe.Pointer(&y[0])) 259 | return []interface{}{ptr, C.int(len(x))} 260 | 261 | // ... handle other types 262 | } 263 | } 264 | ``` 265 | 266 | If you want to see implementation sources, inspect [cffi](https://github.com/Quasilyte/cffi) library. 267 | 268 | ## Warnings and closing notes 269 | 270 | **Performance**. 271 | 272 | `cffi` library solution involves many extra overhead compared to 273 | simple CGo call, which itself is far more expensive than normal Go 274 | function call. 275 | 276 | **Portability**. 277 | 278 | Using CGo alone can hurt application portability. 279 | Granted, using CGo oddities like name mangled function 280 | objects slaughters program portability completely. 281 | It may break with newer Go1 releases. -------------------------------------------------------------------------------- /content/post/disassembling-go-avx512.md: -------------------------------------------------------------------------------- 1 | +++ 2 | date = "Fri Jun 8 13:29:13 MSK 2018" 3 | title = "Disassembling Go AVX-512" 4 | tags = [ 5 | "[go]", 6 | "[asm]", 7 | "[avx-512]", 8 | "[Intel XED]", 9 | ] 10 | description = "objdump that is distributed with Go can't handle AVX-512 yet. This article describes workarounds." 11 | draft = false 12 | +++ 13 | 14 | ## The problem 15 | 16 | Go 1.11 got updated assembler that supports AVX-512, but disassembler is left unchanged. 17 | In other words, `go tool asm` speaks AVX-512, `go tool objdump` does not. 18 | 19 | Suppose we have this `avx.s` file: 20 | 21 | ```x86asm 22 | TEXT ·avxCheck(SB), 0, $0 23 | VPOR X0, X1, X2 //; AVX1 24 | VPOR Y0, Y1, Y2 //; AVX2 25 | VPORD.BCST (DX), Z1, K2, Z2 //; AVX-512 26 | RET 27 | ``` 28 | 29 | You will be surprised after assemble+disassemble attempt: 30 | 31 | ```bash 32 | $ go tool asm avx.s 33 | $ go tool objdump avx.o 34 | 35 | TEXT ·avxCheck(SB) gofile..$GOROOT/avx.s 36 | avx.s:2 0xb7 c5f1ebd0 JMP 0x8b 37 | avx.s:3 0xbb c5f5ebd0 JMP 0x8f 38 | avx.s:4 0xbf 62 ? 39 | avx.s:4 0xc0 f1 ICEBP 40 | avx.s:4 0xc1 755a JNE 0x11d 41 | avx.s:4 0xc3 eb12 JMP 0xd7 42 | avx.s:5 0xc5 c3 RET 43 | ``` 44 | 45 | Rest of this article described how to overcome this situation. 46 | 47 | ## 1. System objdump over binary 48 | 49 | System [objdump](https://linux.die.net/man/1/objdump) can't handle Go object 50 | files as they are not ELF library files (`e_type=1`) but rather internal to Go wire format. 51 | 52 | ```bash 53 | $ objdump -D avx.o 54 | 55 | objdump: avx.o: File format not recognized 56 | ``` 57 | 58 | We can make it work though. 59 | To do so, we need to build executable that `objdump` can understand. 60 | 61 | First off, we add `main.go`: 62 | 63 | ```go 64 | package main 65 | 66 | func avxCheck() 67 | 68 | func main() { 69 | avxCheck() 70 | } 71 | ``` 72 | 73 | It's now possible to build `avxCheck` along with main package. 74 | 75 | ```bash 76 | go build -o avxcheck . 77 | 78 | $ objdump -D avxcheck | sed '/:/,/^$/!d' 79 | 80 | 000000000044e580 : 81 | 44e580: c5 f1 eb d0 vpor %xmm0,%xmm1,%xmm2 82 | 44e584: c5 f5 eb d0 vpor %ymm0,%ymm1,%ymm2 83 | 44e588: 62 f1 75 5a eb 12 vpord (%rdx){1to16},%zmm1,%zmm2{%k2} 84 | 44e58e: c3 retq 85 | 44e58f: cc int3 86 | ``` 87 | 88 | ## 2. System objdump over shellcode 89 | 90 | Assembling `avx.s` with `-S` flag almost yields wanted results: 91 | 92 | ```bash 93 | $ go tool asm -S avx.s 94 | 95 | avxCheck STEXT nosplit size=15 args=0xffffffff80000000 locals=0x0 96 | 0x0000 00000 (avx.s:1) TEXT avxCheck(SB), NOSPLIT, $0 97 | 0x0000 00000 (avx.s:2) VPOR X0, X1, X2 98 | 0x0004 00004 (avx.s:3) VPOR Y0, Y1, Y2 99 | 0x0008 00008 (avx.s:4) VPORD.BCST (DX), Z1, K2, Z2 100 | 0x000e 00014 (avx.s:5) RET 101 | 0x0000 c5 f1 eb d0 c5 f5 eb d0 62 f1 75 5a eb 12 c3 ........b.uZ... 102 | go.info.avxCheck SDWARFINFO size=34 103 | 0x0000 02 61 76 78 43 68 65 63 6b 00 00 00 00 00 00 00 .avxCheck....... 104 | 0x0010 00 00 00 00 00 00 00 00 00 00 01 9c 00 00 00 00 ................ 105 | 0x0020 01 00 106 | ``` 107 | 108 | Function body is `c5 f1 eb d0 c5 f5 eb d0 62 f1 75 5a eb 12 c3`. 109 | 110 | These bytes definitely include 4 instructions from `avxCheck` function, 111 | but it's hard to associate octets with instructions they encode. 112 | The're all intermixed. 113 | 114 | `objdump` does support raw shellcode input format. 115 | All we need to do is to turn hex octets into that. 116 | 117 | ```bash 118 | $ echo 'c5 f1 eb d0 c5 f5 eb d0 62 f1 75 5a eb 12 c3' | 119 | xxd -r -p > code.bin 120 | $ objdump -b binary -m i386 -D code.bin 121 | 122 | Disassembly of section .data: 123 | 124 | 00000000 <.data>: 125 | 0: c5 f1 eb d0 vpor %xmm0,%xmm1,%xmm2 126 | 4: c5 f5 eb d0 vpor %ymm0,%ymm1,%ymm2 127 | 8: 62 f1 75 5a eb 12 vpord (%edx){1to16},%zmm1,%zmm2{%k2} 128 | e: c3 ret 129 | ``` 130 | 131 | ## 3. Intel XED CLI 132 | 133 | [Intel XED](https://github.com/intelxed/xed) includes several useful [command-line tools](https://intelxed.github.io/ref-manual/group__EXAMPLES.html). 134 | 135 | One of them is called `xed`. It's capable of encoding and decoding x86 instructions. 136 | 137 | ```bash 138 | $ echo 'c5 f1 eb d0 c5 f5 eb d0 62 f1 75 5a eb 12 c3' > code.txt 139 | $ xed -64 -A -ih code.txt 140 | 141 | 00: LOGICAL AVX C5F1EBD0 vpor %xmm0, %xmm1, %xmm2 142 | 04: LOGICAL AVX2 C5F5EBD0 vpor %ymm0, %ymm1, %ymm2 143 | 08: LOGICAL AVX512EVEX 62F1755AEB12 vpordl (%rdx){1to16}, %zmm1, %zmm2{%k2} 144 | 0e: RET BASE C3 retq 145 | ``` 146 | 147 | Decoding single instruction is even simpler: 148 | 149 | ```bash 150 | $ xed -64 -A -d '62 f1 75 5a eb 12' 151 | 152 | 62F1755AEB12 153 | ICLASS: VPORD CATEGORY: LOGICAL EXTENSION: AVX512EVEX IFORM: VPORD_ZMMu32_MASKmskw_ZMMu32_MEMu32_AVX512 ISA_SET: AVX512F_512 154 | SHORT: vpordl (%rdx){1to16}, %zmm1, %zmm2{%k2} 155 | ``` 156 | 157 | Without `-A` flag, it will print instructions in Intel syntax: 158 | 159 | ```bash 160 | $ xed -64 -d '62 f1 75 5a eb 12' 161 | 162 | 62F1755AEB12 163 | ICLASS: VPORD CATEGORY: LOGICAL EXTENSION: AVX512EVEX IFORM: VPORD_ZMMu32_MASKmskw_ZMMu32_MEMu32_AVX512 ISA_SET: AVX512F_512 164 | SHORT: vpord zmm2{k2}, zmm1, dword ptr [rdx]{1to16} 165 | ``` 166 | 167 | In addition, there is also `xed-ex4`, which prints many interesting details about instruction being decoded: 168 | 169 | ```bash 170 | $ xed-ex4 -64 C5 F1 EB D0 171 | 172 | PARSING BYTES: c5 f1 eb d0 173 | VPOR VPOR_XMMdq_XMMdq_XMMdq 174 | EASZ:3, 175 | EOSZ:2, 176 | HAS_MODRM:1, 177 | LZCNT, 178 | MAP:1, 179 | MAX_BYTES:4, 180 | MOD:3, 181 | MODE:2, 182 | MODRM_BYTE:208, 183 | NOMINAL_OPCODE:235, 184 | OUTREG:XMM0, 185 | P4, 186 | POS_MODRM:3, 187 | POS_NOMINAL_OPCODE:2, 188 | REG:2, 189 | REG0:XMM2, 190 | REG1:XMM1, 191 | REG2:XMM0, 192 | SMODE:2, 193 | TZCNT, 194 | VEXDEST210:6, 195 | VEXDEST3, 196 | VEXVALID:1, 197 | VEX_PREFIX:1 198 | 0 REG0/W/DQ/EXPLICIT/NT_LOOKUP_FN/XMM_R 199 | 1 REG1/R/DQ/EXPLICIT/NT_LOOKUP_FN/XMM_N 200 | 2 REG2/R/DQ/EXPLICIT/NT_LOOKUP_FN/XMM_B 201 | YDIS: vpor xmm2, xmm1, xmm0 202 | ATT syntax: vpor %xmm0, %xmm1, %xmm2 203 | INTEL syntax: vpor xmm2, xmm1, xmm0 204 | ``` 205 | 206 | Its output requires XED knowledge in order to be fully understood, but if you're 207 | excited, I'm suggesting you to read the documentation and/or sources and achieve enlightenment. 208 | 209 | ## Prefix-only disassembling 210 | 211 | If, for whatever reason, you only want to inspect prefix details, there is [vexdump](https://github.com/Quasilyte/tools/tree/master/src/vexdump) 212 | utility which can be used to do just that. 213 | 214 | Dump single instruction prefix info: 215 | 216 | ```bash 217 | $ vexdump 6272fd098ae8 218 | 219 | EVEX rxbR00mm Wvvvv1pp zLlbVaaa opcode modrm fields 220 | 62 01110010 11111101 00001001 8A 11101000 EVEX.128.66.0F38.W1 221 | ``` 222 | 223 | Dump multiple instructions (most probably for comparison): 224 | 225 | ```bash 226 | $ vexdump 6272FD098AE8 '62 72 fd 09 8a c5' 227 | 228 | EVEX rxbR00mm Wvvvv1pp zLlbVaaa opcode modrm fields 229 | 62 01110010 11111101 00001001 8A 11101000 EVEX.128.66.0F38.W1 230 | 62 01110010 11111101 00001001 8A 11000101 EVEX.128.66.0F38.W1 231 | ``` 232 | 233 | It can also dump mixed prefixes: 234 | 235 | ```bash 236 | $ vexdump 6272fd098ae8 6272fd098ac5 c4e1315813 c5b15813 c5f877 237 | 238 | VEX2 rvvvvlpp opcode modrm fields 239 | C5 10110001 58 00010011 VEX.128.66.0F.W0 240 | C5 11111000 77 00000000 VEX.128.0F.W0 241 | VEX3 rxbmmmmm Wvvvvlpp opcode modrm fields 242 | C4 11100001 00110001 58 00010011 VEX.128.66.0F.W0 243 | EVEX rxbR00mm Wvvvv1pp zLlbVaaa opcode modrm fields 244 | 62 01110010 11111101 00001001 8A 11101000 EVEX.128.66.0F38.W1 245 | 62 01110010 11111101 00001001 8A 11000101 EVEX.128.66.0F38.W1 246 | ``` 247 | 248 | That tool can be especially helpful for encoder validation. 249 | -------------------------------------------------------------------------------- /content/post/dumbing-down-go-interfaces.md: -------------------------------------------------------------------------------- 1 | +++ 2 | date = "2017-04-26" 3 | title = "Dumbed-down Go interfaces" 4 | tags = [ 5 | "[go]", 6 | "[golang]", 7 | "[error handling]", 8 | ] 9 | description = "Making verbose code simpler by wrapping interface methods." 10 | draft = true 11 | +++ 12 | 13 | ## io.Writer 14 | 15 | If you write Go, odds are high that you familiar with 16 | `io.Writer#Write` signature well. 17 | Write returns both error and number of bytes written. 18 | Every time you call this method, 19 | error should be checked before you 20 | try to write to that writer again. 21 | 22 | This interface is inconvenient to 23 | use when you want to call `Write` multiple times and get 24 | total number of bytes pushed during these invocations. 25 | You also do not want to loose error if it ever occurs. 26 | 27 | ```go 28 | func f1(w io.Writer, s fmt.Stringer, parts [][]byte) (int, error) { 29 | written, err := w.Write([]byte(s.String())) 30 | if err != nil { 31 | return written, err 32 | } 33 | for _, part := range parts { 34 | n, err := w.Write(part) 35 | written += n 36 | if err != nil { 37 | return written, err 38 | } 39 | } 40 | return written, nil 41 | } 42 | ``` 43 | 44 | The boilerplate of error propagation can be annoying, 45 | Go has no things like [try!](https://doc.rust-lang.org/std/macro.try.html) 46 | in Rust, but it does not mean that there is no solutions. 47 | 48 | ## Dumbing it down 49 | 50 | Code presented above can be considered idiomatic. 51 | Error checking is explicit and in-place (near error occurrence). 52 | It is OK when you have small amount of `f1`-like function, 53 | there is nothing wrong in small code duplication. 54 | When you reach your limit of DRY violation, consider 55 | possible changes. 56 | 57 | Except writing, we do two things in that function: 58 | counting bytes and checking for errors. 59 | To dumb down interface we need to wrap it in a type 60 | that does additional job for us. 61 | 62 | I will present two examples that simplify `f1` function. 63 | First `f2` does not sacrifice anything but improvement is 64 | slight. Second `f3` is more expressive, 65 | but may not be always appropriate. 66 | 67 | ## Without manual counting 68 | 69 | If errors are more important than counting, we may wish to hide 70 | obscuring code to concentrate on error handling. 71 | 72 | ```go 73 | type countingWriter struct { 74 | written int // Accumulates 1st return value of Write. 75 | dst io.Writer // Wrapped writer. 76 | } 77 | 78 | func (cw *countingWriter) WriteAndCount(p []byte) error { 79 | n, err := cw.dst.Write(p) 80 | cw.written += n 81 | return err 82 | } 83 | ``` 84 | 85 | The benefits are: 86 | less varibles in scope (no need for temporary `n` 87 | and `written` accumulator), single return value 88 | makes it easier to compose calls. 89 | 90 | ```go 91 | func f2(w io.Writer, s fmt.Stringer, parts [][]byte) (int, error) { 92 | cw := countingWriter{dst: w} 93 | err := cw.WriteAndCount([]byte(s.String())) 94 | if err != nil { 95 | return cw.written, err 96 | } 97 | for _, part := range parts { 98 | err := cw.WriteAndCount(part) 99 | if err != nil { 100 | return cw.written, err 101 | } 102 | } 103 | return cw.written, nil 104 | } 105 | ``` 106 | 107 | ## Without eager error handling 108 | 109 | Explicit error handling in `f1`, 110 | after all, is not important. 111 | That function does not try to fix error conditions, 112 | it just passes them back to the caller. 113 | In the end, `f1` does only one important thing: 114 | it writes. 115 | It is possible to make a write method that returns... nothing. 116 | 117 | ```go 118 | type safeWriter struct { 119 | err error // Error that occured during writing. 120 | written int // Bytes written before the error occured. 121 | dst io.Writer // Wrapped writer. 122 | } 123 | 124 | func (w *safeWriter) SafeWrite(p []byte) { 125 | if w.err != nil { 126 | return 127 | } 128 | n, err := w.dst.Write(p) 129 | w.err = err 130 | w.written += n 131 | } 132 | ``` 133 | 134 | Note that even if we do not return an error, 135 | we store it in the safeWriter. 136 | When first error occurs, safeWriter will continue to look like 137 | before, but consequent write calls are ignored. 138 | After all operations we wish to execute are made, 139 | we return an error (which can be nil) and 140 | a total number of bytes written. 141 | 142 | ```go 143 | func f3(w io.Writer, s fmt.Stringer, parts [][]byte) (int, error) { 144 | sw := safeWriter{dst: w} 145 | sw.SafeWrite([]byte(s.String())) 146 | for _, part := range parts { 147 | sw.SafeWrite(part) 148 | } 149 | return sw.written, sw.err 150 | } 151 | ``` 152 | 153 | > Not all errors require immediate handling, we use that fact in 154 | > safeWriter. 155 | 156 | ## Not only io.Writer 157 | 158 | `io.Writer` was selected to show this wrapping technique 159 | on concrete example, but it can be applied to any methods 160 | that clobber code because of the contract it establish. 161 | 162 | For `f1` it could be better to define `writeAll` that 163 | takes `[][]byte` and does the looping inside. 164 | It works, but the solution is less generic than `f3` 165 | and probably less composable. The balance of 166 | **too concrete** and **over-abstract** code is hard, so the decision 167 | is always halfway subjective. 168 | -------------------------------------------------------------------------------- /content/post/elisp-multi-return-values.md: -------------------------------------------------------------------------------- 1 | +++ 2 | date = "2017-05-19" 3 | title = "Emacs Lisp multi return values" 4 | tags = [ 5 | "[emacs lisp]", 6 | "[goism project]", 7 | "[performance]", 8 | ] 9 | description = "Efficient Go style multi return values in Emacs Lisp." 10 | draft = false 11 | +++ 12 | 13 | ## The missing feature 14 | 15 | Did you ever wrote a function in **Emacs Lisp** which should return 16 | more than one result? 17 | 18 | Emacs Lisp has no native support for multiple return values, 19 | but provides `cl-lib` that emulates it in a **Common Lisp** style. 20 | 21 | In this article I will show that `cl-values` is suboptimal and 22 | can be replaced without any sacrifices to the convenience. 23 | 24 | ## Naive solution 25 | 26 | `cl-lib` implements `cl-values` in terms of `list`. 27 | This approach is naive because each time you return with that, 28 | an allocation is involved. GC will trigger more frequently 29 | and perfomance will degrade. 30 | 31 | ```lisp 32 | (let ((lexical-binding t)) 33 | (benchmark-run-compiled 1000000 34 | (cl-multiple-value-bind (a b c) (cl-values 1 2 3) 35 | (ignore a b c)))) ;; => (0.8493319750000001 59 0.7827748330000008) 36 | ;; ...more than 50 garbage collections, 37 | ``` 38 | 39 | We see the bottleneck now: proposed solutions should 40 | avoid memory allocations. 41 | In other words: no `list`, `cons`, `vector`, ... 42 | 43 | ## No allocations with preallocations 44 | 45 | We can still use lists and vectors if preallocation is done. 46 | Multiple return values are mostly consist of 2-4 elements => 47 | the set of required containers is fixed and known beforehand. 48 | 49 | ```lisp 50 | (defvar mv--2 (make-vector 2 nil)) 51 | (defvar mv--3 (make-vector 3 nil)) 52 | ;; ... as many as we need. 53 | ``` 54 | 55 | When 2 value tuple must be returned, `mv--2` vector 56 | is populated with corresponding values. 57 | For 3 value tuple, `mv--3` is used. 58 | Filled vector is returned to the caller. 59 | Special macro can be used to extract vector elements 60 | into specified bindings. 61 | 62 | This brings us close to the `cl-lib`, but without allocations. 63 | 64 | Emacs Lisp has no real multithreading, so it is safe to 65 | store results inside private global variable. 66 | 67 | ## List vs vector 68 | 69 | The choice between `vector` and `list` is not easy, 70 | especially if you know 71 | [Emacs bytecode](https://www.emacswiki.org/emacs/ByteCodeEngineering). 72 | 73 | First operation we care about is **return efficiency**. 74 | To make multi value return, preallocated list/vector 75 | must be filled with data. 76 | 77 | ```text 78 | f(x, y) = x, y+1 79 | 80 | (defvar mv--2 '(nil . nil)) (defvar mv--2 [nil nil]) 81 | constants=[mv--2] maxStack=5 constants=[mv--2 0 1] maxStack=6 82 | | 83 | add1 | add1 84 | varref 0 | varref 0 85 | dup | dup 86 | stack-ref 3 + constant 1 87 | setcar + stack-ref 4 88 | dup + aset 89 | stack-ref 2 + dup 90 | setcdr + constant 2 91 | ret + stack-ref 3 92 | + aset 93 | | ret 94 | ``` 95 | 96 | Left block shows list implementation. Right block is for vector. 97 | 98 | As you may see, for `N=2` case cons cell is better than vector in many ways: 99 | 100 | * Bytecode is shorter 101 | * Less stack space is used 102 | * Smaller constant vector (no need for indexes) 103 | 104 | Second operation is **return value receive**. 105 | 106 | ```text 107 | let x, y = f(...) 108 | 109 | call ... | call ... 110 | dup | dup 111 | car + constant X 112 | stack-ref 1 + aget 113 | cdr + stack-ref 1 114 | + constant Y 115 | + aget 116 | ``` 117 | 118 | What about 3 or more return values? 119 | General algorithm for lists is: 120 | 121 | 1. For `N` return values use dedicated preallocated list 122 | 2. First value bound with `setcar` 123 | 3. Last value bound with `setcdr` 124 | 4. Values in between set with `setcar` AND perform `cdr` 125 | 126 | Note that used list is not **proper list**. The last `cdr` is not `nil`. 127 | 128 | At `N=3` vector and list are nealy equal in efficiency, `N=4` favors vectors. 129 | List becomes less and less efficient as the `N` grows. 130 | In my experience 2-value returns cover 90% of cases. 131 | This means that list is a winner here. 132 | 133 | Another thing worth considering is ability to discard some 134 | of the return values. In **Go** you can do `a, _, c := f()` 135 | which assigns 1st and 3rd returned values; 2nd value is ignored. 136 | Generally, lists are slower here because you still need to 137 | traverse ignored elements. 138 | 139 | Next section describes another implementation option which 140 | is a good compromise between list and vector in terms of 141 | {return/assign/discard} operations perfomance. 142 | 143 | ## Neither list, nor vector? 144 | 145 | It is possible to avoid lists and vectors completely. 146 | 147 | For each **additional** return value it is possible to use 148 | single global variable. 149 | 150 | First value is returned as usual, while others 151 | use `varset` (setq) to bind additional data. 152 | On the caller side, function result is bound to 153 | the first variable; other variables read from 154 | corresponding global variables. 155 | 156 | ```lisp 157 | ;; Return "a", "b", "c": 158 | (progn 159 | (setq mv--3 "c") 160 | (setq mv--2 "b") 161 | "a") 162 | ;; Bind results: 163 | (let ((x1 (f ...) 164 | (x2 mv--2) 165 | (x3 mv--3))) 166 | ...) 167 | ``` 168 | 169 | This gives us very compact bytecode. Perfomance 170 | depends on many factors, but it can 171 | match implementation based on preallocated lists. 172 | 173 | Let's use this idea to create `mv-lib`. 174 | 175 | ## mv-lib 176 | 177 | The minimal `mv-lib` should consist of at least two macros: 178 | 179 | 1. `mv-ret` - yield a multi value 180 | 2. `mv-let` - bind multi value to local variables 181 | 182 | Like with other solutions, predefined globals are required. 183 | For simplicity, they have 0-based suffixes. 184 | That is, second return value is stored inside `mv--0` (not in `mv--2`). 185 | 186 | ```lisp 187 | (defconst mv--max-count 10) ;; Arbitrary limit 188 | 189 | (defun mv--var (index) 190 | "Get return value variable symbol by INDEX" 191 | (when (>= index mv--max-count) 192 | (error "Index %d is too high (%d is max)" index (1- mv--max-count))) 193 | (intern (format "mv--%d" index))) 194 | 195 | (dotimes (i mv--max-count) 196 | (eval `(defvar ,(mv--var i) nil))) 197 | ``` 198 | 199 | `mv-ret` and `mv-let` are convenience wrappers for code that is 200 | presented in previous section. 201 | 202 | [Full mv-lib implementation](/blog/code/mv-lib.el). 203 | 204 | ```lisp 205 | ;; Bind multiple values with some of them being ignored. 206 | (mv-let (a _ b _) (mv-ret 1 2 3 4) 207 | (+ a b)) ;; => 4 208 | 209 | ;; Compare with `cl-lib'. 210 | (let ((lexical-binding t)) 211 | (benchmark-run-compiled 1000000 212 | (mv-let (a b c) (mv-ret 1 2 3) 213 | (ignore a b c)))) ;; => (0.174552687 0 0.0) 214 | ;; 0 GC runs! 215 | ``` 216 | 217 | > Multitple values return with zero allocations achieved 218 | 219 | Each `_` bind variable does not produce any code, 220 | they truly ignore the result. 221 | Important exception is the first binding. It can not 222 | discard the bound expression because it has side-effect 223 | of setting rest return values. 224 | 225 | ## Why I prefer goism 226 | 227 | Macro can help a lot with many features, but what about 228 | packages or namespaces? 229 | 230 | It is tedious and ugly to use prefixed identifiers for **everything**. 231 | Even **C** has better modularity and encapsulation with 232 | internal linkage and opaque pointers. 233 | 234 | Everyone understand complications that arise with modules 235 | for Emacs. Luckily, there is another way. 236 | Some languages already have modules. 237 | With [goism](https://github.com/Quasilyte/goism) it is possible 238 | to write **Go** code that is translated into **Emacs Lisp**. 239 | 240 | As a bonus, when **goism** will be complete, we could 241 | use **Go** libraries inside Emacs. 242 | -------------------------------------------------------------------------------- /content/post/faq.md: -------------------------------------------------------------------------------- 1 | +++ 2 | date = "2016-12-01" 3 | title = "FAQ" 4 | tags = [] 5 | description = "This blog FAQ." 6 | draft = false 7 | +++ 8 | 9 | ## Report an issue 10 | 11 | If you: 12 | 13 | * Found a typo; 14 | * Think some material is inappropriate or incorrect; 15 | * Want to propose a new article; 16 | 17 | ... or you looking for a way to contact author, 18 | 19 | just [open new Github issue](https://github.com/Quasilyte/blog-src/issues/new). 20 | 21 | Try to be polite and precise in your words, please. 22 | 23 | ## Subscribe 24 | 25 | Currently, the only way to "subscribe" for blog updates is to `watch` 26 | [blog-src](https://github.com/Quasilyte/blog-src) repository. 27 | 28 | 29 | 30 | Go to the linked repository and locate displayed button in the top-right 31 | part of the window. 32 | -------------------------------------------------------------------------------- /content/post/gen-map.md: -------------------------------------------------------------------------------- 1 | +++ 2 | date = "2023-09-22" 3 | title = "Generations-based array" 4 | tags = [ 5 | "[go]", 6 | "[performance]", 7 | "[shortread]", 8 | "[data-structure]", 9 | "[roboden]", 10 | ] 11 | description = "A faster sparse-dense array, but without iteration." 12 | draft = false 13 | +++ 14 | 15 | ## Intro 16 | 17 | I was intrigued by the sparse map/set described in [Russ Cox's article](https://research.swtch.com/sparse). 18 | 19 | And I'm not the only one: this exact implementation is used in Go source code more than once! The compiler uses it for many ID-like maps and sets; regexp package uses it for a [queue](https://github.com/golang/go/blob/795414d1c628f763defa43199ab51ea3dc3241d8/src/regexp/exec.go#L17). 20 | 21 | But there is one thing that is still bugging me: it's hard to make it very efficient. All operations I care about are O(1), but `get` and `set` operations clearly become slower in comparison with a straightforward slice approach. 22 | 23 | In fact, if your arrays are not that big (less than 0xffff bytes?), you might be better off using a slice with O(n) clear operation. If you do many `get`+`set`, the increased overhead may be too much. 24 | 25 | In this article, I'll propose a different data structure that can replace a sparse-dense map (and set) if you don't need the iteration over the elements. 26 | 27 | > This discussion is not Go-specific, but I'll use Go in the examples. 28 | 29 | ## The Problem 30 | 31 | Let me start with a problem that we're trying to address. 32 | 33 | Imagine that you need a mapping structure that you can re-use. Something like a `map[uint16]T`, but with a more predictable allocations pattern. 34 | 35 | Your function may look like this: 36 | 37 | ```go 38 | func doWork(s *state) result { 39 | s.map.Reset() // You want the memory to be re-used 40 | 41 | // Do the work using the map. 42 | // Only get+set operations are used here. 43 | } 44 | ``` 45 | 46 | If your "map" can re-use the memory properly, this code may become zero-alloc. 47 | 48 | Our requirements can be described as follows: 49 | 50 | | Operation | Complexity | 51 | |---|---| 52 | | Set | O(1) | 53 | | Get | O(1) | 54 | | Reset | O(1) | 55 | 56 | We want it all, plus the efficient memory re-use. 57 | 58 | We'll analyze these choices today: 59 | 60 | * `map[uint16]T` 61 | * `[]T` 62 | * `sparseMap` 63 | * `genMap` 64 | 65 | The slice and map solutions do not fit our requirements, but we'll use them for a comparison. 66 | 67 | ## Benchmark Results 68 | 69 | Let's start by comparing the raw performance. 70 | 71 | | Data Structure | Set | Get | Reset | 72 | |-------------|------:|------:|-----:| 73 | | map | (x17.9) 47802 | (x28.6) 36922 | 1801 | 74 | | slice | 2665 | 1289 | 6450 | 75 | | sparse | (x6.7) 17859 | (x1.89) 2435 | 16 | 76 | | generations | (x1.1) 3068 | (x1.04) 1349 | 26 | 77 | 78 | Observations: 79 | 80 | * Map is heavily outclassed 81 | * Both sparse and generation maps have a crazy-fast reset 82 | * Even with 5000 elements (8*5000=40000 bytes), a slice reset takes noticeable time 83 | * `sparse.set()` operation is ~7 times slower than slice! 84 | * `sparse.get()` operation is ~2 times slower than slice 85 | * Generations map is almost as fast as a slice, but reset is much faster 86 | 87 | The sparse and generations map do not zero their data during the `reset` operation. Therefore, avoid storing pointers in there. These pointers will be "held" by the container for a potentially long period of time, causing memory leaks. I would only recommend using both sparse and generations-based data structures with simple pointer-free. 88 | 89 | You can find the exact benchmarks code [here](https://gist.github.com/quasilyte/a64bd66093c20c5e146b60e2cf3f3191). 90 | 91 | Some benchmark notes: 92 | 93 | * I used a [real-world](https://github.com/golang/go/blob/795414d1c628f763defa43199ab51ea3dc3241d8/src/cmd/compile/internal/ssa/sparsemap.go) sparse-dense implementation 94 | * Every `get`/`set` goes through a noinline wrapper to avoid the unwanted optimizations 95 | * Every `get`/`set` test runs the operation 5000 times 96 | * Every benchmark is using 5000 elements (it's important for slice reset) 97 | * The measurements above are divided by 10 for an easier interpretation 98 | * The value type used is `int` (8 bytes on my x86-64 machine) 99 | 100 | Now, you should be cautious about random benchmarks posted on the internet. But no matter how you write and/or run these, generations map will always be faster than a sparse-dense map (or set). It's almost as fast as a slice solution while still having a very fast O(1) reset. 101 | 102 | There are reasons for it to be faster. Let's talk about them. 103 | 104 | ## Sparse Map Issues 105 | 106 | Why `sparse.set()` operation is so slow? 107 | 108 | When it comes to insertion of a new value, the sparse map has to do two memory writes. One for the `sparse` and one for the `dense`. Updating the existing value only writes to `dense`. 109 | 110 | ```go 111 | func (s *sparseMap[T]) Set(k int32, v T) { 112 | i := s.sparse[k] 113 | if i < int32(len(s.dense)) && s.dense[i].key == k { 114 | s.dense[i].val = v 115 | return 116 | } 117 | s.dense = append(s.dense, sparseEntry[T]{k, v}) 118 | s.sparse[k] = int32(len(s.dense)) - 1 119 | } 120 | ``` 121 | 122 | Another issue is that two slices mean twice as much boundchecks that can occur. And while you can be careful and use uint keys and check for the bounds yourself to stop compiler from generating an implicit boundcheck, you'll still pay for these if statements. 123 | 124 | The `sparse.get()` operation also suffers from a double memory read. 125 | 126 | ## Generations Map 127 | 128 | It's possible to use some of the ideas behind the sparse-dense map and create an even more specialized data structure. 129 | 130 | ```go 131 | type genMapElem[T any] struct { 132 | seq uint32 133 | val T 134 | } 135 | 136 | type genMap[T any] struct { 137 | elems []genMapElem[T] 138 | seq uint32 139 | } 140 | 141 | func newGenMap[T any](n int) *genMap[T] { 142 | return &genMap[T]{ 143 | elems: make([]genMapElem[T], n), 144 | seq: 1, 145 | } 146 | } 147 | ``` 148 | 149 | Every element will have a generation counter (seq). The container itself will have its own counter. The container's counter starts with 1, while elements start with 0. 150 | 151 | 152 | 153 | Both `get` and `set` operations look very similar to the slice version, but with a `seq` check. 154 | 155 | ```go 156 | func (m *genMap[T]) Set(k uint, v T) { 157 | if k < uint(len(m.elems)) { 158 | m.elems[k] = genMapElem[T]{val: v, seq: m.seq} 159 | } 160 | } 161 | ``` 162 | 163 | Setting the element means updating the element's counter to the container's counter along with the value. 164 | 165 | 166 | 167 | ```go 168 | func (m *genMap[T]) Get(k uint) T { 169 | if k < uint(len(m.elems)) { 170 | el := m.elems[k] 171 | if el.seq == m.seq { 172 | return el.val 173 | } 174 | } 175 | var zero T 176 | return zero 177 | } 178 | ``` 179 | 180 | If `seq` of the element is identical to the container's counter, then this element is defined. Otherwise, it doesn't matter what are the contents of this element. 181 | 182 | 183 | 184 | 185 | 186 | 187 | 188 |
189 | 190 | You can probably already guess how `Reset` will look like: 191 | 192 | ```go 193 | func (m *genMap[T]) Reset() { 194 | m.seq++ 195 | } 196 | ``` 197 | 198 | Well, this is good enough for the most use cases, but there is a small chance that our `uint32` will overflow, making some undefined elements defined. Increasing the `seq` size to `uint64` could help, but it will increase the per-element size overhead. Instead, we can do a real clear operation once in `MaxUint32` resets. 199 | 200 | ```go 201 | func (m *genMap[T]) Reset() { 202 | if m.seq == math.MaxUint32 { 203 | m.seq = 1 204 | clear(m.elems) 205 | } else { 206 | m.seq++ 207 | } 208 | } 209 | ``` 210 | 211 | It's definitely possible to use `uint8` or `uint16` for the `seq` field. That would mean less per-element size overhead at the price of a more frequent data clear. 212 | 213 | * The generations map does exactly 1 memory read and write 214 | * It's easier to get rid of all implicit boundchecks 215 | * Its memory consumption is comparable to the sparse-dense array 216 | * The `Reset` complexity is constant (amortized) 217 | * Arguably, it's even easier to implement and understand than a sparse-dense map 218 | 219 | It's possible to make a generations-based set too. The `get` operation can be turned into `contains` with ease. With sets, only the counters are needed. 220 | 221 | ```go 222 | func (m *genSet) Contains(k uint) bool { 223 | if k < uint(len(m.counters)) { 224 | return m.counters[k] == m.seq 225 | } 226 | return false 227 | } 228 | ``` 229 | 230 | It may sound fascinating, right? Well, you can't use this data structure as a drop-in replacement for a sparse-dense. For instance, a generations-based map can't be iterated efficiently. 231 | 232 | You can add a length counter if you really need it, but that will add some extra overhead to the `set` operation. I would advise you not to do so. The main reason this structure can be so fast is its simplicity. 233 | 234 | The average memory usage will be higher, since a referenced sparse-dense implementation doesn't allocate `n` elements for the `dense` right away; it only allocates the entire `sparse` storage. So, if you only ever populate the array up to n/2, the approximate size usage would be 1.5n instead of a worst-case 2n scenario. The generations-based array would require the entire slice to be allocated right away, leading to a 2n memory usage scenario. 235 | 236 | ## Conclusion 237 | 238 | I used this data structure in my [pathfinding](https://github.com/quasilyte/pathing/) library for Go. The results were great: 5-8% speedup just for a simple data structure change. Keep in mind that this library is already heavily optimized, so every couple of percentages count. 239 | 240 | In turn, this pathfinding library was used in a game I released on Steam: [Roboden](https://store.steampowered.com/app/2416030/Roboden/). 241 | 242 | Therefore, I would consider this data structure to be production-ready. 243 | -------------------------------------------------------------------------------- /content/post/go-asm-dispatch-tables.md: -------------------------------------------------------------------------------- 1 | +++ 2 | date = "Thu May 31 01:39:59 MSK 2018" 3 | title = "Dispatch tables in Go asm" 4 | tags = [ 5 | "[go]", 6 | "[asm]", 7 | ] 8 | description = "Bytecode interpreter in Go asm using direct threading for dispatching." 9 | draft = false 10 | +++ 11 | 12 | ## Dispatch tables 13 | 14 | When you want to execute particular code path depending on some kind 15 | of tag/opcode or other integer value that can be easily mapped into index, 16 | dispatch tables can speed things up compared to the sequence of 17 | comparisons and conditional jumps. 18 | 19 | In interpreters, this technique is often used as an alternative to switch-based dispatch. 20 | It's called direct threading in that domain. Each opcode corresponds to table index that contains machine 21 | code address that can execute operation specified by the opcode. 22 | 23 | > Note that a few `CMP` and jumps can perform better than small dispatch tables. 24 | > With big N, tables win consistently. 25 | 26 | ## Threaded code in Intel syntax 27 | 28 | Suppose we're implementing some virtual machine for a toy programming language. 29 | 30 | Here is it's specification: 31 | 32 | * Has one implicit operand: accumulator register. Mapped to `AX` (`rax`). 33 | * Bytecode pointer stored in `CX` (`rcx`). It's a program counter. 34 | * Supported operations are: `add1`, `sub1`, and `zero`. 35 | 36 | With [nasm](https://www.nasm.us/) and Intel syntax, our code could look like this: 37 | 38 | ```x86asm 39 | ;; Dispatch table itself. 40 | $op_labels: 41 | dq op_exit ;; Stop the evaluation 42 | dq op_add1 ;; Add 1 to RAX 43 | dq op_sub1 ;; Sub 1 from RAX 44 | dq op_zero ;; Set RAX to 0 45 | 46 | ;; Instructions required to fetch and "call" next opcode. 47 | %macro next_op 0 48 | movzx rdx, byte [rcx] ;; Fetch opcode 49 | add rcx, 1 ;; Advance PC (inc instruction is OK here too) 50 | jmp [$op_labels + (rdx * 8)] ;; Execute the operation 51 | %endmacro 52 | 53 | ;; Evaluation entry point. 54 | eval: 55 | next_op 56 | 57 | op_exit: 58 | ret 59 | 60 | op_add1: 61 | add rax, 1 ;; Or `inc rax` 62 | next_op 63 | 64 | op_sub1: 65 | sub rax, 1 ;; Or `dec rax` 66 | next_op 67 | 68 | op_zero: 69 | xor rax, rax ;; Or `mov rax, 0` 70 | next_op 71 | ``` 72 | 73 | Now, the question is: how to do exactly the same thing in Go assembly? 74 | 75 | ## Go implementation 76 | 77 | In Go assembly, it's not possible to have global labels. 78 | It's also not possible to store label address into anything. 79 | `TEXT` blocks are our replacements here. 80 | 81 | ```x86asm 82 | //; The $sym syntax is required to get symbol address, "literal value". 83 | //; op_exit, op_add1 and op_sub1 are declared as TEXT blocks, like normal functions. 84 | DATA op_labels<>+0(SB)/8, $op_exit(SB) 85 | DATA op_labels<>+8(SB)/8, $op_add1(SB) 86 | DATA op_labels<>+16(SB)/8, $op_sub1(SB) 87 | DATA op_labels<>+24(SB)/8, $op_zero(SB) 88 | //; 4 table entries, size is 4*8. 89 | GLOBL op_labels<>(SB), (RODATA|NOPTR), $32 90 | ``` 91 | 92 | Macros are akin to C. 93 | Multiline macros require newline escapes. 94 | 95 | ```x86asm 96 | #define next_op \ 97 | MOVBQZX (CX), DX \ 98 | ADDQ $1, CX \ 99 | MOVQ $op_labels<>(SB), DI \ 100 | JMP (DI)(DX*8) 101 | ``` 102 | 103 | You may notice that there is one excessive `MOVQ` there. 104 | There is an explanation [in the end of the article](#why-additional-movq-in-next-op). 105 | 106 | ```x86asm 107 | //; We are going to go one step further and return AX value to the caller. 108 | TEXT op_exit(SB), NOSPLIT, $0-0 109 | MOVQ AX, ret+8(FP) 110 | RET 111 | 112 | TEXT op_add1(SB), NOSPLIT, $0-0 113 | ADDQ $1, AX 114 | next_op 115 | 116 | TEXT op_sub1(SB), NOSPLIT, $0-0 117 | SUBQ $1, AX 118 | next_op 119 | 120 | TEXT op_zero(SB), NOSPLIT, $0-0 121 | XORQ AX, AX 122 | next_op 123 | ``` 124 | 125 | > All routines defined above have zero size frame and parameters space. 126 | > This is to emphasise that those functions are not `CALL`'ed but rather `JMP`'ed into. 127 | 128 | The last thing is entry point, `eval` function. 129 | It's signature in Go would look like this: 130 | 131 | ```go 132 | // eval executes opbytes and returns accumulator value after evaluation ends. 133 | // opbytes must have trailing 0 byte (opExit). 134 | func eval(opbytes *byte) int64 135 | ``` 136 | 137 | For asm, it's important to consider stack frame size and parameters width. 138 | These are shared among all opcode executing routines. 139 | We don't need stack frame, only 16 bytes for input pointer and output int64. 140 | (Our code is for 64-bit platform only, but you can make it more portable.) 141 | 142 | ```x86asm 143 | TEXT ·eval(SB), NOSPLIT, $0-16 144 | MOVQ opbytes+0(FP), CX //; Set up program counter (PC) 145 | next_op //; Start the evaluation 146 | ``` 147 | 148 | See [eval_amd64.s](/blog/code/eval/eval_amd64.s) for complete asm code. 149 | 150 | ## Calling eval from Go 151 | 152 | Main function can look like this: 153 | 154 | ```go 155 | func main() { 156 | const ( 157 | opExit = iota 158 | opAdd1 159 | opSub1 160 | opZero 161 | ) 162 | prog := []byte{ 163 | opZero, 164 | opAdd1, 165 | opAdd1, 166 | opSub1, 167 | opAdd1, 168 | opExit, 169 | } 170 | fmt.Println(eval(&prog[0])) 171 | } 172 | ``` 173 | 174 | Constants defined purely for convenience reasons. 175 | It is important to keep definitions in sync with asm implementation. 176 | Code generation can help here. 177 | 178 | See [eval.go](/blog/code/eval/eval.go) for complete Go code. 179 | 180 | Put `eval.go` and `eval_amd64.s` in a new directory and run it: 181 | 182 | ```bash 183 | $ go build -o eval.exe . && ./eval.exe 184 | 2 185 | ``` 186 | 187 | ## Pure Go solution 188 | 189 | Without assembly, dispatching would require loop+switch: 190 | 191 | ```go 192 | func eval(opbytes []byte) int64 { 193 | acc := int64(0) 194 | pc := 0 195 | // It's not always the case that instruction consume exactly 1 byte. 196 | // Some instructions may expect immediate bytes right after the opcode. 197 | // This is why we're maintaining pc manually instead of using range over 198 | // the opbytes. If you have fixed-length instructions, range loop 199 | // will be more efficient because it may eliminate all boundary 200 | // checks into opbytes. 201 | for { 202 | switch opbytes[pc] { 203 | case opExit: 204 | return acc 205 | case opAdd1: 206 | acc++ 207 | pc++ 208 | case opSub1: 209 | acc-- 210 | pc++ 211 | case opZero: 212 | acc = 0 213 | pc++ 214 | } 215 | } 216 | return 0 217 | } 218 | ``` 219 | 220 | This is not direct threading anymore. 221 | 222 | If number of opcodes is high enough, table dispatch will be consistently faster on most machines. 223 | The recomendation is, as usual: measure before making final decisions. 224 | 225 | There is also indirect threading, but it's usually measurably slower due to function calls. 226 | 227 | ## Why additional MOVQ in next_op? 228 | 229 | Direct translation of `next_op` would be: 230 | 231 | ```x86asm 232 | #define next_op \ 233 | MOVBQZX (CX), DX \ 234 | ADDQ $1, CX \ 235 | JMP $op_labels<>(SB)(DX*8) 236 | ``` 237 | 238 | This way, it would fully match nasm implementation. 239 | 240 | But unfortunately, this is not a valid Go asm syntax. 241 | 242 | You can't use index expressions while using pseudo register. 243 | And you can't access global data without `SB` pseudo register. 244 | 245 | This could be fixed in future, although the probability is pretty low. 246 | Weird syntax is derived from plan9 asm and is shared among multiple architectures. 247 | -------------------------------------------------------------------------------- /content/post/go-avx512.md: -------------------------------------------------------------------------------- 1 | +++ 2 | date = "Fri Jun 8 13:46:49 MSK 2018" 3 | title = "Go AVX-512 support" 4 | tags = [ 5 | "[go]", 6 | "[asm]", 7 | "[avx-512]", 8 | ] 9 | description = "Your guide in Go AVX-512 world. Links to docs, articles and so on." 10 | draft = false 11 | +++ 12 | 13 | ## About this document 14 | 15 | This article is going to be up-to-date source of AVX-512 information in Go context. 16 | It references other posts, official documentation and other useful resources. 17 | 18 | It's short. By purpose. 19 | Only English content is referenced (original + translated). 20 | 21 | ## Documentation 22 | 23 | * [AVX-512 support in Go assembler](https://github.com/golang/go/wiki/AVX-512-support-in-Go-assembler): 24 | short reference that focuses on Go-specific implementation of AVX-512. 25 | It describes how to use all AVX-512 special features in Go assembly syntax as well as some encoder details. 26 | Contains some examples. 27 | 28 | * [Disassembling Go AVX-512](/blog/post/disassembling-go-avx512): 29 | how to disassemble and inspect AVX-512 machine code (given that `go tool objdump` can't do it). 30 | 31 | * [Hardware counters collector](https://github.com/intel-go/avx512counters): 32 | a program that runs all supported AVX-512 instructions on your machine and records turbo state perf events. 33 | There are pre-built example tables like [avx512_core_i9_7900x.csv](https://github.com/intel-go/avx512counters/blob/master/avx512_core_i9_7900x.csv). 34 | -------------------------------------------------------------------------------- /content/post/go-nested-functions-and-static-locals.md: -------------------------------------------------------------------------------- 1 | +++ 2 | date = "2017-09-18" 3 | title = "Go nested functions and static locals" 4 | tags = [ 5 | "[go]", 6 | "[closure]", 7 | ] 8 | description = "Avoid global lookup tables. Making things as local (private) as possible." 9 | draft = false 10 | +++ 11 | 12 | ## Symbol visibility 13 | 14 | Default symbol visibility should be as narrow as possible. 15 | This means that you use globals with internal linkage instead of external, 16 | local variables rather than globals, hide class-related constants inside 17 | it's scope, and so on. 18 | 19 | If function is only called inside particular function, 20 | it should become a [nested function](https://en.wikipedia.org/wiki/Nested_function). 21 | 22 | Most of these rely on the language support. 23 | 24 | Go has quite simple model of scopes and symbol visibility. 25 | User-defined identifier can be local or package-local (global). 26 | Package level identifiers can be exported or unexported. 27 | 28 | - No `static` storage class for local variables. 29 | - Function declarations can only appear at top level. No nested functions. 30 | 31 | As a consequence, you end up using globals for lookup tables, 32 | compiled regular expressions and other objects that should 33 | be initialized once, and then used during every function call. 34 | 35 | Why such encapsulation matters is not a topic of this post. 36 | Instead, this article is focused on the working technique overview. 37 | Benchmarks and quirks list are included. 38 | 39 | ## Closures and immediately-invoked function expressions 40 | 41 | Go permits top level dynamic initialization. 42 | We are interested in [IIFE](https://en.wikipedia.org/wiki/Immediately-invoked_function_expression) 43 | in combination with closures. 44 | 45 | Suppose someone developed `describeString` function listed below. 46 | It is not overly complex, but in order to define it, 47 | programmer also introduced `hasVowel` helper function, 48 | which requires `vowels` global variable. 49 | `describeString` has to check a string against 50 | regular expression, so it was assigned to `golangRE`, 51 | this removes a need to compile regexp during each function call. 52 | 53 | ```go 54 | var vowels = map[rune]bool{ 55 | 'a': true, 'e': true, 'i': true, 56 | 'o': true, 'u': true, 'y': true, 57 | } 58 | var golangRE = regexp.MustCompile(`\b[Gg]o(?:lang)?\b`) 59 | func hasVowel(s string) bool { 60 | for _, c := range s { 61 | if vowels[c] { 62 | return true 63 | } 64 | } 65 | return false 66 | } 67 | func describeString(s string) string { 68 | var attrs []string 69 | if hasVowel(s) { 70 | attrs = append(attrs, "has vowel letter") 71 | } 72 | if golangRE.MatchString(s) { 73 | attrs = append(attrs, "may be about Go language") 74 | } 75 | attrs = append(attrs, fmt.Sprintf("has length of %d", len(s))) 76 | return strings.Join(attrs, "; ") 77 | } 78 | ``` 79 | 80 | So far, **4** global symbols for single function. 81 | With closures and IIFE we can reduce this number to **1**. 82 | 83 | ```go 84 | var describeString = func() func(string) string { 85 | vowels := map[rune]bool{ 86 | 'a': true, 'e': true, 'i': true, 87 | 'o': true, 'u': true, 'y': true, 88 | } 89 | golangRE := regexp.MustCompile(`\b[Gg]o(?:lang)?\b`) 90 | hasVowel := func(s string) bool { 91 | for _, c := range s { 92 | if vowels[c] { 93 | return true 94 | } 95 | } 96 | return false 97 | } 98 | 99 | return func(s string) string { 100 | var attrs []string 101 | if hasVowel(s) { 102 | attrs = append(attrs, "has vowel letter") 103 | } 104 | if golangRE.MatchString(s) { 105 | attrs = append(attrs, "may be about Go language") 106 | } 107 | attrs = append(attrs, fmt.Sprintf("has length of %d", len(s))) 108 | return strings.Join(attrs, "; ") 109 | } 110 | }() // <- Note this. 111 | ``` 112 | 113 | Note that inner closure body is identical to initial `describeString` implementation. 114 | The rest of this post describes provided solution characteristics. 115 | 116 | ## Performance 117 | 118 | As you may guess there are some performance penalties. 119 | 120 | Two main differences between normal function and closure-based approaches: 121 | 122 | 1. Initialization time. IIFE will be evaluated during package initialization, at run-time. 123 | 2. Function call overhead. IIFE closure calls are never inlined. 124 | 125 | The exact numbers are hard to predict, but you may expect about **1-5%** slowdown. 126 | This may be important if your application is very performance-critical *and* that 127 | function is called inside a tight loop. 128 | 129 | You can use [linked benchmark](/blog/code/closure_test.go) to have an approximation. 130 | Example results are provided in the next snippet. 131 | 132 | ```text 133 | $ go test -bench=. 134 | BenchmarkNormalFunc-4 20000 90200 ns/op 135 | BenchmarkClosure-4 20000 94576 ns/op 136 | 137 | $ benchstat func.txt closure.txt 138 | name old time/op new time/op delta 139 | NormalFunc-4 88.6µs ± 1% 89.3µs ± 2% +0.85% (p=0.015 n=10+10) 140 | ``` 141 | 142 | ## Potential problems 143 | 144 | This article would be incomplete without a list of known problems with 145 | proposed solution. 146 | 147 | > Problem 1 - no parameter names hint. 148 | 149 | With normal function call hint may look like `func(s string) string`, 150 | while our closure will get `func(string) string`. 151 | 152 | You can fix that with simple change. 153 | 154 | ```diff 155 | 156 | -var describeString = func() func(string) string { 157 | +var describeString = func() func(name string) string { 158 | vowels := map[rune]bool{ 159 | 'a': true, 'e': true, 'i': true, 160 | 'o': true, 'u': true, 'y': true, 161 | ``` 162 | 163 | Hovewer, this will force you to break DRY principle, albeit slightly. 164 | The main disadvantage that you have to change parameter names in two 165 | places instead of one. 166 | 167 | From the other point of view it is an additional flexibility, 168 | because you can use longer, expressive names for the "public" parameters and 169 | shorter identifier for the implementation itself. 170 | 171 | > Problem 2 - function variable is mutable. 172 | 173 | For unexported functions this is not a problem, 174 | but if symbol is exported, users may re-assign variable for 175 | something else. They do not have this opportunity with 176 | functions that are defined in a normal way. 177 | 178 | ## Conclusion 179 | 180 | Closure-based encapsulation is an old trick. 181 | JavaScript programmers use it along with IIFE all the time. 182 | 183 | If you have a question: "do I have to?", 184 | the answer is "no" of course. 185 | But when you seek for additional patterns to reduce code 186 | complexity, this solution may prove useful. 187 | -------------------------------------------------------------------------------- /content/post/goism-compilation-modes.md: -------------------------------------------------------------------------------- 1 | +++ 2 | date = "2018-02-23" 3 | title = "Goism compilation modes" 4 | tags = [ 5 | "[go]", 6 | "[emacs lisp]", 7 | "[goism project]", 8 | ] 9 | description = "Attempt to optimize compiler for the most desired use patterns." 10 | draft = true 11 | +++ 12 | 13 | ## Fast & slow builds 14 | 15 | This whole article can be reduced to the simple statement: sometimes compilation 16 | speed does not matter, but when it does, you can't make it "too fast". 17 | 18 | The analysis of exact use cases may reveal opportunities that may cut 19 | compile time significantly. This also helps to avoid bad decisions that hurt 20 | user experience. 21 | 22 | ## User-oriented compilation 23 | 24 | Optimizations only make sense when priorities are well-defined: you can't get 25 | everything; there is something you give away in order to get properties that 26 | are more valuable in your situation. 27 | 28 | The famous [speed-time tradeoff](https://en.wikipedia.org/wiki/Space%E2%80%93time_tradeoff) is a nice example, but the optimization term is also applicable to user experience improvements. This article is focused on this aspect of optimizations: designing an instrument that respects user workflows and it's dynamic nature. 29 | 30 | Tuning the language toolchain towards usefulness requires specification of 31 | the most common and desirable use patterns. 32 | 33 | In the context of interactive development inside [GNU Emacs](https://www.gnu.org/software/emacs/), I can distinguish 34 | three main use cases that require different trade-offs in compilation design: 35 | 36 | 1. [Single expression evaluation](#single-expression-evaluation) (like in interactive [REPL](https://en.wikipedia.org/wiki/Read%E2%80%93eval%E2%80%93print_loop)) 37 | 2. [Script-like main package execution](#scripts) 38 | 3. [Library compilation](#libraries) 39 | 4. [Go "testing" mode](#go-testing-mode) 40 | 41 | There could be more noble usage schemes, but they are escaping my mind at this moment. 42 | 43 | ## Single expression evaluation 44 | 45 | From time to time we want to run specific expression and get evaluation results to gain confidence. 46 | 47 | Suppose you don't remember exact `strings.Split` behavior for empty string input arguments. 48 | Will `len(strings.Split("", "\n")` result in 0 or 1? 49 | 50 | If there is no built-in support for this in your editor or IDE, 51 | you may find yourself visiting [Go playground](https://play.golang.org/) 52 | (creating local `main` package is the same kind of solution). 53 | 54 | My main complaints for this approach are: 55 | 56 | 1. Need to wrap code into `main`. 57 | 2. Need to print evaluation results. 58 | 3. Not very convenient to re-use evaluation results for further experiments. 59 | 4. Prior to [goimports](https://godoc.org/golang.org/x/tools/cmd/goimports) integration into playground, you also was responsible to include required imports (not the case anymore). 60 | 61 | Here are main design considerations for solving this inside [goism](https://github.com/Quasilyte/goism): 62 | 63 | * `goism-eval-string` function accepts Go code string to be executed and returns 64 | Emacs Lisp object that represents evaluation result of arbitrary type. 65 | * The result is pushed into N-slot stack. Stack top is always last evaluation result value. 66 | User may bind those values to Emacs Lisp variables or refer to them via stack reference. 67 | * Required imports are automatically resolved. User can specify additional lookup patterns 68 | to enable non-standard packages. 69 | * The compilation should be blazingly fast. So fast that the user feels like it's instant. 70 | This may even justify [daemon](https://en.wikipedia.org/wiki/Daemon_(computing))+[IPC](https://en.wikipedia.org/wiki/Inter-process_communication) design to avoid extra overhead 71 | of process spawning per each request. 72 | 73 | The `goism-eval-string` should be enough to implement `goism-eval-last-expr` and 74 | `goism-eval-print-last-expr` interactive commands. 75 | 76 | There is so much more to this. 77 | We may want to select a region, do `M-x eval-region` and replace some variables with 78 | constant arguments, so the selection of `strings.Split(x, "\n")` can be executed 79 | with `x` bound to `""`. This does not require significant foundation changes: proposed model 80 | can enable creation of such great features without any troubles. 81 | 82 | ## Scripts 83 | 84 | Much of the script-like code is written in Emacs Lisp to do code generation 85 | or source code transformations. 86 | 87 | When goism is capable enough to load `go/*` packages that make it quite easy 88 | to manipulate Go sources, it would be possible to write one-off, syntax-aware scripts 89 | as a part of Go development workflow. The "compiled" nature of Go will feel 90 | less restrictive inside Emacs. 91 | 92 | Not to forget that executed code can affect Emacs global state. 93 | This makes Go scripts suitable for data load tasks that may be hard using 94 | only Emacs Lisp due to the lack of high-quality libraries. 95 | 96 | TODO: think about scripts more deeply. 97 | 98 | ## Libraries 99 | 100 | TODO: library development VS library deployment 101 | 102 | ## Go "testing" mode 103 | 104 | TODO: does testing framework require specific features from goism? 105 | 106 | TODO: should I describe compiled object format here? Or mention them at all? -------------------------------------------------------------------------------- /content/post/goism-objects-layout-mode.md: -------------------------------------------------------------------------------- 1 | +++ 2 | date = "2018-01-14" 3 | title = "Goism objects layout model" 4 | tags = [ 5 | "[go]", 6 | "[emacs lisp]", 7 | "[goism project]", 8 | ] 9 | description = "How Go objects are represented inside Emacs." 10 | draft = false 11 | +++ 12 | 13 | ## Introduction 14 | 15 | [Goism](https://github.com/Quasilyte/goism) project requires Go pointers 16 | emulation inside Emacs Lisp code. 17 | 18 | This document describes how to achive (almost) full functionality with 19 | potential to have optimizations that eliminate some of the 20 | emulation-related overhead. 21 | 22 | The actual implementation can diverge. 23 | Only initial design is outlined. 24 | 25 | ## Struct representation 26 | 27 | Go structures represented by **lists**. 28 | Empty struct run-time value is unspecified, but it satisfies Go spec requirements. 29 | 30 | ```go 31 | type a1 struct { f1 int } 32 | // a1{1} 33 | // (list 1) 34 | 35 | type a2 struct { f1, f2, f3, f4 int } 36 | // a2{1, 2, 3, 4} 37 | // (list 1 2 3 4) 38 | ``` 39 | 40 | For the very small (1-3 fields) objects lists are a better choice than vectors, 41 | but generally, vectors are more memory-efficient and provide faster random access. 42 | 43 | [Pointers](#pointers) section explains why lists were selected over the vectors 44 | as default representation. 45 | 46 | For unexported struct types, optimizer is permitted to use "best fit" data type 47 | for run-time values. 48 | This is the reason why it is important to forbid usage of unexported types inside 49 | Emacs Lisp domain. 50 | 51 | ## Arrays, strings and slices 52 | 53 | Arrays represented by Emacs Lisp **vectors**. 54 | 55 | Strings are Emacs Lisp **unibyte strings**. Literals are UTF-8 encoded. 56 | Strings created by Go code considered immutable. 57 | Immutability is not enforced during the execution. 58 | 59 | Slices are implemented by a struct of `length`, `capacity`, `offset` and `data` fields. 60 | The purpose of first two fields is self-explanatory. 61 | The `offset` is used during index calculations; needed for re-sliced slices. 62 | `data` is the underlying vector. 63 | Field order described here is not mandatory. 64 | 65 | ```go 66 | [3]int{1, 2, 3} // [1 2 3] 67 | []int{1, 2, 3} // (3 3 0 [1 2 3]) 68 | ([3]int{1, 2, 3})[1:] // (2 2 1 [1 2 3]) 69 | "abc" // "abc" 70 | "π" // "\317\200" 71 | ``` 72 | 73 | Arrays, strings and slices are reference types in Emacs Lisp. 74 | 75 | ## General boxing 76 | 77 | **Box** term is used when referring to thin wrapper that exist to enable 78 | pointer-like semantics. 79 | 80 | The boxed value required to support `car` and `setcar` operations. 81 | `car` is for dereference, `setcar` is for writes/updates. 82 | 83 | The main purpose of boxing is to implement arbitrary pointer indirection. 84 | We never care about `cdr` part; it's value is undefined on purpose. 85 | 86 | ```lisp 87 | *T (cons T.value ?) 88 | **T (cons (cons T.value ?) ?) 89 | ***T (cons (cons (cons T.value ?) ?) ?) 90 | ;; ... and so on 91 | ``` 92 | 93 | ```go 94 | x := new(int) // x = (cons 0 ?) 95 | *x // (car x) 96 | *x = 10 // (setcar x 10) 97 | ``` 98 | 99 | ## Pointers: reference types 100 | 101 | For reference types (ref types for short), the single level of indirection is the object itself. 102 | This means that `*T` has the same run-time representation as `T`. 103 | 104 | When object has non-pointer struct type, all assignments use `copy-sequence`. 105 | Arrays are assigned via copying, too. 106 | 107 | Higher order indirection uses [general boxing](#general-boxing). 108 | 109 | ```go 110 | x1 := Point{x: 1, y: 2} // x1 = (list 1 2) 111 | x2 := x1 // x2 = (copy-sequence x1) 112 | 113 | y1 := &Point{x: 1, y: 2} // y1 = (list 1 2) 114 | y2 := y1 // y2 = y1 115 | y3 := *y2 // y3 = (copy-sequence y2) 116 | 117 | z := new(*Point) // z = (cons (list 0 0) ?) 118 | *z = y2 // (setcar z y2) 119 | ``` 120 | 121 | Pointer to n-th struct member is it's `nthcdr`. 122 | This is why lists are default struct representation - it makes 123 | member address operation possible (and allocation-free). 124 | 125 | ```go 126 | pt := Point{1, 2} // pt = (list 1 2) 127 | x := &pt.x // x = (nthcdr 0 pt) = pt 128 | y := &pt.y // y = (nthcdr 1 pt) = (cdr pt) 129 | *x = 10 // (setcar x 10) 130 | *y = 20 // (setcar y 20) 131 | // pt fields are updated as expected. 132 | ``` 133 | 134 | Because `cdr` pointer part is always ignored, `(x y z)` is a valid pointer for `x`. 135 | 136 | ## Pointers: non-reference types 137 | 138 | In Emacs Lisp, there are non-ref types; they are not mutable. 139 | The solution to this is [inferior mutability](https://ricardomartins.cc/2016/06/08/interior-mutability). 140 | 141 | We apply boxing for all such values when they are not part of the struct 142 | or other container. 143 | Temporary values that are used for stores are not boxed, too. 144 | 145 | ```go 146 | x := 10 // Boxed as (list 10) 147 | y := 10.5 // Boxed as (list 10.5) 148 | 149 | pt.x = 777 // 777 is not boxed here 150 | pt.y = y // Unboxing is required: (car y) 151 | 152 | passVal(x) // Possibilities: (copy-sequence x), (car x) or just x 153 | passPtr(&x) // Always boxed x; makes x optimizations impossible 154 | ``` 155 | 156 | The negative impact on performance is addressed by [escape analysis](#escape-analysis). 157 | 158 | ## Pointers: array/slice element address 159 | 160 | Situation with arrays and slices is more complicated. 161 | 162 | * Arrays and slices can be very large, which makes lists impractical; 163 | * Speculative layout optimization which is used for structs is not applicable (see below); 164 | 165 | Even if it is possible to determine that particular array never gives element address away, 166 | turning it into "real vector" will not work as it becomes incompatible with unoptimized arrays 167 | of the same static type. 168 | 169 | Proposed solution: 170 | 171 | * It is easy to return a pointer to ref type value. Permit this operation; 172 | * Forbid taking element address of non-ref types; 173 | 174 | This is a trade-off between performance and Go spec compliance. 175 | Enabling this feature without constraints will make arrays (and slices) 176 | very inefficient. 177 | 178 | ```go 179 | xs := [2]Point{} 180 | x := &xs[1] // Valid. 181 | 182 | ys := [2]int{} 183 | y := &ys[1] // Invalid. Compile-time error. 184 | ``` 185 | 186 | ## Escape analysis 187 | 188 | If pointers never existed in Go, we could avoid many complications described above. 189 | 190 | Escape analysis is performed as the last part of optimizations. 191 | It's aim is to find data that is never used in a way that forces us to 192 | generate less efficient code. 193 | 194 | For example, if address operator is never applied to local non-ref type 195 | variable, there is no need to box it. 196 | 197 | Go structures that have particular layout can be optimized if 198 | member field address never taken from any of it's instances. 199 | This analysis can be sound for unexported types. 200 | 201 | If needed, special annotation can select particular struct run-time 202 | representation. 203 | Compiler will reject code that uses such types in 204 | non-compatible ways. 205 | This feature should only be used when particular layout is very important. 206 | 207 | ```go 208 | //goism:repr=vector 209 | type Foo struct { 210 | A, B int 211 | C Bar 212 | } 213 | 214 | // Instances of Foo are represented as Emacs Lisp vectors. 215 | // It is compile-time error to take address of A and B fields. 216 | // It is OK to take C field address. 217 | ``` 218 | 219 | Possible representations: 220 | 221 | * `list` - nil-terminated cons pairs. Can take address of any field. 222 | * `list*` - improper list. Like lists, but last element address only works for ref types. 223 | * `vector` - same as for arrays. Can take address of any ref type field. 224 | * `string` - all fields must fit into 16bit ints. Can't take field address. 225 | * `bool-vector` - all fields must be booleans. Can't take field address. 226 | * `atom` - unboxed value. Valid only for unit (single field) structs. Can't take address. 227 | 228 | Some representations not only have restricted field address operation, but also 229 | member types/count constraints. 230 | 231 | The upside is the benefits of particular data type. 232 | For example, strings are much cheaper to allocate, but a little slower at 233 | random access, than vectors. 234 | Improper lists are only a slight improvement over proper lists, but add nearly 235 | no additional restrictions. They also work like a charm for 2 field objects. 236 | -------------------------------------------------------------------------------- /content/post/log-fatal-vs-log-panic.md: -------------------------------------------------------------------------------- 1 | +++ 2 | date = "Tue Jan 8 01:46:18 MSK 2019" 3 | title = "log.Fatal vs log.Panic" 4 | tags = [ 5 | "[shortread]", 6 | "[go]", 7 | ] 8 | description = "tl;dr: avoid `os.Exit` near deferred calls." 9 | draft = false 10 | +++ 11 | 12 | > Update: probably the best alternative to `log.Fatalf` inside `main` is `log.Printf` followed by a return statement. If your main function contains a lot of such exit points, consider using a [step driven evaluation](https://quasilyte.github.io/blog/post/step-pattern/) pattern. 13 | 14 | I personally don't like `log.Fatal/Fatalf/Fatalln`. I feel sad because of their ubiquity in Go examples as a form of reaction to an error. I personally prefer `log.Panic`. 15 | 16 | The `log.Panic` vs `log.Fatal` is essentially `panic` vs `os.Exit(1)`. The latter will not let deferred calls to run. Most of the time, it's not what you want. It also makes testing much harder. It's quite simple to stop panic in test by the means of `recover`. It's much harder to cope with a code that does `os.Exit` somewhere while you're trying to do end2end testing without loosing coverage info. 17 | 18 | I have written `exitAfterDefer` check for the [go-critic](https://github.com/go-critic/go-critic) linter. It warns about `log.Fatal` if the function where it's used deferred any calls prior to that line. 19 | 20 | For example, that check would generate a warning for this kind of code: 21 | 22 | ```go 23 | defer os.Remove(filename) 24 | if err != nil { 25 | log.Fatalf("error: %v", err) 26 | } 27 | ``` 28 | 29 | > warning: log.Fatalf clutters `defer os.Remove(filename)` 30 | 31 | If you can, avoid `log.Fatal`. Don't reject `log.Panic` just because most people follow "don't panic" mantra too heavily. Exit is worse that panic, period. If you have a choice, choose something that does less potential damage. 32 | 33 | So, you know what to do now: 34 | 35 | ```diff 36 | if err != nil { 37 | - log.Fatalf("ooops: %v", err) 38 | + log.Panicf("ooops: %v", err) 39 | } 40 | ``` -------------------------------------------------------------------------------- /content/post/naive-ssa-alternative.md: -------------------------------------------------------------------------------- 1 | +++ 2 | date = "Sat Mar 26 00:00:00 MSK 2022" 3 | title = "A simpler scheme than SSA" 4 | tags = [ 5 | "[go]", 6 | "[compilers]", 7 | "[shortread]", 8 | "[ssa]", 9 | ] 10 | description = "How to perform peephole optimizations without SSA." 11 | draft = false 12 | +++ 13 | 14 | ## The SSA complexities 15 | 16 | Let's suppose that you're working on some small compiler-like project. At some point, you may start thinking about adding optimizations to the code generated by your compiler. 17 | 18 | Then you realize that it's not enough to just have some IR that is suitable for modifications. It's important to apply only those optimizations that keep the code correct (or at least don't make it more broken than it was before). Hopefully, we're also making it faster or smaller along the way. 19 | 20 | Most likely, you're better off choosing something like the [SSA](https://en.wikipedia.org/wiki/Static_single_assignment_form) form. The SSA form comes with a few complexities that you'll have to deal with: 21 | 22 | 1. SSA introduces a lot of "unique" slots*. You need to perform good dead store eliminations and register allocation later on to keep the number of slots minimal 23 | 2. You either need to insert phi nodes or make basic blocks parametrized (so they get outer values as arguments) 24 | 3. SSA alone is not enough. You need some extra metadata, like the number of SSA value usages (most often you want to check whether `v.Uses == 1`) 25 | 26 | > (*) A slot is an abstract term for a place where we store some value. It could be someplace inside a stack, or a register, or a virtual register if we're talking about a VM with a potentially infinite amount of registers. 27 | 28 | Basically, you need to insert some merge points for the SSA values. It can be done with phi nodes or with parametrized basic blocks. It's not a big deal, but it can be messy for a small project that only wants to perform a few local optimizations. Maintaining the SSA invariant can end up being too messy. 29 | 30 | In this article, I'll try to describe a simpler approach that: 31 | 32 | * Keeps the allocated slots after the early compilation phase 33 | * Encodes both SSA unique value constraint with a single-use invariant (Uses=1) 34 | * This form is easy to build and maintain 35 | 36 | ## Unique slots form 37 | 38 | We divide all slots into 2 categories: 39 | 40 | * Normal slots 41 | * Unique slots 42 | 43 | Unique slots have these properties: 44 | 45 | * They don't escape their basic block 46 | * They're only used once 47 | 48 | That being said, both unique and normal slots have assigned ID that tells which memory location they occupy. The same memory location can be unique in one part of the block and non-unique in another. 49 | 50 | Our abstract slot can look like this: 51 | 52 | ```go 53 | type Slot struct { 54 | ID int // allocated by the compiler 55 | Unique bool // inferred by the optimizer 56 | } 57 | ``` 58 | 59 | Instructions operate on slots. Their arguments can have unique or non-unique slots: 60 | 61 | ```go 62 | type Instruction struct { 63 | Op byte 64 | Args []Slot 65 | } 66 | ``` 67 | 68 | In the code below, `slot0` can be marked as unique: 69 | 70 | ```ruby 71 | return 130 72 | 73 | # => 74 | 75 | load_int_const slot0 = 130 76 | return slot0 77 | ``` 78 | 79 | `slot0` is assigned exactly once, it's read only once as well. It doesn't leave its basic block too. 80 | 81 | Here is an example of a block where we have a slot with the same ID marked as unique in several places: 82 | 83 | ```ruby 84 | println(130) 85 | println(200) 86 | 87 | # => 88 | 89 | load_int_const slot0 = 130 90 | push_arg slot0 91 | call println 92 | load_int_const slot0 = 200 93 | push_arg slot0 94 | call println 95 | ``` 96 | 97 | `slot0` is assigned twice, but both versions are unique: there is only one use after every assignment. 98 | 99 | ## Marking the slots as unique 100 | 101 | In reality, we need some extra information to infer that some slot is unique. Namely, we need to know where its lifetime ends. This can be done with pseudo varkill instructions. 102 | 103 | > The name "varkill" is borrowed from the Go compiler source code. It uses this pseudo node to 104 | > record that the variable lifetime has ended. 105 | 106 | When the compiler allocates the slots for intermediate results, it knows when their lifetime ends. This lifetime tomb can end up in the same basic block or somewhere else. 107 | 108 | ```ruby 109 | load_int_const slot0 = 130 110 | push_arg slot0 111 | call println 112 | varkill slot0 # <- slot0 is free after this point 113 | load_int_const slot0 = 200 114 | push_arg slot0 115 | call println 116 | varkill slot0 # <- slot0 is free after this point 117 | ``` 118 | 119 | Here is another example, when temporary value outlives its block: 120 | 121 | ```ruby 122 | return x || y 123 | 124 | # => 125 | 126 | move slot0 = x 127 | jump_nz L0 slot0 128 | move slot0 = y 129 | L0: 130 | return slot0 131 | varkill slot0 132 | ``` 133 | 134 | It's important to include the trailing varkill pseudo ops after the basic block exit instruction. 135 | So, the basic blocks for the code above can look like this: 136 | 137 | ```ruby 138 | b0: 139 | move slot0 = x 140 | jump_nz L0 slot0 141 | move slot0 = y 142 | 143 | b1: # L0 144 | return slot0 145 | varkill slot0 146 | ``` 147 | 148 | Note that varkill is a part of the `b1` block. 149 | 150 | To compute the unique slots within a block, we need to traverse it only once. 151 | 152 | * Go from the end of a basic block 153 | * Put all varkill IDs into a map 154 | * For every recorded ID, collect the number of reads 155 | * When reached the recorded ID write, check the number of reads 156 | * If the number of reads is 0, this is a dead store 157 | * If the number of reads is 1, mark this slot and its usage as unique 158 | * Otherwise it's not a unique slot, remove ID from the map 159 | * When removing a var or marking it unique, an associated varkill should be removed 160 | 161 | > You don't really need a real map here. It's possible to write a zero alloc uniqs marking. 162 | 163 | After the first round of optimizations, we need to re-compute the unique slots. 164 | 165 | To avoid doing redundant re-calculations, we can skip blocks that don't have any varkills. 166 | 167 | This means we need to store this number of varkills counter somewhere along the basic block. It's also possible to have a "dirty" flag inside a block that tells whether it may have changed after the last scanning. 168 | 169 | ```go 170 | type Block struct { 171 | Body []Instruction 172 | Varkills int 173 | Dirty bool 174 | } 175 | ``` 176 | 177 | Strictly speaking, explicit block objects are not required. All metadata can be stored separately, outside of the blocks. It is, however, more convenient to work with explicit block objects. 178 | 179 | ## Where exactly to insert a varkill 180 | 181 | For temporary values that are results of expression computation, it's simple. These are allocated along with the computations. The compiler knows then the expression boundary is over, so it can insert the tombstones right there. 182 | 183 | For local variables, these life scopes can be computed using their lexical scoping. 184 | 185 | ```go 186 | // x slot is assigned when if statement clause is being compiled 187 | if x := f(); x != nil { 188 | return x 189 | } 190 | // After the if statement is compiled, x variable is no longer alive, 191 | // a varkill for the allocated slot can be inserted. 192 | 193 | { 194 | // This x variable is different from the previous one. 195 | // In this case, the slot for x can be marked unique. 196 | x := 10 197 | println(x) 198 | // When this lexical block ends, x is no longer alive. 199 | } 200 | ``` 201 | 202 | For some simple cases, we can insert a varkill at the point of the variable reassignment. This is a more tricky case though: it's better to be conservative here and insert fewer markers than feeding the incorrect information to the optimizer. 203 | 204 | ```go 205 | { 206 | x := 10 207 | println(x) 208 | x = 20 // re-assigned: a suitable place for a varkill 209 | println(x) 210 | } 211 | 212 | // => 213 | 214 | // load_int_const slot0 = 10 215 | // push_arg slot0 216 | // call println 217 | // varkill slot0 218 | // load_int_const slot0 = 20 219 | // push_arg slot0 220 | // call println 221 | ``` 222 | 223 | We can't analyze all variables, but we can still get some benefits and perform safe optimizations without compromising the generated code correctness. 224 | -------------------------------------------------------------------------------- /content/post/riscv32-custom-instruction-and-its-simulation.md: -------------------------------------------------------------------------------- 1 | +++ 2 | date = "2017-06-21" 3 | title = "RISC-V: custom instruction and its simulation" 4 | tags = [ 5 | "[gcc]", 6 | "[gcc plugin]", 7 | "[risc-v]", 8 | "[tutorial]", 9 | "[hardcore]", 10 | "[compilers]", 11 | ] 12 | description = "Implementing and simulating a new RISC-V instruction with GNU toolchain" 13 | draft = false 14 | +++ 15 | 16 | ## Agenda 17 | 18 | This article shows how to add a new instruction to RISC-V and simulate it. 19 | 20 | These topics are covered along the way: 21 | 22 | - Whole GNU `riscv` toolchain installation; 23 | - Implementation of a new instruction for `spike` RISC-V ISA simulator; 24 | - Manual instruction encoding in C/C++; 25 | - Custom instruction simulation (with visible output); 26 | - [riscv32-]GCC plugin development; 27 | 28 | You may find [associated repository](https://github.com/Quasilyte/gnu-riscv32_ext) useful. 29 | 30 | Many things can go wrong. 31 | Be prepared to fix upcoming issues by yourself. 32 | The final result is very rewarding, I promise. 33 | 34 | ## Toolchain installation 35 | 36 | Choose installation directory. Call it `RISCV`. 37 | 38 | Add these lines to your `~/.bashrc`: 39 | 40 | ```bash 41 | # Directory which will contain everything we need. 42 | export RISCV_HOME=~/riscv-home 43 | # $RISCV will point to toolchain install location. 44 | export RISCV="${RISCV_HOME}/riscv" 45 | export PATH="${PATH}:${RISCV}/bin" 46 | ``` 47 | 48 | Run `mkdir -p "${RISCV_HOME}" "${RISCV}"`. 49 | 50 | Use [1_install/2_download-repos](https://github.com/Quasilyte/gnu-riscv32_ext/blob/master/1_install/2_download-repos) script to clone all required repositories. 51 | 52 | If you wish to save some time and traffic, avoid recursive clone of 53 | toolchain repository. Instead, clone sub-modules by hand. 54 | You may exclude "riscv-glibc". 55 | 56 | > Be warned: I have not tested partial toolchain build, caveat emptor 57 | 58 | Satisfy [GNU toolchain](https://github.com/riscv/riscv-gnu-toolchain) 59 | prerequisites by installing all required packages. 60 | In addition, spike requires `device-tree-compiler` package. 61 | 62 | We choose: 63 | 64 | - RISCV32 over RISCV64 65 | - newlib over glibc 66 | 67 | Repositories must be built in this order: 68 | 69 | 1. riscv-gnu-toolchain 70 | 2. riscv-fesvr, riscv-pk 71 | 3. riscv-isa-sim 72 | 73 | You can use [1_install/3_build-repos](https://github.com/Quasilyte/gnu-riscv32_ext/blob/master/1_install/build-repos) 74 | script as a guideline. 75 | 76 | To check installation, use [1_install/4_check-install](https://github.com/Quasilyte/gnu-riscv32_ext/blob/master/1_install/check-install). 77 | 78 | ## Custom instruction description 79 | 80 | Within the framework of this article, we will implement [mac](https://en.wikipedia.org/wiki/Multiply%E2%80%93accumulate_operation) instruction. 81 | 82 | `rv32im` has `mul` and `add` instructions, `mac` combines them. 83 | It defined as `a0 := a0 + a1 * a2` (ordinary 3-address instruction). 84 | 85 | ```ruby 86 | # Without mac (preserve registers): 87 | mv t0, a0 # addi r0, a0, 0 88 | mul a1, a2, a3 89 | add a1, a1, t0 90 | # With mac: 91 | mac a1, a2, a3 92 | ``` 93 | 94 | ## Adding "mac" instruction to the rv32im 95 | 96 | To add an instruction to the simulator: 97 | 1. Describe the instruction's functional behavior; 98 | 2. Add the opcode and opcode mask to "riscv/opcodes.h"; 99 | 100 | First step is accomplished by adding a `riscv/insns/mac.h` file: 101 | 102 | ```c++ 103 | /* file "$RISCV_HOME/riscv-isa-sim/riscv/insns/mac.h" */ 104 | // 'M' extension means we require integer mul/div standard extension. 105 | require_extension('M'); 106 | // RD = RD + RS1 * RS2 107 | reg_t tmp = sext_xlen(RS1 * RS2); 108 | WRITE_RD(sext_xlen(READ_REG(insn.rd()) + tmp)); 109 | ``` 110 | 111 | For the second step, we use [riscv-opcodes](https://github.com/riscv/riscv-opcodes). 112 | 113 | ```bash 114 | cd "${RISCV_HOME}/riscv-opcodes" 115 | echo -e "mac rd rs1 rs2 31..25=1 14..12=0 6..2=0x1A 1..0=3\n" >> opcodes 116 | make install 117 | ``` 118 | 119 | It turns out there is a third step which is not documented. 120 | New entry must be added to the `riscv_insn_list`. 121 | 122 | ```bash 123 | sed -i 's/riscv_insn_list = \\/riscv_insn_list = mac\\/g' \ 124 | "${RISCV_HOME}/riscv-isa-sim/riscv/riscv.mk.in" 125 | ``` 126 | 127 | Rebuild the simulator. 128 | 129 | ```bash 130 | cd "${RISCV}/riscv-isa-sim/build" 131 | sudo make install 132 | ``` 133 | 134 | ## Testing rv32im brand new instruction 135 | 136 | At this stage: 137 | 138 | - Compiler knows nothing about `mac`. It can not emit that instruction; 139 | - Assembler knows nothing about `mac`. We can not use `mac` in inline assembly; 140 | 141 | Our last resort is manual encoding. 142 | 143 | ```c 144 | #include 145 | // Needed to verify results. 146 | int mac_c(int a, int b, int c) { 147 | a += b * c; // Semantically, it is "mac" 148 | return a; 149 | } 150 | // Should not be inlined, because we expect arguments 151 | // in particular registers. 152 | __attribute__((noinline)) 153 | int mac_asm(int a, int b, int c) { 154 | asm __volatile__ (".word 0x02C5856B\n"); 155 | return a; 156 | } 157 | int main(int argc, char** argv) { 158 | int a = 2, b = 3, c = 4; 159 | printf("%d =?= %d\n", mac_c(a, b, c), mac_asm(a, b, c)); 160 | } 161 | ``` 162 | 163 | Save test program as `test_mac.c`. 164 | 165 | ```bash 166 | riscv32-unknown-elf-gcc test_mac.c -O1 -march=rv32im -o test_mac 167 | spike --isa=RV32IM "${RISCV_PK}" test_mac 168 | ``` 169 | 170 | You should see `14 =?= 14` printed to stdout. 171 | If result differs, `riscv32-unknown-elf-gdb` can help you in troubleshooting. 172 | 173 | ## Mac encoding explained 174 | 175 | Be sure to look at [official specifications](https://riscv.org/specifications/) if 176 | you aim for precise descriptions. 177 | 178 | `mac` will mimic `mul` encoding, but use different opcode. 179 | 180 | ```ruby 181 | # file "riscv-opcodes/opcodes" 182 | # differs 183 | # | 184 | # v 185 | mac rd rs1 rs2 31..25=1 14..12=0 6..2=0x1A 1..0=3 186 | mul rd rs1 rs2 31..25=1 14..12=0 6..2=0x0C 1..0=3 187 | # ^ ^ ^ ^ ^ ^ ^ 188 | # | | | | | | | 189 | # | | | | | | | 190 | # | | | | | | also opcode 3 bits 191 | # | | | | | opcode 5 bits 192 | # | | | | funct3 3 bits 193 | # | | | funct7 7 bits 194 | # | | rs2 (src2) 5 bits 195 | # | rs1 (src1) 5 bits 196 | # dest 5 bits 197 | ``` 198 | 199 | Actual encoding has different order of components and opcode is 200 | really single 7 bit segment. 201 | 202 | > 5 bits per register operand means that we have 32 addressable registers. 203 | 204 | ```ruby 205 | # Encoding used for "mac a0, a1, a2" 206 | 0x02C5856B [base 16] 207 | == 208 | 10110001011000010101101011 [base 2] 209 | == 210 | 00000010110001011000010101101011 [base 2] 211 | # Group by related bit chunks: 212 | 0000001 01100 01011 000 01010 1101011 213 | ^ ^ ^ ^ ^ ^ 214 | | | | | | | 215 | | | | | | opcode (6..2=0x0C 1..0=3) 216 | | | | | dest (10 : a0) 217 | | | | funct3 (14..12=0) 218 | | | src1 (11 : a1) 219 | | src2 (12 : a2) 220 | funct7 (31..25=1) 221 | ``` 222 | 223 | 224 | 225 | ## Plugin vs patch 226 | 227 | There are two ways to extend GCC: 228 | 229 | 1. Patch GCC itself 230 | 2. Write loadable plugin for GCC 231 | 232 | Prefer plugins to GCC patches whenever possible. 233 | GCC wiki ["plugins"](https://gcc.gnu.org/wiki/plugins) page described 234 | advantages in the "Background" section. 235 | 236 | In this guide, both methods will be covered. 237 | 238 | Useful links: 239 | 240 | - [Simple GCC plugin](http://thinkingeek.com/2015/08/16/a-simple-plugin-for-gcc-part-1/) series of posts 241 | - [GCC plugins manual](https://gcc.gnu.org/onlinedocs/gccint/Plugins.html#Plugins) 242 | 243 | ## GCC "rv32imMac" plugin 244 | 245 | **TODO** 246 | 247 | ## GIMPLE "gmac" statement 248 | 249 | **TODO** 250 | 251 | ## The pleasure of intrinsics 252 | 253 | **TODO** 254 | 255 | ## Compiling "mac" without intrinsic 256 | 257 | **TODO** 258 | -------------------------------------------------------------------------------- /content/post/ruleguard-modules.md: -------------------------------------------------------------------------------- 1 | +++ 2 | date = "Mon Dec 21 00:54:25 MSK 2020" 3 | title = "ruleguard rules package management" 4 | tags = [ 5 | "[go]", 6 | "[shortread]", 7 | "[ruleguard]", 8 | "[static-analysis]", 9 | ] 10 | description = "A quick intro into the ruleguard rules packaging." 11 | draft = false 12 | +++ 13 | 14 | **Bundles** is a new feature coming to the [ruleguard](https://github.com/quasilyte/go-ruleguard). It'll make it possible to re-use third-party rules without having to copy/paste them. 15 | 16 | ## Creating an importable bundle 17 | 18 | A package that exports rules must define a [Bundle](https://godoc.org/github.com/quasilyte/go-ruleguard/dsl#Bundle) object: 19 | 20 | ```go 21 | package gorules 22 | 23 | import "github.com/quasilyte/go-ruleguard/dsl" 24 | 25 | // Bundle holds the rules package metadata. 26 | // 27 | // In order to be importable from other gorules package, 28 | // a package must define a Bundle variable. 29 | var Bundle = dsl.Bundle{} 30 | 31 | func boolComparison(m dsl.Matcher) { 32 | m.Match(`$x == true`, 33 | `$x != true`, 34 | `$x == false`, 35 | `$x != false`). 36 | Report(`omit bool literal in expression`) 37 | } 38 | ``` 39 | 40 | That package should be a separate [Go module](https://github.com/golang/go/wiki/Modules). A rules bundle is versioned by its Go module. 41 | 42 | It's possible to have several ruleguard files inside one Go module. Only one file should define a Bundle object. During a bundle import, all files will be exported. 43 | 44 | > The metadata object is called a `Bundle` to avoid confusion with Go packages and Go modules. It's useful to have a dedicated word for them. 45 | 46 | ## Importing a bundle 47 | 48 | To use an external rule set: 49 | 50 | 1. Import the containing package 51 | 2. In `init()` function, use its **Bundle** variable in [ImportRules()](https://godoc.org/github.com/quasilyte/go-ruleguard/dsl#ImportRules) call 52 | 53 | ```go 54 | package gorules 55 | 56 | import ( 57 | "github.com/quasilyte/go-ruleguard/dsl" 58 | quasilyterules "github.com/quasilyte/ruleguard-rules-test" 59 | ) 60 | 61 | func init() { 62 | // Imported rules will have a "qrules" prefix. 63 | dsl.ImportRules("qrules", quasilyterules.Bundle) 64 | } 65 | 66 | // Then you can define your own rules. 67 | 68 | func emptyStringTest(m dsl.Matcher) { 69 | m.Match(`len($s) == 0`). 70 | Where(m["s"].Type.Is("string")). 71 | Report(`maybe use $s == "" instead?`) 72 | 73 | m.Match(`len($s) != 0`). 74 | Where(m["s"].Type.Is("string")). 75 | Report(`maybe use $s != "" instead?`) 76 | } 77 | ``` 78 | 79 | Now all you need is to install the imported [github.com/quasilyte/ruleguard-rules-test](https://github.com/quasilyte/ruleguard-rules-test) package. Since bundles are Go modules, it's as simple as installing any other Go module: 80 | 81 | ```bash 82 | go get -v github.com/quasilyte/ruleguard-rules-test 83 | ``` 84 | 85 | It's possible to use an empty (`""`) prefix, but you'll risk getting a name collision. If you don't define your own rules, then it's perfectly 86 | fine to use an empty prefix. 87 | 88 | All ruleguard packages are named `gorules`, so you'll need to assign a local package name. In the example above, we used `quasilyterules` name. 89 | 90 | ## Running the ruleguard 91 | 92 | If you installed the bundle, you should be able to run your main rules file normally: 93 | 94 | ```bash 95 | $ ruleguard -rules rules.go test.go 96 | test.go:4:6: emptyStringTest: maybe use s == "" instead? (rules.go:13) 97 | test.go:5:6: qrules/boolComparison: omit bool literal in expression (rules1.go:8) 98 | ``` 99 | 100 | Using ruleguard from the [go-critic](https://github.com/go-critic/go-critic) or [golangci-lint](https://github.com/golangci/golangci-lint) stays the same. As long as bundles are installed and they can be located by the `go list $package_path`, everything should work fine. 101 | 102 |
103 | 104 | Limitations: 105 | 106 | * Imported packages can't import other bundle packages (could be addressed later) 107 | * Bundles are tied to Go modules; they might now work properly without them 108 | -------------------------------------------------------------------------------- /content/post/single-exit.md: -------------------------------------------------------------------------------- 1 | +++ 2 | date = "Tue Jun 15 00:48:18 MSK 2021" 3 | title = "A single point of exit" 4 | tags = [ 5 | "[shortread]", 6 | ] 7 | description = "A better way to write your `main()` function." 8 | draft = false 9 | +++ 10 | 11 | > There are other similar articles, like [Why you shouldn't use func main in Go](https://pace.dev/blog/2020/02/12/why-you-shouldnt-use-func-main-in-golang-by-mat-ryer.html). This post addresses the issue from a slightly different angle. 12 | 13 | `tl;dr`: You program should probably have only **one** [os.Exit()](https://golang.org/pkg/os/#Exit) call, if any. 14 | 15 | That includes all indirect calls: [log.Fatal()](https://golang.org/pkg/log/#Fatal) and any other function that calls `os.Exit()` at some point. 16 | 17 | If your main looks like this, then this article is for you: 18 | 19 | ```go 20 | func main() { 21 | x, err := doSomething() 22 | defer x.Close() 23 | if err != nil { 24 | log.Fatalf("failed to do something: %+v", err) 25 | } 26 | y, err := doSomethingElse(x) 27 | if err != nil { 28 | log.Fatalf("failed to do something else: %+v", err) 29 | } 30 | // ... and so on 31 | } 32 | ``` 33 | 34 | What's the problem here? It calls `log.Fatal()` several times. 35 | 36 | Why is that a problem? 37 | 38 | * Do you see a deferred `x.Close()` call? If `doSomethingElse()` fails, the `log.Fatalf()` will be executed. That will lead to the `os.Exit()` quitting the program without executing any deferred calls. 39 | 40 | * It's hard to refactor that code. If you'll keep the code as is and move it to a separate function, you'll end up with a function that can exit your program. 41 | 42 | * It's even worse if you have `log.Fatal()` calls somewhere below the execution tree. For example, if `doSomethingElse` can exit on its own, we may not have a chance to log an error inside our main function. This makes the program flow more complicated than it could be. 43 | 44 | Good news: you can fix these problems with one simple trick. Adhere to the single exit point idiom. 45 | 46 | ```go 47 | func main() { 48 | if err := mainNoExit(); err != nil { 49 | log.Fatalf("error: %+v", err) 50 | } 51 | } 52 | 53 | func mainNoExit() error { 54 | x, err := doSomething() 55 | defer x.Close() 56 | if err != nil { 57 | return fmt.Errorf("failed to do something: %+v", err) 58 | } 59 | y, err := doSomethingElse(x) 60 | if err != nil { 61 | return fmt.Errorf("failed to do something else: %+v", err) 62 | } 63 | // ... and so on 64 | } 65 | ``` 66 | 67 | You can call that `mainNoExit()` in any way you like. Here are some other options: 68 | 69 | * `mainImpl()` 70 | * `appMain()` 71 | * move it to another package and call it `otherpkg.Main()` 72 | 73 | As a bonus, you get a function (mainNoExit) that is far easier to test than the original main. 74 | 75 | If your program needs to exit with different exit codes, consider this: 76 | 77 | ```go 78 | func main() { 79 | if err, exitCode := mainNoExit(); err != nil { 80 | log.Printf("error: %+v", err) 81 | os.Exit(exitCode) 82 | } 83 | } 84 | 85 | // Note: mainNoExit returns 2 values now. 86 | func mainNoExit() (error, int) { 87 | x, err := doSomething() 88 | defer x.Close() 89 | if err != nil { 90 | return fmt.Errorf("failed to do something: %+v", err), 1 91 | } 92 | y, err := doSomethingElse(x) 93 | if err != nil { 94 | return fmt.Errorf("failed to do something else: %+v", err), 1 95 | } 96 | // ... and so on 97 | } 98 | ``` 99 | 100 | If you're using some CLI framework, it can still be possible to decompose the logic a little bit and avoid spreading the baddies across your code. 101 | 102 | Let's suppose that we're using [github.com/cespare/subcmd](https://github.com/cespare/subcmd) package. The signature for a subcommand is `func ([]string)`. 103 | 104 | We need a wrapper that would provide us the interface we want. It could be a manual function wrapping, a wrapper framework, or a function factory. Choose your poison. 105 | 106 | I'll use a manual function wrapping here. 107 | 108 | ```go 109 | func main() { 110 | log.SetFlags(0) 111 | 112 | cmds := []subcmd.Command{ 113 | { 114 | Name: "bench", 115 | Description: "run benchmark tests", 116 | Do: benchMain, 117 | }, 118 | 119 | // ... and so on 120 | } 121 | 122 | subcmd.Run(cmds) 123 | } 124 | 125 | func benchMain(args []string) { 126 | if err := cmdBench(args); err != nil { 127 | log.Fatalf("bench: error: %v", err) 128 | } 129 | } 130 | 131 | func cmdBench(args []string) error { 132 | // Actual implementation... 133 | } 134 | ``` 135 | 136 | The [go-critic](https://github.com/go-critic/go-critic) static analyzer can detect some "exit after defer" cases. There is a [go-critic#issue1022](https://github.com/go-critic/go-critic/issues/1022) that raises the topic we're discussing here. 137 | 138 | A long story short, using the single exit pattern can help you to avoid some confusing edge cases that make the static analyzers go crazy. 139 | 140 | Let me re-iterate why having a single point of exit is a good thing: 141 | 142 | * It leads to a better code structure. Easier to decompose and move the code around. 143 | 144 | * Your main package may suddenly become easier to test. 145 | 146 | * The program flow becomes simpler. 147 | 148 | * Static analyzers will thank you. 149 | 150 | * Less `log.Fatal()` things that are [bad](https://quasilyte.dev/blog/post/log-fatal-vs-log-panic/). 151 | -------------------------------------------------------------------------------- /content/post/step-pattern.md: -------------------------------------------------------------------------------- 1 | +++ 2 | date = "Wed Feb 27 22:15:32 MSK 2019" 3 | title = "Step driven evaluation" 4 | tags = [ 5 | "[shortread]", 6 | "[go]", 7 | ] 8 | description = "A pattern for writing multi-step programs." 9 | draft = false 10 | +++ 11 | 12 | If you heard about [table driven tests](https://github.com/golang/go/wiki/TableDrivenTests), the idea described in this article will be easier to grasp, since it's the same technique, but used outside of the tests. 13 | 14 | Suppose you have a function that executes a lot of other functions. This function probably does two main things: 15 | 16 | 1. It checks for all returned errors as they occur. 17 | 2. It passes one function outputs as the inputs for another. 18 | 19 | ```go 20 | // process is an example pipeline-like function. 21 | func queryFile(filename, queryText string) (string, error) { 22 | data, err := readData(filename) 23 | if err != nil { 24 | return nil, errors.Errorf("read data: %v", err) 25 | } 26 | rows, err := splitData(data) 27 | if err != nil { 28 | return nil, errors.Errorf("split data: %v", err) 29 | } 30 | q, err := compileQuery(queryText) 31 | if err != nil { 32 | return nil, errors.Errorf("compile query: %v", err) 33 | } 34 | rows, err = filterRows(rows, q) 35 | if err != nil { 36 | return nil, errors.Errorf("filter rows: %v", err) 37 | } 38 | result, err := rowsToString(rows) 39 | if err != nil { 40 | return nil, errors.Errorf("rows to string: %v", err) 41 | } 42 | return result, nil 43 | } 44 | ``` 45 | 46 | This function consists of 5 steps. Five relevant calls, to be precise. Everything else is a distraction. The order of those calls matter, it's a sequence, the algorithm. 47 | 48 | Let's re-write code above using the step driven evaluation. 49 | 50 | ```go 51 | func queryFile(filename, queryText string) ([]row, error) { 52 | var ctx queryFileContext 53 | steps := []struct { 54 | name string 55 | fn func() error 56 | }{ 57 | {"read data", ctx.readData}, 58 | {"split data", ctx.splitData}, 59 | {"compile query", ctx.compileQuery}, 60 | {"filter rows", ctx.filterRows}, 61 | {"rows to string", ctx.rowsToString}, 62 | } 63 | for _, step := range steps { 64 | if err := step.fn(); err != nil { 65 | return errors.Errorf("%s: %v", step.name, err) 66 | } 67 | } 68 | return ctx.result 69 | } 70 | ``` 71 | 72 | The pipeline is now explicit, it's easier to adjust steps order and to insert or remove them. It is also trivial to add debug logging inside that loop, you need only one new statement as opposed to `N` statements near every function call. 73 | 74 | This approach shines with 4+ step, when the complexity of introducing a new type like `queryFileContext` is inferior to the benefits. 75 | 76 | ```go 77 | // queryFileContext might look like the struct below. 78 | 79 | type queryFileContext struct { 80 | data []byte 81 | rows []row 82 | q *query 83 | result string 84 | } 85 | ``` 86 | 87 | Methods like `queryFileContext.splitData` just call the same function while updating the `ctx` object state. 88 | 89 | ```go 90 | func (ctx *queryFileContext) splitData() error { 91 | var err error 92 | ctx.rows, err = splitData(ctx.data) 93 | return err 94 | } 95 | ``` 96 | 97 | This pattern works particularly well for `main` functions. 98 | 99 | ```go 100 | func main() { 101 | ctx := &context{} 102 | 103 | steps := []struct { 104 | name string 105 | fn func() error 106 | }{ 107 | {"parse flags", ctx.parseFlags}, 108 | {"read schema", ctx.readSchema}, 109 | {"dump schema", ctx.dumpSchema}, // Before transformations 110 | {"remove builtin constructors", ctx.removeBuiltinConstructors}, 111 | {"add adhoc constructors", ctx.addAdhocConstructors}, 112 | {"validate schema", ctx.validateSchema}, 113 | {"decompose arrays", ctx.decomposeArrays}, 114 | {"replace arrays", ctx.replaceArrays}, 115 | {"resolve generics", ctx.resolveGenerics}, 116 | {"dump schema", ctx.dumpSchema}, // After transformations 117 | {"decode combinators", ctx.decodeCombinators}, 118 | {"dump decoded combinators", ctx.dumpDecodedCombinators}, 119 | {"codegen", ctx.codegen}, 120 | } 121 | 122 | for _, step := range steps { 123 | ctx.debugf("start %s step", step.name) 124 | if err := step.fn(); err != nil { 125 | log.Fatalf("%s: %v", step.name, err) 126 | } 127 | } 128 | } 129 | ``` 130 | 131 | An additional benefit is the ease of testing. Even though we use `log.Fatalf`, [which is a bad thing](https://quasilyte.github.io/blog/post/log-fatal-vs-log-panic/), it's trivial to re-create this pipeline inside a test and run a set of steps that fail a test instead of doing `os.Exit`. 132 | 133 | You can also omit some CLI-related steps inside tests, like `"dump schema"` or `"codegen"`. You can also inject test-specific steps into that list. 134 | 135 | There are few drawbacks, as always: 136 | 137 | 1. You need to introduce a new type and probably a few methods for it. 138 | 2. It's not always straightforward to figure out appropriate context object layout so 139 | it satisfies the needs of the entire pipeline without getting overly complex. 140 | 141 | Try using it, maybe you'll like it. 142 | -------------------------------------------------------------------------------- /content/post/yaml5.md: -------------------------------------------------------------------------------- 1 | +++ 2 | date = "Thu Sep 10 22:01:24 MSK 2020" 3 | title = "YAML is your JSON5" 4 | tags = [ 5 | "[yaml5]", 6 | "[yaml]", 7 | "[json5]", 8 | "[shortread]", 9 | ] 10 | description = "Let me tell you about the YAML5 idea." 11 | draft = false 12 | +++ 13 | 14 | TL;DR: 15 | 16 | * Write YAML as [JSON5](https://json5.org/) (but use `#` for comments) 17 | * Enforce this style with [yaml5 lint](https://github.com/quasilyte/yaml5) 18 | 19 | ```js 20 | # JSON5? YAML? 21 | {a: 1, b: ['x', 'y']} 22 | ``` 23 | 24 | What you see above is a valid [YAML 1.2](https://yaml.org/spec/1.2/spec.html) document.
25 | It uses the flow-style syntax for objects and arrays. 26 | 27 | If you look at the [JSON5](https://json5.org/) feature list, you can deduce that YAML is a superset of JSON5.
28 | The only difference is the single-line comment syntax. 29 | 30 | Yes, we can [go and re-write](https://github.com/go-critic/go-critic/pull/966) all YAML files in a JSON5 style. 31 | 32 | [YAML5](https://github.com/quasilyte/yaml5) is the "JSON with comments and trailing commas" that some of us were waiting for. Just take YAML and write it like JSON5, the `yaml5 lint` tool can enforce our promise of not using any features outside of the JSON5 subset: 33 | 34 | ```bash 35 | $ cat bad.yaml 36 | foo: 37 | - a 38 | - key: val 39 | 40 | $ yaml5 lint bad.yaml 41 | bad.yaml:1:4: used a key-value outside of an object 42 | bad.yaml:2:3: use a flow array syntax [] instead 43 | bad.yaml:2:5: unquoted strings are not allowed 44 | bad.yaml:3:8: used a key-value outside of an object 45 | bad.yaml:3:10: infinity value should not be used 46 | ``` 47 | 48 | I'm planning to implement `yaml5 fmt` tool that would pretty-print a YAML document as YAML5 document. 49 | -------------------------------------------------------------------------------- /hugo_hints.txt: -------------------------------------------------------------------------------- 1 | run server: 2 | hugo server --theme=hugo-steam-theme --buildDrafts 3 | 4 | build site: 5 | hugo --theme=hugo-steam-theme 6 | -------------------------------------------------------------------------------- /layouts/_default/single.html: -------------------------------------------------------------------------------- 1 | {{ define "main" }} 2 |
3 |
4 |

{{ .Title }}

5 | 10 |
11 | 12 | 13 | {{ replace .TableOfContents "" "
" | safeHTML }} 14 | 15 |
16 | {{ .Content }} 17 |
18 | 19 | {{ if .Type | eq "post" }} 20 | 29 | 30 | {{ template "_internal/disqus.html" . }} 31 | {{ partial "share" . }} 32 | 33 |
34 | {{ partial "author" . }} 35 |
36 | {{ end }} 37 |
38 | {{ end }} -------------------------------------------------------------------------------- /layouts/_default/summary.html: -------------------------------------------------------------------------------- 1 |
2 |
3 |

{{ .Title }}

4 |
5 |
6 | {{ if and (not .Description) (.Site.Params.useSummaryIfNoDescription) }} 7 |

{{ .Summary }}…

8 | {{ else }} 9 | {{ .Description | markdownify }} 10 | {{ end }} 11 |
12 | 17 |
-------------------------------------------------------------------------------- /layouts/_default/terms.html: -------------------------------------------------------------------------------- 1 | {{ define "main" }} 2 |
3 |
    4 | {{ $data := .Data }} 5 | {{ range $key, $value := .Data.Terms.ByCount }} 6 | {{ $url := printf "%s/%s" $data.Plural ($value.Name | urlize) }} 7 |
  • {{ $value.Name |title }} {{ $value.Count }}
  • 8 | {{ end }} 9 |
10 |
11 | {{ end }} -------------------------------------------------------------------------------- /layouts/partials/head.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | {{ .Title }}{{ if not .IsHome }} · {{ .Site.Title }}{{ end }} 8 | {{ with .Site.Params.name }}{{ end }} 9 | {{ with .Site.Params.description }}{{ end }} 10 | 11 | {{ .Hugo.Generator }} 12 | 13 | 14 | 15 | 16 | 17 | {{ "" | safeHTML }} 18 | {{ if .RSSLink }} 19 | 21 | 23 | {{ end }} 24 | 25 | 26 | 27 | 28 | {{ "" | safeHTML }} 29 | 30 | 31 | {{ range .Site.Params.customCSS }} 32 | 33 | {{ end }} 34 | 35 | {{ with .Site.Params.favicon }} 36 | 37 | 38 | {{ end }} 39 | 40 | {{ "" | safeHTML }} 41 | {{ partial (printf "themes/%s-theme" .Site.Params.themecolor) . }} 42 | 43 | -------------------------------------------------------------------------------- /layouts/partials/js.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | {{ range .Site.Params.customJS }} 6 | 7 | {{ end }} 8 | 9 | 17 | 18 | 19 | {{ template "_internal/google_analytics.html" . }} -------------------------------------------------------------------------------- /layouts/partials/navigation.html: -------------------------------------------------------------------------------- 1 | {{ if .Site.Menus.main }} 2 | 9 | {{ end }} -------------------------------------------------------------------------------- /scripts/deploy: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | if [ -z "${BLOG_DEPLOY_REPO}" ]; then 4 | echo '[ERROR] $BLOG_DEPLOY_REPO is undefined' 5 | exit 1 6 | fi 7 | 8 | echo ' ... building site' && 9 | hugo --theme=hugo-steam-theme && 10 | echo ' ... copying files' && 11 | cp -a 'public/.' "${BLOG_DEPLOY_REPO}" && 12 | echo '[OK] done' -------------------------------------------------------------------------------- /scripts/run_server: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | hugo server --theme=hugo-steam-theme --buildDrafts -------------------------------------------------------------------------------- /static/css/normalize.css: -------------------------------------------------------------------------------- 1 | /*! normalize.css v6.0.0 | MIT License | github.com/necolas/normalize.css */ 2 | 3 | /* Document 4 | ========================================================================== */ 5 | 6 | /** 7 | * 1. Correct the line height in all browsers. 8 | * 2. Prevent adjustments of font size after orientation changes in 9 | * IE on Windows Phone and in iOS. 10 | */ 11 | 12 | html { 13 | line-height: 1.15; /* 1 */ 14 | -ms-text-size-adjust: 100%; /* 2 */ 15 | -webkit-text-size-adjust: 100%; /* 2 */ 16 | } 17 | 18 | /* Sections 19 | ========================================================================== */ 20 | 21 | /** 22 | * Add the correct display in IE 9-. 23 | */ 24 | 25 | article, 26 | aside, 27 | footer, 28 | header, 29 | nav, 30 | section { 31 | display: block; 32 | } 33 | 34 | /** 35 | * Correct the font size and margin on `h1` elements within `section` and 36 | * `article` contexts in Chrome, Firefox, and Safari. 37 | */ 38 | 39 | h1 { 40 | font-size: 2em; 41 | margin: 0.67em 0; 42 | } 43 | 44 | /* Grouping content 45 | ========================================================================== */ 46 | 47 | /** 48 | * Add the correct display in IE 9-. 49 | * 1. Add the correct display in IE. 50 | */ 51 | 52 | figcaption, 53 | figure, 54 | main { /* 1 */ 55 | display: block; 56 | } 57 | 58 | /** 59 | * Add the correct margin in IE 8. 60 | */ 61 | 62 | figure { 63 | margin: 1em 40px; 64 | } 65 | 66 | /** 67 | * 1. Add the correct box sizing in Firefox. 68 | * 2. Show the overflow in Edge and IE. 69 | */ 70 | 71 | hr { 72 | box-sizing: content-box; /* 1 */ 73 | height: 0; /* 1 */ 74 | overflow: visible; /* 2 */ 75 | } 76 | 77 | /** 78 | * 1. Correct the inheritance and scaling of font size in all browsers. 79 | * 2. Correct the odd `em` font sizing in all browsers. 80 | */ 81 | 82 | pre { 83 | font-family: monospace, monospace; /* 1 */ 84 | font-size: 1em; /* 2 */ 85 | } 86 | 87 | /* Text-level semantics 88 | ========================================================================== */ 89 | 90 | /** 91 | * 1. Remove the gray background on active links in IE 10. 92 | * 2. Remove gaps in links underline in iOS 8+ and Safari 8+. 93 | */ 94 | 95 | a { 96 | background-color: transparent; /* 1 */ 97 | -webkit-text-decoration-skip: objects; /* 2 */ 98 | } 99 | 100 | /** 101 | * 1. Remove the bottom border in Chrome 57- and Firefox 39-. 102 | * 2. Add the correct text decoration in Chrome, Edge, IE, Opera, and Safari. 103 | */ 104 | 105 | abbr[title] { 106 | border-bottom: none; /* 1 */ 107 | text-decoration: underline; /* 2 */ 108 | text-decoration: underline dotted; /* 2 */ 109 | } 110 | 111 | /** 112 | * Prevent the duplicate application of `bolder` by the next rule in Safari 6. 113 | */ 114 | 115 | b, 116 | strong { 117 | font-weight: inherit; 118 | } 119 | 120 | /** 121 | * Add the correct font weight in Chrome, Edge, and Safari. 122 | */ 123 | 124 | b, 125 | strong { 126 | font-weight: bolder; 127 | } 128 | 129 | /** 130 | * 1. Correct the inheritance and scaling of font size in all browsers. 131 | * 2. Correct the odd `em` font sizing in all browsers. 132 | */ 133 | 134 | code, 135 | kbd, 136 | samp { 137 | font-family: monospace, monospace; /* 1 */ 138 | font-size: 1em; /* 2 */ 139 | } 140 | 141 | /** 142 | * Add the correct font style in Android 4.3-. 143 | */ 144 | 145 | dfn { 146 | font-style: italic; 147 | } 148 | 149 | /** 150 | * Add the correct background and color in IE 9-. 151 | */ 152 | 153 | mark { 154 | background-color: #ff0; 155 | color: #000; 156 | } 157 | 158 | /** 159 | * Add the correct font size in all browsers. 160 | */ 161 | 162 | small { 163 | font-size: 80%; 164 | } 165 | 166 | /** 167 | * Prevent `sub` and `sup` elements from affecting the line height in 168 | * all browsers. 169 | */ 170 | 171 | sub, 172 | sup { 173 | font-size: 75%; 174 | line-height: 0; 175 | position: relative; 176 | vertical-align: baseline; 177 | } 178 | 179 | sub { 180 | bottom: -0.25em; 181 | } 182 | 183 | sup { 184 | top: -0.5em; 185 | } 186 | 187 | /* Embedded content 188 | ========================================================================== */ 189 | 190 | /** 191 | * Add the correct display in IE 9-. 192 | */ 193 | 194 | audio, 195 | video { 196 | display: inline-block; 197 | } 198 | 199 | /** 200 | * Add the correct display in iOS 4-7. 201 | */ 202 | 203 | audio:not([controls]) { 204 | display: none; 205 | height: 0; 206 | } 207 | 208 | /** 209 | * Remove the border on images inside links in IE 10-. 210 | */ 211 | 212 | img { 213 | border-style: none; 214 | } 215 | 216 | /** 217 | * Hide the overflow in IE. 218 | */ 219 | 220 | svg:not(:root) { 221 | overflow: hidden; 222 | } 223 | 224 | /* Forms 225 | ========================================================================== */ 226 | 227 | /** 228 | * Remove the margin in Firefox and Safari. 229 | */ 230 | 231 | button, 232 | input, 233 | optgroup, 234 | select, 235 | textarea { 236 | margin: 0; 237 | } 238 | 239 | /** 240 | * Show the overflow in IE. 241 | * 1. Show the overflow in Edge. 242 | */ 243 | 244 | button, 245 | input { /* 1 */ 246 | overflow: visible; 247 | } 248 | 249 | /** 250 | * Remove the inheritance of text transform in Edge, Firefox, and IE. 251 | * 1. Remove the inheritance of text transform in Firefox. 252 | */ 253 | 254 | button, 255 | select { /* 1 */ 256 | text-transform: none; 257 | } 258 | 259 | /** 260 | * 1. Prevent a WebKit bug where (2) destroys native `audio` and `video` 261 | * controls in Android 4. 262 | * 2. Correct the inability to style clickable types in iOS and Safari. 263 | */ 264 | 265 | button, 266 | html [type="button"], /* 1 */ 267 | [type="reset"], 268 | [type="submit"] { 269 | -webkit-appearance: button; /* 2 */ 270 | } 271 | 272 | /** 273 | * Remove the inner border and padding in Firefox. 274 | */ 275 | 276 | button::-moz-focus-inner, 277 | [type="button"]::-moz-focus-inner, 278 | [type="reset"]::-moz-focus-inner, 279 | [type="submit"]::-moz-focus-inner { 280 | border-style: none; 281 | padding: 0; 282 | } 283 | 284 | /** 285 | * Restore the focus styles unset by the previous rule. 286 | */ 287 | 288 | button:-moz-focusring, 289 | [type="button"]:-moz-focusring, 290 | [type="reset"]:-moz-focusring, 291 | [type="submit"]:-moz-focusring { 292 | outline: 1px dotted ButtonText; 293 | } 294 | 295 | /** 296 | * 1. Correct the text wrapping in Edge and IE. 297 | * 2. Correct the color inheritance from `fieldset` elements in IE. 298 | * 3. Remove the padding so developers are not caught out when they zero out 299 | * `fieldset` elements in all browsers. 300 | */ 301 | 302 | legend { 303 | box-sizing: border-box; /* 1 */ 304 | color: inherit; /* 2 */ 305 | display: table; /* 1 */ 306 | max-width: 100%; /* 1 */ 307 | padding: 0; /* 3 */ 308 | white-space: normal; /* 1 */ 309 | } 310 | 311 | /** 312 | * 1. Add the correct display in IE 9-. 313 | * 2. Add the correct vertical alignment in Chrome, Firefox, and Opera. 314 | */ 315 | 316 | progress { 317 | display: inline-block; /* 1 */ 318 | vertical-align: baseline; /* 2 */ 319 | } 320 | 321 | /** 322 | * Remove the default vertical scrollbar in IE. 323 | */ 324 | 325 | textarea { 326 | overflow: auto; 327 | } 328 | 329 | /** 330 | * 1. Add the correct box sizing in IE 10-. 331 | * 2. Remove the padding in IE 10-. 332 | */ 333 | 334 | [type="checkbox"], 335 | [type="radio"] { 336 | box-sizing: border-box; /* 1 */ 337 | padding: 0; /* 2 */ 338 | } 339 | 340 | /** 341 | * Correct the cursor style of increment and decrement buttons in Chrome. 342 | */ 343 | 344 | [type="number"]::-webkit-inner-spin-button, 345 | [type="number"]::-webkit-outer-spin-button { 346 | height: auto; 347 | } 348 | 349 | /** 350 | * 1. Correct the odd appearance in Chrome and Safari. 351 | * 2. Correct the outline style in Safari. 352 | */ 353 | 354 | [type="search"] { 355 | -webkit-appearance: textfield; /* 1 */ 356 | outline-offset: -2px; /* 2 */ 357 | } 358 | 359 | /** 360 | * Remove the inner padding and cancel buttons in Chrome and Safari on macOS. 361 | */ 362 | 363 | [type="search"]::-webkit-search-cancel-button, 364 | [type="search"]::-webkit-search-decoration { 365 | -webkit-appearance: none; 366 | } 367 | 368 | /** 369 | * 1. Correct the inability to style clickable types in iOS and Safari. 370 | * 2. Change font properties to `inherit` in Safari. 371 | */ 372 | 373 | ::-webkit-file-upload-button { 374 | -webkit-appearance: button; /* 1 */ 375 | font: inherit; /* 2 */ 376 | } 377 | 378 | /* Interactive 379 | ========================================================================== */ 380 | 381 | /* 382 | * Add the correct display in IE 9-. 383 | * 1. Add the correct display in Edge, IE, and Firefox. 384 | */ 385 | 386 | details, /* 1 */ 387 | menu { 388 | display: block; 389 | } 390 | 391 | /* 392 | * Add the correct display in all browsers. 393 | */ 394 | 395 | summary { 396 | display: list-item; 397 | } 398 | 399 | /* Scripting 400 | ========================================================================== */ 401 | 402 | /** 403 | * Add the correct display in IE 9-. 404 | */ 405 | 406 | canvas { 407 | display: inline-block; 408 | } 409 | 410 | /** 411 | * Add the correct display in IE. 412 | */ 413 | 414 | template { 415 | display: none; 416 | } 417 | 418 | /* Hidden 419 | ========================================================================== */ 420 | 421 | /** 422 | * Add the correct display in IE 10-. 423 | */ 424 | 425 | [hidden] { 426 | display: none; 427 | } 428 | -------------------------------------------------------------------------------- /static/favicon.ico: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lc19997/blogs/2d7ecee37db001765ab9de78ea880940b758d717/static/favicon.ico -------------------------------------------------------------------------------- /static/favicon_old.ico: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lc19997/blogs/2d7ecee37db001765ab9de78ea880940b758d717/static/favicon_old.ico -------------------------------------------------------------------------------- /static/files/go_x86_aliases.txt: -------------------------------------------------------------------------------- 1 | JA => JHI 2 | JAE => JCC 3 | JB => JCS 4 | JBE => JLS 5 | JC => JCS 6 | JCC => JCC 7 | JCS => JCS 8 | JE => JEQ 9 | JEQ => JEQ 10 | JG => JGT 11 | JGE => JGE 12 | JGT => JGT 13 | JHI => JHI 14 | JHS => JCC 15 | JL => JLT 16 | JLE => JLE 17 | JLO => JCS 18 | JLS => JLS 19 | JLT => JLT 20 | JMI => JMI 21 | JNA => JLS 22 | JNAE => JCS 23 | JNB => JCC 24 | JNBE => JHI 25 | JNC => JCC 26 | JNE => JNE 27 | JNG => JLE 28 | JNGE => JLT 29 | JNL => JGE 30 | JNLE => JGT 31 | JNO => JOC 32 | JNP => JPC 33 | JNS => JPL 34 | JNZ => JNE 35 | JO => JOS 36 | JOC => JOC 37 | JOS => JOS 38 | JP => JPS 39 | JPC => JPC 40 | JPE => JPS 41 | JPL => JPL 42 | JPO => JPC 43 | JPS => JPS 44 | JS => JMI 45 | JZ => JEQ 46 | MASKMOVDQU => MASKMOVOU 47 | MOVD => MOVQ 48 | MOVDQ2Q => MOVQ 49 | MOVNTDQ => MOVNTO 50 | MOVOA => MOVO 51 | PSLLDQ => PSLLO 52 | PSRLDQ => PSRLO 53 | PADDD => PADDL 54 | -------------------------------------------------------------------------------- /static/fonts/FontAwesome.otf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lc19997/blogs/2d7ecee37db001765ab9de78ea880940b758d717/static/fonts/FontAwesome.otf -------------------------------------------------------------------------------- /static/fonts/fontawesome-webfont.eot: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lc19997/blogs/2d7ecee37db001765ab9de78ea880940b758d717/static/fonts/fontawesome-webfont.eot -------------------------------------------------------------------------------- /static/fonts/fontawesome-webfont.ttf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lc19997/blogs/2d7ecee37db001765ab9de78ea880940b758d717/static/fonts/fontawesome-webfont.ttf -------------------------------------------------------------------------------- /static/fonts/fontawesome-webfont.woff: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lc19997/blogs/2d7ecee37db001765ab9de78ea880940b758d717/static/fonts/fontawesome-webfont.woff -------------------------------------------------------------------------------- /static/fonts/fontawesome-webfont.woff2: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lc19997/blogs/2d7ecee37db001765ab9de78ea880940b758d717/static/fonts/fontawesome-webfont.woff2 -------------------------------------------------------------------------------- /static/hljs-themes/hybrid.css: -------------------------------------------------------------------------------- 1 | /* 2 | 3 | vim-hybrid theme by w0ng (https://github.com/w0ng/vim-hybrid) 4 | 5 | */ 6 | 7 | /*background color*/ 8 | .hljs { 9 | display: block; 10 | overflow-x: auto; 11 | padding: 0.5em; 12 | background: #1d1f21; 13 | } 14 | 15 | /*selection color*/ 16 | .hljs::selection, 17 | .hljs span::selection { 18 | background: #373b41; 19 | } 20 | 21 | .hljs::-moz-selection, 22 | .hljs span::-moz-selection { 23 | background: #373b41; 24 | } 25 | 26 | /*foreground color*/ 27 | .hljs { 28 | color: #c5c8c6; 29 | } 30 | 31 | /*color: fg_yellow*/ 32 | .hljs-title, 33 | .hljs-name { 34 | color: #f0c674; 35 | } 36 | 37 | /*color: fg_comment*/ 38 | .hljs-comment, 39 | .hljs-meta, 40 | .hljs-meta .hljs-keyword { 41 | color: #707880; 42 | } 43 | 44 | /*color: fg_red*/ 45 | .hljs-number, 46 | .hljs-symbol, 47 | .hljs-literal, 48 | .hljs-deletion, 49 | .hljs-link { 50 | color: #cc6666 51 | } 52 | 53 | /*color: fg_green*/ 54 | .hljs-string, 55 | .hljs-doctag, 56 | .hljs-addition, 57 | .hljs-regexp, 58 | .hljs-selector-attr, 59 | .hljs-selector-pseudo { 60 | color: #b5bd68; 61 | } 62 | 63 | /*color: fg_purple*/ 64 | .hljs-attribute, 65 | .hljs-code, 66 | .hljs-selector-id { 67 | color: #b294bb; 68 | } 69 | 70 | /*color: fg_blue*/ 71 | .hljs-keyword, 72 | .hljs-selector-tag, 73 | .hljs-bullet, 74 | .hljs-tag { 75 | color: #81a2be; 76 | } 77 | 78 | /*color: fg_aqua*/ 79 | .hljs-subst, 80 | .hljs-variable, 81 | .hljs-template-tag, 82 | .hljs-template-variable { 83 | color: #8abeb7; 84 | } 85 | 86 | /*color: fg_orange*/ 87 | .hljs-type, 88 | .hljs-built_in, 89 | .hljs-builtin-name, 90 | .hljs-quote, 91 | .hljs-section, 92 | .hljs-selector-class { 93 | color: #de935f; 94 | } 95 | 96 | .hljs-emphasis { 97 | font-style: italic; 98 | } 99 | 100 | .hljs-strong { 101 | font-weight: bold; 102 | } 103 | -------------------------------------------------------------------------------- /static/hljs-themes/wombat.css: -------------------------------------------------------------------------------- 1 | /*background color*/ 2 | .hljs { 3 | display: block; 4 | overflow-x: auto; 5 | padding: 0.5em; 6 | background: #1d1f21; 7 | } 8 | 9 | .hljs::selection, 10 | .hljs span::selection { 11 | background: #373b41; 12 | } 13 | 14 | .hljs::-moz-selection, 15 | .hljs span::-moz-selection { 16 | background: #373b41; 17 | } 18 | 19 | .hljs { 20 | color: #c5c8c6; 21 | } 22 | 23 | .hljs-title, 24 | .hljs-name { 25 | color: #cae682 ; 26 | } 27 | 28 | .hljs-comment, 29 | .hljs-meta, 30 | .hljs-meta .hljs-keyword { 31 | color: #707880; 32 | } 33 | 34 | .hljs-number, 35 | .hljs-symbol, 36 | .hljs-deletion, 37 | .hljs-link { 38 | color: #cc6666 39 | } 40 | 41 | .hljs-string { 42 | color: #95e454; 43 | } 44 | 45 | .hljs-doctag, 46 | .hljs-addition, 47 | .hljs-regexp, 48 | .hljs-selector-attr, 49 | .hljs-selector-pseudo { 50 | color: #b5bd68; 51 | } 52 | 53 | .hljs-attribute, 54 | .hljs-code, 55 | .hljs-selector-id { 56 | color: #b294bb; 57 | } 58 | 59 | .hljs-keyword, 60 | .hljs-selector-tag, 61 | .hljs-bullet, 62 | .hljs-tag { 63 | color: #8ac6f2; 64 | font-weight: bold; 65 | } 66 | 67 | .hljs-subst, 68 | .hljs-variable, 69 | .hljs-template-tag { 70 | color: #8abeb7; 71 | } 72 | 73 | .hljs-built_in, 74 | .hljs-literal { 75 | color: #e5786d; 76 | } 77 | 78 | .hljs-template-variable, 79 | .hljs-type, 80 | 81 | .hljs-quote, 82 | .hljs-section, 83 | .hljs-selector-class { 84 | color: #92a65e; 85 | font-weight: bold; 86 | } 87 | 88 | .hljs-emphasis { 89 | font-style: italic; 90 | } 91 | 92 | .hljs-strong { 93 | font-weight: bold; 94 | } 95 | -------------------------------------------------------------------------------- /static/img/avatar.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lc19997/blogs/2d7ecee37db001765ab9de78ea880940b758d717/static/img/avatar.jpg -------------------------------------------------------------------------------- /static/img/genmap1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lc19997/blogs/2d7ecee37db001765ab9de78ea880940b758d717/static/img/genmap1.png -------------------------------------------------------------------------------- /static/img/genmap2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lc19997/blogs/2d7ecee37db001765ab9de78ea880940b758d717/static/img/genmap2.png -------------------------------------------------------------------------------- /static/img/genmap3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lc19997/blogs/2d7ecee37db001765ab9de78ea880940b758d717/static/img/genmap3.png -------------------------------------------------------------------------------- /static/img/genmap4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lc19997/blogs/2d7ecee37db001765ab9de78ea880940b758d717/static/img/genmap4.png -------------------------------------------------------------------------------- /static/img/github_watch.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lc19997/blogs/2d7ecee37db001765ab9de78ea880940b758d717/static/img/github_watch.png -------------------------------------------------------------------------------- /static/img/jit_call1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lc19997/blogs/2d7ecee37db001765ab9de78ea880940b758d717/static/img/jit_call1.png -------------------------------------------------------------------------------- /static/img/jit_call2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lc19997/blogs/2d7ecee37db001765ab9de78ea880940b758d717/static/img/jit_call2.png -------------------------------------------------------------------------------- /static/img/pathing_bithack.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lc19997/blogs/2d7ecee37db001765ab9de78ea880940b758d717/static/img/pathing_bithack.png -------------------------------------------------------------------------------- /static/img/pathing_comparison.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lc19997/blogs/2d7ecee37db001765ab9de78ea880940b758d717/static/img/pathing_comparison.png -------------------------------------------------------------------------------- /static/img/pathing_deltas.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lc19997/blogs/2d7ecee37db001765ab9de78ea880940b758d717/static/img/pathing_deltas.png -------------------------------------------------------------------------------- /static/img/pathing_map.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lc19997/blogs/2d7ecee37db001765ab9de78ea880940b758d717/static/img/pathing_map.png -------------------------------------------------------------------------------- /static/img/pathing_mathbits.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lc19997/blogs/2d7ecee37db001765ab9de78ea880940b758d717/static/img/pathing_mathbits.png -------------------------------------------------------------------------------- /static/img/pathing_pathmem.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lc19997/blogs/2d7ecee37db001765ab9de78ea880940b758d717/static/img/pathing_pathmem.png -------------------------------------------------------------------------------- /static/img/pathing_stonks.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lc19997/blogs/2d7ecee37db001765ab9de78ea880940b758d717/static/img/pathing_stonks.png -------------------------------------------------------------------------------- /static/img/reg_table.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lc19997/blogs/2d7ecee37db001765ab9de78ea880940b758d717/static/img/reg_table.png -------------------------------------------------------------------------------- /static/img/zeroalloc.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lc19997/blogs/2d7ecee37db001765ab9de78ea880940b758d717/static/img/zeroalloc.png -------------------------------------------------------------------------------- /static/style.css: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/lc19997/blogs/2d7ecee37db001765ab9de78ea880940b758d717/static/style.css -------------------------------------------------------------------------------- /themes/hugo-steam-theme/CHANGELOG.md: -------------------------------------------------------------------------------- 1 | # Changelog 2 | 3 | ## 8th April 2017 4 | 5 | `.Now` will be deprecated with Hugo v0.20. Hence the required minimum version of Hugo is v0.20. 6 | 7 | ## 27th November 2016 8 | 9 | - `favicon` allows you to link a custom favicon by adding a path relative to the `static` folder 10 | - with `customCSS` and `customJS` you can link your own stylesheets and scripts. The files have to be link relative to `static` as well. 11 | 12 | ## 26th November 2016 13 | 14 | Some of the new features of Hugo v0.17 were now introduced in this theme. Since some changes are not backwards compatible you have to update to Hugo v0.17 or newer versions. 15 | 16 | - Steam now uses the Google Analytics template that is shipped with Hugo. Just move the `googleAnaltics` variable outside the `params` block. Have a look at the [example config file](https://github.com/digitalcraftsman/hugo-steam-theme/blob/master/exampleSite/config.toml). 17 | - The support for Google Plus comments in now deprecated. 18 | - Formerly, it was only possible to show pages of `type` post on the homepage. Now, all types of pages are shown. You can hide single pages by adding `hide = true` to the (TOML) frontmatter. 19 | - External dependencies like Highlight.js have been updated to the latest version. 20 | -------------------------------------------------------------------------------- /themes/hugo-steam-theme/LICENSE.md: -------------------------------------------------------------------------------- 1 | Copyright (c) 2015 Digitalcraftsman - Released under The MIT License. 2 | 3 | Permission is hereby granted, free of charge, to any person 4 | obtaining a copy of this software and associated documentation 5 | files (the "Software"), to deal in the Software without 6 | restriction, including without limitation the rights to use, 7 | copy, modify, merge, publish, distribute, sublicense, and/or sell 8 | copies of the Software, and to permit persons to whom the 9 | Software is furnished to do so, subject to the following 10 | conditions: 11 | 12 | The above copyright notice and this permission notice shall be 13 | included in all copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 16 | EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES 17 | OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 18 | NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT 19 | HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, 20 | WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 21 | FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR 22 | OTHER DEALINGS IN THE SOFTWARE. -------------------------------------------------------------------------------- /themes/hugo-steam-theme/README.md: -------------------------------------------------------------------------------- 1 | # Steam 2 | 3 | Steam is a minimal and customizable theme for bloggers and was developed by [Tommaso Barbato](//github.com/epistrephein). He created it as a slightly adapted version of the [Vapor](//github.com/sethlilly/Vapor) Ghost theme by [Seth Lilly](//github.com/sethlilly). Noteworthy features of this Hugo port are the integration of a comment-system powered by Disqus, the customizable appearance by changing theme colors, support for RSS feeds, syntax highlighting via Highlight.js for source code and the optional use of Google Analytics. Enough to read. Let's take the first steps to get started. 4 | 5 | #### Please note that this theme is no longer maintained. 6 | 7 | ![Screenshot](https://raw.githubusercontent.com/digitalcraftsman/hugo-steam-theme/dev/images/screenshot.png) 8 | 9 | 10 | ## Installation 11 | 12 | Inside the folder of your Hugo site run: 13 | 14 | $ cd themes 15 | $ git clone https://github.com/digitalcraftsman/hugo-steam-theme.git 16 | 17 | For more information read the official [setup guide](//gohugo.io/overview/installing/) of Hugo. 18 | 19 | ### The config file 20 | 21 | Take a look inside the [`exampleSite`](//github.com/digitalcraftsman/hugo-steam-theme/blob/dev/exampleSite/) folder of this theme. You'll find a file called [`config.toml`](//github.com/digitalcraftsman/hugo-steam-theme/blob/dev/exampleSite/config.toml). 22 | 23 | To use it, copy the [`config.toml`](//github.com/digitalcraftsman/hugo-steam-theme/blob/dev/exampleSite/config.toml) in the root folder of your Hugo site. Feel free to change strings as you like to customize your website. 24 | 25 | ## Add links to the navigation 26 | 27 | You can add custom pages like this by adding `menu = "main"` in the frontmatter: 28 | 29 | ```toml 30 | +++ 31 | date = "2015-08-22" 32 | title = "About me" 33 | menu = "main" 34 | +++ 35 | ``` 36 | 37 | If no document contains menu = "main" in the frontmatter than the navigation will not be shown 38 | 39 | 40 | ## Customize theme colors 41 | 42 | This theme features four different theme colors (green as default, blue, red and orange) that change the appearance of you Hugo site slightly. Just set the `themeColor` variable to the color you like. 43 | 44 | Furthermore you can create your own theme. Under [`layouts/partials/themes`](//github.com/digitalcraftsman/hugo-steam-theme/tree/dev/layouts/partials/themes) you'll find a stylesheet template called [`custom-theme.html`](//github.com/digitalcraftsman/hugo-steam-theme/blob/dev/layouts/partials/themes/custom-theme.html). Customize the colors as you like and save the new theme with the schema `-theme.html` within the same folder. As you can see, the color is the prefix of the stylesheet template. Therefore you just need to set `themeColor` in the [`configs`](//github.com/digitalcraftsman/hugo-steam-theme/blob/dev/exampleSite/config.toml)) to that self-defined prefix. 45 | 46 | ## Comments 47 | 48 | This theme features a comment system powered by Disqus. To enable it you have to add your Disqus shortname to the `disqusShortname` variable in the config file. 49 | 50 | ## Nearly finished 51 | 52 | In order to see your site in action, run Hugo's built-in local server. 53 | 54 | $ hugo server 55 | 56 | Now enter [`localhost:1313`](http://localhost:1313) in the address bar of your browser. 57 | 58 | ## Changelog 59 | 60 | You can find the latest changes and improvements of this theme in the [CHANGELOG.md](https://github.com/digitalcraftsman/hugo-steam-theme/blob/master/CHANGELOG.md) 61 | 62 | 63 | ## Contributing 64 | 65 | Did you found a bug or got an idea for a new feature? Feel free to use the [issue tracker](//github.com/digitalcraftsman/hugo-steam-theme/issues) to let me know. Or make directly a [pull request](//github.com/digitalcraftsman/hugo-steam-theme/pulls). 66 | 67 | 68 | ## License 69 | 70 | This theme is released under the MIT license. For more information read the [License](//github.com/digitalcraftsman/hugo-steam-theme/blob/master/LICENSE.md). 71 | 72 | 73 | ## Annotations 74 | 75 | Thanks to 76 | 77 | - [Steve Francia](//github.com/spf13) for creating Hugo and the awesome community around the project. 78 | - [Seth Lilly](//github.com/sethlilly) and [Tommaso Barbato](//github.com/epistrephein) for developing the original version(s) of this theme 79 | -------------------------------------------------------------------------------- /themes/hugo-steam-theme/archetypes/default.md: -------------------------------------------------------------------------------- 1 | +++ 2 | 3 | +++ 4 | 5 | -------------------------------------------------------------------------------- /themes/hugo-steam-theme/exampleSite/.gitignore: -------------------------------------------------------------------------------- 1 | public/ 2 | themes -------------------------------------------------------------------------------- /themes/hugo-steam-theme/exampleSite/config.toml: -------------------------------------------------------------------------------- 1 | baseurl = "https://example.org/" 2 | languageCode = "en-us" 3 | title = "Steam - a minimal theme for Hugo" 4 | theme = "hugo-steam-theme" 5 | disqusShortname = "spf13" 6 | # Enable Google Analytics be inserting your tracking code 7 | googleAnalytics = "" 8 | # Number of posts per page 9 | paginate = 10 10 | 11 | [params] 12 | title = "Steam" 13 | subtitle = "a minimal theme for ~~Ghost~~ Hugo" 14 | copyright = "Released under the MIT license." 15 | 16 | # You can choose between green, orange, red and blue. 17 | themecolor = "green" 18 | 19 | # Link custom assets relative to /static 20 | favicon = "favicon.ico" 21 | customCSS = [] 22 | customJS = [] 23 | 24 | # To provide some metadata for search engines and the about section in the footer 25 | # feel free to add a few information about you and your website. 26 | name = "John Doe" 27 | bio = "programmer - blogger - coffee aficionado" 28 | description = "Your description of the blog" 29 | 30 | # Link your social networks (optional) 31 | location = "" 32 | twitter = "spf13" 33 | linkedin = "" 34 | googleplus = "" 35 | facebook = "" 36 | instagram = "" 37 | github = "spf13" 38 | gitlab = "" 39 | bitbucket = "" 40 | 41 | # Customize or translate the strings 42 | keepReadingStr = "Keep reading" 43 | backtotopStr = "Back to top" 44 | shareStr = "Share" 45 | pageNotFoundTitle = "404 - Page not found" 46 | -------------------------------------------------------------------------------- /themes/hugo-steam-theme/exampleSite/content/about.md: -------------------------------------------------------------------------------- 1 | +++ 2 | date = "2015-08-22" 3 | title = "Link custom pages" 4 | menu = "main" 5 | url = "about/" 6 | hide = "true" 7 | +++ 8 | 9 | You can add custom pages like this by adding `menu = "main"` in the frontmatter: 10 | 11 | ```toml 12 | +++ 13 | date = "2015-08-22" 14 | title = "About me" 15 | menu = "main" 16 | url = "about/" 17 | +++ 18 | ``` 19 | 20 | This site is just a usual document. Create a new file, e.g. `about.md` in the `content` content directory. The `url` variable in the frontmatter allows you to define the final url of the about page. -------------------------------------------------------------------------------- /themes/hugo-steam-theme/exampleSite/content/post/goisforlovers.md: -------------------------------------------------------------------------------- 1 | +++ 2 | title = "(Hu)go Template Primer" 3 | tags = [ 4 | "go", 5 | "golang", 6 | "templates", 7 | "themes", 8 | "development", 9 | ] 10 | date = "2014-04-02" 11 | categories = [ 12 | "Development", 13 | "golang", 14 | ] 15 | description = "Lorem ipsum dolor sit amet, consectetur adipisicing elit. Earum similique, ipsum officia amet blanditiis provident ratione nihil ipsam dolorem repellat." 16 | +++ 17 | 18 | Hugo uses the excellent [go][] [html/template][gohtmltemplate] library for 19 | its template engine. It is an extremely lightweight engine that provides a very 20 | small amount of logic. In our experience that it is just the right amount of 21 | logic to be able to create a good static website. If you have used other 22 | template systems from different languages or frameworks you will find a lot of 23 | similarities in go templates. 24 | 25 | This document is a brief primer on using go templates. The [go docs][gohtmltemplate] 26 | provide more details. 27 | 28 | ## Introduction to Go Templates 29 | 30 | Go templates provide an extremely simple template language. It adheres to the 31 | belief that only the most basic of logic belongs in the template or view layer. 32 | One consequence of this simplicity is that go templates parse very quickly. 33 | 34 | A unique characteristic of go templates is they are content aware. Variables and 35 | content will be sanitized depending on the context of where they are used. More 36 | details can be found in the [go docs][gohtmltemplate]. 37 | 38 | ## Basic Syntax 39 | 40 | Go lang templates are html files with the addition of variables and 41 | functions. 42 | 43 | **Go variables and functions are accessible within {{ }}** 44 | 45 | Accessing a predefined variable "foo": 46 | 47 | {{ foo }} 48 | 49 | **Parameters are separated using spaces** 50 | 51 | Calling the add function with input of 1, 2: 52 | 53 | {{ add 1 2 }} 54 | 55 | **Methods and fields are accessed via dot notation** 56 | 57 | Accessing the Page Parameter "bar" 58 | 59 | {{ .Params.bar }} 60 | 61 | **Parentheses can be used to group items together** 62 | 63 | {{ if or (isset .Params "alt") (isset .Params "caption") }} Caption {{ end }} 64 | 65 | 66 | ## Variables 67 | 68 | Each go template has a struct (object) made available to it. In hugo each 69 | template is passed either a page or a node struct depending on which type of 70 | page you are rendering. More details are available on the 71 | [variables](/layout/variables) page. 72 | 73 | A variable is accessed by referencing the variable name. 74 | 75 | {{ .Title }} 76 | 77 | Variables can also be defined and referenced. 78 | 79 | {{ $address := "123 Main St."}} 80 | {{ $address }} 81 | 82 | 83 | ## Functions 84 | 85 | Go template ship with a few functions which provide basic functionality. The go 86 | template system also provides a mechanism for applications to extend the 87 | available functions with their own. [Hugo template 88 | functions](/layout/functions) provide some additional functionality we believe 89 | are useful for building websites. Functions are called by using their name 90 | followed by the required parameters separated by spaces. Template 91 | functions cannot be added without recompiling hugo. 92 | 93 | **Example:** 94 | 95 | {{ add 1 2 }} 96 | 97 | ## Includes 98 | 99 | When including another template you will pass to it the data it will be 100 | able to access. To pass along the current context please remember to 101 | include a trailing dot. The templates location will always be starting at 102 | the /layout/ directory within Hugo. 103 | 104 | **Example:** 105 | 106 | {{ template "chrome/header.html" . }} 107 | 108 | 109 | ## Logic 110 | 111 | Go templates provide the most basic iteration and conditional logic. 112 | 113 | ### Iteration 114 | 115 | Just like in go, the go templates make heavy use of range to iterate over 116 | a map, array or slice. The following are different examples of how to use 117 | range. 118 | 119 | **Example 1: Using Context** 120 | 121 | {{ range array }} 122 | {{ . }} 123 | {{ end }} 124 | 125 | **Example 2: Declaring value variable name** 126 | 127 | {{range $element := array}} 128 | {{ $element }} 129 | {{ end }} 130 | 131 | **Example 2: Declaring key and value variable name** 132 | 133 | {{range $index, $element := array}} 134 | {{ $index }} 135 | {{ $element }} 136 | {{ end }} 137 | 138 | ### Conditionals 139 | 140 | If, else, with, or, & and provide the framework for handling conditional 141 | logic in Go Templates. Like range, each statement is closed with `end`. 142 | 143 | 144 | Go Templates treat the following values as false: 145 | 146 | * false 147 | * 0 148 | * any array, slice, map, or string of length zero 149 | 150 | **Example 1: If** 151 | 152 | {{ if isset .Params "title" }}

{{ index .Params "title" }}

{{ end }} 153 | 154 | **Example 2: If -> Else** 155 | 156 | {{ if isset .Params "alt" }} 157 | {{ index .Params "alt" }} 158 | {{else}} 159 | {{ index .Params "caption" }} 160 | {{ end }} 161 | 162 | **Example 3: And & Or** 163 | 164 | {{ if and (or (isset .Params "title") (isset .Params "caption")) (isset .Params "attr")}} 165 | 166 | **Example 4: With** 167 | 168 | An alternative way of writing "if" and then referencing the same value 169 | is to use "with" instead. With rebinds the context `.` within its scope, 170 | and skips the block if the variable is absent. 171 | 172 | The first example above could be simplified as: 173 | 174 | {{ with .Params.title }}

{{ . }}

{{ end }} 175 | 176 | **Example 5: If -> Else If** 177 | 178 | {{ if isset .Params "alt" }} 179 | {{ index .Params "alt" }} 180 | {{ else if isset .Params "caption" }} 181 | {{ index .Params "caption" }} 182 | {{ end }} 183 | 184 | ## Pipes 185 | 186 | One of the most powerful components of go templates is the ability to 187 | stack actions one after another. This is done by using pipes. Borrowed 188 | from unix pipes, the concept is simple, each pipeline's output becomes the 189 | input of the following pipe. 190 | 191 | Because of the very simple syntax of go templates, the pipe is essential 192 | to being able to chain together function calls. One limitation of the 193 | pipes is that they only can work with a single value and that value 194 | becomes the last parameter of the next pipeline. 195 | 196 | A few simple examples should help convey how to use the pipe. 197 | 198 | **Example 1 :** 199 | 200 | {{ if eq 1 1 }} Same {{ end }} 201 | 202 | is the same as 203 | 204 | {{ eq 1 1 | if }} Same {{ end }} 205 | 206 | It does look odd to place the if at the end, but it does provide a good 207 | illustration of how to use the pipes. 208 | 209 | **Example 2 :** 210 | 211 | {{ index .Params "disqus_url" | html }} 212 | 213 | Access the page parameter called "disqus_url" and escape the HTML. 214 | 215 | **Example 3 :** 216 | 217 | {{ if or (or (isset .Params "title") (isset .Params "caption")) (isset .Params "attr")}} 218 | Stuff Here 219 | {{ end }} 220 | 221 | Could be rewritten as 222 | 223 | {{ isset .Params "caption" | or isset .Params "title" | or isset .Params "attr" | if }} 224 | Stuff Here 225 | {{ end }} 226 | 227 | 228 | ## Context (aka. the dot) 229 | 230 | The most easily overlooked concept to understand about go templates is that {{ . }} 231 | always refers to the current context. In the top level of your template this 232 | will be the data set made available to it. Inside of a iteration it will have 233 | the value of the current item. When inside of a loop the context has changed. . 234 | will no longer refer to the data available to the entire page. If you need to 235 | access this from within the loop you will likely want to set it to a variable 236 | instead of depending on the context. 237 | 238 | **Example:** 239 | 240 | {{ $title := .Site.Title }} 241 | {{ range .Params.tags }} 242 |
  • {{ . }} - {{ $title }}
  • 243 | {{ end }} 244 | 245 | Notice how once we have entered the loop the value of {{ . }} has changed. We 246 | have defined a variable outside of the loop so we have access to it from within 247 | the loop. 248 | 249 | # Hugo Parameters 250 | 251 | Hugo provides the option of passing values to the template language 252 | through the site configuration (for sitewide values), or through the meta 253 | data of each specific piece of content. You can define any values of any 254 | type (supported by your front matter/config format) and use them however 255 | you want to inside of your templates. 256 | 257 | 258 | ## Using Content (page) Parameters 259 | 260 | In each piece of content you can provide variables to be used by the 261 | templates. This happens in the [front matter](/content/front-matter). 262 | 263 | An example of this is used in this documentation site. Most of the pages 264 | benefit from having the table of contents provided. Sometimes the TOC just 265 | doesn't make a lot of sense. We've defined a variable in our front matter 266 | of some pages to turn off the TOC from being displayed. 267 | 268 | Here is the example front matter: 269 | 270 | ``` 271 | --- 272 | title: "Permalinks" 273 | date: "2013-11-18" 274 | aliases: 275 | - "/doc/permalinks/" 276 | groups: ["extras"] 277 | groups_weight: 30 278 | notoc: true 279 | --- 280 | ``` 281 | 282 | Here is the corresponding code inside of the template: 283 | 284 | {{ if not .Params.notoc }} 285 |
    286 | {{ .TableOfContents }} 287 |
    288 | {{ end }} 289 | 290 | 291 | 292 | ## Using Site (config) Parameters 293 | In your top-level configuration file (eg, `config.yaml`) you can define site 294 | parameters, which are values which will be available to you in chrome. 295 | 296 | For instance, you might declare: 297 | 298 | ```yaml 299 | params: 300 | CopyrightHTML: "Copyright © 2013 John Doe. All Rights Reserved." 301 | TwitterUser: "spf13" 302 | SidebarRecentLimit: 5 303 | ``` 304 | 305 | Within a footer layout, you might then declare a `