├── .gitignore ├── ChangeLog ├── Makefile ├── README ├── erlfmt └── erlfmt.escript /.gitignore: -------------------------------------------------------------------------------- 1 | *.dump 2 | *.beam 3 | -------------------------------------------------------------------------------- /ChangeLog: -------------------------------------------------------------------------------- 1 | 2016-08-28 Mark Bucciarelli 2 | 3 | * erlfmt: Fail if stdin is missing a -module() statement. 4 | OTP-19 has some in the source tree; for example, 5 | ./lib/hipe/amd64/hipe_amd64_liveness.erl. 6 | 7 | * erlfmt: Fail if source files has utf-8 encoding. See 8 | https://bugs.erlang.org/browse/ERL-234. 9 | 10 | 2016-06-07 Mark Bucciarelli 11 | 12 | * erlfmt: Use mktemp idiom that works on GNU/Linux as well as OSX. 13 | 14 | 2016-05-22 Mark Bucciarelli 15 | 16 | * erlfmt: Convert tabs to space in prettypr output. 17 | 18 | 2016-05-10 Mark Bucciarelli 19 | 20 | * README: Minor word smithing. 21 | 22 | 2016-05-09 Mark Bucciarelli 23 | 24 | * README: Add long backstory. Major change in approach---create a 25 | wrapper around erl_tidy instead of trying to recreate the wheel. 26 | 27 | * erlfmt.escript: Set keep_unused option to true so eunit tests are 28 | not deleted. 29 | 30 | * erlfmt: Assume erlfmt.escript is on path. 31 | 32 | * Makefile: New 33 | 34 | 35 | 2016-05-03 Mark Bucciarelli 36 | 37 | * erlfmt.erl: Deal with lines that had comments after the dot. For 38 | example, -define(TIMEOUT, 20). % milliseconds 39 | 40 | 2016-05-02 Mark Bucciarelli 41 | 42 | * erlfmt.erl: Refactor two functions into one function with guards. 43 | 44 | * Makefile, erlfmt.erl: erlfmt erlfmt.erl 45 | 46 | * erlfmt.erl: Trim trailing whitespace after dot before tokenizing. 47 | It's a common for people to leave trailing space at the end of a line 48 | and it is safe to zap it. 49 | 50 | * erlfmt.erl: deal with preprocessor juju. 51 | 52 | 2016-05-01 Mark Bucciarelli 53 | 54 | * erlfmt.erl: erl_parse doesn't understand pre-processor juju. 55 | 56 | * erlfmt.erl: On parse error, dump tokens and error message. 57 | 58 | * README: Add README. 59 | 60 | * Makefile, erlfmt: Add install target that installs to $HOME/bin. 61 | Update erlfmt shell script to add $HOME/bin to path. 62 | 63 | * erlfmt.erl: Add license. 64 | 65 | * .gitignore, erlfmt: Add shell script. 66 | 67 | * erlfmt.erl: Output any comments and whitespace before form. 68 | 69 | * erlfmt.erl: Abort if we find a dot not at line ending. 70 | 71 | * erlfmt.erl: Give user a clue if no output. 72 | 73 | * erlfmt.erl: improve comment, no logic change. 74 | 75 | * Works but drops comments. 76 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | install: ${HOME}/bin/erlfmt ${HOME}/bin/erlfmt.escript 2 | ${HOME}/bin/erlfmt: erlfmt 3 | cp -i -a erlfmt ${HOME}/bin/erlfmt 4 | chmod +x ${HOME}/bin/erlfmt 5 | 6 | ${HOME}/bin/erlfmt.escript: erlfmt.escript 7 | cp -i -a erlfmt.escript ${HOME}/bin/erlfmt.escript 8 | chmod +x ${HOME}/bin/erlfmt.escript 9 | -------------------------------------------------------------------------------- /README: -------------------------------------------------------------------------------- 1 | Read Erlang source on stdin, reformat with erl_tidy, and write to stdout. 2 | No options, you get what erl_prettypr:format/2 gives you. 3 | 4 | 5 | Back story 6 | 7 | 8 | The problem seemed simple: read Erlang source code on stdin and send a 9 | nicely formatted result to stdout. In vim, I got used to this when I was 10 | learning Go: 11 | 12 | :%!gofmt 13 | 14 | Just type the code without any concern for formatting, and then format the 15 | entire file in one command. I now have xmlfmt, jsfmt, javafmt and I 16 | wanted a similar utility for Erlang. 17 | 18 | I started with erl_tidy, which 19 | 20 | Tidies and pretty-prints Erlang source code, removing 21 | unused functions, updating obsolete constructs and 22 | function calls, etc. 23 | 24 | Seems perfect ... except that erl_tidy does not read from stdin. I posted 25 | a question to the Erlang questions mailing list asking if I missed 26 | something and the answer was a clear no. People pointed me to alternative 27 | approaches: an Emacs elisp module, a rebar3 module that wraps erl_tidy, 28 | and a state-machine/parser written in Erlang that the vim Erlang module 29 | uses. The first two didn't solve my problem, and I didn't like the 30 | complexity of the third approach. 31 | 32 | First attempt: use basic Erlang modules 33 | 34 | While Googling around, I found a short post showing how you can format 35 | Erlang code using erl_scan (string -> tokens), erl_parse (tokens -> form) 36 | and erl_pp (form -> string). So I started coding that up. It worked 37 | great. Initially. 38 | 39 | The first problem I hit was that erl_parse:parse_form/1 does not handle 40 | white space or comment tokens. OK, no problem. To keep moving forward I 41 | implemented the quick hack of only dropping comments that came inside a 42 | form, and keeping the ones that came before. Not great, but I wanted to 43 | get something working. 44 | 45 | The next problem I hit was pre-processor constructs. It turns out that 46 | erl_parse does not understand pre-processor bits either (macros, imports, 47 | etc). So, another hack: when we hit a dot-terminated token sequence that 48 | includes a pre-processor construct, just print out the raw text that 49 | generated those tokens and leave it at that. 50 | 51 | When that was done, I decided dropping comments was not acceptable. While 52 | searching around for how to re-insert comments, I came across a reference 53 | to the epp_dodger module and a function that re-inserts comments. Both of 54 | which is used by erl_tidy. It seemed stupid to re-write erl_tidy so I 55 | went back to square one. 56 | 57 | Second attempt: teach erl_tidy about stdin 58 | 59 | Erlang can read stdin just fine. In fact, most of the input functions in 60 | the io module read from stdin by default. Erlang provides the standard_io 61 | atom that you can use for an IODevice argument. I started hacking on 62 | erl_tidy, intending to use the "special" file name of a single dash (-) to 63 | tell erl_tidy to read from stdin. 64 | 65 | It was simple to add a new read_module("-", Opts) and pass standard_io to 66 | epp_dodger:parse/3. But, the next chunk of erl_tidy logic reads comments 67 | from the same file again, which is not possible with stdin, since the 68 | stream was already consumed by parsing. 69 | 70 | I briefly looking to see if I could somehow turn a string into an IODevice 71 | (like you can in Java), but nothing turned up. So, it seems like I have 72 | to write stdin to a file in order to use erl_tidy. 73 | 74 | Third attempt: call erl_tidy:file/1 from a shell script. 75 | 76 | It was trivial to write a shell script that pipes stdin to a file. 77 | Calling erl_tidy from the shell script was not. My first attempt looked 78 | like this: 79 | 80 | $ erl -run erl_tidy file $TMPFN 81 | 82 | which produced: 83 | 84 | =ERROR REPORT==== 9-May-2016::20:13:30 === 85 | erl_comment_scan: bad filename: `['hmmmm_sup.erl']' 86 | 87 | and hung there, waiting in the Erlang interpreter. 88 | 89 | It turns out that when use the -run flag and pass it arguments, Erlang 90 | assumes the receiving function has one argument---a list. That's why the 91 | filename in the error message has brackets around it ... it's an list not 92 | a string.. 93 | 94 | Final attempt: call escript from a shell script 95 | 96 | The final result is what you see in the repository now. It's so easy to 97 | pipe stdin to a file in a shell script that I saw no reason to re-write 98 | that in Erlang. So I added a short escript that unpacks the file name 99 | from the list and passes that to erl_tidy. 100 | 101 | The result: I was able to erlfmt all 1,795 source files under 102 | /usr/local/Cellar/erlang/18.2.1, with only one error: 103 | 104 | /usr/local/Cellar/erlang/18.2.1/lib/erlang/lib/wx-1.6/src/gen/gl.erl 105 | ./erlfmt: line 6: 25399 User defined signal 2: 31 ./erlfmt.escript "$TMPF" 106 | 107 | That file is a generated one, and is 971KB in size. I got the above error 108 | when I had erl_tidy write the reformatted file to the temporary file. 109 | When instead I told erl_tidy to write the result to stdout, I got this 110 | error: 111 | 112 | escript: exception exit: badarg 113 | in function erl_tidy:file/2 (erl_tidy.erl, line 295) 114 | in call from erl_eval:local_func/6 (erl_eval.erl, line 557) 115 | in call from escript:interpret/4 (escript.erl, line 787) 116 | in call from escript:start/1 (escript.erl, line 277) 117 | in call from init:start_it/1 118 | in call from init:start_em/1 119 | 120 | It successfully parsed the other 1,794 files, which gives success rate of 121 | 0.99945. Not too hot in the Erlang world, but I don't expect to be 122 | formatting files that big. Good enough. 123 | 124 | The back story back story. 125 | 126 | Why such a long README? Erlang questions mailing list recently had an 127 | interesting thread on code documentation ("rhetorical structure of code"). 128 | It was a long thread and a lot of things were said, but one in particular 129 | stuck with me: code documentation rarely shows the starts and stops, the 130 | litany of failed experiments that you encounter on the way to the final 131 | product. And that those failed experiements are sometimes useful in the 132 | future. 133 | 134 | -------------------------------------------------------------------------------- /erlfmt: -------------------------------------------------------------------------------- 1 | #! /bin/sh -e 2 | # Run erl_tidy on stdin and send result to stdout. 3 | 4 | # 5 | # OSX and GNU/Linux mktemp behave differently with the -t option. 6 | # 7 | # mktemp -t mkb 8 | # OSX : /var/folders/cy/4988f14n4r35fp39wrt1pfq80000gn/T/mkb.UgVBxAPP 9 | # GNU/Linux: mktemp: too few X's in template ‘mkb’ 10 | # 11 | # mktemp -t mkb.XXXXXX 12 | # OSX : /var/folders/cy/4988f14n4r35fp39wrt1pfq80000gn/T/mkb.XXXXXX.UgVBxAPP 13 | # GNU/Linux: /tmp/mkb.pRykUg 14 | # 15 | # So, "manually" add the TMPDIR prefix. 16 | 17 | D=${TMP1DIR:-.} 18 | TMPF=$D/$(mktemp $(basename $0).XXXXXX) 19 | 20 | cat - > $TMPF 21 | 22 | # See https://bugs.erlang.org/browse/ERL-234 23 | if grep 'coding: *utf-8' $TMPF > /dev/null ; then 24 | printf "erlfmt: utf-8 encoding\n" >&2 25 | exit 1 26 | fi 27 | 28 | # Some erl files in OTP-19 don't have module statements; for example, 29 | # ./lib/hipe/amd64/hipe_amd64_liveness.erl 30 | # Skip these too. 31 | if ! grep '^-module' $TMPF > /dev/null ; then 32 | printf "erlfmt: no -module() line.\n" >&2 33 | exit 1 34 | fi 35 | 36 | 37 | # 38 | # Convert the tabs output by erl_prettypr to eight spaces. 39 | # 40 | # On OSX, I was getting the error 41 | # 42 | # sed: RE error: illegal byte sequence 43 | # 44 | # Per StackOverflow, this error means that the byte-sequence in the file does 45 | # not use the same encoding as the shell. In my specific case, I had some 46 | # eunit test cases that included text with some encoded characters. 47 | # 48 | # The fix of setting LC_ALL=C in-line is fine since the only character we care 49 | # about matching is the tab character. 50 | # 51 | # However, (effectively) setting LC_CTYPE to C treats strings as if each 52 | # byte were its own character (no interpretation based on encoding rules 53 | # is performed), with no regard for the - multibyte-on-demand - UTF-8 54 | # encoding that OS X employs by default, where foreign characters have 55 | # multibyte encodings. 56 | # 57 | # http://stackoverflow.com/questions/19242275/re-error-illegal-byte-sequence-on-mac-os-x 58 | # 59 | 60 | erlfmt.escript "$TMPF" | LC_ALL=C sed 's/ / /g' 61 | rm $TMPF 62 | -------------------------------------------------------------------------------- /erlfmt.escript: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env escript 2 | %%! -shutdown_time 1000 3 | main([Name]) -> 4 | erl_tidy:file(Name, [{stdout, true}, {keep_unused, true}]); 5 | main(_) -> 6 | usage(). 7 | 8 | usage() -> 9 | io:put_chars("usage: erlfmt.escript \n"), 10 | halt(1). 11 | --------------------------------------------------------------------------------