├── .gitignore
├── ChangeLog
├── Makefile
├── README
├── erlfmt
└── erlfmt.escript


/.gitignore:
--------------------------------------------------------------------------------
1 | *.dump
2 | *.beam
3 | 


--------------------------------------------------------------------------------
/ChangeLog:
--------------------------------------------------------------------------------
 1 | 2016-08-28  Mark Bucciarelli <mkbucc@gmail.com>
 2 | 
 3 | 	* erlfmt: Fail if stdin is missing a -module() statement.
 4 | 	  OTP-19 has some in the source tree; for example,
 5 | 	  ./lib/hipe/amd64/hipe_amd64_liveness.erl.
 6 | 
 7 | 	* erlfmt: Fail if source files has utf-8 encoding. See
 8 | 	  https://bugs.erlang.org/browse/ERL-234.
 9 | 
10 | 2016-06-07  Mark Bucciarelli <mkbucc@gmail.com>
11 | 
12 | 	* erlfmt: Use mktemp idiom that works on GNU/Linux as well as OSX.
13 | 
14 | 2016-05-22  Mark Bucciarelli <mkbucc@gmail.com>
15 | 
16 | 	* erlfmt: Convert tabs to space in prettypr output.
17 | 
18 | 2016-05-10  Mark Bucciarelli <mkbucc@gmail.com>
19 | 
20 | 	* README: Minor word smithing.
21 | 
22 | 2016-05-09  Mark Bucciarelli <mkbucc@gmail.com>
23 | 
24 | 	* README: Add long backstory.  Major change in approach---create a
25 | 	  wrapper around erl_tidy instead of trying to recreate the wheel.
26 | 
27 | 	* erlfmt.escript: Set keep_unused option to true so eunit tests are
28 | 	  not deleted.
29 | 
30 | 	* erlfmt: Assume erlfmt.escript is on path.
31 | 
32 | 	* Makefile: New
33 | 
34 | 
35 | 2016-05-03  Mark Bucciarelli <mkbucc@gmail.com>
36 | 
37 | 	* erlfmt.erl: Deal with lines that had comments after the dot.  For
38 | 	  example, -define(TIMEOUT, 20). % milliseconds
39 | 
40 | 2016-05-02  Mark Bucciarelli <mkbucc@gmail.com>
41 | 
42 | 	* erlfmt.erl: Refactor two functions into one function with guards.
43 | 
44 | 	* Makefile, erlfmt.erl: erlfmt erlfmt.erl
45 | 
46 | 	* erlfmt.erl: Trim trailing whitespace after dot before tokenizing.
47 | 	  It's a common for people to leave trailing space at the end of a line
48 | 	  and it is safe to zap it.
49 | 
50 | 	* erlfmt.erl: deal with preprocessor juju.
51 | 
52 | 2016-05-01  Mark Bucciarelli <mkbucc@gmail.com>
53 | 
54 | 	* erlfmt.erl: erl_parse doesn't understand pre-processor juju.
55 | 
56 | 	* erlfmt.erl: On parse error, dump tokens and error message.
57 | 
58 | 	* README: Add README.
59 | 
60 | 	* Makefile, erlfmt: Add install target that installs to $HOME/bin.
61 | 	  Update erlfmt shell script to add $HOME/bin to path.
62 | 
63 | 	* erlfmt.erl: Add license.
64 | 
65 | 	* .gitignore, erlfmt: Add shell script.
66 | 
67 | 	* erlfmt.erl: Output any comments and whitespace before form.
68 | 
69 | 	* erlfmt.erl: Abort if we find a dot not at line ending.
70 | 
71 | 	* erlfmt.erl: Give user a clue if no output.
72 | 
73 | 	* erlfmt.erl: improve comment, no logic change.
74 | 
75 | 	* Works but drops comments.
76 | 


--------------------------------------------------------------------------------
/Makefile:
--------------------------------------------------------------------------------
1 | install: ${HOME}/bin/erlfmt ${HOME}/bin/erlfmt.escript
2 | ${HOME}/bin/erlfmt: erlfmt
3 | 	cp -i -a erlfmt ${HOME}/bin/erlfmt
4 | 	chmod +x ${HOME}/bin/erlfmt
5 | 
6 | ${HOME}/bin/erlfmt.escript: erlfmt.escript
7 | 	cp -i -a erlfmt.escript ${HOME}/bin/erlfmt.escript
8 | 	chmod +x ${HOME}/bin/erlfmt.escript
9 | 


--------------------------------------------------------------------------------
/README:
--------------------------------------------------------------------------------
  1 | Read Erlang source on stdin, reformat with erl_tidy, and write to stdout.
  2 | No options, you get what erl_prettypr:format/2 gives you.
  3 | 
  4 | 
  5 |                                   Back story
  6 | 
  7 | 
  8 |     The problem seemed simple: read Erlang source code on stdin and send a
  9 |     nicely formatted result to stdout.  In vim, I got used to this when I was
 10 |     learning Go:
 11 | 
 12 |             :%!gofmt
 13 | 
 14 |     Just type the code without any concern for formatting, and then format the
 15 |     entire file in one command.  I now have xmlfmt, jsfmt, javafmt and I
 16 |     wanted a similar utility for Erlang.
 17 | 
 18 |     I started with erl_tidy, which
 19 | 
 20 |             Tidies and pretty-prints Erlang source code, removing
 21 |             unused functions, updating obsolete constructs and
 22 |             function calls, etc.
 23 | 
 24 |     Seems perfect ... except that erl_tidy does not read from stdin.  I posted
 25 |     a question to the Erlang questions mailing list asking if I missed
 26 |     something and the answer was a clear no.  People pointed me to alternative
 27 |     approaches: an Emacs elisp module, a rebar3 module that wraps erl_tidy,
 28 |     and a state-machine/parser written in Erlang that the vim Erlang module
 29 |     uses.  The first two didn't solve my problem, and I didn't like the
 30 |     complexity of the third approach.
 31 | 
 32 | First attempt: use basic Erlang modules
 33 |     
 34 |     While Googling around, I found a short post showing how you can format
 35 |     Erlang code using erl_scan (string -> tokens), erl_parse (tokens -> form)
 36 |     and erl_pp (form -> string).  So I started coding that up.  It worked
 37 |     great. Initially.
 38 | 
 39 |     The first problem I hit was that erl_parse:parse_form/1 does not handle
 40 |     white space or comment tokens.  OK, no problem.  To keep moving forward I
 41 |     implemented the quick hack of only dropping comments that came inside a
 42 |     form, and keeping the ones that came before.  Not great, but I wanted to
 43 |     get something working.
 44 | 
 45 |     The next problem I hit was pre-processor constructs.  It turns out that
 46 |     erl_parse does not understand pre-processor bits either (macros, imports,
 47 |     etc).  So, another hack: when we hit a dot-terminated token sequence that
 48 |     includes a pre-processor construct, just print out the raw text that
 49 |     generated those tokens and leave it at that.
 50 | 
 51 |     When that was done, I decided dropping comments was not acceptable.  While
 52 |     searching around for how to re-insert comments, I came across a reference
 53 |     to the epp_dodger module and a function that re-inserts comments.  Both of
 54 |     which is used by erl_tidy.  It seemed stupid to re-write erl_tidy so I
 55 |     went back to square one.
 56 | 
 57 | Second attempt: teach erl_tidy about stdin
 58 | 
 59 |     Erlang can read stdin just fine.  In fact, most of the input functions in
 60 |     the io module read from stdin by default.  Erlang provides the standard_io
 61 |     atom that you can use for an IODevice argument.  I started hacking on
 62 |     erl_tidy, intending to use the "special" file name of a single dash (-) to
 63 |     tell erl_tidy to read from stdin.
 64 | 
 65 |     It was simple to add a new read_module("-", Opts) and pass standard_io to
 66 |     epp_dodger:parse/3.  But, the next chunk of erl_tidy logic reads comments
 67 |     from the same file again, which is not possible with stdin, since the
 68 |     stream was already consumed by parsing.
 69 | 
 70 |     I briefly looking to see if I could somehow turn a string into an IODevice
 71 |     (like you can in Java), but nothing turned up.  So, it seems like I have
 72 |     to write stdin to a file in order to use erl_tidy.
 73 | 
 74 | Third attempt: call erl_tidy:file/1 from a shell script.
 75 | 
 76 |     It was trivial to write a shell script that pipes stdin to a file.
 77 |     Calling erl_tidy from the shell script was not.  My first attempt looked
 78 |     like this:
 79 | 
 80 |             $ erl -run erl_tidy file $TMPFN
 81 | 
 82 |     which produced:
 83 | 
 84 |             =ERROR REPORT==== 9-May-2016::20:13:30 ===
 85 |             erl_comment_scan: bad filename: `['hmmmm_sup.erl']'
 86 | 
 87 |     and hung there, waiting in the Erlang interpreter.
 88 | 
 89 |     It turns out that when use the -run flag and pass it arguments, Erlang
 90 |     assumes the receiving function has one argument---a list.  That's why the
 91 |     filename in the error message has brackets around it ... it's an list not
 92 |     a string..
 93 | 
 94 | Final attempt: call escript from a shell script
 95 | 
 96 |     The final result is what you see in the repository now.   It's so easy to
 97 |     pipe stdin to a file in a shell script that I saw no reason to re-write
 98 |     that in Erlang.  So I added a short escript that unpacks the file name
 99 |     from the list and passes that to erl_tidy.
100 | 
101 |     The result: I was able to erlfmt all 1,795 source files under
102 |     /usr/local/Cellar/erlang/18.2.1, with only one error:
103 | 
104 |             /usr/local/Cellar/erlang/18.2.1/lib/erlang/lib/wx-1.6/src/gen/gl.erl
105 |             ./erlfmt: line 6: 25399 User defined signal 2: 31 ./erlfmt.escript "$TMPF"
106 | 
107 |     That file is a generated one, and is 971KB in size.  I got the above error
108 |     when I had erl_tidy write the reformatted file to the temporary file.
109 |     When instead I told erl_tidy to write the result to stdout, I got this
110 |     error:
111 | 
112 |             escript: exception exit: badarg
113 |               in function  erl_tidy:file/2 (erl_tidy.erl, line 295)
114 |               in call from erl_eval:local_func/6 (erl_eval.erl, line 557)
115 |               in call from escript:interpret/4 (escript.erl, line 787)
116 |               in call from escript:start/1 (escript.erl, line 277)
117 |               in call from init:start_it/1 
118 |               in call from init:start_em/1 
119 |             
120 |     It successfully parsed the other 1,794 files, which gives success rate of
121 |     0.99945.  Not too hot in the Erlang world, but I don't expect to be
122 |     formatting files that big.  Good enough.
123 | 
124 | The back story back story.
125 | 
126 |     Why such a long README?  Erlang questions mailing list recently had an
127 |     interesting thread on code documentation ("rhetorical structure of code").
128 |     It was a long thread and a lot of things were said, but one in particular
129 |     stuck with me: code documentation rarely shows the starts and stops, the
130 |     litany of failed experiments that you encounter on the way to the final
131 |     product.  And that those failed experiements are sometimes useful in the
132 |     future.
133 | 
134 | 


--------------------------------------------------------------------------------
/erlfmt:
--------------------------------------------------------------------------------
 1 | #! /bin/sh -e
 2 | # Run erl_tidy on stdin and send result to stdout.
 3 | 
 4 | #
 5 | #   OSX and GNU/Linux mktemp behave differently with the -t option.
 6 | #
 7 | #       mktemp -t mkb
 8 | #           OSX      : /var/folders/cy/4988f14n4r35fp39wrt1pfq80000gn/T/mkb.UgVBxAPP
 9 | #           GNU/Linux: mktemp: too few X's in template ‘mkb’
10 | #
11 | #       mktemp -t mkb.XXXXXX
12 | #           OSX      : /var/folders/cy/4988f14n4r35fp39wrt1pfq80000gn/T/mkb.XXXXXX.UgVBxAPP
13 | #           GNU/Linux: /tmp/mkb.pRykUg
14 | #
15 | #   So, "manually" add the TMPDIR prefix.
16 | 
17 | D=${TMP1DIR:-.}
18 | TMPF=$D/$(mktemp $(basename $0).XXXXXX)
19 | 
20 | cat - > $TMPF
21 | 
22 | # See https://bugs.erlang.org/browse/ERL-234
23 | if grep 'coding: *utf-8' $TMPF > /dev/null ; then
24 | 	printf "erlfmt: utf-8 encoding\n" >&2
25 | 	exit 1
26 | fi
27 | 
28 | # Some erl files in OTP-19 don't have module statements; for example,
29 | #      ./lib/hipe/amd64/hipe_amd64_liveness.erl
30 | # Skip these too.
31 | if ! grep '^-module' $TMPF > /dev/null ; then
32 | 	printf "erlfmt: no -module() line.\n" >&2
33 | 	exit 1
34 | fi
35 | 
36 | 
37 | #
38 | #   Convert the tabs output by erl_prettypr to eight spaces.
39 | #
40 | #   On OSX, I was getting the error
41 | #
42 | #           sed: RE error: illegal byte sequence
43 | #
44 | #   Per StackOverflow, this error means that the byte-sequence in the file does
45 | #   not use the same encoding as the shell.  In my specific case, I had some
46 | #   eunit test cases that included text with some encoded characters.
47 | #
48 | #   The fix of setting LC_ALL=C in-line is fine since the only character we care
49 | #   about matching is the tab character.
50 | #
51 | #       However, (effectively) setting LC_CTYPE to C treats strings as if each
52 | #       byte were its own character (no interpretation based on encoding rules
53 | #       is performed), with no regard for the - multibyte-on-demand - UTF-8
54 | #       encoding that OS X employs by default, where foreign characters have
55 | #       multibyte encodings.
56 | #
57 | #   http://stackoverflow.com/questions/19242275/re-error-illegal-byte-sequence-on-mac-os-x
58 | #
59 | 
60 | erlfmt.escript "$TMPF" | LC_ALL=C sed 's/	/        /g'
61 | rm $TMPF
62 | 


--------------------------------------------------------------------------------
/erlfmt.escript:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env escript
 2 | %%! -shutdown_time 1000
 3 | main([Name]) ->
 4 | 	erl_tidy:file(Name, [{stdout, true}, {keep_unused, true}]);
 5 | main(_) ->
 6 |     usage().
 7 | 
 8 | usage() ->
 9 |     io:put_chars("usage: erlfmt.escript <file>\n"),
10 |     halt(1).
11 | 


--------------------------------------------------------------------------------