├── dune-project ├── .exclude ├── test ├── dune └── PPrintTest.ml ├── benchmark_old ├── dune ├── Makefile ├── Size.mli ├── Size.ml ├── OldPPrintEngine.mli ├── PPrintBench.ml └── OldPPrintEngine.ml ├── .gitignore ├── src ├── Makefile ├── dune ├── index.mld ├── PPrint.mli ├── PPrint.ml ├── PPrintEngine.mli └── PPrintEngine.ml ├── benchmark_new ├── dune ├── Makefile ├── README.md ├── AST.ml ├── AST2Document.ml ├── benchmark.txt └── main.ml ├── AUTHORS.md ├── blog ├── Makefile ├── style.css └── billet.markdown ├── header ├── README.md ├── pprint.opam ├── TODO.md ├── CHANGES.md ├── Makefile ├── mini └── PPrintMini.ml └── LICENSE /dune-project: -------------------------------------------------------------------------------- 1 | (lang dune 1.3) 2 | -------------------------------------------------------------------------------- /.exclude: -------------------------------------------------------------------------------- 1 | OldPPrintEngine.* 2 | size.* 3 | PPrintBench.* 4 | -------------------------------------------------------------------------------- /test/dune: -------------------------------------------------------------------------------- 1 | (executable 2 | (name PPrintTest) 3 | (libraries unix pprint) 4 | ) 5 | -------------------------------------------------------------------------------- /benchmark_old/dune: -------------------------------------------------------------------------------- 1 | (executable 2 | (name PPrintBench) 3 | (libraries unix pprint) 4 | ) 5 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | *~ 2 | _build 3 | blog/billet.html 4 | .merlin 5 | pprint.install 6 | dune-workspace.versions 7 | -------------------------------------------------------------------------------- /src/Makefile: -------------------------------------------------------------------------------- 1 | .PHONY: all clean doc test bench 2 | 3 | all clean doc test bench: 4 | $(MAKE) -C .. $@ 5 | -------------------------------------------------------------------------------- /benchmark_new/dune: -------------------------------------------------------------------------------- 1 | (executable 2 | (name main) 3 | (libraries pprint fix core_unix.command_unix core_bench) 4 | ) 5 | -------------------------------------------------------------------------------- /AUTHORS.md: -------------------------------------------------------------------------------- 1 | PPrint was written by François Pottier and Nicolas Pouillard, with 2 | contributions by Yann Régis-Gianas, Gabriel Scherer, Jonathan 3 | Protzenko, Thomas Refis. 4 | -------------------------------------------------------------------------------- /benchmark_old/Makefile: -------------------------------------------------------------------------------- 1 | .PHONY: all 2 | all: 3 | dune build @all 4 | 5 | .PHONY: clean 6 | clean: 7 | git clean -fX . 8 | 9 | .PHONY: bench 10 | bench: 11 | dune exec ./PPrintBench.exe 12 | -------------------------------------------------------------------------------- /src/dune: -------------------------------------------------------------------------------- 1 | (env 2 | (dev (flags :standard -w @A-4)) 3 | (release (flags :standard)) 4 | ) 5 | 6 | (library 7 | (name pprint) 8 | (public_name pprint) 9 | (wrapped false) 10 | ) 11 | 12 | (documentation) 13 | -------------------------------------------------------------------------------- /benchmark_new/Makefile: -------------------------------------------------------------------------------- 1 | .PHONY: all 2 | all: 3 | dune build @all 4 | 5 | .PHONY: clean 6 | clean: 7 | git clean -fX . 8 | 9 | .PHONY: bench 10 | bench: 11 | dune exec ./main.exe -- time alloc samples -quota 10 12 | 13 | .PHONY: once 14 | once: 15 | dune exec ./main.exe -- time alloc -quota 1x 16 | -------------------------------------------------------------------------------- /blog/Makefile: -------------------------------------------------------------------------------- 1 | # -------------------------------------------------------------------------------- 2 | 3 | # [make billet] creates the blog entry. 4 | 5 | .PHONY: billet clean 6 | 7 | billet: billet.html 8 | 9 | clean: 10 | rm -f billet.html 11 | 12 | %.html: %.markdown 13 | pandoc -s $< -c style.css > $@ 14 | -------------------------------------------------------------------------------- /header: -------------------------------------------------------------------------------- 1 | 2 | PPrint 3 | 4 | François Pottier, Inria Paris 5 | Nicolas Pouillard 6 | 7 | Copyright 2007-2022 Inria. All rights reserved. This file is 8 | distributed under the terms of the GNU Library General Public 9 | License, with an exception, as described in the file LICENSE. 10 | 11 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # PPrint: a Pretty-Printing Toolbox 2 | 3 | `PPrint` is an OCaml library for **pretty-printing textual documents**. It 4 | takes care of **indentation and line breaks**, and is typically used to 5 | **pretty-print code**. 6 | 7 | To install the latest released version, type `opam install pprint`. 8 | 9 | Here is [the documentation of the latest released 10 | version](http://cambium.inria.fr/~fpottier/pprint/doc/pprint/). 11 | -------------------------------------------------------------------------------- /benchmark_new/README.md: -------------------------------------------------------------------------------- 1 | This benchmark measures the performance of PPrint using an artificial benchmark 2 | that involves randomly generating arithmetic expressions, converting them to 3 | PPrint documents, and rendering these documents. 4 | 5 | Caveat: very few of the PPrint combinators are exercised by this benchmark. 6 | 7 | Use `make once` in this directory to run the benchmark just once. 8 | Use `make bench` in this directory to run it several times 9 | and obtain somewhat more reliable results. 10 | -------------------------------------------------------------------------------- /pprint.opam: -------------------------------------------------------------------------------- 1 | opam-version: "2.0" 2 | maintainer: "francois.pottier@inria.fr" 3 | authors: [ 4 | "François Pottier " 5 | "Nicolas Pouillard " 6 | ] 7 | license: "LGPL-2.0-only WITH OCaml-LGPL-linking-exception" 8 | homepage: "https://github.com/fpottier/pprint" 9 | dev-repo: "git+ssh://git@github.com/fpottier/pprint.git" 10 | bug-reports: "francois.pottier@inria.fr" 11 | build: [ 12 | ["dune" "build" "-p" name "-j" jobs] 13 | ] 14 | depends: [ 15 | "ocaml" {>= "4.03"} 16 | "dune" {>= "1.3"} 17 | ] 18 | synopsis: "A pretty-printing combinator library and rendering engine" 19 | description: "This library offers a set of combinators for building so-called documents as 20 | well as an efficient engine for converting documents to a textual, fixed-width 21 | format. The engine takes care of indentation and line breaks, while respecting 22 | the constraints imposed by the structure of the document and by the text width." 23 | -------------------------------------------------------------------------------- /TODO.md: -------------------------------------------------------------------------------- 1 | * Document that `align` is dangerous, and point to Pretty Expressive. 2 | 3 | * Document that the target width can be exceeded. At each group, the engine 4 | asks: can this group fit on the current line? If the answer is positive, 5 | then the engine commits to printing the entire group in flat mode, on the 6 | current line. Yet it is possible that the material that *follows this group* 7 | cannot fit on the current line! There is no "lookahead" when deciding 8 | whether a group should be printed flat or dissolved. As a result, there is 9 | no guarantee that the target width supplied by the user will be respected. 10 | 11 | * Test the interaction of `range` with the automatic removal of trailing 12 | blank characters. Do we obtain the desired behavior? 13 | 14 | * Set up a real test suite. 15 | 16 | * Fix the warnings produced by `make doc`. Review its output. 17 | 18 | * Update the private `Makefile` so as to publish the package documentation 19 | on yquem (or gitlab?). 20 | 21 | * Try to speed up the random generator. 22 | `choose`, applied to a list, is too slow: use an array? 23 | avoid building n suspensions when only one will be forced? 24 | 25 | * Extend `PPrintBench` to also try non-random documents of large size. 26 | -------------------------------------------------------------------------------- /blog/style.css: -------------------------------------------------------------------------------- 1 | body { 2 | width: 800px; 3 | font-family: sans-serif; 4 | margin: 0 auto; 5 | } 6 | 7 | pre.sourceCode, code { 8 | background-color: #eee; 9 | } 10 | 11 | .controls { 12 | text-align: center; 13 | } 14 | 15 | a { 16 | text-decoration: none; 17 | color: #888; 18 | display: inline-block; 19 | } 20 | 21 | a:hover { 22 | text-decoration: underline; 23 | color: #666; 24 | } 25 | 26 | .boxes { 27 | width: 100%; 28 | margin-bottom: 1em; 29 | text-align: center; 30 | } 31 | 32 | .box { 33 | text-align: left; 34 | width: 395px; 35 | display: inline-block; 36 | height: 8em; 37 | border: 1px solid black; 38 | vertical-align: top; 39 | background-color: white; 40 | border-radius: 5px; 41 | } 42 | 43 | .box ul { 44 | margin: 0; 45 | } 46 | 47 | .title { 48 | padding: 5px; 49 | font-weight: bold; 50 | } 51 | 52 | .controls a { 53 | display: inline-block; 54 | padding: 2px; 55 | border: 1px solid #888; 56 | border-radius: 2px; 57 | background-color: #fffef7; 58 | color: #353535; 59 | } 60 | 61 | .controls a:hover { 62 | text-decoration: none; 63 | border: 1px solid #353535; 64 | background-color: #fffceb; 65 | } 66 | 67 | .caption { 68 | text-align: center; 69 | font-style: italic; 70 | margin: 0; 71 | } 72 | 73 | .figure { 74 | border: 1px solid black; 75 | padding: 2em; 76 | } 77 | -------------------------------------------------------------------------------- /benchmark_new/AST.ml: -------------------------------------------------------------------------------- 1 | (******************************************************************************) 2 | (* *) 3 | (* PPrint *) 4 | (* *) 5 | (* François Pottier, Inria Paris *) 6 | (* Nicolas Pouillard *) 7 | (* *) 8 | (* Copyright 2007-2022 Inria. All rights reserved. This file is *) 9 | (* distributed under the terms of the GNU Library General Public *) 10 | (* License, with an exception, as described in the file LICENSE. *) 11 | (* *) 12 | (******************************************************************************) 13 | 14 | (* Unary operators. *) 15 | 16 | type unop = 17 | | UNeg 18 | 19 | (* Binary operators. *) 20 | 21 | type binop = 22 | | BAdd 23 | | BSub 24 | | BMul 25 | | BDiv 26 | 27 | (* Expressions. *) 28 | 29 | type expr = 30 | | EConst of int 31 | | EUnOp of unop * expr 32 | | EBinOp of expr * binop * expr 33 | 34 | type main = 35 | expr 36 | -------------------------------------------------------------------------------- /benchmark_old/Size.mli: -------------------------------------------------------------------------------- 1 | (******************************************************************************) 2 | (* *) 3 | (* PPrint *) 4 | (* *) 5 | (* François Pottier, Inria Paris *) 6 | (* Nicolas Pouillard *) 7 | (* *) 8 | (* Copyright 2007-2022 Inria. All rights reserved. This file is *) 9 | (* distributed under the terms of the GNU Library General Public *) 10 | (* License, with an exception, as described in the file LICENSE. *) 11 | (* *) 12 | (******************************************************************************) 13 | 14 | (*i $Id: size.mli,v 1.5 2008-07-21 14:53:06 filliatr Exp $ i*) 15 | 16 | (* Sizes of ocaml values (in their memory representation). 17 | Sizes are given in words ([size_w]), bytes ([size_b]) or kilobytes 18 | ([size_kb]), in a system-independent way. *) 19 | 20 | val size_w : 'a -> int 21 | 22 | val size_b : 'a -> int 23 | 24 | val size_kb : 'a -> int 25 | -------------------------------------------------------------------------------- /test/PPrintTest.ml: -------------------------------------------------------------------------------- 1 | (**************************************************************************) 2 | (* *) 3 | (* PPrint *) 4 | (* *) 5 | (* François Pottier, Inria Paris *) 6 | (* Nicolas Pouillard *) 7 | (* *) 8 | (* Copyright 2007-2019 Inria. All rights reserved. This file is *) 9 | (* distributed under the terms of the GNU Library General Public *) 10 | (* License, with an exception, as described in the file LICENSE. *) 11 | (**************************************************************************) 12 | 13 | open PPrint 14 | 15 | (* This is a test file. It is not, strictly speaking, part of the library. *) 16 | 17 | let paragraph (s : string) = 18 | flow (break 1) (words s) 19 | 20 | let document = 21 | prefix 2 1 22 | (string "TITLE:") 23 | (string "PPrint") 24 | ^^ 25 | hardline 26 | ^^ 27 | prefix 2 1 28 | (string "AUTHORS:") 29 | (utf8string "François Pottier and Nicolas Pouillard") 30 | ^^ 31 | hardline 32 | ^^ 33 | prefix 2 1 34 | (string "ABSTRACT:") 35 | ( 36 | paragraph "This is an adaptation of Daan Leijen's \"PPrint\" library, 37 | which itself is based on the ideas developed by Philip Wadler in 38 | \"A Prettier Printer\". For more information about Wadler's and Leijen's work, 39 | please consult the following reference:" 40 | ^^ 41 | nest 2 ( 42 | twice (break 1) 43 | ^^ 44 | separate_map (break 1) (fun s -> nest 2 (url s)) [ 45 | "http://homepages.inf.ed.ac.uk/wadler/papers/prettier/prettier.pdf"; 46 | ] 47 | ) 48 | ^^ 49 | twice (break 1) 50 | ^^ 51 | paragraph "To install PPrint, type \"opam install pprint\"." 52 | ^^ 53 | twice (break 1) 54 | ^^ 55 | paragraph "The documentation for PPrint is built by \"make doc\"." 56 | ) 57 | ^^ 58 | hardline 59 | 60 | let () = 61 | ToChannel.pretty 0.5 80 stdout document; 62 | flush stdout 63 | -------------------------------------------------------------------------------- /benchmark_new/AST2Document.ml: -------------------------------------------------------------------------------- 1 | (******************************************************************************) 2 | (* *) 3 | (* PPrint *) 4 | (* *) 5 | (* François Pottier, Inria Paris *) 6 | (* Nicolas Pouillard *) 7 | (* *) 8 | (* Copyright 2007-2022 Inria. All rights reserved. This file is *) 9 | (* distributed under the terms of the GNU Library General Public *) 10 | (* License, with an exception, as described in the file LICENSE. *) 11 | (* *) 12 | (******************************************************************************) 13 | 14 | (* Converting an AST to a PPrint document. *) 15 | 16 | open PPrint 17 | open AST 18 | 19 | let lparen = lparen ^^ ifflat empty space 20 | let rparen = ifflat empty hardline ^^ rparen 21 | let add = space ^^ plus ^^ break 1 22 | let sub = space ^^ minus ^^ break 1 23 | let mul = space ^^ star ^^ break 1 24 | let div = space ^^ slash ^^ break 1 25 | 26 | let[@inline] const i = 27 | utf8format "%d" i 28 | 29 | let[@inline] paren d = 30 | nest 2 (lparen ^^ d) ^^ rparen 31 | 32 | let rec factor e = 33 | group begin match e with 34 | | EConst i -> const i 35 | | EUnOp (UNeg, e) -> minus ^^ break 0 ^^ factor e 36 | | _ -> paren (expr e) 37 | end 38 | 39 | and term e = 40 | group begin match e with 41 | | EBinOp (e1, BMul, e2) -> term e1 ^^ mul ^^ factor e2 42 | | EBinOp (e1, BDiv, e2) -> term e1 ^^ div ^^ factor e2 43 | | _ -> factor e 44 | end 45 | 46 | and expr e = 47 | group begin match e with 48 | | EBinOp (e1, BAdd, e2) -> expr e1 ^^ add ^^ term e2 49 | | EBinOp (e1, BSub, e2) -> expr e1 ^^ sub ^^ term e2 50 | | _ -> term e 51 | end 52 | 53 | and main : AST.main -> document = function 54 | | e -> expr e ^^ hardline 55 | -------------------------------------------------------------------------------- /benchmark_old/Size.ml: -------------------------------------------------------------------------------- 1 | (******************************************************************************) 2 | (* *) 3 | (* PPrint *) 4 | (* *) 5 | (* François Pottier, Inria Paris *) 6 | (* Nicolas Pouillard *) 7 | (* *) 8 | (* Copyright 2007-2022 Inria. All rights reserved. This file is *) 9 | (* distributed under the terms of the GNU Library General Public *) 10 | (* License, with an exception, as described in the file LICENSE. *) 11 | (* *) 12 | (******************************************************************************) 13 | 14 | (*i $Id: size.ml,v 1.7 2008-07-21 14:53:06 filliatr Exp $ i*) 15 | 16 | (*i*) 17 | open Obj 18 | (*i*) 19 | 20 | (*s Pointers already visited are stored in a hash-table, where 21 | comparisons are done using physical equality. *) 22 | 23 | module H = Hashtbl.Make( 24 | struct 25 | type t = Obj.t 26 | let equal = (==) 27 | let hash o = Hashtbl.hash (magic o : int) 28 | end) 29 | 30 | let node_table = (H.create 257 : unit H.t) 31 | 32 | let in_table o = try H.find node_table o; true with Not_found -> false 33 | 34 | let add_in_table o = H.add node_table o () 35 | 36 | let reset_table () = H.clear node_table 37 | 38 | (*s Objects are traversed recursively, as soon as their tags are less than 39 | [no_scan_tag]. [count] records the numbers of words already visited. *) 40 | 41 | let size_of_double = size (repr 1.0) 42 | 43 | let count = ref 0 44 | 45 | let rec traverse t = 46 | if not (in_table t) then begin 47 | add_in_table t; 48 | if is_block t then begin 49 | let n = size t in 50 | let tag = tag t in 51 | if tag < no_scan_tag then begin 52 | count := !count + 1 + n; 53 | for i = 0 to n - 1 do 54 | let f = field t i in 55 | if is_block f then traverse f 56 | done 57 | end else if tag = string_tag then 58 | count := !count + 1 + n 59 | else if tag = double_tag then 60 | count := !count + size_of_double 61 | else if tag = double_array_tag then 62 | count := !count + 1 + size_of_double * n 63 | else 64 | incr count 65 | end 66 | end 67 | 68 | (*s Sizes of objects in words and in bytes. The size in bytes is computed 69 | system-independently according to [Sys.word_size]. *) 70 | 71 | let size_w o = 72 | reset_table (); 73 | count := 0; 74 | traverse (repr o); 75 | !count 76 | 77 | let size_b o = (size_w o) * (Sys.word_size / 8) 78 | 79 | let size_kb o = (size_w o) / (8192 / Sys.word_size) 80 | 81 | 82 | -------------------------------------------------------------------------------- /benchmark_new/benchmark.txt: -------------------------------------------------------------------------------- 1 | ┌───────────────────────────────┬────────────────┬────────────────────┬─────────────────┬────────────────┬────────────────┬────────────┐ 2 | │ Name │ Runs @ Samples │ Time/Run │ mWd/Run │ mjWd/Run │ Prom/Run │ Percentage │ 3 | ├───────────────────────────────┼────────────────┼────────────────────┼─────────────────┼────────────────┼────────────────┼────────────┤ 4 | │ Generating AST:10 │ 216776 @ 929 │ 464.33ns │ 62.00w │ │ │ │ 5 | │ Generating AST:100 │ 21385 @ 696 │ 4_723.28ns │ 619.99w │ 0.43w │ 0.43w │ │ 6 | │ Generating AST:1000 │ 1989 @ 455 │ 48_160.03ns │ 6_199.93w │ 45.26w │ 45.26w │ │ 7 | │ Generating AST:10000 │ 197 @ 197 │ 517_285.53ns │ 62_003.34w │ 4_764.14w │ 4_764.14w │ 0.03% │ 8 | │ Generating AST:100000 │ 53 @ 53 │ 7_264_986.66ns │ 619_989.24w │ 306_696.78w │ 306_696.78w │ 0.41% │ 9 | │ Generating AST:1000000 │ 16 @ 16 │ 90_710_369.38ns │ 6_199_925.16w │ 3_730_766.82w │ 3_730_766.82w │ 5.13% │ 10 | │ Generating AST:3000000 │ 9 @ 9 │ 304_697_926.02ns │ 18_598_998.73w │ 11_330_178.05w │ 11_330_178.05w │ 17.24% │ 11 | │ Constructing document:10 │ 53868 @ 789 │ 1_856.55ns │ 691.00w │ 0.46w │ 0.46w │ │ 12 | │ Constructing document:100 │ 6327 @ 573 │ 15_400.23ns │ 5_459.00w │ 30.40w │ 30.40w │ │ 13 | │ Constructing document:1000 │ 478 @ 303 │ 189_899.66ns │ 56_479.00w │ 3_359.74w │ 3_359.74w │ 0.01% │ 14 | │ Constructing document:10000 │ 76 @ 76 │ 3_533_742.14ns │ 566_160.00w │ 259_387.04w │ 259_387.04w │ 0.20% │ 15 | │ Constructing document:100000 │ 22 @ 22 │ 47_337_705.54ns │ 5_625_674.00w │ 2_912_119.35w │ 2_912_119.35w │ 2.68% │ 16 | │ Constructing document:1000000 │ 7 @ 7 │ 569_203_287.56ns │ 56_192_506.00w │ 29_708_800.82w │ 29_708_800.82w │ 32.21% │ 17 | │ Constructing document:3000000 │ 4 @ 4 │ 1_767_411_640.50ns │ 168_560_532.00w │ 89_297_060.40w │ 89_297_060.40w │ 100.00% │ 18 | │ Rendering document:10 │ 59499 @ 799 │ 1_498.73ns │ 1_064.00w │ 0.62w │ 0.62w │ │ 19 | │ Rendering document:100 │ 7123 @ 585 │ 13_146.30ns │ 7_916.00w │ 5.82w │ 5.82w │ │ 20 | │ Rendering document:1000 │ 512 @ 311 │ 166_991.10ns │ 81_939.00w │ 2_597.18w │ 68.18w │ │ 21 | │ Rendering document:10000 │ 102 @ 102 │ 1_826_896.93ns │ 821_528.00w │ 42_269.89w │ 415.89w │ 0.10% │ 22 | │ Rendering document:100000 │ 29 @ 29 │ 24_206_210.55ns │ 8_214_669.00w │ 392_796.61w │ 2_931.61w │ 1.37% │ 23 | │ Rendering document:1000000 │ 9 @ 9 │ 258_933_670.28ns │ 82_557_024.00w │ 5_793_610.02w │ 27_404.02w │ 14.65% │ 24 | │ Rendering document:3000000 │ 3 @ 3 │ 874_072_360.80ns │ 248_119_196.00w │ 21_866_036.50w │ 84_359.50w │ 49.45% │ 25 | └───────────────────────────────┴────────────────┴────────────────────┴─────────────────┴────────────────┴────────────────┴────────────┘ 26 | -------------------------------------------------------------------------------- /CHANGES.md: -------------------------------------------------------------------------------- 1 | # Changes 2 | 3 | ## 2023/08/30 4 | 5 | * The new function `is_empty` allows testing (in constant time) whether 6 | a document is empty. 7 | 8 | * Documentation: add a warning about the time and space complexity of 9 | a naive use of `ifflat`. 10 | 11 | * The library now requires OCaml 4.03 or newer. 12 | 13 | * Add a new micro-benchmark, which uses `core_bench` and involves 14 | randomly-generated arithmetic expressions. 15 | 16 | ## 2022/01/03 17 | 18 | * Improved documentation. (Initial proposal by Thomas Refis, extended 19 | by François Pottier.) 20 | 21 | * The internal modules `PPrintEngine`, `PPrintCombinators`, `PPrintRenderer`, 22 | and `PPrintOCaml` have been removed. (Their existence was an implementation 23 | detail.) Please refer to `PPrint`, `PPrint`, `PPrint`, and `PPrint.OCaml` 24 | instead. 25 | 26 | ## 2021/11/29 27 | 28 | * Trailing blank characters at the end of a line are now suppressed. 29 | This includes indentation characters (whose production is implicit) 30 | as well as blank characters that are explicitly produced by the 31 | combinators `space` and `blank`. Trailing blank characters are 32 | suppressed in both rendering modes (pretty and compact). 33 | (Contributed by Thomas Refis, reviewed and polished by François Pottier.) 34 | 35 | * New function `PPrint.OCaml.unit`. 36 | 37 | ## 2020/04/10 38 | 39 | * New function `PPrint.utf8format`. 40 | 41 | ## 2020/03/16 42 | 43 | * New functions `PPrint.OCaml.flowing_list` and `PPrint.OCaml.flowing_array`. 44 | 45 | ## 2020/02/26 46 | 47 | * Change the behavior of `PPrint.ToFormatter` to use `Format.pp_print_text` 48 | internally. This means that a newline character causes a call to 49 | `Format.pp_force_newline`; a space character causes a call to 50 | `Format.pp_print_space`; and every other character is printed using 51 | `Format.pp_print_char`. 52 | 53 | * Switch to `dune`. 54 | 55 | * Avoid a few compilation warnings. 56 | 57 | ## 2018/05/23 58 | 59 | * Add a `line` field to the `state` record, which can be read by the code 60 | that implements custom documents. Add a `range` combinator that allows 61 | retrieving the start and end points of a (sub)document in the output. 62 | (Suggested by Victor Gomes.) 63 | 64 | ## 2017/10/03 65 | 66 | * Update the code and build options to use `-safe-string`. This means that 67 | the library now requires OCaml 4.02 or later, and is compatible with 4.06. 68 | 69 | ## 2015/03/16 70 | 71 | * Moved to github and changed the license to LGPL with an exception. 72 | 73 | ## 2014/04/25 74 | 75 | * Minor changes in the implementation of `string` and `substring`. 76 | Initially committed on 2014/03/24, but left out of the 20140424 77 | release due to a goof-up. 78 | 79 | ## 2014/04/11 80 | 81 | * Changed the behavior of `align`, which was not consistent with its 82 | documentation. `align` now sets the indentation level to the current column. 83 | In particular, this means that `align (align d)` is equivalent to `align d`, 84 | which was not the case previously. Thanks to Dmitry Grebeniuk for reporting 85 | this issue. 86 | 87 | ## 2014/04/03 88 | 89 | * The library is now extensible (in principle). A `custom` document 90 | constructor allows the user to define her own documents, as long as they fit 91 | the manner in which the current rendering engine works. 92 | 93 | * The `compact` rendering engine is now tail-recursive too. 94 | 95 | ## 2014/03/21 96 | 97 | * Minor optimisation in the smart constructor `group`. 98 | 99 | ## 2014/03/13 100 | 101 | * New (simpler) pretty-printing engine. The representation of documents in 102 | memory is slightly larger; document construction is perhaps slightly slower, 103 | while rendering is significantly faster. (Construction dominates rendering.) 104 | The rendering speed is now guaranteed to be independent of the width 105 | parameter. The price to pay for this simplification is that the primitive 106 | document constructors `column` and `nesting` are no longer supported. The 107 | API is otherwise unchanged. 108 | 109 | ## 2013/01/31 110 | 111 | * First official release of PPrint. 112 | -------------------------------------------------------------------------------- /benchmark_new/main.ml: -------------------------------------------------------------------------------- 1 | (******************************************************************************) 2 | (* *) 3 | (* PPrint *) 4 | (* *) 5 | (* François Pottier, Inria Paris *) 6 | (* Nicolas Pouillard *) 7 | (* *) 8 | (* Copyright 2007-2022 Inria. All rights reserved. This file is *) 9 | (* distributed under the terms of the GNU Library General Public *) 10 | (* License, with an exception, as described in the file LICENSE. *) 11 | (* *) 12 | (******************************************************************************) 13 | 14 | open Core_bench 15 | let memoize = Fix.Memoize.Int.memoize 16 | 17 | (* -------------------------------------------------------------------------- *) 18 | 19 | (* Random generation of abstract syntax trees. *) 20 | 21 | module Generate = struct 22 | 23 | open AST 24 | 25 | let uneg e = 26 | EUnOp (UNeg, e) 27 | 28 | let ebinop op (e1, e2) = 29 | EBinOp (e1, op, e2) 30 | 31 | let pay s = 32 | assert (s > 0); 33 | s - 1 34 | 35 | let split s = 36 | assert (s >= 0); 37 | let s1 = Random.int (s + 1) in 38 | let s2 = s - s1 in 39 | s1, s2 40 | 41 | let rec expr (s : int) : expr = 42 | if s = 0 then 43 | EConst 0 44 | else 45 | let s = pay s in 46 | let i = Random.int 5 in 47 | if i = 4 then 48 | EUnOp (UNeg, expr s) 49 | else 50 | let s1, s2 = split s in 51 | let op = List.nth [BAdd; BSub; BMul; BDiv] i in 52 | EBinOp (expr s1, op, expr s2) 53 | 54 | let main (s : int) : main = 55 | (* We want reproducible results, and placing a call to [Random.init] 56 | in the main program does not seem to work (not sure why). *) 57 | Random.init 128; 58 | expr s 59 | 60 | end 61 | 62 | (* -------------------------------------------------------------------------- *) 63 | 64 | (* Each benchmark is run at the following tree sizes. *) 65 | 66 | let args = 67 | [10; 100; 1_000; 10_000; 100_000; 1_000_000; 3_000_000] 68 | 69 | (* -------------------------------------------------------------------------- *) 70 | 71 | (* Generating ASTs. *) 72 | 73 | let generation = 74 | let name = "Generating AST" in 75 | Bench.Test.create_indexed ~name ~args @@ fun s -> 76 | Core.Staged.stage (fun () -> ignore (Generate.main s)) 77 | 78 | (* After [Generate.main] has been benchmarked, a memoized version of 79 | it can be used, so we spend less time preparing data for the next 80 | benchmarks. *) 81 | 82 | let make_ast = 83 | memoize Generate.main 84 | 85 | (* -------------------------------------------------------------------------- *) 86 | 87 | (* Converting ASTs to PPrint documents. *) 88 | 89 | let conversion = 90 | let name = "Constructing document" in 91 | Bench.Test.create_indexed ~name ~args @@ fun s -> 92 | let ast = make_ast s in 93 | Core.Staged.stage (fun () -> ignore (AST2Document.main ast)) 94 | 95 | let make_doc = 96 | memoize @@ fun s -> 97 | make_ast s 98 | |> AST2Document.main 99 | 100 | (* -------------------------------------------------------------------------- *) 101 | 102 | (* Rendering PPrint documents (in memory). *) 103 | 104 | let format document : string = 105 | let b = Buffer.create 1024 in 106 | PPrint.ToBuffer.pretty 0.8 80 b document; 107 | Buffer.contents b 108 | 109 | let formatting = 110 | let name = "Rendering document" in 111 | Bench.Test.create_indexed ~name ~args @@ fun s -> 112 | let document = make_doc s in 113 | Core.Staged.stage (fun () -> ignore (format document)) 114 | 115 | (* -------------------------------------------------------------------------- *) 116 | 117 | (* Running the benchmarks. *) 118 | 119 | let run_all_benchmarks () = 120 | Command_unix.run (Bench.make_command [ 121 | generation; 122 | conversion; 123 | formatting; 124 | ]) 125 | 126 | (* -------------------------------------------------------------------------- *) 127 | 128 | (* Main. *) 129 | 130 | let () = 131 | run_all_benchmarks() 132 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | # ------------------------------------------------------------------------------ 2 | 3 | # The version number is automatically set to the current date, 4 | # unless DATE is defined on the command line. 5 | DATE := $(shell /bin/date +%Y%m%d) 6 | 7 | # The project's name. 8 | THIS := pprint 9 | 10 | # The archive's URL (https). 11 | ARCHIVE := https://github.com/fpottier/$(THIS)/archive/$(DATE).tar.gz 12 | 13 | # ------------------------------------------------------------------------------ 14 | 15 | .PHONY: all 16 | all: 17 | @ dune build @all 18 | 19 | .PHONY: clean 20 | clean: 21 | @ git clean -fdX 22 | 23 | .PHONY: test 24 | test: 25 | @ dune exec test/PPrintTest.exe 26 | 27 | .PHONY: bench 28 | bench: 29 | @ make -C benchmark_old $@ 30 | @ make -C benchmark_new $@ 31 | 32 | .PHONY: install 33 | install: 34 | @ dune clean 35 | @ dune build -p $(THIS) 36 | @ dune install -p $(THIS) 37 | 38 | .PHONY: uninstall 39 | uninstall: 40 | @ ocamlfind remove $(THIS) || true 41 | 42 | .PHONY: reinstall 43 | reinstall: uninstall 44 | @ make install 45 | 46 | .PHONY: show 47 | show: reinstall 48 | @ echo "#require \"pprint\";;\n#show PPrint;;" | ocaml 49 | 50 | .PHONY: pin 51 | pin: 52 | @ opam pin add $(THIS) . --yes 53 | 54 | .PHONY: unpin 55 | unpin: 56 | @ opam pin remove $(THIS) --yes 57 | 58 | # ------------------------------------------------------------------------------ 59 | 60 | # Documentation. 61 | 62 | DOCDIR = _build/default/_doc/_html 63 | DOC = $(DOCDIR)/index.html 64 | 65 | .PHONY: doc 66 | doc: 67 | @ rm -rf _build/default/_doc 68 | @ dune clean 69 | @ dune build @doc 70 | @ echo "You can view the documentation by typing 'make view'". 71 | 72 | .PHONY: view 73 | view: doc 74 | @ echo Attempting to open $(DOC)... 75 | @ if command -v firefox > /dev/null ; then \ 76 | firefox $(DOC) ; \ 77 | else \ 78 | open -a /Applications/Firefox.app/ $(DOC) ; \ 79 | fi 80 | 81 | .PHONY: export 82 | export: doc 83 | ssh yquem.inria.fr rm -rf public_html/$(THIS)/doc 84 | scp -r $(DOCDIR) yquem.inria.fr:public_html/$(THIS)/doc 85 | 86 | # ------------------------------------------------------------------------------ 87 | 88 | # [make versions] compiles the package under many versions of OCaml, 89 | # whose list is specified below. 90 | 91 | # This requires appropriate opam switches to exist. A missing switch 92 | # can be created like this: 93 | # opam switch create 4.03.0 94 | 95 | VERSIONS := \ 96 | 4.03.0 \ 97 | 4.04.2 \ 98 | 4.05.0 \ 99 | 4.06.1 \ 100 | 4.07.1 \ 101 | 4.08.1 \ 102 | 4.09.1 \ 103 | 4.09.0+bytecode-only \ 104 | 4.10.0 \ 105 | 4.11.1 \ 106 | 4.12.0 \ 107 | 4.13.0 \ 108 | 4.14.1 \ 109 | 5.0.0 \ 110 | 5.1.0 \ 111 | 5.2.0 \ 112 | 113 | .PHONY: versions 114 | versions: 115 | @(echo "(lang dune 2.0)" && \ 116 | for v in $(VERSIONS) ; do \ 117 | echo "(context (opam (switch $$v)))" ; \ 118 | done) > dune-workspace.versions 119 | @ dune build --workspace dune-workspace.versions 120 | 121 | # ------------------------------------------------------------------------------ 122 | 123 | # [make headache] updates the headers. 124 | 125 | HEADACHE := headache 126 | HEADER := header 127 | 128 | .PHONY: headache 129 | headache: 130 | @ for f in {src,benchmark_old}/*.{ml,mli} benchmark_new/*.ml ; do \ 131 | $(HEADACHE) -h $(HEADER) $$f ; \ 132 | done 133 | 134 | # ------------------------------------------------------------------------- 135 | 136 | # Publishing a release. 137 | 138 | .PHONY: release 139 | release: 140 | # Make sure the current version can be compiled and installed. 141 | @ make uninstall 142 | @ make clean 143 | @ make install 144 | # Check the current package description. 145 | @ opam lint 146 | # Check if everything has been committed. 147 | @ if [ -n "$$(git status --porcelain)" ] ; then \ 148 | echo "Error: there remain uncommitted changes." ; \ 149 | git status ; \ 150 | exit 1 ; \ 151 | else \ 152 | echo "Now making a release..." ; \ 153 | fi 154 | # Create a git tag. 155 | @ git tag -a $(DATE) -m "Release $(DATE)." 156 | # Upload. (This automatically makes a .tar.gz archive available on gitlab.) 157 | @ git push 158 | @ git push --tags 159 | # Done. 160 | @ echo "Done." 161 | @ echo "If happy, please type:" 162 | @ echo " \"make publish\" to publish a new opam package" 163 | @ echo " \"make export\" to upload the documentation to yquem.inria.fr" 164 | 165 | .PHONY: publish 166 | publish: 167 | # Publish an opam description. 168 | @ opam publish -v $(DATE) $(THIS) $(ARCHIVE) . 169 | 170 | .PHONY: undo 171 | undo: 172 | # Undo the last release (assuming it was done on the same date). 173 | @ git tag -d $(DATE) 174 | @ git push -u origin :$(DATE) 175 | -------------------------------------------------------------------------------- /blog/billet.markdown: -------------------------------------------------------------------------------- 1 | 2 | 3 | I am pleased to announce the first official release of _**PPrint**_, an OCaml 4 | library for pretty-printing textual documents. 5 | 6 | ## A taste of the layout language 7 | 8 | At the heart of _**PPrint**_ is a little domain-specific language of 9 | documents. This language has a well-defined semantics, which the printing 10 | engine implements. This language rests upon a small number of fundamental 11 | concepts. 12 | 13 | There are combinators for creating atomic documents. For 14 | instance, 15 | 16 | ```ocaml 17 | string "hello" 18 | ``` 19 | 20 | is a simple, unbreakable document. 21 | 22 | There is also a concatenation operator, which joins two documents. 23 | For instance, 24 | 25 | ```ocaml 26 | string "hello" ^^ string "world" 27 | ``` 28 | 29 | is a composite document. It is in fact equivalent to `string "helloworld"`. 30 | 31 | So far, nothing very exciting. The next two combinators are more original and 32 | interesting. 33 | 34 | The first of these combinators, `break 1`, is a breakable space. If printed in 35 | flat mode, it produces an ordinary space character; if printed in normal mode, 36 | it produces a newline character. 37 | 38 | Yes, there are two printing modes, namely flat mode and normal mode. The 39 | printing engine goes back and forth between these two modes. Exactly where and 40 | how the engine switches from one mode to the other is controlled by the next 41 | combinator. 42 | 43 | The second of these combinators, `group`, introduces a choice between flat 44 | mode and normal mode. It is a document transformer: if `d` is a document, then 45 | `group d` is a document. When the printing engine encounters `group d`, two 46 | possibilities arise. The first possibility is to print all of `d` on a single 47 | line. This is known as flat mode. The engine tries this first (ignoring any 48 | `group` combinators inside `d`). If it succeeds, great. If it fails, by lack 49 | of space on the current line, then the engine backtracks and reverts to the 50 | second possibility, which is to simply ignore the `group` combinator, and just 51 | print `d`. This has subtle consequences: there might be further groups inside 52 | `d`, and each of these groups will give rise to further choices. 53 | 54 | This gives rise to an interesting language, where `group` is used to indicate 55 | a choice point, and the appearance of `break` is dependent upon the choice 56 | point(s) that appear higher up in the hierarchical structure of the document. 57 | For instance, the document: 58 | 59 | ```ocaml 60 | group (string "This" ^^ break 1 ^^ string "is" ^^ break 1 ^^ string "pretty.") 61 | ``` 62 | 63 | will be printed either on a single line, if it fits, or on three lines. It 64 | will not be printed on two lines: there is just one choice point, so either 65 | the two breakable spaces will be broken, or none of them will. By the way, 66 | this document can be abbreviated as follows: 67 | 68 | ```ocaml 69 | group (string "This" ^/^ string "is" ^/^ string "pretty.") 70 | ``` 71 | 72 | On the other hand, the document: 73 | 74 | ```ocaml 75 | string "This" ^^ 76 | group (break 1 ^^ string "is") ^^ 77 | group (break 1 ^^ string "pretty.") 78 | ``` 79 | 80 | could be printed on one, two, or three lines. There are two choice points, 81 | each of which influences one of the two breakable spaces. The two choices are 82 | independent of one another. Each of the words in the sentence `This is 83 | pretty.` will be printed on the current line if it fits, and on a new line 84 | otherwise. By the way, this document can be abbreviated as follows: 85 | 86 | ```ocaml 87 | flow (break 1) [ 88 | string "This" ; 89 | string "is" ; 90 | string "pretty." 91 | ] 92 | ``` 93 | 94 | There are more combinators, such as `nest`, which controls indentation, and 95 | it is relatively easy to roll your own combinators on top of those that are 96 | provided. 97 | 98 | One limitation of the library is that the document must be entirely built in 99 | memory before it is printed. So far, we have used the library in small- to 100 | medium-scale applications, and this has not been a problem. In principle, 101 | one could work around this limitation by adding a new document constructor 102 | whose argument is a suspended document computation. 103 | 104 | ## Acknowledgements 105 | 106 | The document language and the printing engine are inspired by Daan Leijen's 107 | [PPrint](http://www.cs.uu.nl/~daan/pprint.html) 108 | library, which itself is based on the ideas developed by Philip 109 | Wadler in the paper 110 | [A Prettier Printer](http://homepages.inf.ed.ac.uk/wadler/papers/prettier/prettier.pdf). 111 | 112 | _**PPrint**_ was written by François Pottier and Nicolas Pouillard, with 113 | contributions by Yann Régis-Gianas, Gabriel Scherer, and Jonathan 114 | Protzenko. 115 | 116 | 117 | ## Installation 118 | 119 | The library is available online 120 | ([source code](http://gallium.inria.fr/~fpottier/pprint/pprint.tar.gz), 121 | [documentation](http://gallium.inria.fr/~fpottier/pprint/doc/)), 122 | and can also be installed via OPAM: just type `opam install pprint` 123 | if you already have a working OPAM installation. 124 | 125 | Have fun! Feel free to make comments, suggestions, and to let me know if 126 | and how you are using this library. 127 | 128 | -------------------------------------------------------------------------------- /benchmark_old/OldPPrintEngine.mli: -------------------------------------------------------------------------------- 1 | (******************************************************************************) 2 | (* *) 3 | (* PPrint *) 4 | (* *) 5 | (* François Pottier, Inria Paris *) 6 | (* Nicolas Pouillard *) 7 | (* *) 8 | (* Copyright 2007-2022 Inria. All rights reserved. This file is *) 9 | (* distributed under the terms of the GNU Library General Public *) 10 | (* License, with an exception, as described in the file LICENSE. *) 11 | (* *) 12 | (******************************************************************************) 13 | 14 | (** A pretty-printing engine and a set of basic document combinators. *) 15 | 16 | (** {1 Building documents} *) 17 | 18 | (** Documents must be built in memory before they are rendered. This may seem 19 | costly, but it is a simple approach, and works well. *) 20 | 21 | (** The following operations form a set of basic (low-level) combinators for 22 | building documents. On top of these combinators, higher-level combinators 23 | can be defined: see {!PPrintCombinators}. *) 24 | 25 | (** This is the abstract type of documents. *) 26 | type document 27 | 28 | (** The following basic (low-level) combinators allow constructing documents. *) 29 | 30 | (** [empty] is the empty document. *) 31 | val empty: document 32 | 33 | (** [char c] is a document that consists of the single character [c]. This 34 | character must not be a newline. *) 35 | val char: char -> document 36 | 37 | (** [string s] is a document that consists of the string [s]. This string must 38 | not contain a newline. *) 39 | val string: string -> document 40 | 41 | (** [substring s ofs len] is a document that consists of the portion of the 42 | string [s] delimited by the offset [ofs] and the length [len]. This 43 | portion must contain a newline. *) 44 | val substring: string -> int -> int -> document 45 | 46 | (** [fancystring s apparent_length] is a document that consists of the string 47 | [s]. This string must not contain a newline. The string may contain fancy 48 | characters: color escape characters, UTF-8 or multi-byte characters, 49 | etc. Thus, its apparent length (which measures how many columns the text 50 | will take up on screen) differs from its length in bytes. *) 51 | val fancystring: string -> int -> document 52 | 53 | (** [fancysubstring s ofs len apparent_length] is a document that consists of 54 | the portion of the string [s] delimited by the offset [ofs] and the length 55 | [len]. This portion must contain a newline. The string may contain fancy 56 | characters. *) 57 | val fancysubstring : string -> int -> int -> int -> document 58 | 59 | (** [utf8string s] is a document that consists of the UTF-8-encoded string [s]. 60 | This string must not contain a newline. *) 61 | val utf8string: string -> document 62 | 63 | (** [hardline] is a forced newline document. This document forces all enclosing 64 | groups to be printed in non-flattening mode. In other words, any enclosing 65 | groups are dissolved. *) 66 | val hardline: document 67 | 68 | (** [blank n] is a document that consists of [n] blank characters. *) 69 | val blank: int -> document 70 | 71 | (** [break n] is a document which consists of either [n] blank characters, 72 | when forced to display on a single line, or a single newline character, 73 | otherwise. Note that there is no choice at this point: choices are encoded 74 | by the [group] combinator. *) 75 | val break: int -> document 76 | 77 | (** [doc1 ^^ doc2] is the concatenation of the documents [doc1] and [doc2]. *) 78 | val (^^): document -> document -> document 79 | 80 | (** [nest j doc] is the document [doc], in which the indentation level has 81 | been increased by [j], that is, in which [j] blanks have been inserted 82 | after every newline character. Read this again: indentation is inserted 83 | after every newline character. No indentation is inserted at the beginning 84 | of the document. *) 85 | val nest: int -> document -> document 86 | 87 | (** [group doc] encodes a choice. If possible, then the entire document [group 88 | doc] is rendered on a single line. Otherwise, the group is dissolved, and 89 | [doc] is rendered. There might be further groups within [doc], whose 90 | presence will lead to further choices being explored. *) 91 | val group: document -> document 92 | 93 | (** [column f] is the document obtained by applying the function [f] to the 94 | current column number. This combinator allows making the construction of 95 | a document dependent on the current column number. *) 96 | val column: (int -> document) -> document 97 | 98 | (** [nesting f] is the document obtained by applying the function [f] to the 99 | current indentation level, that is, the number of indentation (blank) 100 | characters that were inserted at the beginning of the current line. *) 101 | val nesting: (int -> document) -> document 102 | 103 | (** [ifflat doc1 doc2] is rendered as [doc1] if part of a group that can be 104 | successfully flattened, and is rendered as [doc2] otherwise. Use this 105 | operation with caution. Because the pretty-printer is free to choose 106 | between [doc1] and [doc2], these documents should be semantically 107 | equivalent. *) 108 | val ifflat: document -> document -> document 109 | 110 | (** {1 Rendering documents} *) 111 | 112 | (**This signature describes the document renderers in a manner that 113 | is independent of the type of the output channel. *) 114 | module type RENDERER = sig 115 | 116 | (**The type of the output channel. *) 117 | type channel 118 | 119 | (**The type of documents. *) 120 | type document 121 | 122 | (** [pretty rfrac width channel document] pretty-prints the document 123 | [document] into the output channel [channel]. The parameter [width] is 124 | the maximum number of characters per line. The parameter [rfrac] is the 125 | ribbon width, a fraction relative to [width]. The ribbon width is the 126 | maximum number of non-indentation characters per line. *) 127 | val pretty: float -> int -> channel -> document -> unit 128 | 129 | (** [compact channel document] prints the document [document] to the output 130 | channel [channel]. No indentation is used. All newline instructions are 131 | respected, that is, no groups are flattened. *) 132 | val compact: channel -> document -> unit 133 | 134 | end 135 | 136 | (** This renderer sends its output into an output channel. *) 137 | module ToChannel : RENDERER 138 | with type channel = out_channel 139 | and type document = document 140 | 141 | (** This renderer sends its output into a memory buffer. *) 142 | module ToBuffer : RENDERER 143 | with type channel = Buffer.t 144 | and type document = document 145 | 146 | (** This renderer sends its output into a formatter channel. *) 147 | module ToFormatter : RENDERER 148 | with type channel = Format.formatter 149 | and type document = document 150 | -------------------------------------------------------------------------------- /mini/PPrintMini.ml: -------------------------------------------------------------------------------- 1 | (******************************************************************************) 2 | (* *) 3 | (* PPrint *) 4 | (* *) 5 | (* François Pottier, Inria Paris *) 6 | (* Nicolas Pouillard *) 7 | (* *) 8 | (* Copyright 2007-2022 Inria. All rights reserved. This file is *) 9 | (* distributed under the terms of the GNU Library General Public *) 10 | (* License, with an exception, as described in the file LICENSE. *) 11 | (* *) 12 | (******************************************************************************) 13 | 14 | (* -------------------------------------------------------------------------- *) 15 | 16 | (* A type of integers with infinity. *) 17 | 18 | type requirement = 19 | int (* with infinity *) 20 | 21 | (* Infinity is encoded as [max_int]. *) 22 | 23 | let infinity : requirement = 24 | max_int 25 | 26 | (* Addition of integers with infinity. *) 27 | 28 | let (++) (x : requirement) (y : requirement) : requirement = 29 | if x = infinity || y = infinity then 30 | infinity 31 | else 32 | x + y 33 | 34 | (* Comparison of requirements is just ordinary comparison. *) 35 | 36 | (* -------------------------------------------------------------------------- *) 37 | 38 | (* The type of documents. See [PPrintEngine] for documentation. *) 39 | 40 | type document = 41 | | Empty 42 | | FancyString of string * int * int * int 43 | | Blank of int 44 | | IfFlat of document * document 45 | | HardLine 46 | | Cat of requirement * document * document 47 | | Nest of requirement * int * document 48 | | Group of requirement * document 49 | 50 | (* -------------------------------------------------------------------------- *) 51 | 52 | (* Retrieving or computing the space requirement of a document. *) 53 | 54 | let rec requirement = function 55 | | Empty -> 56 | 0 57 | | FancyString (_, _, _, len) 58 | | Blank len -> 59 | len 60 | | IfFlat (doc1, _) -> 61 | requirement doc1 62 | | HardLine -> 63 | infinity 64 | | Cat (req, _, _) 65 | | Nest (req, _, _) 66 | | Group (req, _) -> 67 | req 68 | 69 | (* -------------------------------------------------------------------------- *) 70 | 71 | (* Document constructors. *) 72 | 73 | let empty = 74 | Empty 75 | 76 | let fancysubstring s ofs len apparent_length = 77 | if len = 0 then 78 | empty 79 | else 80 | FancyString (s, ofs, len, apparent_length) 81 | 82 | let fancystring s apparent_length = 83 | fancysubstring s 0 (String.length s) apparent_length 84 | 85 | let utf8_length s = 86 | let rec length_aux s c i = 87 | if i >= String.length s then c else 88 | let n = Char.code (String.unsafe_get s i) in 89 | let k = 90 | if n < 0x80 then 1 else 91 | if n < 0xe0 then 2 else 92 | if n < 0xf0 then 3 else 4 93 | in 94 | length_aux s (c + 1) (i + k) 95 | in 96 | length_aux s 0 0 97 | 98 | let utf8string s = 99 | fancystring s (utf8_length s) 100 | 101 | let char c = 102 | assert (c <> '\n'); 103 | fancystring (String.make 1 c) 1 104 | 105 | let space = 106 | char ' ' 107 | 108 | let hardline = 109 | HardLine 110 | 111 | let blank n = 112 | match n with 113 | | 0 -> 114 | empty 115 | | 1 -> 116 | space 117 | | _ -> 118 | Blank n 119 | 120 | let ifflat doc1 doc2 = 121 | match doc1 with 122 | | IfFlat (doc1, _) 123 | | doc1 -> 124 | IfFlat (doc1, doc2) 125 | 126 | let internal_break i = 127 | ifflat (blank i) hardline 128 | 129 | let break0 = 130 | internal_break 0 131 | 132 | let break1 = 133 | internal_break 1 134 | 135 | let break i = 136 | match i with 137 | | 0 -> 138 | break0 139 | | 1 -> 140 | break1 141 | | _ -> 142 | internal_break i 143 | 144 | let (^^) x y = 145 | match x, y with 146 | | Empty, _ -> 147 | y 148 | | _, Empty -> 149 | x 150 | | _, _ -> 151 | Cat (requirement x ++ requirement y, x, y) 152 | 153 | let nest i x = 154 | assert (i >= 0); 155 | Nest (requirement x, i, x) 156 | 157 | let group x = 158 | let req = requirement x in 159 | if req = infinity then 160 | x 161 | else 162 | Group (req, x) 163 | 164 | (* -------------------------------------------------------------------------- *) 165 | 166 | (* Printing blank space (indentation characters). *) 167 | 168 | let blank_length = 169 | 80 170 | 171 | let blank_buffer = 172 | String.make blank_length ' ' 173 | 174 | let rec blanks output n = 175 | if n <= 0 then 176 | () 177 | else if n <= blank_length then 178 | Buffer.add_substring output blank_buffer 0 n 179 | else begin 180 | Buffer.add_substring output blank_buffer 0 blank_length; 181 | blanks output (n - blank_length) 182 | end 183 | 184 | (* -------------------------------------------------------------------------- *) 185 | 186 | (* The rendering engine maintains the following internal state. *) 187 | 188 | (* For simplicity, the ribbon width is considered equal to the line 189 | width; in other words, there is no ribbon width constraint. *) 190 | 191 | (* For simplicity, the output channel is required to be an OCaml buffer. 192 | It is stored within the [state] record. *) 193 | 194 | type state = 195 | { 196 | (* The line width. *) 197 | width: int; 198 | (* The current column. *) 199 | mutable column: int; 200 | (* The output buffer. *) 201 | output: Buffer.t; 202 | } 203 | 204 | (* -------------------------------------------------------------------------- *) 205 | 206 | (* For simplicity, the rendering engine is *not* in tail-recursive style. *) 207 | 208 | let rec pretty state (indent : int) (flatten : bool) doc = 209 | match doc with 210 | 211 | | Empty -> 212 | () 213 | 214 | | FancyString (s, ofs, len, apparent_length) -> 215 | Buffer.add_substring state.output s ofs len; 216 | state.column <- state.column + apparent_length 217 | 218 | | Blank n -> 219 | blanks state.output n; 220 | state.column <- state.column + n 221 | 222 | | HardLine -> 223 | assert (not flatten); 224 | Buffer.add_char state.output '\n'; 225 | blanks state.output indent; 226 | state.column <- indent 227 | 228 | | IfFlat (doc1, doc2) -> 229 | pretty state indent flatten (if flatten then doc1 else doc2) 230 | 231 | | Cat (_, doc1, doc2) -> 232 | pretty state indent flatten doc1; 233 | pretty state indent flatten doc2 234 | 235 | | Nest (_, j, doc) -> 236 | pretty state (indent + j) flatten doc 237 | 238 | | Group (req, doc) -> 239 | let flatten = flatten || state.column ++ req <= state.width in 240 | pretty state indent flatten doc 241 | 242 | (* -------------------------------------------------------------------------- *) 243 | 244 | (* The engine's entry point. *) 245 | 246 | let pretty width doc = 247 | let output = Buffer.create 512 in 248 | let state = { width; column = 0; output } in 249 | pretty state 0 false doc; 250 | Buffer.contents output 251 | -------------------------------------------------------------------------------- /benchmark_old/PPrintBench.ml: -------------------------------------------------------------------------------- 1 | (******************************************************************************) 2 | (* *) 3 | (* PPrint *) 4 | (* *) 5 | (* François Pottier, Inria Paris *) 6 | (* Nicolas Pouillard *) 7 | (* *) 8 | (* Copyright 2007-2022 Inria. All rights reserved. This file is *) 9 | (* distributed under the terms of the GNU Library General Public *) 10 | (* License, with an exception, as described in the file LICENSE. *) 11 | (* *) 12 | (******************************************************************************) 13 | 14 | (* ------------------------------------------------------------------------- *) 15 | 16 | (* The following signature is common to the old and new engines. *) 17 | 18 | module type ENGINE = sig 19 | 20 | type document 21 | val empty: document 22 | val char: char -> document 23 | val string: string -> document 24 | val substring: string -> int -> int -> document 25 | val fancystring: string -> int -> document 26 | val fancysubstring : string -> int -> int -> int -> document 27 | val utf8string: string -> document 28 | val hardline: document 29 | val blank: int -> document 30 | val break: int -> document 31 | val (^^): document -> document -> document 32 | val nest: int -> document -> document 33 | val group: document -> document 34 | val ifflat: document -> document -> document 35 | 36 | module ToBuffer : PPrint.RENDERER 37 | with type channel = Buffer.t 38 | and type document = document 39 | 40 | end 41 | 42 | (* ------------------------------------------------------------------------- *) 43 | 44 | (* We use our own abstract syntax of documents. We produce random documents 45 | in this syntax first, then (as part of the timed test) translate them to 46 | the engine's syntax. This allows timing the engine's document construction 47 | code too. *) 48 | 49 | type mydoc = 50 | | MyEmpty 51 | | MyChar of char 52 | | MyString of string 53 | | MySubString of string * int * int 54 | | MyUtf8String of string 55 | | MyHardLine 56 | | MyBlank of int 57 | | MyBreak of int 58 | | MyCat of mydoc * mydoc 59 | | MyNest of int * mydoc 60 | | MyGroup of mydoc 61 | | MyIfFlat of mydoc * mydoc 62 | 63 | (* ------------------------------------------------------------------------- *) 64 | 65 | (* [measure v] measures the size of an OCaml value [v] in bytes. *) 66 | 67 | let measure v = 68 | (* String.length (Marshal.to_string v []) *) 69 | Size.size_b v 70 | 71 | (* ------------------------------------------------------------------------- *) 72 | 73 | (* [split n] produces two numbers [n1] and [n2] comprised between [0] and [n] 74 | (inclusive) whose sum is [n]. *) 75 | 76 | let split n = 77 | let n1 = Random.int (n + 1) in 78 | let n2 = n - n1 in 79 | n1, n2 80 | 81 | (* [choose xs] randomly and uniformly chooses between the elements of the 82 | array [xs]. *) 83 | 84 | let choose xs = 85 | Array.unsafe_get xs (Random.int (Array.length xs)) 86 | 87 | (* [pick] is analogous, but each element comes with a relative integer weight. *) 88 | 89 | let pick wxs = 90 | (* Compute the total weight. *) 91 | let weight = List.fold_left (fun weight (w, _) -> weight + w) 0 wxs in 92 | assert (weight > 0); 93 | (* Pick a random integer between 0 and the total weight. *) 94 | let i = Random.int weight in 95 | (* Find the corresponding element. *) 96 | let rec loop i wxs = 97 | match wxs with 98 | | [] -> 99 | assert false 100 | | (w, x) :: wxs -> 101 | if i < w then x else loop (i - w) wxs 102 | in 103 | loop i wxs 104 | 105 | (* ------------------------------------------------------------------------- *) 106 | 107 | (* A random document generator. *) 108 | 109 | let leaf = 110 | [| 111 | MyChar 'c'; 112 | MyString "hello"; 113 | MySubString ("the cat", 4, 3); 114 | MyUtf8String "étoile"; 115 | MyHardLine; 116 | MyBlank 2; 117 | MyBreak 2 118 | |] 119 | 120 | let rec random (n : int) : mydoc = 121 | (* If the budget is 0, produce an empty document. *) 122 | if n = 0 then 123 | MyEmpty 124 | (* If the budget is 1, produce a leaf. *) 125 | else if n = 1 then 126 | choose leaf 127 | (* Otherwise, decrement the budget, and produce a node of nonzero 128 | arity, spending the rest of the budget on the children. *) 129 | else 130 | let n = n - 1 in 131 | Lazy.force (pick [ 132 | 10, lazy (let n1, n2 = split n in MyCat (random n1, random n2)); 133 | 2, lazy (MyNest (2, random n)); 134 | 10, lazy (MyGroup (random n)); 135 | 2, lazy (let n1, n2 = split n in MyIfFlat (random n1, random n2)) 136 | ]) 137 | 138 | (* ------------------------------------------------------------------------- *) 139 | 140 | (* Building documents for a particular engine. *) 141 | 142 | module Build (E : ENGINE) = struct 143 | 144 | open E 145 | 146 | let rec build (doc : mydoc) : document = 147 | match doc with 148 | | MyEmpty -> 149 | empty 150 | | MyChar c -> 151 | char c 152 | | MyString s -> 153 | string s 154 | | MySubString (s, ofs, len) -> 155 | substring s ofs len 156 | | MyUtf8String s -> 157 | utf8string s 158 | | MyHardLine -> 159 | hardline 160 | | MyBlank b -> 161 | blank b 162 | | MyBreak b -> 163 | break b 164 | | MyCat (doc1, doc2) -> 165 | build doc1 ^^ build doc2 166 | | MyNest (i, doc) -> 167 | nest i (build doc) 168 | | MyGroup doc -> 169 | group (build doc) 170 | | MyIfFlat (doc1, doc2) -> 171 | ifflat (build doc1) (build doc2) 172 | 173 | end 174 | 175 | (* ------------------------------------------------------------------------- *) 176 | 177 | (* The rendering parameters. *) 178 | 179 | let rfrac = 180 | 0.8 181 | 182 | let width = 183 | 80 184 | 185 | (* ------------------------------------------------------------------------- *) 186 | 187 | (* Testing an engine, alone. *) 188 | 189 | module Test1 (E : ENGINE) = struct 190 | 191 | open E 192 | 193 | (* The size of the randomly generated documents. *) 194 | let n = 195 | 1000 196 | 197 | (* The number of runs. *) 198 | let runs = 199 | 10000 200 | 201 | let () = 202 | let module B = Build(E) in 203 | let s = ref 0 in 204 | for _r = 1 to runs do 205 | let document = B.build (random n) in 206 | s := !s + measure document; 207 | let buffer = Buffer.create 32768 in 208 | ToBuffer.pretty rfrac width buffer document; 209 | let buffer = Buffer.create 32768 in 210 | ToBuffer.compact buffer document 211 | done; 212 | Printf.printf "Test 1: success.\n%!"; 213 | let average = float_of_int !s /. float_of_int runs in 214 | Printf.printf "Average document size: %d bytes.\n%!" (truncate average) 215 | 216 | end 217 | 218 | (* ------------------------------------------------------------------------- *) 219 | 220 | (* Testing two engines and comparing their output. *) 221 | 222 | module Test2 (E1 : ENGINE) (E2 : ENGINE) = struct 223 | 224 | (* The size of the randomly generated documents. *) 225 | let n = 226 | 1000 227 | 228 | (* The number of runs. *) 229 | let runs = 230 | 10000 231 | 232 | let () = 233 | let module B1 = Build(E1) in 234 | let module B2 = Build(E2) in 235 | for _r = 1 to runs do 236 | let document = random n in 237 | let document1 = B1.build document in 238 | let document2 = B2.build document in 239 | let buffer1 = Buffer.create 32768 in 240 | E1.ToBuffer.pretty rfrac width buffer1 document1; 241 | let buffer2 = Buffer.create 32768 in 242 | E2.ToBuffer.pretty rfrac width buffer2 document2; 243 | assert (Buffer.contents buffer1 = Buffer.contents buffer2) 244 | done; 245 | Printf.printf "Test 2: success.\n%!" 246 | 247 | end 248 | 249 | (* ------------------------------------------------------------------------- *) 250 | 251 | (* Timing an engine, alone. *) 252 | 253 | module Time1 (E : ENGINE) (D : sig val n: int val runs: int val docs : mydoc array end) = struct 254 | 255 | open E 256 | open D 257 | 258 | let gc = 259 | false 260 | 261 | let time f x = 262 | if gc then 263 | Gc.major(); 264 | let start = Unix.gettimeofday() in 265 | let y = f x in 266 | let finish = Unix.gettimeofday() in 267 | y, finish -. start 268 | 269 | let () = 270 | let module B = Build(E) in 271 | Printf.printf "Time: building documents...\n%!"; 272 | let docs, duration = time (fun () -> Array.map B.build docs) () in 273 | Printf.printf "Time: built %d documents of size %d in %.2f seconds.\n%!" runs n duration; 274 | let size = Array.fold_left (fun accu doc -> accu + measure doc) 0 docs in 275 | let average = float_of_int size /. float_of_int runs in 276 | Printf.printf "Average document size: %d bytes.\n%!" (truncate average); 277 | let buffer = Buffer.create 32768 in 278 | Printf.printf "Time: rendering documents...\n%!"; 279 | let (), duration = time (fun () -> 280 | Array.iter (fun document -> 281 | ToBuffer.pretty rfrac width buffer document; 282 | Buffer.clear buffer 283 | ) docs 284 | ) () in 285 | Printf.printf "Time: rendered %d documents of size %d in %.2f seconds.\n%!" runs n duration 286 | 287 | end 288 | 289 | (* ------------------------------------------------------------------------- *) 290 | 291 | (* Main. *) 292 | 293 | let test1 () = 294 | (* Testing both engines on the same set of documents. *) 295 | Printf.printf "Testing old engine...\n"; 296 | let state = Random.get_state() in 297 | let module T = Test1(OldPPrintEngine) in 298 | Random.set_state state; 299 | Printf.printf "Testing new engine...\n"; 300 | let module T = Test1(PPrintEngine) in 301 | () 302 | 303 | let test2 () = 304 | (* Comparing the two engines. *) 305 | Printf.printf "Comparing old and new engines...\n"; 306 | let module T = Test2(OldPPrintEngine)(PPrintEngine) in 307 | () 308 | 309 | type engine = Old | New 310 | 311 | let test3 engine = 312 | (* The timing test. Best to run it separately on each engine 313 | (in two different processes), as there are otherwise GC 314 | effects. If a major GC is triggered, the timing test is 315 | severely affected. *) 316 | let module D = struct 317 | (* The size of the randomly generated documents. *) 318 | let n = 10000 319 | (* The number of runs. *) 320 | let runs = 1000 321 | let () = Printf.printf "Generating %d documents of size %d...\n%!" runs n 322 | let docs = Array.init runs (fun _ -> random n) 323 | end in 324 | match engine with 325 | | Old -> 326 | Printf.printf "Timing old engine...\n"; 327 | let module T = Time1(OldPPrintEngine)(D) in 328 | () 329 | | New -> 330 | Printf.printf "Timing new engine...\n"; 331 | let module T = Time1(PPrintEngine)(D) in 332 | () 333 | 334 | let () = 335 | (* The comparison between the old and new engines is now disabled, 336 | because the new engine removes trailing blank characters on every 337 | line, whereas the old engine does not. *) 338 | if false then test2(); 339 | test3 New 340 | -------------------------------------------------------------------------------- /src/index.mld: -------------------------------------------------------------------------------- 1 | {0 PPrint} 2 | 3 | [PPrint] is an OCaml library for {b pretty-printing textual 4 | documents}. It takes care of {b indentation and line breaks}, and is 5 | typically used to {b pretty-print code}. 6 | 7 | {1 API Reference} 8 | 9 | An experienced user may wish to jump directly to a section of the 10 | API documentation: 11 | - {{!PPrint.building}building documents,} 12 | - {{!PPrint.inspecting}inspecting documents,} 13 | - {{!PPrint.rendering}rendering documents,} 14 | - {{!PPrint.defining}defining custom documents,} 15 | - {{!PPrint.combinators}high-level combinators,} 16 | - combinators for {{!PPrint.OCaml}printing OCaml values}; 17 | these reside in the submodule {!PPrint.OCaml}. 18 | 19 | {1 Core Combinators} 20 | 21 | At the heart of [PPrint] is a little {b domain-specific language of 22 | documents}. This language has a well-defined semantics, which the printing 23 | engine implements. The language and its semantics rest upon a small number 24 | of fundamental concepts. 25 | 26 | There are combinators for creating {b atomic documents}. For instance, the 27 | {{!PPrint.string}string} combinator turns an OCaml string (which must not 28 | contain any newline character) into a document. Thus, 29 | 30 | {[ 31 | string "hello" 32 | ]} 33 | 34 | is a simple, unbreakable document. The {{!PPrint.utf8string}utf8string} 35 | combinator is analogous, and should be preferred when working with non-ASCII 36 | strings. The {{!PPrint.utf8format}utf8format} combinator provides a 37 | convenient [sprintf]-style API for constructing a complex string and 38 | turning it into an atomic document. 39 | 40 | There is a {b concatenation} operator {{!PPrint.(^^)}(^^)}, which joins 41 | two documents. For instance, 42 | 43 | {[ 44 | string "hello" ^^ string "world" 45 | ]} 46 | 47 | is a composite document. It is in fact equivalent to [string "helloworld"]. 48 | 49 | A somewhat more interesting combinator is the {b breakable blank} combinator 50 | {{!PPrint.break}break}. This combinator expects a nonnegative integer 51 | argument, the width of the desired breakable blank. If [break n] is printed 52 | in flat mode, it produces [n] blank characters; if it is printed in normal 53 | mode, it produces one newline character. 54 | 55 | As suggested by the previous sentence, there are {b two printing modes}, 56 | namely {b flat mode} and {b normal mode}. The printing engine goes back and 57 | forth between these two modes. Exactly where and how the printing engine 58 | switches from one mode to the other is controlled by the next combinator. 59 | 60 | The {b grouping combinator}, {{!PPrint.group}group}, introduces {b a choice 61 | between flat mode and normal mode}. It is a document transformer: if [d] is 62 | a document, then [group d] is a document. When the printing engine 63 | encounters [group d], two possibilities arise. The first possibility is to 64 | print all of [d] on a single line. This is known as flat mode. The engine 65 | tries this first (ignoring all {{!PPrint.group}group} combinators inside 66 | [d]). If it succeeds, great. If it fails, by lack of space on the current 67 | line, then the engine reverts to the second possibility, which is to 68 | dissolve the group and print the bare document [d] in normal mode. This has 69 | subtle consequences: there might be further groups inside [d], and each of 70 | these groups gives rise to further choices. 71 | 72 | At each group, the choice is resolved in an efficient way. No backtracking 73 | is required. The ideal width of every document is computed (in a bottom-up 74 | manner) when documents are constructed. This allows every choice to be 75 | resolved in constant time. The time complexity of building and rendering 76 | documents is linear in the size of the document. 77 | 78 | {1 Examples} 79 | 80 | The interplay of {{!PPrint.break}break} and {{!PPrint.group}group} gives 81 | rise to an interesting language, where {{!PPrint.group}group} is used to 82 | indicate a choice point, and the appearance of {{!PPrint.break}break} is 83 | dependent upon the choice points that appear higher up in the hierarchical 84 | structure of the document. For instance, the document: 85 | 86 | {[ 87 | group (string "This" ^^ break 1 ^^ string "is" ^^ break 1 ^^ string "pretty.") 88 | ]} 89 | 90 | is printed either on a single line, if it fits, or on three lines. It cannot 91 | be printed on two lines: there is just one choice point, so either the two 92 | breakable blanks are broken, or none of them is. By the way, this document 93 | can be abbreviated as follows: 94 | 95 | {[ 96 | group (string "This" ^/^ string "is" ^/^ string "pretty.") 97 | ]} 98 | 99 | On the other hand, the document: 100 | 101 | {[ 102 | string "This" ^^ 103 | group (break 1 ^^ string "is") ^^ 104 | group (break 1 ^^ string "pretty.") 105 | ]} 106 | 107 | can be printed on one, two, or three lines. There are two choice points, 108 | each of which influences one of the two breakable blanks. The two choices 109 | are independent of one another. Each of the words in the sentence [This is 110 | pretty.] is printed on the current line if it fits, and on a new line 111 | otherwise. By the way, this document can be abbreviated as follows: 112 | 113 | {[ 114 | flow (break 1) [ 115 | string "This"; 116 | string "is"; 117 | string "pretty." 118 | ] 119 | ]} 120 | 121 | or as follows: 122 | 123 | {[ 124 | flow_map (break 1) string [ "This"; "is"; "pretty." ] 125 | ]} 126 | 127 | {1 More Core Combinators} 128 | 129 | As noted earlier, the string that is supplied to {{!PPrint.string}string}, 130 | {{!PPrint.utf8string}utf8string}, or {{!PPrint.utf8format}utf8format} must 131 | not contain any newline characters. If one wishes to impose a line break, 132 | one must use the {b forced newline} combinator {{!PPrint.hardline}hardline}. 133 | 134 | Whereas {{!PPrint.group}group} introduces a choice between flat mode and 135 | normal mode, the {b conditional construct} {{!PPrint.ifflat}ifflat} allows 136 | testing whether the printing engine is currently in flat mode or in normal 137 | mode. The document [ifflat doc1 doc2] is rendered as [doc1] if the engine is 138 | currently in flat mode, and as [doc2] if the engine is currently in normal 139 | mode. This is a powerful combinator; however, one must be aware that 140 | {i both branches are evaluated and constructed in memory} during the 141 | document construction phase. So, if used naively, 142 | {{!PPrint.ifflat}ifflat} can cause exponential time and space usage. 143 | To avoid this danger, {{!PPrint.ifflat}ifflat} should 144 | typically be applied to documents of constant size. 145 | 146 | The {b blank combinator} {{!PPrint.blank}blank} is analogous to 147 | {{!PPrint.break}break}, but produces non-breakable blank characters. A blank 148 | character is like an ordinary ASCII space character [string " "], except 149 | that blank characters at the end of a line are automatically suppressed. 150 | Thus, the printing engine guarantees that no trailing blank characters are 151 | ever produced. 152 | 153 | To illustrate the power of these combinators, let us reveal that 154 | {{!PPrint.break}break} is in reality not a primitive combinator: it is 155 | defined in terms of {{!PPrint.hardline}hardline}, {{!PPrint.blank}blank}, 156 | and {{!PPrint.ifflat}ifflat}. A possible definition of [break 1] is [ifflat 157 | (blank 1) hardline]. 158 | 159 | The {b nesting} combinator {{!PPrint.nest}nest} deals with indentation. At 160 | every time, the printing engine maintains a {b current indentation level}, 161 | which is a nonnegative integer. The current indentation level is initially 162 | zero. To render the document [nest 2 d], the printing engine temporarily 163 | increases the current indentation level by 2, renders the document [d], then 164 | restores the previous indentation level. The effect of the current 165 | indentation level is as follows: {b every time a newline character is 166 | emitted, it is immediately followed by [n] blank characters}, where [n] is 167 | the current indentation level. 168 | 169 | To illustrate the use of indentation, let us look at this document: 170 | 171 | {[ 172 | group ( 173 | string "begin" ^^ 174 | nest 2 (break 1 ^^ string "work") ^^ 175 | break 1 ^^ string "end" 176 | ) 177 | ]} 178 | 179 | Although this document looks somewhat complicated, understanding its 180 | behavior is relatively easy, because there is only one 181 | {{!PPrint.group}group} combinator in it. This document can be printed in one 182 | of two ways. If it fits on the current line, then the content of the group 183 | is rendered in flat mode: [break 1] becomes equivalent to [blank 1], and 184 | (because no newline characters are emitted) [nest 2] has no effect. The 185 | document is then rendered as follows: 186 | 187 | {[ 188 | begin work end 189 | ]} 190 | 191 | If the document does {i not} fit on the current line, then the group is 192 | dissolved, and [break 1] becomes equivalent to [hardline]. Thus, the 193 | document becomes equivalent to: 194 | 195 | {[ 196 | string "begin" ^^ 197 | nest 2 (hardline ^^ string "work") ^^ 198 | hardline ^^ string "end" 199 | ]} 200 | 201 | Thanks to the {{!PPrint.nest}nest} combinator, the first 202 | {{!PPrint.hardline}hardline} is immediately followed with two blank 203 | characters, whereas the second {{!PPrint.hardline}hardline} is not. 204 | The document is then rendered as follows: 205 | 206 | {[ 207 | begin 208 | work 209 | end 210 | ]} 211 | 212 | The {b alignment} combinator {{!PPrint.align}align} can be used to change 213 | the current indentation level in a more subtle way. The effect of this 214 | combinator is to set the current indentation level to the current column. To 215 | understand what this means, let us look at this document: 216 | 217 | {[ 218 | string "please" ^/^ align (group (string "align" ^/^ string "here")) 219 | ]} 220 | 221 | If this document fits on the current line, then neither 222 | {{!PPrint.align}align} nor {{!PPrint.group}group} have any effect, so the 223 | document is rendered as follows: 224 | 225 | {[ 226 | please align here 227 | ]} 228 | 229 | If the document does {i not} fit on the current line, then the group is 230 | dissolved. The second concatenation operator {{!PPrint.(^/^)}(^/^)} inserts 231 | a breakable blank [break 1], which is in this case is equivalent to 232 | [hardline]. Because the current indentation level is set by 233 | {{!PPrint.align}align} to the column that follows "[please ]", the document 234 | is rendered as follows: 235 | 236 | {[ 237 | please align 238 | here 239 | ]} 240 | 241 | This concludes our review of [PPrint]'s core combinators. Not every 242 | combinator has been mentioned here; for further details, please consult 243 | {{!PPrint.building}the complete list} of the core combinators for building 244 | documents. 245 | 246 | On top of the core combinators, it is up to the user of the library to 247 | define higher-level combinators that are more convenient or better suited to 248 | a particular use case. [PPrint] itself comes with {{!PPrint.combinators}a 249 | collection of high-level combinators}, and the submodule {!PPrint.OCaml} 250 | offers a collection of combinators for {{!PPrint.OCaml}printing OCaml 251 | values}. These collections are not as complete and thoughtfully designed as 252 | they could be. They are subject to change in the future. 253 | 254 | {1 Rendering Documents} 255 | 256 | The submodules {!PPrint.ToChannel}, {!PPrint.ToBuffer}, and 257 | {!PPrint.ToFormatter} give access to the printing engine, 258 | and send their output respectively to an output channel 259 | of type [out_channel], to a buffer of type [Buffer.t], 260 | and to a formatter channel of type [Format.formatter]. 261 | 262 | Each of these submodules offers a choice between two printing engines. The 263 | {{!PPrint.ToChannel.pretty}pretty} printing engine should be preferred in 264 | most situations; it attempts to respects the maximum line width and ribbon 265 | width specified by the user. The {{!PPrint.ToChannel.compact}compact} 266 | printing engine can be used when the readability of the output does not 267 | matter: it assumes a maximum line width of zero (so it never flattens a 268 | group) and does not emit any indentation characters. 269 | 270 | {1 Defining Custom Documents} 271 | 272 | It is possible to extend [PPrint] with custom document constructors, 273 | provided they meet the expectations of the printing engine. In short, the {b 274 | custom document} combinator {{!PPrint.val-custom}custom} expects an object 275 | of class {{!PPrint.class-type-custom}custom}. This object must provide three 276 | methods. The method [requirement] must compute the ideal width of the custom 277 | document. The methods [pretty] and [compact] must render the custom 278 | document. For this purpose, they have access to the {{!PPrint.output}output 279 | channel} and to the {{!PPrint.state}state} of the printing engine. For more 280 | details, see {{!PPrint.defining}Defining Custom Documents}. 281 | 282 | {1 History and Acknowledgements} 283 | 284 | The document language and the printing engine are inspired by Daan Leijen's 285 | {{:https://hackage.haskell.org/package/wl-pprint}wl-pprint} library, which 286 | itself is based on the ideas developed by Philip Wadler in the paper 287 | {{:http://homepages.inf.ed.ac.uk/wadler/papers/prettier/prettier.pdf}A 288 | Prettier Printer}. This Haskell library exploits laziness to achieve a very 289 | low memory requirement: the entire document never needs to reside in memory. 290 | [PPrint] achieves greater simplicity and possibly higher throughput by 291 | requiring the entire document to be built in memory before it is printed. 292 | 293 | [PPrint] was written by {{:http://cambium.inria.fr/~fpottier/}François 294 | Pottier} and Nicolas Pouillard, with contributions by Yann Régis-Gianas, 295 | Gabriel Scherer, Jonathan Protzenko, and Thomas Refis. 296 | -------------------------------------------------------------------------------- /src/PPrint.mli: -------------------------------------------------------------------------------- 1 | (******************************************************************************) 2 | (* *) 3 | (* PPrint *) 4 | (* *) 5 | (* François Pottier, Inria Paris *) 6 | (* Nicolas Pouillard *) 7 | (* *) 8 | (* Copyright 2007-2022 Inria. All rights reserved. This file is *) 9 | (* distributed under the terms of the GNU Library General Public *) 10 | (* License, with an exception, as described in the file LICENSE. *) 11 | (* *) 12 | (******************************************************************************) 13 | 14 | include module type of PPrintEngine (** @inline *) 15 | 16 | (** {1:combinators High-Level Combinators} *) 17 | 18 | (** {2 Single Characters} *) 19 | 20 | (**The following atomic documents consist of a single character. Each of 21 | them is a synonym for the application of {!char} to some constant 22 | character. For instance, {!lparen} is a synonym for [char '(']. *) 23 | 24 | val lparen: document 25 | val rparen: document 26 | val langle: document 27 | val rangle: document 28 | val lbrace: document 29 | val rbrace: document 30 | val lbracket: document 31 | val rbracket: document 32 | val squote: document 33 | val dquote: document 34 | val bquote: document 35 | val semi: document 36 | val colon: document 37 | val comma: document 38 | val dot: document 39 | val sharp: document 40 | val slash: document 41 | val backslash: document 42 | val equals: document 43 | val qmark: document 44 | val tilde: document 45 | val at: document 46 | val percent: document 47 | val dollar: document 48 | val caret: document 49 | val ampersand: document 50 | val star: document 51 | val plus: document 52 | val minus: document 53 | val underscore: document 54 | val bang: document 55 | val bar: document 56 | 57 | (** {2 Delimiters} *) 58 | 59 | (**[precede l x] is [l ^^ x]. *) 60 | val precede: document -> document -> document 61 | 62 | (**[terminate r x] is [x ^^ r]. *) 63 | val terminate: document -> document -> document 64 | 65 | (**[enclose l r x] is [l ^^ x ^^ r]. *) 66 | val enclose: document -> document -> document -> document 67 | 68 | (**The following combinators enclose a document within a pair of delimiters. 69 | They are partial applications of [enclose]. No whitespace or line break 70 | is introduced. *) 71 | 72 | val squotes: document -> document 73 | val dquotes: document -> document 74 | val bquotes: document -> document 75 | val braces: document -> document 76 | val parens: document -> document 77 | val angles: document -> document 78 | val brackets: document -> document 79 | 80 | (** {2 Repetition} *) 81 | 82 | (**[twice doc] is the document obtained by concatenating two copies of 83 | the document [doc]. *) 84 | val twice: document -> document 85 | 86 | (**[repeat n doc] is the document obtained by concatenating [n] copies of 87 | the document [doc]. *) 88 | val repeat: int -> document -> document 89 | 90 | (** {2 Lists and Options} *) 91 | 92 | (**[concat docs] is the concatenation of the documents in the list [docs]. *) 93 | val concat: document list -> document 94 | 95 | (**[separate sep docs] is the concatenation of the documents in the list 96 | [docs]. The separator [sep] is inserted between every two adjacent 97 | documents. *) 98 | val separate: document -> document list -> document 99 | 100 | (**[concat_map f xs] is equivalent to [concat (List.map f xs)]. *) 101 | val concat_map: ('a -> document) -> 'a list -> document 102 | 103 | (**[separate_map sep f xs] is equivalent to [separate sep (List.map f xs)]. *) 104 | val separate_map: document -> ('a -> document) -> 'a list -> document 105 | 106 | (**[separate2 sep last_sep docs] is the concatenation of the documents in 107 | the list [docs]. The separator [sep] is inserted between every two 108 | adjacent documents, except between the last two documents, where the 109 | separator [last_sep] is used instead. *) 110 | val separate2: document -> document -> document list -> document 111 | 112 | (**[optional f None] is the empty document. [optional f (Some x)] is 113 | the document [f x]. *) 114 | val optional: ('a -> document) -> 'a option -> document 115 | 116 | (** {2 Text} *) 117 | 118 | (**[lines s] is the list of documents obtained by splitting [s] at newline 119 | characters, and turning each line into a document via [substring]. This 120 | code is not UTF-8 aware. *) 121 | val lines: string -> document list 122 | 123 | (**[arbitrary_string s] is equivalent to [separate (break 1) (lines s)]. 124 | It is analogous to [string s], but is valid even if the string [s] 125 | contains newline characters. *) 126 | val arbitrary_string: string -> document 127 | 128 | (**[words s] is the list of documents obtained by splitting [s] at whitespace 129 | characters, and turning each word into a document via [substring]. All 130 | whitespace is discarded. This code is not UTF-8 aware. *) 131 | val words: string -> document list 132 | 133 | (**[split ok s] splits the string [s] before and after every occurrence of a 134 | character that satisfies the predicate [ok]. The substrings thus obtained 135 | are turned into documents, and a list of documents is returned. No 136 | information is lost: the concatenation of the documents yields the 137 | original string. This code is not UTF-8 aware. *) 138 | val split: (char -> bool) -> string -> document list 139 | 140 | (**[flow sep docs] separates the documents in the list [docs] with the 141 | separator [sep] and arranges for a new line to begin whenever a document 142 | does not fit on the current line. This is useful for typesetting 143 | free-flowing, ragged-right text. A typical choice of [sep] is [break b], 144 | where [b] is the number of spaces that must be inserted between two 145 | consecutive words (when displayed on the same line). *) 146 | val flow: document -> document list -> document 147 | 148 | (**[flow_map sep f docs] is equivalent to [flow sep (List.map f docs)]. *) 149 | val flow_map: document -> ('a -> document) -> 'a list -> document 150 | 151 | (**[url s] is a possible way of displaying the URL [s]. A potential line 152 | break is inserted immediately before and immediately after every slash 153 | and dot character. *) 154 | val url: string -> document 155 | 156 | (** {2 Alignment and Indentation} *) 157 | 158 | (**[hang n doc] is analogous to [align], but additionally indents all lines, 159 | except the first one, by [n]. Thus, the text in the box forms a hanging 160 | indent. *) 161 | val hang: int -> document -> document 162 | 163 | (**[prefix n b left right] has the following flat layout: 164 | {[ 165 | left right 166 | ]} 167 | and the following non-flat layout: 168 | {[ 169 | left 170 | right 171 | ]} 172 | The parameter [n] controls the nesting of [right] (when not flat). 173 | The parameter [b] controls the number of spaces between [left] and [right] 174 | (when flat). *) 175 | val prefix: int -> int -> document -> document -> document 176 | 177 | (**[jump n b right] is equivalent to [prefix n b empty right]. *) 178 | val jump: int -> int -> document -> document 179 | 180 | (**[infix n b middle left right] has the following flat layout: 181 | {[ 182 | left middle right 183 | ]} 184 | and the following non-flat layout: 185 | {[ 186 | left middle 187 | right 188 | ]} 189 | The parameter [n] controls the nesting of [right] (when not flat). 190 | The parameter [b] controls the number of spaces between [left] and [middle] 191 | (always) and between [middle] and [right] (when flat). *) 192 | val infix: int -> int -> document -> document -> document -> document 193 | 194 | (**[surround n b opening contents closing] has the following flat layout: 195 | {[ 196 | opening contents closing 197 | ]} 198 | and the following non-flat layout: 199 | {[ 200 | opening 201 | contents 202 | closing 203 | ]} 204 | The parameter [n] controls the nesting of [contents] (when not flat). 205 | The parameter [b] controls the number of spaces between [opening] and 206 | [contents] and between [contents] and [closing] (when flat). 207 | *) 208 | val surround: int -> int -> document -> document -> document -> document 209 | 210 | (**[soft_surround] is analogous to [surround], but involves more than one 211 | group, so it offers possibilities other than the completely flat layout 212 | (where [opening], [contents], and [closing] appear on a single line) and 213 | the completely developed layout (where [opening], [contents], and 214 | [closing] appear on separate lines). It tries to place the beginning of 215 | [contents] on the same line as [opening], and to place [closing] on the 216 | same line as the end of [contents], if possible. *) 217 | val soft_surround: int -> int -> document -> document -> document -> document 218 | 219 | (**[surround_separate n b void opening sep closing docs] is equivalent to 220 | [surround n b opening (separate sep docs) closing], except when the list 221 | [docs] is empty, in which case it reduces to [void]. *) 222 | val surround_separate: 223 | int -> int -> 224 | document -> document -> document -> document -> 225 | document list -> document 226 | 227 | (**[surround_separate_map n b void opening sep closing f xs] is equivalent 228 | to [surround_separate n b void opening sep closing (List.map f xs)]. *) 229 | val surround_separate_map: 230 | int -> int -> 231 | document -> document -> document -> document -> 232 | ('a -> document) -> 'a list -> document 233 | 234 | (** {2 Short-Hands} *) 235 | 236 | (**[!^s] is a short-hand for [string s]. *) 237 | val ( !^ ) : string -> document 238 | 239 | (**[x ^/^ y] separates [x] and [y] with a breakable space. 240 | It is a short-hand for [x ^^ break 1 ^^ y]. *) 241 | val ( ^/^ ) : document -> document -> document 242 | 243 | (**[x ^//^ y] is a short-hand for [prefix 2 1 x y]. *) 244 | val ( ^//^ ) : document -> document -> document 245 | 246 | (** {1:ocaml Printing OCaml Values} *) 247 | 248 | (**This module offers document combinators that help print OCaml values. The 249 | strings produced by rendering these documents are supposed to be accepted 250 | by the OCaml parser as valid values. 251 | 252 | These functions do {i not} distinguish between mutable and immutable 253 | values. They do {i not} recognize sharing, and do {i not} incorporate a 254 | protection against cyclic values. *) 255 | module OCaml : sig 256 | 257 | (* The signature of this module is compatible with that expected by the 258 | [camlp4] generator [Camlp4RepresentationGenerator]. This explains why 259 | some functions have unused parameters. This is also the reason why 260 | there is a type [representation]. *) 261 | 262 | type constructor = string 263 | type type_name = string 264 | type record_field = string 265 | type tag = int 266 | 267 | (**[variant _ dc _ args] represents a constructed value whose data 268 | constructor is [dc] and whose arguments are [args]. The other two 269 | parameters are presently unused. *) 270 | val variant : type_name -> constructor -> tag -> document list -> document 271 | 272 | (**[record _ fields] represents a record value whose fields are [fields]. 273 | The other parameter is presently unused. *) 274 | val record : type_name -> (record_field * document) list -> document 275 | 276 | (**[tuple args] represents a tuple value whose components are [args]. *) 277 | val tuple : document list -> document 278 | 279 | (**[string s] represents the literal string [s]. *) 280 | val string : string -> document 281 | 282 | (**[int i] represents the literal integer [i]. *) 283 | val int : int -> document 284 | 285 | (**[int32 i] represents the literal 32-bit integer [i]. *) 286 | val int32 : int32 -> document 287 | 288 | (**[int64 i] represents the literal 64-bit integer [i]. *) 289 | val int64 : int64 -> document 290 | 291 | (**[nativeint i] represents the literal native integer [i]. *) 292 | val nativeint : nativeint -> document 293 | 294 | (**[float f] represents the literal floating-point number [f]. *) 295 | val float : float -> document 296 | 297 | (**[char c] represents the literal character [c]. *) 298 | val char : char -> document 299 | 300 | (**[bool b] represents the Boolean value [b]. *) 301 | val bool : bool -> document 302 | 303 | (**[unit] represents the unit constant [()]. *) 304 | val unit : document 305 | 306 | (**[option f o] represents the option [o]. The representation of the 307 | element, if present, is computed by the function [f]. *) 308 | val option : ('a -> document) -> 'a option -> document 309 | 310 | (**[list f xs] represents the list [xs]. The representation of each element 311 | is computed by the function [f]. If the whole list fits on a single line, 312 | then it is printed on a single line; otherwise each element is printed on 313 | a separate line. *) 314 | val list : ('a -> document) -> 'a list -> document 315 | 316 | (**[flowing_list f xs] represents the list [xs]. The representation of each 317 | element is computed by the function [f]. As many elements are possible 318 | are printed on each line. *) 319 | val flowing_list : ('a -> document) -> 'a list -> document 320 | 321 | (**[array f xs] represents the array [xs]. The representation of each 322 | element is computed by the function [f]. If the whole array fits on a 323 | single line, then it is printed on a single line; otherwise each element 324 | is printed on a separate line. *) 325 | val array : ('a -> document) -> 'a array -> document 326 | 327 | (**[flowing_array f xs] represents the array [xs]. The representation of 328 | each element is computed by the function [f]. As many elements are 329 | possible are printed on each line. *) 330 | val flowing_array : ('a -> document) -> 'a array -> document 331 | 332 | (**[ref r] represents the reference [r]. The representation of the content 333 | is computed by the function [f]. *) 334 | val ref : ('a -> document) -> 'a ref -> document 335 | 336 | (** [unknown t _] represents an unknown value of type [t]. It is rendered 337 | as a string of the form []. *) 338 | val unknown : type_name -> 'a -> document 339 | 340 | (**/**) 341 | 342 | type representation = 343 | document 344 | 345 | end (* OCaml *) 346 | -------------------------------------------------------------------------------- /src/PPrint.ml: -------------------------------------------------------------------------------- 1 | (******************************************************************************) 2 | (* *) 3 | (* PPrint *) 4 | (* *) 5 | (* François Pottier, Inria Paris *) 6 | (* Nicolas Pouillard *) 7 | (* *) 8 | (* Copyright 2007-2022 Inria. All rights reserved. This file is *) 9 | (* distributed under the terms of the GNU Library General Public *) 10 | (* License, with an exception, as described in the file LICENSE. *) 11 | (* *) 12 | (******************************************************************************) 13 | 14 | include PPrintEngine 15 | 16 | (* -------------------------------------------------------------------------- *) 17 | 18 | (* Predefined single-character documents. *) 19 | 20 | let lparen = char '(' 21 | let rparen = char ')' 22 | let langle = char '<' 23 | let rangle = char '>' 24 | let lbrace = char '{' 25 | let rbrace = char '}' 26 | let lbracket = char '[' 27 | let rbracket = char ']' 28 | let squote = char '\'' 29 | let dquote = char '"' 30 | let bquote = char '`' 31 | let semi = char ';' 32 | let colon = char ':' 33 | let comma = char ',' 34 | let dot = char '.' 35 | let sharp = char '#' 36 | let slash = char '/' 37 | let backslash = char '\\' 38 | let equals = char '=' 39 | let qmark = char '?' 40 | let tilde = char '~' 41 | let at = char '@' 42 | let percent = char '%' 43 | let dollar = char '$' 44 | let caret = char '^' 45 | let ampersand = char '&' 46 | let star = char '*' 47 | let plus = char '+' 48 | let minus = char '-' 49 | let underscore = char '_' 50 | let bang = char '!' 51 | let bar = char '|' 52 | 53 | (* -------------------------------------------------------------------------- *) 54 | 55 | (* Repetition. *) 56 | 57 | let[@inline] twice doc = 58 | doc ^^ doc 59 | 60 | let repeat n doc = 61 | let rec loop n doc accu = 62 | if n = 0 then 63 | accu 64 | else 65 | loop (n - 1) doc (doc ^^ accu) 66 | in 67 | loop n doc empty 68 | 69 | (* -------------------------------------------------------------------------- *) 70 | 71 | (* Delimiters. *) 72 | 73 | let[@inline] precede l x = l ^^ x 74 | let[@inline] terminate r x = x ^^ r 75 | let[@inline] enclose l r x = l ^^ x ^^ r 76 | 77 | let[@inline] squotes x = enclose squote squote x 78 | let[@inline] dquotes x = enclose dquote dquote x 79 | let[@inline] bquotes x = enclose bquote bquote x 80 | let[@inline] braces x = enclose lbrace rbrace x 81 | let[@inline] parens x = enclose lparen rparen x 82 | let[@inline] angles x = enclose langle rangle x 83 | let[@inline] brackets x = enclose lbracket rbracket x 84 | 85 | (* -------------------------------------------------------------------------- *) 86 | 87 | (* Some functions on lists. *) 88 | 89 | (* A variant of [fold_left] that keeps track of the element index. *) 90 | 91 | let foldli (f : int -> 'b -> 'a -> 'b) (accu : 'b) (xs : 'a list) : 'b = 92 | let r = ref 0 in 93 | List.fold_left (fun accu x -> 94 | let i = !r in 95 | r := i + 1; 96 | f i accu x 97 | ) accu xs 98 | 99 | (* -------------------------------------------------------------------------- *) 100 | 101 | (* Working with lists of documents. *) 102 | 103 | let concat docs = 104 | (* We take advantage of the fact that [^^] operates in constant 105 | time, regardless of the size of its arguments. The document 106 | that is constructed is essentially a reversed list (i.e., a 107 | tree that is biased towards the left). This is not a problem; 108 | when pretty-printing this document, the engine will descend 109 | along the left branch, pushing the nodes onto its stack as 110 | it goes down, effectively reversing the list again. *) 111 | List.fold_left (^^) empty docs 112 | 113 | let separate sep docs = 114 | foldli (fun i accu doc -> 115 | if i = 0 then 116 | doc 117 | else 118 | accu ^^ sep ^^ doc 119 | ) empty docs 120 | 121 | let concat_map f xs = 122 | List.fold_left (fun accu x -> 123 | accu ^^ f x 124 | ) empty xs 125 | 126 | let separate_map sep f xs = 127 | foldli (fun i accu x -> 128 | if i = 0 then 129 | f x 130 | else 131 | accu ^^ sep ^^ f x 132 | ) empty xs 133 | 134 | let separate2 sep last_sep docs = 135 | let n = List.length docs in 136 | foldli (fun i accu doc -> 137 | if i = 0 then 138 | doc 139 | else 140 | accu ^^ (if i < n - 1 then sep else last_sep) ^^ doc 141 | ) empty docs 142 | 143 | let optional f = function 144 | | None -> 145 | empty 146 | | Some x -> 147 | f x 148 | 149 | (* -------------------------------------------------------------------------- *) 150 | 151 | (* Text. *) 152 | 153 | (* This variant of [String.index_from] returns an option. *) 154 | 155 | let index_from s i c = 156 | try 157 | Some (String.index_from s i c) 158 | with Not_found -> 159 | None 160 | 161 | (* [lines s] chops the string [s] into a list of lines, which are turned 162 | into documents. *) 163 | 164 | let lines s = 165 | let rec chop accu i = 166 | match index_from s i '\n' with 167 | | Some j -> 168 | let accu = substring s i (j - i) :: accu in 169 | chop accu (j + 1) 170 | | None -> 171 | substring s i (String.length s - i) :: accu 172 | in 173 | List.rev (chop [] 0) 174 | 175 | let arbitrary_string s = 176 | separate (break 1) (lines s) 177 | 178 | (* [split ok s] splits the string [s] at every occurrence of a character 179 | that satisfies the predicate [ok]. The substrings thus obtained are 180 | turned into documents, and a list of documents is returned. No information 181 | is lost: the concatenation of the documents yields the original string. 182 | This code is not UTF-8 aware. *) 183 | 184 | let split ok s = 185 | let n = String.length s in 186 | let rec index_from i = 187 | if i = n then 188 | None 189 | else if ok s.[i] then 190 | Some i 191 | else 192 | index_from (i + 1) 193 | in 194 | let rec chop accu i = 195 | match index_from i with 196 | | Some j -> 197 | let accu = substring s i (j - i) :: accu in 198 | let accu = char s.[j] :: accu in 199 | chop accu (j + 1) 200 | | None -> 201 | substring s i (String.length s - i) :: accu 202 | in 203 | List.rev (chop [] 0) 204 | 205 | (* [words s] chops the string [s] into a list of words, which are turned 206 | into documents. *) 207 | 208 | let words s = 209 | let n = String.length s in 210 | (* A two-state finite automaton. *) 211 | (* In this state, we have skipped at least one blank character. *) 212 | let rec skipping accu i = 213 | if i = n then 214 | (* There was whitespace at the end. Drop it. *) 215 | accu 216 | else match s.[i] with 217 | | ' ' 218 | | '\t' 219 | | '\n' 220 | | '\r' -> 221 | (* Skip more whitespace. *) 222 | skipping accu (i + 1) 223 | | _ -> 224 | (* Begin a new word. *) 225 | word accu i (i + 1) 226 | (* In this state, we have skipped at least one non-blank character. *) 227 | and word accu i j = 228 | if j = n then 229 | (* Final word. *) 230 | substring s i (j - i) :: accu 231 | else match s.[j] with 232 | | ' ' 233 | | '\t' 234 | | '\n' 235 | | '\r' -> 236 | (* A new word has been identified. *) 237 | let accu = substring s i (j - i) :: accu in 238 | skipping accu (j + 1) 239 | | _ -> 240 | (* Continue inside the current word. *) 241 | word accu i (j + 1) 242 | in 243 | List.rev (skipping [] 0) 244 | 245 | let flow_map sep f docs = 246 | foldli (fun i accu doc -> 247 | if i = 0 then 248 | f doc 249 | else 250 | accu ^^ 251 | (* This idiom allows beginning a new line if [doc] does not 252 | fit on the current line. *) 253 | group (sep ^^ f doc) 254 | ) empty docs 255 | 256 | let flow sep docs = 257 | flow_map sep (fun x -> x) docs 258 | 259 | let url s = 260 | flow (break 0) (split (function '/' | '.' -> true | _ -> false) s) 261 | 262 | (* -------------------------------------------------------------------------- *) 263 | (* Alignment and indentation. *) 264 | 265 | let hang i d = 266 | align (nest i d) 267 | 268 | let ( !^ ) = string 269 | 270 | let[@inline] ( ^/^ ) x y = 271 | x ^^ break 1 ^^ y 272 | 273 | let prefix n b x y = 274 | group (x ^^ nest n (break b ^^ y)) 275 | 276 | let[@inline] (^//^) x y = 277 | prefix 2 1 x y 278 | 279 | let jump n b y = 280 | group (nest n (break b ^^ y)) 281 | 282 | let infix n b op x y = 283 | prefix n b (x ^^ blank b ^^ op) y 284 | 285 | let surround n b opening contents closing = 286 | group (opening ^^ nest n ( break b ^^ contents) ^^ break b ^^ closing ) 287 | 288 | let soft_surround n b opening contents closing = 289 | group (opening ^^ nest n (group (break b) ^^ contents) ^^ group (break b ^^ closing)) 290 | 291 | let surround_separate n b void opening sep closing docs = 292 | match docs with 293 | | [] -> 294 | void 295 | | _ :: _ -> 296 | surround n b opening (separate sep docs) closing 297 | 298 | let surround_separate_map n b void opening sep closing f xs = 299 | match xs with 300 | | [] -> 301 | void 302 | | _ :: _ -> 303 | surround n b opening (separate_map sep f xs) closing 304 | 305 | (* -------------------------------------------------------------------------- *) 306 | (* Printing OCaml values. *) 307 | 308 | module OCaml = struct 309 | 310 | open Printf 311 | 312 | type constructor = string 313 | type type_name = string 314 | type record_field = string 315 | type tag = int 316 | 317 | (* -------------------------------------------------------------------------- *) 318 | 319 | (* This internal [sprintf]-like function produces a document. We use [string], 320 | as opposed to [arbitrary_string], because the strings that we produce will 321 | never contain a newline character. *) 322 | 323 | let[@inline] dsprintf format = 324 | ksprintf string format 325 | 326 | (* -------------------------------------------------------------------------- *) 327 | 328 | (* Nicolas prefers using this code as opposed to just [sprintf "%g"] or 329 | [sprintf "%f"]. The latter print [inf] and [-inf], whereas OCaml 330 | understands [infinity] and [neg_infinity]. [sprintf "%g"] does not add a 331 | trailing dot when the number happens to be an integral number. [sprintf 332 | "%F"] seems to lose precision and ignores the precision modifier. *) 333 | 334 | let valid_float_lexeme (s : string) : string = 335 | let l = String.length s in 336 | let rec loop i = 337 | if i >= l then 338 | (* If we reach the end of the string and have found only characters in 339 | the set '0' .. '9' and '-', then this string will be considered as an 340 | integer literal by OCaml. Adding a trailing dot makes it a float 341 | literal. *) 342 | s ^ "." 343 | else 344 | match s.[i] with 345 | | '0' .. '9' | '-' -> loop (i + 1) 346 | | _ -> s 347 | in loop 0 348 | 349 | (* This function constructs a string representation of a floating point 350 | number. This representation is supposed to be accepted by OCaml as a 351 | valid floating point literal. *) 352 | 353 | let float_representation (f : float) : string = 354 | match classify_float f with 355 | | FP_nan -> 356 | "nan" 357 | | FP_infinite -> 358 | if f < 0.0 then "neg_infinity" else "infinity" 359 | | _ -> 360 | (* Try increasing precisions and validate. *) 361 | let s = sprintf "%.12g" f in 362 | if f = float_of_string s then valid_float_lexeme s else 363 | let s = sprintf "%.15g" f in 364 | if f = float_of_string s then valid_float_lexeme s else 365 | sprintf "%.18g" f 366 | 367 | (* -------------------------------------------------------------------------- *) 368 | 369 | (* A few constants and combinators, used below. *) 370 | 371 | let some = 372 | string "Some" 373 | 374 | let none = 375 | string "None" 376 | 377 | let lbracketbar = 378 | string "[|" 379 | 380 | let rbracketbar = 381 | string "|]" 382 | 383 | let seq1 opening separator closing = 384 | surround_separate 2 0 385 | (opening ^^ closing) opening (separator ^^ break 1) closing 386 | 387 | let seq2 opening separator closing = 388 | surround_separate_map 2 1 389 | (opening ^^ closing) opening (separator ^^ break 1) closing 390 | 391 | (* -------------------------------------------------------------------------- *) 392 | 393 | (* The following functions are printers for many types of OCaml values. *) 394 | 395 | (* There is no protection against cyclic values. *) 396 | 397 | let tuple = 398 | seq1 lparen comma rparen 399 | 400 | let variant _ cons _ args = 401 | match args with 402 | | [] -> 403 | !^cons 404 | | _ :: _ -> 405 | !^cons ^^ tuple args 406 | 407 | let record _ fields = 408 | seq2 lbrace semi rbrace (fun (k, v) -> infix 2 1 equals !^k v) fields 409 | 410 | let option f = function 411 | | None -> 412 | none 413 | | Some x -> 414 | some ^^ tuple [f x] 415 | 416 | let list f xs = 417 | seq2 lbracket semi rbracket f xs 418 | 419 | let flowing_list f xs = 420 | group (lbracket ^^ space ^^ nest 2 ( 421 | flow_map (semi ^^ break 1) f xs 422 | ) ^^ space ^^ rbracket) 423 | 424 | let array f xs = 425 | seq2 lbracketbar semi rbracketbar f (Array.to_list xs) 426 | 427 | let flowing_array f xs = 428 | group (lbracketbar ^^ space ^^ nest 2 ( 429 | flow_map (semi ^^ break 1) f (Array.to_list xs) 430 | ) ^^ space ^^ rbracketbar) 431 | 432 | let ref f x = 433 | record "ref" ["contents", f !x] 434 | 435 | let float f = 436 | string (float_representation f) 437 | 438 | let int = 439 | dsprintf "%d" 440 | 441 | let int32 = 442 | dsprintf "%ld" 443 | 444 | let int64 = 445 | dsprintf "%Ld" 446 | 447 | let nativeint = 448 | dsprintf "%nd" 449 | 450 | let char = 451 | dsprintf "%C" 452 | 453 | let bool = 454 | dsprintf "%B" 455 | 456 | let unit = 457 | dsprintf "()" 458 | 459 | let string = 460 | dsprintf "%S" 461 | 462 | let unknown tyname _ = 463 | dsprintf "" tyname 464 | 465 | type representation = 466 | document 467 | 468 | end (* OCaml *) 469 | -------------------------------------------------------------------------------- /src/PPrintEngine.mli: -------------------------------------------------------------------------------- 1 | (******************************************************************************) 2 | (* *) 3 | (* PPrint *) 4 | (* *) 5 | (* François Pottier, Inria Paris *) 6 | (* Nicolas Pouillard *) 7 | (* *) 8 | (* Copyright 2007-2022 Inria. All rights reserved. This file is *) 9 | (* distributed under the terms of the GNU Library General Public *) 10 | (* License, with an exception, as described in the file LICENSE. *) 11 | (* *) 12 | (******************************************************************************) 13 | 14 | (**[PPrint] is an OCaml library for {b pretty-printing textual documents}. 15 | It takes care of {b indentation and line breaks}, and is typically used 16 | to {b pretty-print code}. *) 17 | 18 | (** {1:building Building Documents} *) 19 | 20 | (**The abstract type of documents. *) 21 | type document 22 | 23 | (** {2 Atomic Documents} *) 24 | 25 | (**[empty] is the empty document. *) 26 | val empty: document 27 | 28 | (**[char c] is an atomic document that consists of the single character [c]. 29 | This character must not be a newline character. *) 30 | val char: char -> document 31 | 32 | (**[string s] is an atomic document that consists of the string [s]. This 33 | string must not contain a newline. The printing engine assumes that the 34 | ideal width of this string is [String.length s]. This assumption is safe 35 | if this is an ASCII string. Otherwise, {!fancystring} or {!utf8string} 36 | should be preferred. *) 37 | val string: string -> document 38 | 39 | (**[substring s ofs len] is an atomic document that consists of the portion 40 | of the string [s] delimited by the offset [ofs] and the length [len]. 41 | This portion must not contain a newline. [substring s ofs len] is 42 | equivalent to [string (String.sub s ofs len)], but is expected to be more 43 | efficient, as the substring is not actually extracted. *) 44 | val substring: string -> int -> int -> document 45 | 46 | (**[fancystring s alen] is an atomic document that consists of the string 47 | [s]. This string must not contain a newline. The string may contain fancy 48 | characters: color escape characters, UTF-8 characters, etc. Thus, its 49 | apparent length (which measures how many columns the text will take up on 50 | screen) differs from its length in bytes. The printing engine assumes 51 | that its apparent length is [alen]. *) 52 | val fancystring: string -> int -> document 53 | 54 | (**[fancysubstring s ofs len alen] is equivalent to [fancystring (String.sub 55 | s ofs len) alen]. *) 56 | val fancysubstring : string -> int -> int -> int -> document 57 | 58 | (**[utf8string s] is an atomic document that consists of the UTF-8-encoded 59 | string [s]. This string must not contain a newline. [utf8string s] is 60 | equivalent to [fancystring s (utf8_length s)], where [utf8_length s] is 61 | the apparent length of the UTF-8-encoded string [s]. *) 62 | val utf8string: string -> document 63 | 64 | (** [utf8format format ...] is equivalent to 65 | [utf8string (Printf.sprintf format ...)]. *) 66 | val utf8format: ('a, unit, string, document) format4 -> 'a 67 | 68 | (** {2 Blanks and Newlines} *) 69 | 70 | (**The atomic document [hardline] represents a forced newline. This document 71 | has infinite ideal width: thus, if there is a choice between printing it 72 | in flat mode and printing it in normal mode, normal mode is preferred. In 73 | other words, when [hardline] is placed directly inside a group, this 74 | group is dissolved: [group hardline] is equivalent to [hardline]. This 75 | combinator should be seldom used; consider using {!break} instead. *) 76 | val hardline: document 77 | 78 | (**The atomic document [blank n] consists of [n] blank characters. A blank 79 | character is like an ordinary ASCII space character [char ' '], except 80 | that blank characters that appear at the end of a line are automatically 81 | suppressed. *) 82 | val blank: int -> document 83 | 84 | (**[space] is a synonym for [blank 1]. It consists of one blank character. 85 | It is therefore not equivalent to [char ' ']. *) 86 | val space: document 87 | 88 | (**The document [break n] is a breakable blank of width [n]. It produces [n] 89 | blank characters if the printing engine is in flat mode, and a single 90 | newline character if the printing engine is in normal mode. [break 1] is 91 | equivalent to [ifflat (blank 1) hardline]. *) 92 | val break: int -> document 93 | 94 | (** {2 Composite Documents} *) 95 | 96 | (**[doc1 ^^ doc2] is the concatenation of the documents [doc1] and [doc2]. *) 97 | val (^^): document -> document -> document 98 | 99 | (**[group doc] encodes a choice. If the document [doc] fits on the current 100 | line, then it is rendered on a single line, in flat mode. (All [group] 101 | combinators inside it are then ignored.) Otherwise, this group is 102 | dissolved, and [doc] is rendered in normal mode. There might be more 103 | groups within [doc], whose presence leads to further choices being 104 | explored. *) 105 | val group: document -> document 106 | 107 | (**[ifflat doc1 doc2] is rendered as [doc1] if the printing engine is in 108 | flat mode, that is, if the printing engine has determined that some 109 | enclosing group fits on the current line. Otherwise, it is rendered as 110 | [doc2]. Use this combinator with caution! Because the printing engine is 111 | free to choose between [doc1] and [doc2], these documents must be 112 | semantically equivalent. It is up to the user to enforce this property. *) 113 | val ifflat: document -> document -> document 114 | 115 | (**To render the document [nest j doc], the printing engine temporarily 116 | increases the current indentation level by [j], then renders [doc]. The 117 | effect of the current indentation level is as follows: every time a 118 | newline character is emitted, it is immediately followed by [n] blank 119 | characters, where [n] is the current indentation level. Thus, one may 120 | think of [nest j doc] roughly as the document [doc] in which [j] blank 121 | characters have been inserted after every newline character. *) 122 | val nest: int -> document -> document 123 | 124 | (**To render [align doc], the printing engine sets the current indentation 125 | level to the current column, then renders [doc]. In other words, the 126 | document [doc] is rendered within a box whose upper left corner is the 127 | current position of the printing engine. *) 128 | val align: document -> document 129 | 130 | (**A point is a pair of a line number and a column number. *) 131 | type point = 132 | int * int 133 | 134 | (**A range is a pair of points. *) 135 | type range = 136 | point * point 137 | 138 | (**The document [range hook doc] is printed like the document [doc], but 139 | allows the caller to register a hook that is applied, when the document 140 | is printed, to the range occupied by this document in the output text. 141 | This offers a way of mapping positions in the output text back to 142 | (sub)documents. *) 143 | val range: (range -> unit) -> document -> document 144 | 145 | (** {1:inspecting Inspecting Documents} *) 146 | 147 | (**Documents are abstract, and cannot be inspected. Nevertheless, it is 148 | possible to test whether a document is empty. *) 149 | 150 | (**[is_empty doc] determines whether the document [doc] is empty. Most ways 151 | of constructing empty documents, such as [empty], [empty ^^ empty], 152 | [nest j empty], and so on, are recognized as such. However, a document 153 | constructed by {!val-custom} or {!val-range} is never considered empty. *) 154 | val is_empty: document -> bool 155 | 156 | (** {1:rendering Rendering Documents} *) 157 | 158 | (**Three renderers are available. They offer the same API, described 159 | by the signature {!RENDERER}, and differ only in the nature of the 160 | output channel that they use. *) 161 | 162 | (**This signature describes the document renderers in a manner that 163 | is independent of the type of the output channel. *) 164 | module type RENDERER = sig 165 | 166 | (**The type of the output channel. *) 167 | type channel 168 | 169 | (**The type of documents. *) 170 | type document 171 | 172 | (** [pretty rfrac width channel document] pretty-prints the document 173 | [document] into the output channel [channel]. The parameter [width] is 174 | the maximum number of characters per line. The parameter [rfrac] is the 175 | ribbon width, a fraction relative to [width]. The ribbon width is the 176 | maximum number of non-indentation characters per line. *) 177 | val pretty: float -> int -> channel -> document -> unit 178 | 179 | (** [compact channel document] prints the document [document] to the output 180 | channel [channel]. No indentation is used. All newline instructions are 181 | respected, that is, no groups are flattened. *) 182 | val compact: channel -> document -> unit 183 | 184 | end 185 | 186 | (**This renderer sends its output into an output channel. *) 187 | module ToChannel : RENDERER 188 | with type channel = out_channel 189 | and type document = document 190 | 191 | (**This renderer sends its output into a memory buffer. *) 192 | module ToBuffer : RENDERER 193 | with type channel = Buffer.t 194 | and type document = document 195 | 196 | (**This renderer sends its output into a formatter channel. *) 197 | module ToFormatter : RENDERER 198 | with type channel = Format.formatter 199 | and type document = document 200 | 201 | (** {1:defining Defining Custom Documents} *) 202 | 203 | (**It is possible to define custom document constructors, provided they meet 204 | the expectations of the printing engine. In short, the custom document 205 | combinator {!val-custom} expects an object of class {!class-type-custom}. 206 | This object must provide three methods. The method [requirement] must 207 | compute the ideal width of the custom document. The methods [pretty] and 208 | [compact] must render the custom document. For this purpose, they have 209 | access to the {{!output}output channel} and to the {{!state}state} of the 210 | printing engine. *) 211 | 212 | (** A width requirement is expressed as an integer. The value [max_int] 213 | is reserved and represents infinity. *) 214 | type requirement = int 215 | 216 | (**[infinity] represents an infinite width requirement. *) 217 | val infinity : requirement 218 | 219 | (**An output channel is abstractly represented as an object equipped with 220 | methods for displaying one character and for displaying a substring. *) 221 | class type output = object 222 | 223 | (**[char c] sends the character [c] to the output channel. *) 224 | method char: char -> unit 225 | 226 | (**[substring s ofs len] sends the substring of [s] delimited by the 227 | offset [ofs] and the length [len] to the output channel. *) 228 | method substring: string -> int (* offset *) -> int (* length *) -> unit 229 | 230 | end 231 | 232 | (**The internal state of the rendering engine is exposed to the user who 233 | wishes to define custom documents. However, its structure is subject to 234 | change in future versions of the library. *) 235 | type state = { 236 | 237 | width: int; 238 | (** The line width. This parameter is fixed throughout the execution of 239 | the renderer. *) 240 | 241 | ribbon: int; 242 | (** The ribbon width. This parameter is fixed throughout the execution of 243 | the renderer. *) 244 | 245 | mutable last_indent: int; 246 | (** The number of blanks that were printed at the beginning of the current 247 | line. This field is updated (only) when a hardline is emitted. It is 248 | used (only) to determine whether the ribbon width constraint is 249 | respected. *) 250 | 251 | mutable line: int; 252 | (** The current line. This field is updated (only) when a hardline is 253 | emitted. It is not used by the pretty-printing engine itself. *) 254 | 255 | mutable column: int; 256 | (** The current column. This field must be updated whenever something is 257 | sent to the output channel. It is used (only) to determine whether the 258 | width constraint is respected. *) 259 | 260 | } 261 | 262 | (**A custom document is defined by implementing an object of class 263 | {!class-type-custom}. *) 264 | class type custom = object 265 | 266 | (**A custom document must publish the width (i.e., the number of columns) 267 | that it would like to occupy if printed on a single line (in flat 268 | mode). The special value [infinity] means that this document cannot be 269 | printed on a single line; this value causes any groups that contain 270 | this document to be dissolved. This method should in principle work in 271 | constant time. *) 272 | method requirement: requirement 273 | 274 | (**The method [pretty] is used by the main rendering algorithm. It has 275 | access to the output channel and to the printing engine's internal 276 | state. In addition, it receives the current indentation level and a 277 | Boolean flag that tells whether the engine is currently in flat mode. 278 | If the engine is in flat mode, then the document must be printed on a 279 | single line, in a manner that is consistent with the width requirement 280 | that was published ahead of time. If the engine is in normal mode, then 281 | there is no such obligation. The state must be updated in a manner that 282 | is consistent with what is sent to the output channel. *) 283 | method pretty: output -> state -> int -> bool -> unit 284 | 285 | (**The method [compact] is used by the compact rendering algorithm. It 286 | has access to the output channel only. *) 287 | method compact: output -> unit 288 | 289 | end 290 | 291 | (**[custom] constructs a custom document out an object of type 292 | {!class-type-custom}. *) 293 | val custom: custom -> document 294 | 295 | (**Some of the key functions of the library are exposed, in the hope that 296 | they may be useful to authors of custom (leaf and composite) documents. 297 | In the case of a leaf document, they can help perform certain basic 298 | functions; for instance, applying the function {!pretty} to the document 299 | {!hardline} is a simple way of printing a hardline, while respecting the 300 | indentation parameters and updating the state in a correct manner. 301 | Similarly, applying {!pretty} to the document [blank n] is a simple way 302 | of printing [n] blank characters. In the case of a composite document 303 | (one that contains subdocuments), these functions are essential: they 304 | allow computing the width requirement of a subdocument and displaying a 305 | subdocument. *) 306 | 307 | (**[requirement doc] computes the width requirement of the document [doc]. 308 | It runs in constant time. *) 309 | val requirement: document -> requirement 310 | 311 | (**[pretty output state indent flatten doc] prints the document [doc]. See 312 | the documentation of the method [pretty] in the class 313 | {!class-type-custom}. *) 314 | val pretty: output -> state -> int -> bool -> document -> unit 315 | 316 | (**[compact output doc] prints the document [doc]. See the documentation of 317 | the method [compact] in the class {!class-type-custom}. *) 318 | val compact: output -> document -> unit 319 | -------------------------------------------------------------------------------- /benchmark_old/OldPPrintEngine.ml: -------------------------------------------------------------------------------- 1 | (******************************************************************************) 2 | (* *) 3 | (* PPrint *) 4 | (* *) 5 | (* François Pottier, Inria Paris *) 6 | (* Nicolas Pouillard *) 7 | (* *) 8 | (* Copyright 2007-2022 Inria. All rights reserved. This file is *) 9 | (* distributed under the terms of the GNU Library General Public *) 10 | (* License, with an exception, as described in the file LICENSE. *) 11 | (* *) 12 | (******************************************************************************) 13 | 14 | (* ------------------------------------------------------------------------- *) 15 | 16 | (* The last element of a non-empty list. *) 17 | 18 | let rec last x xs = 19 | match xs with 20 | | [] -> 21 | x 22 | | x :: xs -> 23 | last x xs 24 | 25 | let last = function 26 | | [] -> 27 | assert false 28 | | x :: xs -> 29 | last x xs 30 | 31 | (* ------------------------------------------------------------------------- *) 32 | (* ------------------------------------------------------------------------- *) 33 | 34 | (* A uniform interface for output channels. *) 35 | 36 | module type OUTPUT = sig 37 | type channel 38 | val char: channel -> char -> unit 39 | val substring: channel -> string -> int (* offset *) -> int (* length *) -> unit 40 | end 41 | 42 | (* ------------------------------------------------------------------------- *) 43 | 44 | (* Three implementations of the above interface, respectively based on output 45 | channels, memory buffers, and formatters. This compensates for the fact 46 | that ocaml's standard library does not allow creating an output channel 47 | that feeds into a memory buffer (a regrettable omission). *) 48 | 49 | module ChannelOutput : OUTPUT with type channel = out_channel = struct 50 | type channel = out_channel 51 | let char = output_char 52 | let substring = output_substring (* requires OCaml >= 4.02 *) 53 | end 54 | 55 | module BufferOutput : OUTPUT with type channel = Buffer.t = struct 56 | type channel = Buffer.t 57 | let char = Buffer.add_char 58 | let substring = Buffer.add_substring 59 | end 60 | 61 | module FormatterOutput : OUTPUT with type channel = Format.formatter = struct 62 | type channel = Format.formatter 63 | let char = Format.pp_print_char 64 | let substring fmt = fst (Format.pp_get_formatter_output_functions fmt ()) 65 | end 66 | 67 | (* ------------------------------------------------------------------------- *) 68 | (* ------------------------------------------------------------------------- *) 69 | 70 | (* Here is the algebraic data type of documents. It is analogous to Daan 71 | Leijen's version, but the binary constructor [Union] is replaced with 72 | the unary constructor [Group], and the constant [Line] is replaced with 73 | more general constructions, namely [IfFlat], which provides alternative 74 | forms depending on the current flattening mode, and [HardLine], which 75 | represents a newline character, and causes a failure in flattening mode. *) 76 | 77 | type document = 78 | 79 | (* [Empty] is the empty document. *) 80 | 81 | | Empty 82 | 83 | (* [Char c] is a document that consists of the single character [c]. We 84 | enforce the invariant that [c] is not a newline character. *) 85 | 86 | | Char of char 87 | 88 | (* [String (s, ofs, len)] is a document that consists of the portion of 89 | the string [s] delimited by the offset [ofs] and the length [len]. We 90 | assume, but do not check, that this portion does not contain a newline 91 | character. *) 92 | 93 | | String of string * int * int 94 | 95 | (* [FancyString (s, ofs, len, apparent_length)] is a (portion of a) string 96 | that may contain fancy characters: color escape characters, UTF-8 or 97 | multi-byte characters, etc. Thus, the apparent length (which corresponds 98 | to what will be visible on screen) differs from the length (which is a 99 | number of bytes, and is reported by [String.length]). We assume, but do 100 | not check, that fancystrings do not contain a newline character. *) 101 | 102 | | FancyString of string * int * int * int 103 | 104 | (* [Blank n] is a document that consists of [n] blank characters. *) 105 | 106 | | Blank of int 107 | 108 | (* When in flattening mode, [IfFlat (d1, d2)] turns into the document 109 | [d1]. When not in flattening mode, it turns into the document [d2]. *) 110 | 111 | | IfFlat of document * document 112 | 113 | (* When in flattening mode, [HardLine] causes a failure, which requires 114 | backtracking all the way until the stack is empty. When not in flattening 115 | mode, it represents a newline character, followed with an appropriate 116 | number of indentation. A common way of using [HardLine] is to only use it 117 | directly within the right branch of an [IfFlat] construct. *) 118 | 119 | | HardLine 120 | 121 | (* [Cat doc1 doc2] is the concatenation of the documents [doc1] and 122 | [doc2]. *) 123 | 124 | | Cat of document * document 125 | 126 | (* [Nest (j, doc)] is the document [doc], in which the indentation level 127 | has been increased by [j], that is, in which [j] blanks have been 128 | inserted after every newline character. *) 129 | 130 | | Nest of int * document 131 | 132 | (* [Group doc] represents an alternative: it is either a flattened form of 133 | [doc], in which occurrences of [Group] disappear and occurrences of 134 | [IfFlat] resolve to their left branch, or [doc] itself. *) 135 | 136 | | Group of document 137 | 138 | (* [Column f] is the document obtained by applying [f] to the current 139 | column number. *) 140 | 141 | | Column of (int -> document) 142 | 143 | (* [Nesting f] is the document obtained by applying [f] to the current 144 | indentation level, that is, the number of blanks that were printed 145 | at the beginning of the current line. *) 146 | 147 | | Nesting of (int -> document) 148 | 149 | (* ------------------------------------------------------------------------- *) 150 | 151 | (* The above algebraic data type is not exposed to the user. Instead, we 152 | expose the following functions. *) 153 | 154 | let empty = 155 | Empty 156 | 157 | let char c = 158 | assert (c <> '\n'); 159 | Char c 160 | 161 | let substring s ofs len = 162 | if len = 0 then 163 | Empty 164 | else 165 | String (s, ofs, len) 166 | 167 | let string s = 168 | substring s 0 (String.length s) 169 | 170 | let fancysubstring s ofs len apparent_length = 171 | if len = 0 then 172 | Empty 173 | else 174 | FancyString (s, ofs, len, apparent_length) 175 | 176 | let fancystring s apparent_length = 177 | fancysubstring s 0 (String.length s) apparent_length 178 | 179 | (* The following function was stolen from [Batteries]. *) 180 | let utf8_length s = 181 | let rec length_aux s c i = 182 | if i >= String.length s then c else 183 | let n = Char.code (String.unsafe_get s i) in 184 | let k = 185 | if n < 0x80 then 1 else 186 | if n < 0xe0 then 2 else 187 | if n < 0xf0 then 3 else 4 188 | in 189 | length_aux s (c + 1) (i + k) 190 | in 191 | length_aux s 0 0 192 | 193 | let utf8string s = 194 | fancystring s (utf8_length s) 195 | 196 | let hardline = 197 | HardLine 198 | 199 | let blank n = 200 | match n with 201 | | 0 -> 202 | Empty 203 | | 1 -> 204 | Blank 1 205 | | _ -> 206 | Blank n 207 | 208 | let internal_break i = 209 | IfFlat (blank i, HardLine) 210 | 211 | let break0 = 212 | internal_break 0 213 | 214 | let break1 = 215 | internal_break 1 216 | 217 | let break i = 218 | match i with 219 | | 0 -> 220 | break0 221 | | 1 -> 222 | break1 223 | | _ -> 224 | internal_break i 225 | 226 | let (^^) x y = 227 | match x, y with 228 | | Empty, x 229 | | x, Empty -> 230 | x 231 | | _, _ -> 232 | Cat (x, y) 233 | 234 | let nest i x = 235 | assert (i >= 0); 236 | Nest (i, x) 237 | 238 | let group x = 239 | Group x 240 | 241 | let column f = 242 | Column f 243 | 244 | let nesting f = 245 | Nesting f 246 | 247 | let ifflat x y = 248 | IfFlat (x, y) 249 | 250 | (* ------------------------------------------------------------------------- *) 251 | 252 | (* The pretty rendering algorithm: preliminary declarations. *) 253 | 254 | (* The renderer is supposed to behave exactly like Daan Leijen's, although its 255 | implementation is quite radically different. Instead of relying on 256 | Haskell's lazy evaluation mechanism, we implement an abstract machine with 257 | mutable current state, forking, backtracking (via an explicit stack of 258 | choice points), and cut (disposal of earlier choice points). *) 259 | 260 | (* The renderer's input consists of an ordered sequence of documents. Each 261 | document carries an extra indentation level, akin to an implicit [Nest] 262 | constructor, and a ``flattening'' flag, which, if set, means that this 263 | document should be printed in flattening mode. *) 264 | 265 | (* An alternative coding style would be to avoid decorating each input 266 | document with an indentation level and a flattening mode, and allow 267 | the input sequence to contain instructions that set the current 268 | nesting level or reset the flattening mode. That would perhaps be 269 | slightly more readable, and slightly less efficient. *) 270 | 271 | type input = 272 | | INil 273 | | ICons of int * bool * document * input 274 | 275 | (* When possible (that is, when the stack is empty), the renderer writes 276 | directly to the output channel. Otherwise, output is buffered until either 277 | a failure point is reached (then, the buffered output is discarded) or a 278 | cut is reached (then, all buffered output is committed to the output 279 | channel). At all times, the length of the buffered output is at most one 280 | line. *) 281 | 282 | (* The buffered output consists of a list of characters and strings. It is 283 | stored in reverse order (the head of the list should be printed last). *) 284 | 285 | type output = 286 | | OEmpty 287 | | OChar of char * output 288 | | OString of string * int * int * output 289 | | OBlank of int * output 290 | 291 | (* The renderer maintains the following state record. For efficiency, the 292 | record is mutable; it is copied when the renderer forks, that is, at 293 | choice points. *) 294 | 295 | type 'channel state = { 296 | 297 | (* The line width and ribbon width. *) 298 | 299 | width: int; 300 | ribbon: int; 301 | 302 | (* The output channel. *) 303 | 304 | channel: 'channel; 305 | 306 | (* The current indentation level. This is the number of blanks that 307 | were printed at the beginning of the current line. *) 308 | 309 | mutable indentation: int; 310 | 311 | (* The current column. *) 312 | 313 | mutable column: int; 314 | 315 | (* The renderer's input. For efficiency, the input is assumed to never be 316 | empty, and the leading [ICons] constructor is inlined within the state 317 | record. In other words, the fields [nest1], [flatten1], and [input1] 318 | concern the first input document, and the field [input] contains the 319 | rest of the input sequence. *) 320 | 321 | mutable indent1: int; 322 | mutable flatten1: bool; 323 | mutable input1: document; 324 | mutable input: input; 325 | 326 | (* The renderer's buffered output. *) 327 | 328 | mutable output: output; 329 | 330 | } 331 | 332 | (* The renderer maintains a stack of resumptions, that is, states in which 333 | execution should be resumed if the current thread of execution fails by 334 | lack of space on the current line. *) 335 | 336 | (* It is not difficult to prove that the stack is empty if and only if 337 | flattening mode is off. Furthermore, when flattening mode is on, 338 | all groups are ignored, so no new choice points are pushed onto the 339 | stack. As a result, the stack has height one at most at all times, 340 | so that the stack height is zero when flattening mode is off and 341 | one when flattening mode is on. *) 342 | 343 | type 'channel stack = 344 | 'channel state list 345 | 346 | (* ------------------------------------------------------------------------- *) 347 | 348 | (* The pretty rendering algorithm: code. *) 349 | 350 | (* The renderer is parameterized over an implementation of output channels. *) 351 | 352 | module Renderer (Output : OUTPUT) = struct 353 | 354 | type channel = 355 | Output.channel 356 | 357 | type dummy = 358 | document 359 | type document = 360 | dummy 361 | 362 | (* Printing blank space (indentation characters). *) 363 | 364 | let blank_length = 365 | 80 366 | 367 | let blank_buffer = 368 | String.make blank_length ' ' 369 | 370 | let rec blanks channel n = 371 | if n <= 0 then 372 | () 373 | else if n <= blank_length then 374 | Output.substring channel blank_buffer 0 n 375 | else begin 376 | Output.substring channel blank_buffer 0 blank_length; 377 | blanks channel (n - blank_length) 378 | end 379 | 380 | (* Committing buffered output to the output channel. The list is printed in 381 | reverse order. The code is not tail recursive, but there is no risk of 382 | stack overflow, since the length of the buffered output cannot exceed one 383 | line. *) 384 | 385 | let rec commit channel = function 386 | | OEmpty -> 387 | () 388 | | OChar (c, output) -> 389 | commit channel output; 390 | Output.char channel c 391 | | OString (s, ofs, len, output) -> 392 | commit channel output; 393 | Output.substring channel s ofs len 394 | | OBlank (n, output) -> 395 | commit channel output; 396 | blanks channel n 397 | 398 | (* The renderer's abstract machine. *) 399 | 400 | (* The procedures [run], [shift], [emit_char], [emit_string], and 401 | [emit_blanks] are mutually recursive, and are tail recursive. They 402 | maintain a stack and a current state. The states in the stack, and the 403 | current state, are pairwise distinct, so that the current state can be 404 | mutated without affecting the contents of the stack. *) 405 | 406 | (* An invariant is: the buffered output is nonempty only when the stack is 407 | nonempty. The contrapositive is: if the stack is empty, then the buffered 408 | output is empty. Indeed, the fact that the stack is empty means that no 409 | choices were made, so we are not in a speculative mode of execution: as a 410 | result, all output can be sent directly to the output channel. On the 411 | contrary, when the stack is nonempty, there is a possibility that we 412 | might backtrack in the future, so all output should be held in a 413 | buffer. *) 414 | 415 | (* [run] is allowed to call itself recursively only when no material is 416 | printed. In that case, the check for failure is skipped -- indeed, 417 | this test is performed only within [shift]. *) 418 | 419 | let rec run (stack : channel stack) (state : channel state) : unit = 420 | 421 | (* Examine the first piece of input, as well as (in some cases) the 422 | current flattening mode. *) 423 | 424 | match state.input1, state.flatten1 with 425 | 426 | (* The first piece of input is an empty document. Discard it 427 | and continue. *) 428 | 429 | | Empty, _ -> 430 | shift stack state 431 | 432 | (* The first piece of input is a character. Emit it and continue. *) 433 | 434 | | Char c, _ -> 435 | emit_char stack state c 436 | 437 | (* The first piece of input is a string. Emit it and continue. *) 438 | 439 | | String (s, ofs, len), _ -> 440 | emit_string stack state s ofs len len 441 | | FancyString (s, ofs, len, apparent_length), _ -> 442 | emit_string stack state s ofs len apparent_length 443 | | Blank n, _ -> 444 | emit_blanks stack state n 445 | 446 | (* The first piece of input is a hard newline instruction. *) 447 | 448 | (* If flattening mode is off, then we behave as follows. We emit a newline 449 | character, followed by the prescribed amount of indentation. We update 450 | the current state to record how many indentation characters were 451 | printed and to to reflect the new column number. Then, we discard the 452 | current piece of input and continue. *) 453 | 454 | | HardLine, false -> 455 | assert (stack = []); (* since flattening mode is off, the stack must be empty. *) 456 | Output.char state.channel '\n'; 457 | let i = state.indent1 in 458 | blanks state.channel i; 459 | state.column <- i; 460 | state.indentation <- i; 461 | shift stack state 462 | 463 | (* If flattening mode is on, then [HardLine] causes an immediate failure. We 464 | backtrack all the way to the state found at the bottom of the stack. 465 | (Indeed, if we were to backtrack to the state found at the top of the stack, 466 | then we would come back to this point in flattening mode, and fail again.) 467 | This will take us back to non-flattening mode, so that, when we come back 468 | to this [HardLine], we will be able to honor it. *) 469 | 470 | | HardLine, true -> 471 | assert (stack <> []); (* since flattening mode is on, the stack must be non-empty. *) 472 | run [] (last stack) 473 | 474 | (* The first piece of input is an [IfFlat] conditional instruction. *) 475 | 476 | | IfFlat (doc, _), true 477 | | IfFlat (_, doc), false -> 478 | state.input1 <- doc; 479 | run stack state 480 | 481 | (* The first piece of input is a concatenation operator. We take it 482 | apart and queue both documents in the input sequence. *) 483 | 484 | | Cat (doc1, doc2), _ -> 485 | state.input1 <- doc1; 486 | state.input <- ICons (state.indent1, state.flatten1, doc2, state.input); 487 | run stack state 488 | 489 | (* The first piece of input is a [Nest] operator. We increase the amount 490 | of indentation to be applied to the first input document. *) 491 | 492 | | Nest (j, doc), _ -> 493 | state.indent1 <- state.indent1 + j; 494 | state.input1 <- doc; 495 | run stack state 496 | 497 | (* The first piece of input is a [Group] operator, and flattening mode 498 | is currently off. This introduces a choice point: either we flatten 499 | this whole group, or we don't. We try the former possibility first: 500 | this is done by enabling flattening mode. Should this avenue fail, 501 | we push the current state, in which flattening mode is disabled, 502 | onto the stack. *) 503 | 504 | (* Note that the current state is copied before continuing, so that 505 | the state that is pushed on the stack is not affected by future 506 | modifications. This is a fork. *) 507 | 508 | | Group doc, false -> 509 | state.input1 <- doc; 510 | run (state :: stack) { state with flatten1 = true } 511 | 512 | (* The first piece of input is a [Group] operator, and flattening mode 513 | is currently on. The operator is ignored. *) 514 | 515 | | Group doc, true -> 516 | state.input1 <- doc; 517 | run stack state 518 | 519 | (* The first piece of input is a [Column] operator. The current column 520 | is fed into it, so as to produce a document, with which we continue. *) 521 | 522 | | Column f, _ -> 523 | state.input1 <- f state.column; 524 | run stack state 525 | 526 | (* The first piece of input is a [Nesting] operator. The current 527 | indentation level is fed into it, so as to produce a document, with 528 | which we continue. *) 529 | 530 | | Nesting f, _ -> 531 | state.input1 <- f state.indentation; 532 | run stack state 533 | 534 | (* [shift] discards the first document in the input sequence, so that the 535 | second input document, if there is one, becomes first. The renderer stops 536 | if there is none. *) 537 | 538 | and shift stack state = 539 | 540 | assert (state.output = OEmpty || stack <> []); 541 | assert (state.flatten1 = (stack <> [])); 542 | 543 | (* If the stack is nonempty and we have exceeded either the width or the 544 | ribbon width parameters, then fail. Backtracking is implemented by 545 | discarding the current state, popping a state off the stack, and making 546 | it the current state. *) 547 | 548 | match stack with 549 | | resumption :: stack 550 | when state.column > state.width 551 | || state.column - state.indentation > state.ribbon -> 552 | run stack resumption 553 | | _ -> 554 | 555 | match state.input with 556 | | INil -> 557 | 558 | (* End of input. Commit any buffered output and stop. *) 559 | 560 | commit state.channel state.output 561 | 562 | | ICons (indent, flatten, head, tail) -> 563 | 564 | (* There is an input document. Move it one slot ahead and 565 | check if we are leaving flattening mode. *) 566 | 567 | state.indent1 <- indent; 568 | state.input1 <- head; 569 | state.input <- tail; 570 | if state.flatten1 && not flatten then begin 571 | 572 | (* Leaving flattening mode means success: we have flattened 573 | a certain group, and fitted it all on a line, without 574 | reaching a failure point. We would now like to commit our 575 | decision to flatten this group. This is a Prolog cut. We 576 | discard the stack of choice points, replacing it with an 577 | empty stack, and commit all buffered output. *) 578 | 579 | state.flatten1 <- flatten; (* false *) 580 | commit state.channel state.output; 581 | state.output <- OEmpty; 582 | run [] state 583 | 584 | end 585 | else 586 | run stack state 587 | 588 | (* [emit_char] prints a character (either to the output channel or to the 589 | output buffer), increments the current column, discards the first piece 590 | of input, and continues. *) 591 | 592 | and emit_char stack state c = 593 | begin match stack with 594 | | [] -> 595 | Output.char state.channel c 596 | | _ -> 597 | state.output <- OChar (c, state.output) 598 | end; 599 | state.column <- state.column + 1; 600 | shift stack state 601 | 602 | (* [emit_string] prints a string (either to the output channel or to the 603 | output buffer), updates the current column, discards the first piece of 604 | input, and continues. *) 605 | 606 | and emit_string stack state s ofs len apparent_length = 607 | begin match stack with 608 | | [] -> 609 | Output.substring state.channel s ofs len 610 | | _ -> 611 | state.output <- OString (s, ofs, len, state.output) 612 | end; 613 | state.column <- state.column + apparent_length; 614 | shift stack state 615 | 616 | (* [emit_blanks] prints a blank string (either to the output channel or to 617 | the output buffer), updates the current column, discards the first piece 618 | of input, and continues. *) 619 | 620 | and emit_blanks stack state n = 621 | begin match stack with 622 | | [] -> 623 | blanks state.channel n 624 | | _ -> 625 | state.output <- OBlank (n, state.output) 626 | end; 627 | state.column <- state.column + n; 628 | shift stack state 629 | 630 | (* This is the renderer's main entry point. *) 631 | 632 | let pretty rfrac width channel document = 633 | run [] { 634 | width = width; 635 | ribbon = max 0 (min width (truncate (float_of_int width *. rfrac))); 636 | channel = channel; 637 | indentation = 0; 638 | column = 0; 639 | indent1 = 0; 640 | flatten1 = false; 641 | input1 = document; 642 | input = INil; 643 | output = OEmpty; 644 | } 645 | 646 | (* ------------------------------------------------------------------------- *) 647 | 648 | (* The compact rendering algorithm. *) 649 | 650 | let compact channel document = 651 | 652 | let column = 653 | ref 0 654 | in 655 | 656 | let rec scan = function 657 | | Empty -> 658 | () 659 | | Char c -> 660 | Output.char channel c; 661 | column := !column + 1 662 | | String (s, ofs, len) -> 663 | Output.substring channel s ofs len; 664 | column := !column + len 665 | | FancyString (s, ofs, len, apparent_length) -> 666 | Output.substring channel s ofs len; 667 | column := !column + apparent_length 668 | | Blank n -> 669 | blanks channel n; 670 | column := !column + n 671 | | HardLine -> 672 | Output.char channel '\n'; 673 | column := 0 674 | | Cat (doc1, doc2) -> 675 | scan doc1; 676 | scan doc2 677 | | IfFlat (doc, _) 678 | | Nest (_, doc) 679 | | Group doc -> 680 | scan doc 681 | | Column f -> 682 | scan (f !column) 683 | | Nesting f -> 684 | scan (f 0) 685 | in 686 | 687 | scan document 688 | 689 | end 690 | 691 | (* ------------------------------------------------------------------------- *) 692 | 693 | (* Instantiating the renderers for the three kinds of output channels. *) 694 | 695 | module type RENDERER = sig 696 | type channel 697 | type document 698 | val pretty: float -> int -> channel -> document -> unit 699 | val compact: channel -> document -> unit 700 | end 701 | 702 | module ToChannel = 703 | Renderer(ChannelOutput) 704 | 705 | module ToBuffer = 706 | Renderer(BufferOutput) 707 | 708 | module ToFormatter = 709 | Renderer(FormatterOutput) 710 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | In the following, "the Library" refers to the OCaml source files that 2 | form the PPrint library. The names of these files match the pattern 3 | PPrint*.{ml,mli}. 4 | 5 | The Library is distributed under the terms of the GNU Library General 6 | Public License version 2 (included below). 7 | 8 | As a special exception to the GNU Library General Public License, you 9 | may link, statically or dynamically, a "work that uses the Library" 10 | with a publicly distributed version of the Library to produce an 11 | executable file containing portions of the Library, and distribute 12 | that executable file under terms of your choice, without any of the 13 | additional requirements listed in clause 6 of the GNU Library General 14 | Public License. By "a publicly distributed version of the Library", 15 | we mean either the unmodified Library as distributed by INRIA, or a 16 | modified version of the Library that is distributed under the 17 | conditions defined in clause 2 of the GNU Library General Public 18 | License. This exception does not however invalidate any other reasons 19 | why the executable file might be covered by the GNU Library General 20 | Public License. 21 | 22 | ---------------------------------------------------------------------- 23 | 24 | GNU LIBRARY GENERAL PUBLIC LICENSE 25 | Version 2, June 1991 26 | 27 | Copyright (C) 1991 Free Software Foundation, Inc. 28 | 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA 29 | Everyone is permitted to copy and distribute verbatim copies 30 | of this license document, but changing it is not allowed. 31 | 32 | [This is the first released version of the library GPL. It is 33 | numbered 2 because it goes with version 2 of the ordinary GPL.] 34 | 35 | Preamble 36 | 37 | The licenses for most software are designed to take away your 38 | freedom to share and change it. By contrast, the GNU General Public 39 | Licenses are intended to guarantee your freedom to share and change 40 | free software--to make sure the software is free for all its users. 41 | 42 | This license, the Library General Public License, applies to some 43 | specially designated Free Software Foundation software, and to any 44 | other libraries whose authors decide to use it. You can use it for 45 | your libraries, too. 46 | 47 | When we speak of free software, we are referring to freedom, not 48 | price. Our General Public Licenses are designed to make sure that you 49 | have the freedom to distribute copies of free software (and charge for 50 | this service if you wish), that you receive source code or can get it 51 | if you want it, that you can change the software or use pieces of it 52 | in new free programs; and that you know you can do these things. 53 | 54 | To protect your rights, we need to make restrictions that forbid 55 | anyone to deny you these rights or to ask you to surrender the rights. 56 | These restrictions translate to certain responsibilities for you if 57 | you distribute copies of the library, or if you modify it. 58 | 59 | For example, if you distribute copies of the library, whether gratis 60 | or for a fee, you must give the recipients all the rights that we gave 61 | you. You must make sure that they, too, receive or can get the source 62 | code. If you link a program with the library, you must provide 63 | complete object files to the recipients so that they can relink them 64 | with the library, after making changes to the library and recompiling 65 | it. And you must show them these terms so they know their rights. 66 | 67 | Our method of protecting your rights has two steps: (1) copyright 68 | the library, and (2) offer you this license which gives you legal 69 | permission to copy, distribute and/or modify the library. 70 | 71 | Also, for each distributor's protection, we want to make certain 72 | that everyone understands that there is no warranty for this free 73 | library. If the library is modified by someone else and passed on, we 74 | want its recipients to know that what they have is not the original 75 | version, so that any problems introduced by others will not reflect on 76 | the original authors' reputations. 77 | 78 | Finally, any free program is threatened constantly by software 79 | patents. We wish to avoid the danger that companies distributing free 80 | software will individually obtain patent licenses, thus in effect 81 | transforming the program into proprietary software. To prevent this, 82 | we have made it clear that any patent must be licensed for everyone's 83 | free use or not licensed at all. 84 | 85 | Most GNU software, including some libraries, is covered by the ordinary 86 | GNU General Public License, which was designed for utility programs. This 87 | license, the GNU Library General Public License, applies to certain 88 | designated libraries. This license is quite different from the ordinary 89 | one; be sure to read it in full, and don't assume that anything in it is 90 | the same as in the ordinary license. 91 | 92 | The reason we have a separate public license for some libraries is that 93 | they blur the distinction we usually make between modifying or adding to a 94 | program and simply using it. Linking a program with a library, without 95 | changing the library, is in some sense simply using the library, and is 96 | analogous to running a utility program or application program. However, in 97 | a textual and legal sense, the linked executable is a combined work, a 98 | derivative of the original library, and the ordinary General Public License 99 | treats it as such. 100 | 101 | Because of this blurred distinction, using the ordinary General 102 | Public License for libraries did not effectively promote software 103 | sharing, because most developers did not use the libraries. We 104 | concluded that weaker conditions might promote sharing better. 105 | 106 | However, unrestricted linking of non-free programs would deprive the 107 | users of those programs of all benefit from the free status of the 108 | libraries themselves. This Library General Public License is intended to 109 | permit developers of non-free programs to use free libraries, while 110 | preserving your freedom as a user of such programs to change the free 111 | libraries that are incorporated in them. (We have not seen how to achieve 112 | this as regards changes in header files, but we have achieved it as regards 113 | changes in the actual functions of the Library.) The hope is that this 114 | will lead to faster development of free libraries. 115 | 116 | The precise terms and conditions for copying, distribution and 117 | modification follow. Pay close attention to the difference between a 118 | "work based on the library" and a "work that uses the library". The 119 | former contains code derived from the library, while the latter only 120 | works together with the library. 121 | 122 | Note that it is possible for a library to be covered by the ordinary 123 | General Public License rather than by this special one. 124 | 125 | GNU LIBRARY GENERAL PUBLIC LICENSE 126 | TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 127 | 128 | 0. This License Agreement applies to any software library which 129 | contains a notice placed by the copyright holder or other authorized 130 | party saying it may be distributed under the terms of this Library 131 | General Public License (also called "this License"). Each licensee is 132 | addressed as "you". 133 | 134 | A "library" means a collection of software functions and/or data 135 | prepared so as to be conveniently linked with application programs 136 | (which use some of those functions and data) to form executables. 137 | 138 | The "Library", below, refers to any such software library or work 139 | which has been distributed under these terms. A "work based on the 140 | Library" means either the Library or any derivative work under 141 | copyright law: that is to say, a work containing the Library or a 142 | portion of it, either verbatim or with modifications and/or translated 143 | straightforwardly into another language. (Hereinafter, translation is 144 | included without limitation in the term "modification".) 145 | 146 | "Source code" for a work means the preferred form of the work for 147 | making modifications to it. For a library, complete source code means 148 | all the source code for all modules it contains, plus any associated 149 | interface definition files, plus the scripts used to control compilation 150 | and installation of the library. 151 | 152 | Activities other than copying, distribution and modification are not 153 | covered by this License; they are outside its scope. The act of 154 | running a program using the Library is not restricted, and output from 155 | such a program is covered only if its contents constitute a work based 156 | on the Library (independent of the use of the Library in a tool for 157 | writing it). Whether that is true depends on what the Library does 158 | and what the program that uses the Library does. 159 | 160 | 1. You may copy and distribute verbatim copies of the Library's 161 | complete source code as you receive it, in any medium, provided that 162 | you conspicuously and appropriately publish on each copy an 163 | appropriate copyright notice and disclaimer of warranty; keep intact 164 | all the notices that refer to this License and to the absence of any 165 | warranty; and distribute a copy of this License along with the 166 | Library. 167 | 168 | You may charge a fee for the physical act of transferring a copy, 169 | and you may at your option offer warranty protection in exchange for a 170 | fee. 171 | 172 | 2. You may modify your copy or copies of the Library or any portion 173 | of it, thus forming a work based on the Library, and copy and 174 | distribute such modifications or work under the terms of Section 1 175 | above, provided that you also meet all of these conditions: 176 | 177 | a) The modified work must itself be a software library. 178 | 179 | b) You must cause the files modified to carry prominent notices 180 | stating that you changed the files and the date of any change. 181 | 182 | c) You must cause the whole of the work to be licensed at no 183 | charge to all third parties under the terms of this License. 184 | 185 | d) If a facility in the modified Library refers to a function or a 186 | table of data to be supplied by an application program that uses 187 | the facility, other than as an argument passed when the facility 188 | is invoked, then you must make a good faith effort to ensure that, 189 | in the event an application does not supply such function or 190 | table, the facility still operates, and performs whatever part of 191 | its purpose remains meaningful. 192 | 193 | (For example, a function in a library to compute square roots has 194 | a purpose that is entirely well-defined independent of the 195 | application. Therefore, Subsection 2d requires that any 196 | application-supplied function or table used by this function must 197 | be optional: if the application does not supply it, the square 198 | root function must still compute square roots.) 199 | 200 | These requirements apply to the modified work as a whole. If 201 | identifiable sections of that work are not derived from the Library, 202 | and can be reasonably considered independent and separate works in 203 | themselves, then this License, and its terms, do not apply to those 204 | sections when you distribute them as separate works. But when you 205 | distribute the same sections as part of a whole which is a work based 206 | on the Library, the distribution of the whole must be on the terms of 207 | this License, whose permissions for other licensees extend to the 208 | entire whole, and thus to each and every part regardless of who wrote 209 | it. 210 | 211 | Thus, it is not the intent of this section to claim rights or contest 212 | your rights to work written entirely by you; rather, the intent is to 213 | exercise the right to control the distribution of derivative or 214 | collective works based on the Library. 215 | 216 | In addition, mere aggregation of another work not based on the Library 217 | with the Library (or with a work based on the Library) on a volume of 218 | a storage or distribution medium does not bring the other work under 219 | the scope of this License. 220 | 221 | 3. You may opt to apply the terms of the ordinary GNU General Public 222 | License instead of this License to a given copy of the Library. To do 223 | this, you must alter all the notices that refer to this License, so 224 | that they refer to the ordinary GNU General Public License, version 2, 225 | instead of to this License. (If a newer version than version 2 of the 226 | ordinary GNU General Public License has appeared, then you can specify 227 | that version instead if you wish.) Do not make any other change in 228 | these notices. 229 | 230 | Once this change is made in a given copy, it is irreversible for 231 | that copy, so the ordinary GNU General Public License applies to all 232 | subsequent copies and derivative works made from that copy. 233 | 234 | This option is useful when you wish to copy part of the code of 235 | the Library into a program that is not a library. 236 | 237 | 4. You may copy and distribute the Library (or a portion or 238 | derivative of it, under Section 2) in object code or executable form 239 | under the terms of Sections 1 and 2 above provided that you accompany 240 | it with the complete corresponding machine-readable source code, which 241 | must be distributed under the terms of Sections 1 and 2 above on a 242 | medium customarily used for software interchange. 243 | 244 | If distribution of object code is made by offering access to copy 245 | from a designated place, then offering equivalent access to copy the 246 | source code from the same place satisfies the requirement to 247 | distribute the source code, even though third parties are not 248 | compelled to copy the source along with the object code. 249 | 250 | 5. A program that contains no derivative of any portion of the 251 | Library, but is designed to work with the Library by being compiled or 252 | linked with it, is called a "work that uses the Library". Such a 253 | work, in isolation, is not a derivative work of the Library, and 254 | therefore falls outside the scope of this License. 255 | 256 | However, linking a "work that uses the Library" with the Library 257 | creates an executable that is a derivative of the Library (because it 258 | contains portions of the Library), rather than a "work that uses the 259 | library". The executable is therefore covered by this License. 260 | Section 6 states terms for distribution of such executables. 261 | 262 | When a "work that uses the Library" uses material from a header file 263 | that is part of the Library, the object code for the work may be a 264 | derivative work of the Library even though the source code is not. 265 | Whether this is true is especially significant if the work can be 266 | linked without the Library, or if the work is itself a library. The 267 | threshold for this to be true is not precisely defined by law. 268 | 269 | If such an object file uses only numerical parameters, data 270 | structure layouts and accessors, and small macros and small inline 271 | functions (ten lines or less in length), then the use of the object 272 | file is unrestricted, regardless of whether it is legally a derivative 273 | work. (Executables containing this object code plus portions of the 274 | Library will still fall under Section 6.) 275 | 276 | Otherwise, if the work is a derivative of the Library, you may 277 | distribute the object code for the work under the terms of Section 6. 278 | Any executables containing that work also fall under Section 6, 279 | whether or not they are linked directly with the Library itself. 280 | 281 | 6. As an exception to the Sections above, you may also compile or 282 | link a "work that uses the Library" with the Library to produce a 283 | work containing portions of the Library, and distribute that work 284 | under terms of your choice, provided that the terms permit 285 | modification of the work for the customer's own use and reverse 286 | engineering for debugging such modifications. 287 | 288 | You must give prominent notice with each copy of the work that the 289 | Library is used in it and that the Library and its use are covered by 290 | this License. You must supply a copy of this License. If the work 291 | during execution displays copyright notices, you must include the 292 | copyright notice for the Library among them, as well as a reference 293 | directing the user to the copy of this License. Also, you must do one 294 | of these things: 295 | 296 | a) Accompany the work with the complete corresponding 297 | machine-readable source code for the Library including whatever 298 | changes were used in the work (which must be distributed under 299 | Sections 1 and 2 above); and, if the work is an executable linked 300 | with the Library, with the complete machine-readable "work that 301 | uses the Library", as object code and/or source code, so that the 302 | user can modify the Library and then relink to produce a modified 303 | executable containing the modified Library. (It is understood 304 | that the user who changes the contents of definitions files in the 305 | Library will not necessarily be able to recompile the application 306 | to use the modified definitions.) 307 | 308 | b) Accompany the work with a written offer, valid for at 309 | least three years, to give the same user the materials 310 | specified in Subsection 6a, above, for a charge no more 311 | than the cost of performing this distribution. 312 | 313 | c) If distribution of the work is made by offering access to copy 314 | from a designated place, offer equivalent access to copy the above 315 | specified materials from the same place. 316 | 317 | d) Verify that the user has already received a copy of these 318 | materials or that you have already sent this user a copy. 319 | 320 | For an executable, the required form of the "work that uses the 321 | Library" must include any data and utility programs needed for 322 | reproducing the executable from it. However, as a special exception, 323 | the source code distributed need not include anything that is normally 324 | distributed (in either source or binary form) with the major 325 | components (compiler, kernel, and so on) of the operating system on 326 | which the executable runs, unless that component itself accompanies 327 | the executable. 328 | 329 | It may happen that this requirement contradicts the license 330 | restrictions of other proprietary libraries that do not normally 331 | accompany the operating system. Such a contradiction means you cannot 332 | use both them and the Library together in an executable that you 333 | distribute. 334 | 335 | 7. You may place library facilities that are a work based on the 336 | Library side-by-side in a single library together with other library 337 | facilities not covered by this License, and distribute such a combined 338 | library, provided that the separate distribution of the work based on 339 | the Library and of the other library facilities is otherwise 340 | permitted, and provided that you do these two things: 341 | 342 | a) Accompany the combined library with a copy of the same work 343 | based on the Library, uncombined with any other library 344 | facilities. This must be distributed under the terms of the 345 | Sections above. 346 | 347 | b) Give prominent notice with the combined library of the fact 348 | that part of it is a work based on the Library, and explaining 349 | where to find the accompanying uncombined form of the same work. 350 | 351 | 8. You may not copy, modify, sublicense, link with, or distribute 352 | the Library except as expressly provided under this License. Any 353 | attempt otherwise to copy, modify, sublicense, link with, or 354 | distribute the Library is void, and will automatically terminate your 355 | rights under this License. However, parties who have received copies, 356 | or rights, from you under this License will not have their licenses 357 | terminated so long as such parties remain in full compliance. 358 | 359 | 9. You are not required to accept this License, since you have not 360 | signed it. However, nothing else grants you permission to modify or 361 | distribute the Library or its derivative works. These actions are 362 | prohibited by law if you do not accept this License. Therefore, by 363 | modifying or distributing the Library (or any work based on the 364 | Library), you indicate your acceptance of this License to do so, and 365 | all its terms and conditions for copying, distributing or modifying 366 | the Library or works based on it. 367 | 368 | 10. Each time you redistribute the Library (or any work based on the 369 | Library), the recipient automatically receives a license from the 370 | original licensor to copy, distribute, link with or modify the Library 371 | subject to these terms and conditions. You may not impose any further 372 | restrictions on the recipients' exercise of the rights granted herein. 373 | You are not responsible for enforcing compliance by third parties to 374 | this License. 375 | 376 | 11. If, as a consequence of a court judgment or allegation of patent 377 | infringement or for any other reason (not limited to patent issues), 378 | conditions are imposed on you (whether by court order, agreement or 379 | otherwise) that contradict the conditions of this License, they do not 380 | excuse you from the conditions of this License. If you cannot 381 | distribute so as to satisfy simultaneously your obligations under this 382 | License and any other pertinent obligations, then as a consequence you 383 | may not distribute the Library at all. For example, if a patent 384 | license would not permit royalty-free redistribution of the Library by 385 | all those who receive copies directly or indirectly through you, then 386 | the only way you could satisfy both it and this License would be to 387 | refrain entirely from distribution of the Library. 388 | 389 | If any portion of this section is held invalid or unenforceable under any 390 | particular circumstance, the balance of the section is intended to apply, 391 | and the section as a whole is intended to apply in other circumstances. 392 | 393 | It is not the purpose of this section to induce you to infringe any 394 | patents or other property right claims or to contest validity of any 395 | such claims; this section has the sole purpose of protecting the 396 | integrity of the free software distribution system which is 397 | implemented by public license practices. Many people have made 398 | generous contributions to the wide range of software distributed 399 | through that system in reliance on consistent application of that 400 | system; it is up to the author/donor to decide if he or she is willing 401 | to distribute software through any other system and a licensee cannot 402 | impose that choice. 403 | 404 | This section is intended to make thoroughly clear what is believed to 405 | be a consequence of the rest of this License. 406 | 407 | 12. If the distribution and/or use of the Library is restricted in 408 | certain countries either by patents or by copyrighted interfaces, the 409 | original copyright holder who places the Library under this License may add 410 | an explicit geographical distribution limitation excluding those countries, 411 | so that distribution is permitted only in or among countries not thus 412 | excluded. In such case, this License incorporates the limitation as if 413 | written in the body of this License. 414 | 415 | 13. The Free Software Foundation may publish revised and/or new 416 | versions of the Library General Public License from time to time. 417 | Such new versions will be similar in spirit to the present version, 418 | but may differ in detail to address new problems or concerns. 419 | 420 | Each version is given a distinguishing version number. If the Library 421 | specifies a version number of this License which applies to it and 422 | "any later version", you have the option of following the terms and 423 | conditions either of that version or of any later version published by 424 | the Free Software Foundation. If the Library does not specify a 425 | license version number, you may choose any version ever published by 426 | the Free Software Foundation. 427 | 428 | 14. If you wish to incorporate parts of the Library into other free 429 | programs whose distribution conditions are incompatible with these, 430 | write to the author to ask for permission. For software which is 431 | copyrighted by the Free Software Foundation, write to the Free 432 | Software Foundation; we sometimes make exceptions for this. Our 433 | decision will be guided by the two goals of preserving the free status 434 | of all derivatives of our free software and of promoting the sharing 435 | and reuse of software generally. 436 | 437 | NO WARRANTY 438 | 439 | 15. BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE, THERE IS NO 440 | WARRANTY FOR THE LIBRARY, TO THE EXTENT PERMITTED BY APPLICABLE LAW. 441 | EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR 442 | OTHER PARTIES PROVIDE THE LIBRARY "AS IS" WITHOUT WARRANTY OF ANY 443 | KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE 444 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR 445 | PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE 446 | LIBRARY IS WITH YOU. SHOULD THE LIBRARY PROVE DEFECTIVE, YOU ASSUME 447 | THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 448 | 449 | 16. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN 450 | WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY 451 | AND/OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE, BE LIABLE TO YOU 452 | FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR 453 | CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE 454 | LIBRARY (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING 455 | RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A 456 | FAILURE OF THE LIBRARY TO OPERATE WITH ANY OTHER SOFTWARE), EVEN IF 457 | SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH 458 | DAMAGES. 459 | 460 | END OF TERMS AND CONDITIONS 461 | 462 | Appendix: How to Apply These Terms to Your New Libraries 463 | 464 | If you develop a new library, and you want it to be of the greatest 465 | possible use to the public, we recommend making it free software that 466 | everyone can redistribute and change. You can do so by permitting 467 | redistribution under these terms (or, alternatively, under the terms of the 468 | ordinary General Public License). 469 | 470 | To apply these terms, attach the following notices to the library. It is 471 | safest to attach them to the start of each source file to most effectively 472 | convey the exclusion of warranty; and each file should have at least the 473 | "copyright" line and a pointer to where the full notice is found. 474 | 475 | 476 | Copyright (C) 477 | 478 | This library is free software; you can redistribute it and/or 479 | modify it under the terms of the GNU Library General Public 480 | License as published by the Free Software Foundation; either 481 | version 2 of the License, or (at your option) any later version. 482 | 483 | This library is distributed in the hope that it will be useful, 484 | but WITHOUT ANY WARRANTY; without even the implied warranty of 485 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU 486 | Library General Public License for more details. 487 | 488 | You should have received a copy of the GNU Library General Public 489 | License along with this library; if not, write to the Free 490 | Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, 491 | MA 02111-1307, USA 492 | 493 | Also add information on how to contact you by electronic and paper mail. 494 | 495 | You should also get your employer (if you work as a programmer) or your 496 | school, if any, to sign a "copyright disclaimer" for the library, if 497 | necessary. Here is a sample; alter the names: 498 | 499 | Yoyodyne, Inc., hereby disclaims all copyright interest in the 500 | library `Frob' (a library for tweaking knobs) written by James Random Hacker. 501 | 502 | , 1 April 1990 503 | Ty Coon, President of Vice 504 | 505 | That's all there is to it! 506 | -------------------------------------------------------------------------------- /src/PPrintEngine.ml: -------------------------------------------------------------------------------- 1 | (******************************************************************************) 2 | (* *) 3 | (* PPrint *) 4 | (* *) 5 | (* François Pottier, Inria Paris *) 6 | (* Nicolas Pouillard *) 7 | (* *) 8 | (* Copyright 2007-2022 Inria. All rights reserved. This file is *) 9 | (* distributed under the terms of the GNU Library General Public *) 10 | (* License, with an exception, as described in the file LICENSE. *) 11 | (* *) 12 | (******************************************************************************) 13 | 14 | (** A point is a pair of a line number and a column number. *) 15 | type point = 16 | int * int 17 | 18 | (** A range is a pair of points. *) 19 | type range = 20 | point * point 21 | 22 | (* ------------------------------------------------------------------------- *) 23 | 24 | (* A type of integers with infinity. *) 25 | 26 | type requirement = 27 | int (* with infinity *) 28 | 29 | (* Infinity is encoded as [max_int]. *) 30 | 31 | let infinity : requirement = 32 | max_int 33 | 34 | (* Addition of integers with infinity. *) 35 | 36 | let (++) (x : requirement) (y : requirement) : requirement = 37 | if x = infinity || y = infinity then 38 | infinity 39 | else 40 | x + y 41 | 42 | (* Comparison between an integer with infinity and a normal integer. *) 43 | 44 | let (<==) (x : requirement) (y : int) = 45 | x <= y 46 | 47 | (* ------------------------------------------------------------------------- *) 48 | 49 | (* A uniform interface for output channels. *) 50 | 51 | class type output = object 52 | 53 | (** [char c] sends the character [c] to the output channel. *) 54 | method char: char -> unit 55 | 56 | (** [substring s ofs len] sends the substring of [s] delimited by the 57 | offset [ofs] and the length [len] to the output channel. *) 58 | method substring: string -> int (* offset *) -> int (* length *) -> unit 59 | 60 | end 61 | 62 | (* ------------------------------------------------------------------------- *) 63 | 64 | (* Printing blank space. This is used both internally (to emit indentation 65 | characters) and via the public combinator [blank]. *) 66 | 67 | let blank_length = 68 | 80 69 | 70 | let blank_buffer = 71 | String.make blank_length ' ' 72 | 73 | let rec blanks (output : output) n = 74 | if n <= 0 then 75 | () 76 | else if n <= blank_length then 77 | output#substring blank_buffer 0 n 78 | else begin 79 | output#substring blank_buffer 0 blank_length; 80 | blanks output (n - blank_length) 81 | end 82 | 83 | (* ------------------------------------------------------------------------- *) 84 | 85 | (* The class [buffering] implements a wrapper that delays the printing of 86 | blank characters. This includes indentation characters and characters 87 | produced by the combinator [blank]. The printing of these characters is 88 | delayed until it is known that they are followed by something on the same 89 | line; if they are not followed with anything, then it is canceled. 90 | 91 | The actual printing task is delegated to the object [delegate], whose type 92 | is [output]; the new object has type [output] as well. *) 93 | 94 | class buffering (delegate : output) : output = object (self) 95 | 96 | (* The number of blank characters that are withholding. *) 97 | val mutable buffered = 0 98 | 99 | (* [flush] sends out the blank characters that have been withheld. *) 100 | method private flush = 101 | blanks delegate buffered; 102 | buffered <- 0 103 | 104 | method char c : unit = 105 | begin match c with 106 | | '\n' -> 107 | (* The current line ends here. Any blank characters that were withheld 108 | are destroyed. This is where we avoid printing blank characters if 109 | nothing follows them. *) 110 | buffered <- 0 111 | | _ -> 112 | (* The current line is nonempty. Any blank characters that were 113 | withheld can now be flushed. *) 114 | self#flush 115 | end; 116 | (* Print this character as usual. *) 117 | delegate#char c 118 | 119 | method substring s pos len = 120 | (* If this is a string of length zero, then there is nothing to do. *) 121 | if len = 0 then 122 | () 123 | (* If this is a blank string (which we recognize by its address), then 124 | its content is withheld. *) 125 | else if s == blank_buffer then 126 | buffered <- buffered + len 127 | (* If this is not a blank string, then the blank characters that were 128 | withheld up to this point can now be flushed. *) 129 | else begin 130 | self#flush; 131 | delegate#substring s pos len 132 | end 133 | 134 | end 135 | 136 | (* ------------------------------------------------------------------------- *) 137 | 138 | (* Three kinds of output channels are wrapped so as to satisfy the above 139 | interface: OCaml output channels, OCaml memory buffers, and OCaml 140 | formatters. *) 141 | 142 | class channel_output channel = object 143 | method char = output_char channel 144 | method substring = output_substring channel 145 | (* We used to use [output], but, as of OCaml 4.02 and with -safe-string 146 | enabled, the type of [output] has changed: this function now expects 147 | an argument of type [bytes]. The new function [output_substring] must 148 | be used instead. Furthermore, as of OCaml 4.06, -safe-string is enabled 149 | by default. In summary, we require OCaml 4.02, use [output_substring], 150 | and enable -safe-string. *) 151 | end 152 | 153 | class buffer_output buffer = object 154 | method char = Buffer.add_char buffer 155 | method substring = Buffer.add_substring buffer 156 | end 157 | 158 | class formatter_output fmt = object 159 | method char = function 160 | | '\n' -> Format.pp_force_newline fmt () 161 | | ' ' -> Format.pp_print_space fmt () 162 | | c -> Format.pp_print_char fmt c 163 | 164 | method substring str ofs len = 165 | Format.pp_print_text fmt ( 166 | if ofs = 0 && len = String.length str 167 | then str 168 | else String.sub str ofs len 169 | ) 170 | end 171 | 172 | (* ------------------------------------------------------------------------- *) 173 | 174 | (** The rendering engine maintains the following internal state. Its structure 175 | is subject to change in future versions of the library. Nevertheless, it is 176 | exposed to the user who wishes to define custom documents. *) 177 | 178 | type state = { 179 | 180 | width: int; 181 | (** The line width. This parameter is fixed throughout the execution of 182 | the renderer. *) 183 | 184 | ribbon: int; 185 | (** The ribbon width. This parameter is fixed throughout the execution of 186 | the renderer. *) 187 | 188 | mutable last_indent: int; 189 | (** The number of blanks that were printed at the beginning of the current 190 | line. This field is updated (only) when a hardline is emitted. It is 191 | used (only) to determine whether the ribbon width constraint is 192 | respected. *) 193 | 194 | mutable line: int; 195 | (** The current line. This field is updated (only) when a hardline is 196 | emitted. It is not used by the pretty-printing engine itself. *) 197 | 198 | mutable column: int; 199 | (** The current column. This field must be updated whenever something is 200 | sent to the output channel. It is used (only) to determine whether the 201 | width constraint is respected. *) 202 | 203 | } 204 | 205 | (* ------------------------------------------------------------------------- *) 206 | 207 | (* [initial rfrac width] creates a fresh initial state. *) 208 | 209 | let initial rfrac width = { 210 | width = width; 211 | ribbon = max 0 (min width (truncate (float_of_int width *. rfrac))); 212 | last_indent = 0; 213 | line = 0; 214 | column = 0 215 | } 216 | 217 | (* ------------------------------------------------------------------------- *) 218 | 219 | (** A custom document is defined by implementing the following methods. *) 220 | 221 | class type custom = object 222 | 223 | (** A custom document must publish the width (i.e., the number of columns) 224 | that it would like to occupy if it is printed on a single line (that is, 225 | in flattening mode). The special value [infinity] means that this 226 | document cannot be printed on a single line; this value causes any 227 | groups that contain this document to be dissolved. This method should 228 | in principle work in constant time. *) 229 | method requirement: requirement 230 | 231 | (** The method [pretty] is used by the main rendering algorithm. It has 232 | access to the output channel and to the algorithm's internal state, as 233 | described above. In addition, it receives the current indentation level 234 | and the current flattening mode (on or off). If flattening mode is on, 235 | then the document must be printed on a single line, in a manner that is 236 | consistent with the requirement that was published ahead of time. If 237 | flattening mode is off, then there is no such obligation. The state must 238 | be updated in a manner that is consistent with what is sent to the 239 | output channel. *) 240 | method pretty: output -> state -> int -> bool -> unit 241 | 242 | (** The method [compact] is used by the compact rendering algorithm. It has 243 | access to the output channel only. *) 244 | method compact: output -> unit 245 | 246 | end 247 | 248 | (* ------------------------------------------------------------------------- *) 249 | 250 | (* Here is the algebraic data type of documents. It is analogous to Daan 251 | Leijen's version, but the binary constructor [Union] is replaced with 252 | the unary constructor [Group], and the constant [Line] is replaced with 253 | more general constructions, namely [IfFlat], which provides alternative 254 | forms depending on the current flattening mode, and [HardLine], which 255 | represents a newline character, and causes a failure in flattening mode. *) 256 | 257 | type document = 258 | 259 | (* [Empty] is the empty document. *) 260 | 261 | | Empty 262 | 263 | (* [Char c] is a document that consists of the single character [c]. We 264 | enforce the invariant that [c] is not a newline character. *) 265 | 266 | | Char of char 267 | 268 | (* [String s] is a document that consists of just the string [s]. We 269 | assume, but do not check, that this string does not contain a newline 270 | character. [String] is a special case of [FancyString], which takes up 271 | less space in memory. *) 272 | 273 | | String of string 274 | 275 | (* [FancyString (s, ofs, len, apparent_length)] is a (portion of a) string 276 | that may contain fancy characters: color escape characters, UTF-8 or 277 | multi-byte characters, etc. Thus, the apparent length (which corresponds 278 | to what will be visible on screen) differs from the length (which is a 279 | number of bytes, and is reported by [String.length]). We assume, but do 280 | not check, that fancystrings do not contain a newline character. *) 281 | 282 | | FancyString of string * int * int * int 283 | 284 | (* [Blank n] is a document that consists of [n] blank characters. *) 285 | 286 | | Blank of int 287 | 288 | (* When in flattening mode, [IfFlat (d1, d2)] turns into the document 289 | [d1]. When not in flattening mode, it turns into the document [d2]. *) 290 | 291 | | IfFlat of document * document 292 | 293 | (* When in flattening mode, [HardLine] causes a failure, which requires 294 | backtracking all the way until the stack is empty. When not in flattening 295 | mode, it represents a newline character, followed with an appropriate 296 | number of indentation. A common way of using [HardLine] is to only use it 297 | directly within the right branch of an [IfFlat] construct. *) 298 | 299 | | HardLine 300 | 301 | (* The following constructors store their space requirement. This is the 302 | document's apparent length, if printed in flattening mode. This 303 | information is computed in a bottom-up manner when the document is 304 | constructed. *) 305 | 306 | (* In other words, the space requirement is the number of columns that the 307 | document needs in order to fit on a single line. We express this value in 308 | the set of `integers extended with infinity', and use the value 309 | [infinity] to indicate that the document cannot be printed on a single 310 | line. *) 311 | 312 | (* Storing this information at [Group] nodes is crucial, as it allows us to 313 | avoid backtracking and buffering. *) 314 | 315 | (* Storing this information at other nodes allows the function [requirement] 316 | to operate in constant time. This means that the bottom-up computation of 317 | requirements takes linear time. *) 318 | 319 | (* [Cat (req, doc1, doc2)] is the concatenation of the documents [doc1] and 320 | [doc2]. The space requirement [req] is the sum of the requirements of 321 | [doc1] and [doc2]. *) 322 | 323 | | Cat of requirement * document * document 324 | 325 | (* [Nest (req, j, doc)] is the document [doc], in which the indentation 326 | level has been increased by [j], that is, in which [j] blanks have been 327 | inserted after every newline character. The space requirement [req] is 328 | the same as the requirement of [doc]. *) 329 | 330 | | Nest of requirement * int * document 331 | 332 | (* [Group (req, doc)] represents an alternative: it is either a flattened 333 | form of [doc], in which occurrences of [Group] disappear and occurrences 334 | of [IfFlat] resolve to their left branch, or [doc] itself. The space 335 | requirement [req] is the same as the requirement of [doc]. *) 336 | 337 | | Group of requirement * document 338 | 339 | (* [Align (req, doc)] increases the indentation level to reach the current 340 | column. Thus, the document [doc] is rendered within a box whose upper 341 | left corner is the current position. The space requirement [req] is the 342 | same as the requirement of [doc]. *) 343 | 344 | | Align of requirement * document 345 | 346 | (* [Range (req, hook, doc)] is printed like [doc]. After it is printed, the 347 | function [hook] is applied to the range that is occupied by [doc] in the 348 | output. *) 349 | 350 | | Range of requirement * (range -> unit) * document 351 | 352 | (* [Custom (req, f)] is a document whose appearance is user-defined. *) 353 | 354 | | Custom of custom 355 | 356 | (* ------------------------------------------------------------------------- *) 357 | 358 | (* Retrieving or computing the space requirement of a document. *) 359 | 360 | let rec requirement = function 361 | | Empty -> 362 | 0 363 | | Char _ -> 364 | 1 365 | | String s -> 366 | String.length s 367 | | FancyString (_, _, _, len) 368 | | Blank len -> 369 | len 370 | | IfFlat (doc1, _) -> 371 | (* The requirement of a document is the space that it needs when it is 372 | printed in flattening mode. So, the requirement of [ifflat x y] is 373 | just the requirement of its flat version, [x]. *) 374 | (* The smart constructor [ifflat] ensures that [IfFlat] is never nested 375 | in the left-hand side of [IfFlat], so this recursive call is not a 376 | problem; the function [requirement] has constant time complexity. *) 377 | requirement doc1 378 | | HardLine -> 379 | (* A hard line cannot be printed in flattening mode. *) 380 | infinity 381 | | Cat (req, _, _) 382 | | Nest (req, _, _) 383 | | Group (req, _) 384 | | Align (req, _) 385 | | Range (req, _, _) -> 386 | (* These nodes store their requirement -- which is computed when the 387 | node is constructed -- so as to allow us to answer in constant time 388 | here. *) 389 | req 390 | | Custom c -> 391 | c#requirement 392 | 393 | (* ------------------------------------------------------------------------- *) 394 | 395 | (* The above algebraic data type is not exposed to the user. Instead, we 396 | expose the following smart constructors. These functions construct a raw 397 | document and compute its requirement, so as to obtain a document. *) 398 | 399 | (* The smart constructors ensure that [Empty] is the only empty document; 400 | that is, there is no other way of constructing a document that behaves 401 | (in all contexts) as an empty document. (This claim could be violated 402 | by constructing [range hook empty] where [hook] has no effect, or by 403 | constructing a [custom] document that behaves like an empty document. 404 | These violations seem benign.) *) 405 | 406 | let empty = 407 | Empty 408 | 409 | let char c = 410 | assert (c <> '\n'); 411 | Char c 412 | 413 | let space = 414 | Blank 1 415 | 416 | let string s = 417 | if String.length s = 0 then 418 | empty 419 | else 420 | String s 421 | 422 | let fancysubstring s ofs len apparent_length = 423 | if len = 0 then 424 | empty 425 | else 426 | FancyString (s, ofs, len, apparent_length) 427 | 428 | let[@inline] substring s ofs len = 429 | fancysubstring s ofs len len 430 | 431 | let[@inline] fancystring s apparent_length = 432 | fancysubstring s 0 (String.length s) apparent_length 433 | 434 | (* The following function was stolen from [Batteries]. *) 435 | let utf8_length s = 436 | let rec length_aux s c i = 437 | if i >= String.length s then c else 438 | let n = Char.code (String.unsafe_get s i) in 439 | let k = 440 | if n < 0x80 then 1 else 441 | if n < 0xe0 then 2 else 442 | if n < 0xf0 then 3 else 4 443 | in 444 | length_aux s (c + 1) (i + k) 445 | in 446 | length_aux s 0 0 447 | 448 | let[@inline] utf8string s = 449 | fancystring s (utf8_length s) 450 | 451 | let[@inline] utf8format f = 452 | Printf.ksprintf utf8string f 453 | 454 | let hardline = 455 | HardLine 456 | 457 | let blank n = 458 | match n with 459 | | 0 -> 460 | empty 461 | | _ -> 462 | Blank n 463 | 464 | let ifflat doc1 doc2 = 465 | match doc1, doc2 with 466 | (* If both documents are empty then the result is empty. *) 467 | | Empty, Empty -> 468 | empty 469 | (* We avoid nesting [IfFlat] inside the left-hand side of [IfFlat]. That 470 | would be redundant; and the function [requirement] relies on the fact 471 | that the left child of [IfFlat] cannot be [IfFlat]. On the right-hand 472 | side, a symmetric optimization would be valid as well, but is not 473 | useful. *) 474 | | IfFlat (doc1, _), doc2 475 | | doc1, doc2 -> 476 | IfFlat (doc1, doc2) 477 | 478 | let[@inline] internal_break i = 479 | IfFlat (blank i, hardline) 480 | 481 | let break0 = 482 | IfFlat (Empty, HardLine) (* this is [internal_break 0] *) 483 | 484 | let break1 = 485 | IfFlat (Blank 1, HardLine) (* this is [internal_break 1] *) 486 | 487 | let break i = 488 | match i with 489 | | 0 -> 490 | break0 491 | | 1 -> 492 | break1 493 | | _ -> 494 | internal_break i 495 | 496 | let (^^) x y = 497 | match x, y with 498 | | Empty, _ -> 499 | y 500 | | _, Empty -> 501 | x 502 | | _, _ -> 503 | Cat (requirement x ++ requirement y, x, y) 504 | 505 | let nest i x = 506 | assert (i >= 0); 507 | match x with 508 | | Empty -> 509 | Empty 510 | | _ -> 511 | Nest (requirement x, i, x) 512 | 513 | let group x = 514 | match x with 515 | | Empty -> 516 | Empty 517 | | _ -> 518 | let req = requirement x in 519 | (* Minor optimisation: an infinite requirement dissolves a group. *) 520 | if req = infinity then 521 | x 522 | else 523 | Group (req, x) 524 | 525 | let align x = 526 | match x with 527 | | Empty -> 528 | Empty 529 | | _ -> 530 | Align (requirement x, x) 531 | 532 | let[@inline] range hook x = 533 | Range (requirement x, hook, x) 534 | 535 | let custom c = 536 | (* Sanity check. *) 537 | assert (c#requirement >= 0); 538 | Custom c 539 | 540 | (* ------------------------------------------------------------------------- *) 541 | 542 | (* Because the smart constructors ensure that [Empty] is the only empty 543 | document, [is_empty] can be implemented in a simple and efficient way. *) 544 | 545 | let is_empty x = 546 | match x with Empty -> true | _ -> false 547 | 548 | (* ------------------------------------------------------------------------- *) 549 | 550 | (* This function expresses the following invariant: if we are in flattening 551 | mode, then we must be within bounds, i.e. the width and ribbon width 552 | constraints must be respected. *) 553 | 554 | let ok state flatten : bool = 555 | not flatten || 556 | state.column <= state.width && state.column <= state.last_indent + state.ribbon 557 | 558 | (* ------------------------------------------------------------------------- *) 559 | 560 | (* The pretty rendering engine. *) 561 | 562 | (* The renderer is supposed to behave exactly like Daan Leijen's, although its 563 | implementation is quite radically different, and simpler. Our documents are 564 | constructed eagerly, as opposed to lazily. This means that we pay a large 565 | space overhead, but in return, we get the ability of computing information 566 | bottom-up, as described above, which allows to render documents without 567 | backtracking or buffering. *) 568 | 569 | (* The [state] record is never copied; it is just threaded through. In 570 | addition to it, the parameters [indent] and [flatten] influence the 571 | manner in which the document is rendered. *) 572 | 573 | (* The code is written in tail-recursive style, so as to avoid running out of 574 | stack space if the document is very deep. Each [KCons] cell in a 575 | continuation represents a pending call to [pretty]. Each [KRange] cell 576 | represents a pending call to a user-provided range hook. *) 577 | 578 | type cont = 579 | | KNil 580 | | KCons of int * bool * document * cont 581 | | KRange of (range -> unit) * point * cont 582 | 583 | let rec pretty 584 | (output : output) 585 | (state : state) 586 | (indent : int) 587 | (flatten : bool) 588 | (doc : document) 589 | (cont : cont) 590 | : unit = 591 | match doc with 592 | 593 | | Empty -> 594 | continue output state cont 595 | 596 | | Char c -> 597 | output#char c; 598 | state.column <- state.column + 1; 599 | (* assert (ok state flatten); *) 600 | continue output state cont 601 | 602 | | String s -> 603 | let len = String.length s in 604 | output#substring s 0 len; 605 | state.column <- state.column + len; 606 | (* assert (ok state flatten); *) 607 | continue output state cont 608 | 609 | | FancyString (s, ofs, len, apparent_length) -> 610 | output#substring s ofs len; 611 | state.column <- state.column + apparent_length; 612 | (* assert (ok state flatten); *) 613 | continue output state cont 614 | 615 | | Blank n -> 616 | blanks output n; 617 | state.column <- state.column + n; 618 | (* assert (ok state flatten); *) 619 | continue output state cont 620 | 621 | | HardLine -> 622 | (* We cannot be in flattening mode, because a hard line has an [infinity] 623 | requirement, and we attempt to render a group in flattening mode only 624 | if this group's requirement is met. *) 625 | assert (not flatten); 626 | (* Emit a hardline. *) 627 | output#char '\n'; 628 | blanks output indent; 629 | state.line <- state.line + 1; 630 | state.column <- indent; 631 | state.last_indent <- indent; 632 | (* Continue. *) 633 | continue output state cont 634 | 635 | | IfFlat (doc1, doc2) -> 636 | (* Pick an appropriate sub-document, based on the current flattening 637 | mode. *) 638 | pretty output state indent flatten (if flatten then doc1 else doc2) cont 639 | 640 | | Cat (_, doc1, doc2) -> 641 | (* Push the second document onto the continuation. *) 642 | pretty output state indent flatten doc1 (KCons (indent, flatten, doc2, cont)) 643 | 644 | | Nest (_, j, doc) -> 645 | pretty output state (indent + j) flatten doc cont 646 | 647 | | Group (req, doc) -> 648 | (* If we already are in flattening mode, stay in flattening mode; we 649 | are committed to it. If we are not already in flattening mode, we 650 | have a choice of entering flattening mode. We enter this mode only 651 | if we know that this group fits on this line without violating the 652 | width or ribbon width constraints. Thus, we never backtrack. *) 653 | let flatten = 654 | flatten || 655 | let column = state.column ++ req in 656 | column <== state.width && column <== state.last_indent + state.ribbon 657 | in 658 | pretty output state indent flatten doc cont 659 | 660 | | Align (_, doc) -> 661 | (* The effect of this combinator is to set [indent] to [state.column]. 662 | Usually [indent] is equal to [state.last_indent], hence setting it 663 | to [state.column] increases it. However, if [nest] has been used 664 | since the current line began, then this could cause [indent] to 665 | decrease. *) 666 | (* assert (state.column > state.last_indent); *) 667 | pretty output state state.column flatten doc cont 668 | 669 | | Range (_, hook, doc) -> 670 | let start : point = (state.line, state.column) in 671 | pretty output state indent flatten doc (KRange (hook, start, cont)) 672 | 673 | | Custom c -> 674 | (* Invoke the document's custom rendering function. *) 675 | c#pretty output state indent flatten; 676 | (* Sanity check. *) 677 | assert (ok state flatten); 678 | (* Continue. *) 679 | continue output state cont 680 | 681 | and continue output state = function 682 | | KNil -> 683 | () 684 | | KCons (indent, flatten, doc, cont) -> 685 | pretty output state indent flatten doc cont 686 | | KRange (hook, start, cont) -> 687 | let finish : point = (state.line, state.column) in 688 | hook (start, finish); 689 | continue output state cont 690 | 691 | (* Publish a version of [pretty] that does not take an explicit continuation. 692 | This function may be used by authors of custom documents. We do not expose 693 | the internal [pretty] -- the one that takes a continuation -- because we 694 | wish to simplify the user's life. The price to pay is that calls that go 695 | through a custom document cannot be tail calls. *) 696 | 697 | let pretty output state indent flatten doc = 698 | pretty output state indent flatten doc KNil 699 | 700 | (* ------------------------------------------------------------------------- *) 701 | 702 | (* The compact rendering algorithm. *) 703 | 704 | let rec compact output doc cont = 705 | match doc with 706 | | Empty -> 707 | continue output cont 708 | | Char c -> 709 | output#char c; 710 | continue output cont 711 | | String s -> 712 | let len = String.length s in 713 | output#substring s 0 len; 714 | continue output cont 715 | | FancyString (s, ofs, len, _apparent_length) -> 716 | output#substring s ofs len; 717 | continue output cont 718 | | Blank n -> 719 | blanks output n; 720 | continue output cont 721 | | HardLine -> 722 | output#char '\n'; 723 | continue output cont 724 | | Cat (_, doc1, doc2) -> 725 | compact output doc1 (doc2 :: cont) 726 | | IfFlat (doc, _) 727 | | Nest (_, _, doc) 728 | | Group (_, doc) 729 | | Align (_, doc) 730 | | Range (_, _, doc) -> 731 | compact output doc cont 732 | | Custom c -> 733 | (* Invoke the document's custom rendering function. *) 734 | c#compact output; 735 | continue output cont 736 | 737 | and continue output cont = 738 | match cont with 739 | | [] -> 740 | () 741 | | doc :: cont -> 742 | compact output doc cont 743 | 744 | let compact output doc = 745 | compact output doc [] 746 | 747 | (* ------------------------------------------------------------------------- *) 748 | 749 | (* We now instantiate the renderers for the three kinds of output channels. *) 750 | 751 | (* This is just boilerplate. *) 752 | 753 | module type RENDERER = sig 754 | type channel 755 | type document 756 | val pretty: float -> int -> channel -> document -> unit 757 | val compact: channel -> document -> unit 758 | end 759 | 760 | module MakeRenderer (X : sig 761 | type channel 762 | val output: channel -> output 763 | end) 764 | : RENDERER with type channel = X.channel and type document = document 765 | = struct 766 | type channel = X.channel 767 | type nonrec document = document 768 | let pretty rfrac width channel doc = pretty (X.output channel) (initial rfrac width) 0 false doc 769 | let compact channel doc = compact (X.output channel) doc 770 | end 771 | 772 | module ToChannel = 773 | MakeRenderer(struct 774 | type channel = out_channel 775 | let output channel = new buffering (new channel_output channel) 776 | end) 777 | 778 | module ToBuffer = 779 | MakeRenderer(struct 780 | type channel = Buffer.t 781 | let output buffer = new buffering (new buffer_output buffer) 782 | end) 783 | 784 | module ToFormatter = 785 | MakeRenderer(struct 786 | type channel = Format.formatter 787 | let output fmt = new buffering (new formatter_output fmt) 788 | end) 789 | --------------------------------------------------------------------------------