├── .github └── workflows │ └── build.yml ├── .gitignore ├── CHANGELOG.md ├── LICENSE ├── README.md ├── commonmark-bench ├── benchmarks │ └── commonmark │ │ └── pro-git.rkt └── info.rkt ├── commonmark-doc ├── info.rkt └── scribblings │ ├── commonmark.scrbl │ ├── commonmark │ └── private │ │ ├── commonmark.css │ │ └── scribble-render.rkt │ └── info.rkt ├── commonmark-lib ├── commonmark │ ├── main.rkt │ ├── parse.rkt │ ├── private │ │ ├── parse │ │ │ ├── block.rkt │ │ │ ├── common.rkt │ │ │ ├── entities.json │ │ │ ├── entity.rkt │ │ │ └── inline.rkt │ │ ├── regexp.rkt │ │ ├── render.rkt │ │ └── struct.rkt │ ├── render │ │ └── html.rkt │ └── struct.rkt └── info.rkt ├── commonmark-test ├── info.rkt └── tests │ └── commonmark │ ├── parse │ ├── footnote.rkt │ ├── gh4.rkt │ └── gh5.rkt │ ├── spec-0.31.2.json │ ├── spec.rkt │ └── test-util.rkt └── commonmark └── info.rkt /.github/workflows/build.yml: -------------------------------------------------------------------------------- 1 | name: build 2 | on: [push] 3 | defaults: 4 | run: 5 | working-directory: repo 6 | jobs: 7 | test: 8 | runs-on: ubuntu-latest 9 | strategy: 10 | fail-fast: false 11 | matrix: 12 | racket-version: [ '7.4', '7.9', '8.0', '8.3', stable ] 13 | steps: 14 | - uses: actions/checkout@v2 15 | with: { path: repo } 16 | - uses: Bogdanp/setup-racket@v1.5 17 | with: 18 | version: ${{ matrix.racket-version }} 19 | dest: '$GITHUB_WORKSPACE/racket' 20 | sudo: never 21 | - name: install 22 | run: raco pkg install --installation --auto --link commonmark-{bench,doc,lib,test} 23 | - name: test 24 | run: raco test -ep commonmark-{bench,doc,lib,test} 25 | 26 | - name: deploy_docs 27 | if: ${{ github.event_name != 'pull_request' && github.ref == 'refs/heads/master' && matrix.racket-version == 'stable' }} 28 | run: | 29 | set -e 30 | scribble +m --redirect https://docs.racket-lang.org/local-redirect/index.html \ 31 | --dest docs --dest-name index commonmark-doc/scribblings/commonmark.scrbl 32 | cd docs 33 | git init -b gh-pages 34 | git config user.name 'GitHub Actions' 35 | git config user.email 'lexi.lambda@gmail.com' 36 | git add . 37 | git commit -m 'Deploy to GitHub Pages' 38 | git push --force 'https://lexi-lambda:${{ github.token }}@github.com/${{ github.repository }}' gh-pages 39 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | /build/ 2 | compiled/ 3 | doc/ 4 | *~ 5 | -------------------------------------------------------------------------------- /CHANGELOG.md: -------------------------------------------------------------------------------- 1 | ## 1.2 (2024-10-07) 2 | 3 | * Updated from CommonMark v0.30 to v0.31.2. ([f28bafb](https://github.com/lexi-lambda/racket-commonmark/commit/f28bafb69a3cdf4ffc9f0f0a77aefadb507421f7)) 4 | 5 | The behavioral changes are quite minimal. The relevant bullets from [the CommonMark changelog](https://spec.commonmark.org/changelog.txt) are: 6 | 7 | > * Add symbols to unicode punctuation (Titus Wormer). 8 | > * Add `search` element to list of known block elements (Titus Wormer). 9 | > * Remove `source` element as HTML block start condition (Lukas Spieß). 10 | > * Remove restrictive limitation on inline comments; now we match the HTML spec (Titus Wormer). 11 | 12 | ## 1.1.1 (2024-10-07) 13 | 14 | * Fixed bug that caused inline links to sometimes fail to parse. ([#4](https://github.com/lexi-lambda/racket-commonmark/issues/4), [f96082a](https://github.com/lexi-lambda/racket-commonmark/commit/f96082a21d5577c57c5c00d916f666567cb41a1c)) 15 | * Fixed nested list tightness sometimes being incorrect. ([#5](https://github.com/lexi-lambda/racket-commonmark/issues/5), [e0b9dec](https://github.com/lexi-lambda/racket-commonmark/commit/e0b9dec454e9ebca23c4578f7e58a3774546d4a9)) 16 | 17 | ## 1.1 (2021-11-22) 18 | 19 | * Added support for footnotes as an optional extension. ([d40156b](https://github.com/lexi-lambda/racket-commonmark/commit/d40156bce42088aea1a742d6cce4c8697318db70)) 20 | 21 | ## 1.0 (2021-11-20) 22 | 23 | * Initial release. 24 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | ISC License 2 | 3 | Copyright (c) 2021, Alexis King 4 | 5 | Permission to use, copy, modify, and/or distribute this software 6 | for any purpose with or without fee is hereby granted, provided 7 | that the above copyright notice and this permission notice appear 8 | in all copies. 9 | 10 | THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL 11 | WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED 12 | WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE 13 | AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL 14 | DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA 15 | OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER 16 | TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR 17 | PERFORMANCE OF THIS SOFTWARE. 18 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # commonmark [![Build Status](https://github.com/lexi-lambda/racket-commonmark/actions/workflows/build.yml/badge.svg?branch=master)](https://github.com/lexi-lambda/racket-commonmark/actions/workflows/build.yml) [![Scribble Docs](https://img.shields.io/badge/docs-built-blue)][commonmark-doc] 2 | 3 | This library provides a fast, [CommonMark]-compliant Markdown parser, implemented natively in Racket. To use it, install the `commonmark` package: 4 | 5 | ``` 6 | $ raco pkg install commonmark 7 | ``` 8 | 9 | [For more information, see the documentation.][commonmark-doc] 10 | 11 | [commonmark-doc]: https://lexi-lambda.github.io/racket-commonmark/ 12 | [CommonMark]: https://commonmark.org/ 13 | -------------------------------------------------------------------------------- /commonmark-bench/benchmarks/commonmark/pro-git.rkt: -------------------------------------------------------------------------------- 1 | #lang racket/base 2 | 3 | ;; This module benchmarks commonmark against markdown, using an input corpus 4 | ;; derived by concatenating the Markdown sources of all the localizations of the 5 | ;; first edition of Pro Git by Scott Chacon. (This is the benchmarking technique 6 | ;; used by cmark .) 7 | 8 | (require benchmark 9 | net/git-checkout 10 | racket/file 11 | racket/format 12 | racket/list 13 | racket/match 14 | racket/path 15 | racket/port 16 | 17 | (prefix-in cm: commonmark) 18 | (prefix-in md: markdown)) 19 | 20 | (define-logger cm-bench) 21 | 22 | (define current-build-directory (make-parameter "build")) 23 | 24 | (define (bench-path . sub) 25 | (simplify-path (apply build-path (current-build-directory) "bench" sub))) 26 | 27 | (define (clone-progit) 28 | (define dest-dir (bench-path "progit")) 29 | (cond 30 | [(directory-exists? dest-dir) 31 | (log-cm-bench-debug "clone-progit: ‘~a’ already exists, skipping" dest-dir)] 32 | [else 33 | (log-cm-bench-info "clone-progit: cloning into ‘~a’" dest-dir) 34 | (make-parent-directory* dest-dir) 35 | (git-checkout #:transport 'https "github.com" "progit/progit.git" 36 | #:dest-dir dest-dir)]) 37 | dest-dir) 38 | 39 | (define document-sizes #(tiny small medium large)) 40 | 41 | (define (build-bench-inputs) 42 | (define progit-dir (clone-progit)) 43 | (define langs '("ar" "az" "be" "ca" "cs" "de" "en" "eo" "es" "es-ni" 44 | "fa" "fi" "fr" "hi" "hu" "id" "it" "ja" "ko" 45 | "mk" "nl" "no-nb" "pl" "pt-br" "ro" "ru" "sr" 46 | "th" "tr" "uk" "vi" "zh" "zh-tw")) 47 | 48 | (for/vector #:length (vector-length document-sizes) ([size (in-vector document-sizes)]) 49 | (define out-path (bench-path "input" (~a size ".md"))) 50 | (cond 51 | [(file-exists? out-path) 52 | (log-cm-bench-debug "build-bench-input: ‘~a’ already exists, skipping" out-path) 53 | (file->string out-path)] 54 | [else 55 | (log-cm-bench-info "build-bench-input: writing ‘~a’" out-path) 56 | (make-parent-directory* out-path) 57 | (define str-out (open-output-string)) 58 | (call-with-output-file* #:mode 'text out-path 59 | (λ (out) 60 | (for* ([lang (in-list (match size 61 | [(or 'tiny 'small) '("en")] 62 | ['medium (take langs 15)] 63 | ['large langs]))] 64 | [in-path (in-directory (match size 65 | ['tiny (build-path progit-dir lang "01-introduction")] 66 | [_ (build-path progit-dir lang)]))] 67 | #:when (file-exists? in-path) 68 | #:when (equal? (path-get-extension in-path) #".markdown")) 69 | (call-with-input-file* #:mode 'text in-path 70 | (λ (in) (copy-port in str-out out)))))) 71 | (get-output-string str-out)]))) 72 | 73 | (define (benchmark-results-path) 74 | (bench-path "results" "result")) 75 | 76 | (define (size->string bytes) 77 | (define Ki 1024) 78 | (define Mi (* Ki Ki)) 79 | (cond 80 | [(< bytes Ki) (~a (~r (/ bytes Ki) #:precision 1) " KiB")] 81 | [(< bytes Mi) (~a (~r (/ bytes Ki) #:precision 0) " KiB")] 82 | [else (~a (~r (/ bytes Mi) #:precision 0) " MiB")])) 83 | 84 | (define (do-run-benchmarks #:num-trials [num-trials 1]) 85 | (define bench-inputs (build-bench-inputs)) 86 | (define results-file (benchmark-results-path)) 87 | (make-parent-directory* results-file) 88 | 89 | (log-cm-bench-info "running benchmarks...") 90 | (run-benchmarks 91 | #:extract-time 'delta-time 92 | #:num-trials num-trials 93 | #:make-name (λ (size) 94 | (define bytes (string-utf-8-length (vector-ref bench-inputs size))) 95 | (~a (vector-ref document-sizes size) " (" (size->string bytes) ")")) 96 | #:results-file results-file 97 | (range (vector-length document-sizes)) 98 | '([commonmark markdown]) 99 | (λ (size impl) 100 | (define input (vector-ref bench-inputs size)) 101 | (match impl 102 | ['commonmark (cm:document->html (cm:string->document input))] 103 | ['markdown (map md:xexpr->string (md:parse-markdown input))])))) 104 | 105 | (module+ main 106 | (require plot 107 | racket/class) 108 | 109 | (define (visualize-benchmark-results [results (get-past-results (benchmark-results-path))]) 110 | (parameterize ([plot-x-ticks no-ticks] 111 | [current-benchmark-color-scheme (cons '("white" "black") '(solid))]) 112 | (define frame 113 | (plot-frame 114 | #:title "commonmark vs markdown" 115 | #:x-label "input size" 116 | #:y-label "normalized time" 117 | (render-benchmark-alts '(commonmark) results))) 118 | (send frame show #t))) 119 | 120 | (visualize-benchmark-results (do-run-benchmarks))) 121 | -------------------------------------------------------------------------------- /commonmark-bench/info.rkt: -------------------------------------------------------------------------------- 1 | #lang info 2 | 3 | (define version "1.2") 4 | 5 | (define collection 'multi) 6 | 7 | (define deps 8 | '("base" 9 | "benchmark" 10 | "commonmark-lib" 11 | "markdown" 12 | "plot-lib" 13 | "plot-gui-lib")) 14 | (define build-deps '()) 15 | -------------------------------------------------------------------------------- /commonmark-doc/info.rkt: -------------------------------------------------------------------------------- 1 | #lang info 2 | 3 | (define version "1.2") 4 | 5 | (define collection 'multi) 6 | 7 | (define deps 8 | '("base")) 9 | (define build-deps 10 | '(["commonmark-lib" #:version "1.2"] 11 | "racket-doc" 12 | "scribble-lib" 13 | "threading-lib")) 14 | -------------------------------------------------------------------------------- /commonmark-doc/scribblings/commonmark.scrbl: -------------------------------------------------------------------------------- 1 | #lang scribble/manual 2 | 3 | @(require (for-label commonmark 4 | commonmark/struct 5 | racket/base 6 | racket/contract 7 | racket/port 8 | (except-in xml document document? struct:document)) 9 | (only-in commonmark/parse current-parse-footnotes?) 10 | racket/format 11 | racket/string 12 | scribble/core 13 | scribble/decode 14 | scribble/example 15 | scribble/html-properties 16 | threading 17 | "commonmark/private/scribble-render.rkt") 18 | 19 | @title{CommonMark: Standard Markdown} 20 | @author{@author+email["Alexis King" "lexi.lambda@gmail.com"]} 21 | @margin-note{The source of this manual is available on @hyperlink["https://github.com/lexi-lambda/racket-commonmark/blob/master/commonmark-doc/scribblings/commonmark.scrbl"]{GitHub.}} 22 | 23 | @(define highlight-style (style 'tt (list (alt-tag "mark") 24 | (make-background-color-property "yellow")))) 25 | @(define (highlight . content) 26 | (element highlight-style content)) 27 | 28 | @(define (reftech . pre-content) 29 | (apply tech pre-content #:doc '(lib "scribblings/reference/reference.scrbl"))) 30 | @(define (xml-tech . pre-content) 31 | (apply tech pre-content #:doc '(lib "xml/xml.scrbl"))) 32 | @(define X-expression @xml-tech{X-expression}) 33 | @(define X-expressions @xml-tech{X-expressions}) 34 | 35 | @(define mod:markdown @racketmodname[markdown #:indirect]) 36 | 37 | @(define CommonMark @hyperlink["https://commonmark.org/"]{CommonMark}) 38 | @(define cmark-gfm @hyperlink["https://github.com/github/cmark-gfm"]{cmark-gfm}) 39 | 40 | @(define (cm-link #:singularize? [singularize? #f] #:style [style #f] tag . pre-content) 41 | (define maybe-singularize (if singularize? 42 | (λ~> (string-replace #px"s$" "")) 43 | values)) 44 | (apply hyperlink 45 | #:style style 46 | (~a "https://spec.commonmark.org/0.31.2/#" 47 | (~> (string-foldcase tag) 48 | maybe-singularize 49 | (string-replace #px"[^a-z]+" "-"))) 50 | pre-content)) 51 | 52 | @(define (cm-tech . pre-content) 53 | (define content (decode-content pre-content)) 54 | (cm-link (content->string content) content #:singularize? #t)) 55 | @(define (cm-section . pre-content) 56 | (define content (decode-content pre-content)) 57 | (cm-link (content->string content) "§" ~ content)) 58 | 59 | @(define (see-cm what where) 60 | @margin-note{See @where in the @CommonMark specification for more information about @|what|.}) 61 | @(define (see-extension what where) 62 | @margin-note{@what are an @tech{extension} to the @CommonMark specification and are not enabled by default; see @where in the @secref{extensions} section of this manual for more details.}) 63 | 64 | @(define make-commonmark-eval (make-eval-factory '(commonmark 65 | commonmark/struct 66 | racket/list 67 | racket/match))) 68 | @(define-syntax-rule (cm-examples body ...) 69 | (examples #:eval (make-commonmark-eval) #:once body ...)) 70 | 71 | @defmodule[commonmark]{ 72 | 73 | The @racketmodname[commonmark] library implements a @|CommonMark|-compliant Markdown parser. Currently, it passes all test cases in @hyperlink["https://spec.commonmark.org/0.31.2/"]{v0.31.2 of the specification}. By default, only the Markdown features specified by @CommonMark are supported, but non-standard support for @tech{footnotes} can be optionally enabled; see the @secref{extensions} section of this manual for more details. 74 | 75 | The @racketmodname[commonmark] module reprovides all of the bindings provided by @racketmodname[commonmark/parse] and @racketmodname[commonmark/render/html] (but @emph{not} the bindings provided by @racketmodname[commonmark/struct]).} 76 | 77 | @local-table-of-contents[] 78 | 79 | @section[#:tag "quick-start"]{Quick start} 80 | 81 | @(define quick-eval (make-base-eval)) 82 | 83 | @margin-note{For information about the Markdown syntax supported by @racketmodname[commonmark], see the @CommonMark website.} 84 | 85 | In @racketmodname[commonmark], processing Markdown is split into two steps: @seclink["parsing"]{parsing} and @seclink["rendering-html"]{rendering}. To get started, use @racket[string->document] or @racket[read-document] to parse Markdown input into a @tech{document} structure: 86 | 87 | @(examples 88 | #:eval quick-eval 89 | #:label #f 90 | (eval:alts @#,racket[(require @#,racketmodname[commonmark])] 91 | (require commonmark)) 92 | (define doc (string->document "*Hello*, **markdown**!")) 93 | doc) 94 | 95 | A @tech{document} is an abstract syntax tree representing Markdown content. Most uses of Markdown render it to HTML, so @racketmodname[commonmark] also provides the @racket[document->html] and @racket[write-document-html] functions, which render a @tech{document} to HTML in the way recommended by the @CommonMark specification: 96 | 97 | @(examples 98 | #:eval quick-eval 99 | #:label #f 100 | (write-document-html doc)) 101 | 102 | The @racket[document->xexprs] function can also be used to render a @tech{document} to a @reftech{list} of @X-expressions, which can make it more convenient to incorporate rendered Markdown into a larger HTML document (though do be aware of the caveats involving @tech{HTML blocks} and @tech{HTML spans} described in the documentation for @racket[document->xexprs]): 103 | 104 | @(examples 105 | #:eval quick-eval 106 | #:label #f 107 | (document->xexprs doc)) 108 | 109 | @(close-eval quick-eval) 110 | 111 | @section[#:tag "parsing"]{Parsing} 112 | @declare-exporting[commonmark/parse commonmark] 113 | @defmodule[commonmark/parse #:no-declare]{ 114 | 115 | The @racketmodname[commonmark/parse] module provides functions for parsing Markdown content into a @tech{document} structure. To render Markdown to HTML, use this module in combination with the functions provided by @racketmodname[commonmark/render/html]. 116 | 117 | All of the bindings provided by @racketmodname[commonmark/parse] are also provided by @racketmodname[commonmark].} 118 | 119 | @defproc[(string->document [str string?]) document?]{ 120 | Parses @racket[str] as a Markdown @tech{document}. 121 | 122 | @(cm-examples 123 | #:label "Example:" 124 | (define doc (string->document "*Hello*, **markdown**!")) 125 | doc 126 | (write-document-html doc)) 127 | 128 | This function cannot fail: every string of Unicode characters is a valid Markdown document.} 129 | 130 | @defproc[(read-document [in input-port?]) document?]{ 131 | Like @racket[string->document], but the input is read from the given @reftech{input port} rather than from a @reftech{string}. 132 | 133 | @(cm-examples 134 | #:label "Example:" 135 | (define doc (read-document (open-input-string "*Hello*, **markdown**!"))) 136 | doc 137 | (write-document-html doc)) 138 | 139 | This function may be more efficient than @racket[(read-document (port->string in))], but probably not substantially, as the entire @tech{document} structure must be realized in memory regardless.} 140 | 141 | @defboolparam[current-parse-footnotes? parse-footnotes? #:value #f]{ 142 | Enables or disables @tech{footnote} parsing, which is an @tech{extension} to the @CommonMark specification; see @secref{extension:footnotes} for more details. 143 | 144 | Note that the value of @racket[current-parse-footnotes?] only affects parsing, @emph{not} rendering. If a @tech{document} containing @tech{footnotes} is rendered to HTML, the @tech{footnotes} will still be rendered even if @racket[(current-parse-footnotes?)] is @racket[#f]. 145 | 146 | @history[#:added "1.1"]} 147 | 148 | @section[#:tag "rendering-html"]{Rendering HTML} 149 | @declare-exporting[commonmark/render/html commonmark] 150 | @defmodule[commonmark/render/html #:no-declare]{ 151 | 152 | The @racketmodname[commonmark/render/html] module provides functions for rendering a parsed Markdown @tech{document} to HTML as recommended by the @CommonMark specification. This module should generally be used in combination with @racketmodname[commonmark/parse], which provides functions for producing a @tech{document} structure from Markdown input. 153 | 154 | All of the bindings provided by @racketmodname[commonmark/render/html] are also provided by @racketmodname[commonmark].} 155 | 156 | @defproc[(document->html [doc document?]) string?]{ 157 | Renders @racket[doc] to HTML in the format recommended by the @CommonMark specification. 158 | 159 | @(cm-examples 160 | (document->html (string->document "*Hello*, **markdown**!")))} 161 | 162 | @defproc[(write-document-html [doc document?] [out output-port? (current-output-port)]) void?]{ 163 | Like @racket[document->html], but writes the rendered HTML directly to @racket[out] rather than returning it as a @reftech{string}. 164 | 165 | @(cm-examples 166 | (write-document-html (string->document "*Hello*, **markdown**!")))} 167 | 168 | @defproc[(document->xexprs [doc document?]) (listof xexpr/c)]{ 169 | Like @racket[document->html], but returns the rendered HTML as a @reftech{list} of @X-expressions rather than as a string. 170 | 171 | @(cm-examples 172 | (document->xexprs (string->document "*Hello*, **markdown**!"))) 173 | 174 | Note that @tech{HTML blocks} and @tech{HTML spans} are not parsed and may even contain invalid HTML, which makes them difficult to represent as an @|X-expression|. As a workaround, raw HTML will be represented as @racket[cdata] elements: 175 | 176 | @(cm-examples 177 | #:label #f 178 | (document->xexprs 179 | (string->document "A paragraph with raw HTML."))) 180 | 181 | This generally yields the desired result, as @racket[xexpr->string] renders @racket[cdata] elements directly as their unescaped content. However, strictly speaking, it is an abuse of @racket[cdata].} 182 | 183 | @deftogether[(@defparam[current-italic-tag tag symbol? #:value 'em] 184 | @defparam[current-bold-tag tag symbol? #:value 'strong])]{ 185 | These @reftech{parameters} determine which HTML tag is used to render @tech{italic spans} and @tech{bold spans}, respectively. The default values of @racket['em] and @racket['strong] correspond to those required by the @CommonMark specification, but this can be semantically incorrect if “emphasis” syntax is used for purposes other than emphasis, such as italicizing the title of a book. 186 | 187 | Reasonable alternate values for @racket[current-italic-tag] and @racket[current-bold-tag] include @racket['i], @racket['b], @racket['mark], @racket['cite], or @racket['defn], all of which are elements with semantic (rather than presentational) meaning in HTML5. Of course, the “most correct” choice depends on how @tech{italic spans} and @tech{bold spans} will actually be used. 188 | 189 | @(cm-examples 190 | (eval:alts 191 | (parameterize ([current-italic-tag 'cite] 192 | [current-bold-tag 'mark]) 193 | (document->xexprs 194 | (string->document 195 | (string-append 196 | "> First, programming is about stating and solving problems,\n" 197 | "> and this activity normally takes place in a context with its\n" 198 | "> own language of discourse; **good programmers ought to\n" 199 | "> formulate this language as a programming language**.\n" 200 | "\n" 201 | "— *The Racket Manifesto* (emphasis mine)")))) 202 | (parameterize ([current-italic-tag 'cite] 203 | [current-bold-tag 'mark]) 204 | ; In this example, we’ll end up with really long strings in the output 205 | ; containing \n characters, which looks bad in the docs, so we want to quietly 206 | ; split them on \n characters just to make the example’s output more readable. 207 | (define (split-inline-strs v) 208 | (match v 209 | [(? list?) (flatten (map split-inline-strs v))] 210 | [(document v fns) (document (split-inline-strs v) fns)] 211 | [(paragraph v) (paragraph (split-inline-strs v))] 212 | [(blockquote v) (blockquote (split-inline-strs v))] 213 | [(bold v) (bold (split-inline-strs v))] 214 | [(italic v) (italic (split-inline-strs v))] 215 | [(? string?) (regexp-split #px"(?<=\n)" v)])) 216 | (document->xexprs 217 | (split-inline-strs 218 | (string->document 219 | (string-append 220 | "> First, programming is about stating and solving problems,\n" 221 | "> and this activity normally takes place in a context with its\n" 222 | "> own language of discourse; **good programmers ought to\n" 223 | "> formulate this language as a programming language**.\n" 224 | "\n" 225 | "— *The Racket Manifesto* (emphasis mine)")))))))} 226 | 227 | @section[#:tag "structure"]{Document structure} 228 | @defmodule[commonmark/struct]{ 229 | 230 | The @racketmodname[commonmark/struct] module provides @reftech{structure types} used to represent Markdown content as abstract syntax. The root of every syntax tree is a @tech{document}, which contains @tech{blocks}, which in turn contain @tech{inline content}. Most users will not need to interact with these structures directly, but doing so can be useful to perform additional processing on the document before rendering it, or to render Markdown to a format other than HTML. 231 | 232 | Note that the bindings in this section are only provided by @racketmodname[commonmark/struct], @emph{not} by @racketmodname[commonmark].} 233 | 234 | @defstruct*[document ([blocks (listof block?)] 235 | [footnotes (listof footnote-definition?)]) 236 | #:transparent]{ 237 | A parsed Markdown @deftech{document}, which has a body @tech{flow} and a @reftech{list} of @tech{footnote definitions}. It can be parsed from Markdown input using @racket[read-document] or @racket[string->document] and can be rendered to HTML using @racket[document->html]. 238 | 239 | @history[#:changed "1.1" @elem{Added the @racket[footnotes] field.}]} 240 | 241 | @defstruct*[footnote-definition ([blocks (listof block?)] [label string?]) #:transparent]{ 242 | @see-extension[@tech{Footnotes} @secref{extension:footnotes}] 243 | 244 | A @tech{footnote definition} contains a @tech{flow} that can be referenced by a @tech{footnote reference} via its @tech{footnote label}. 245 | 246 | Note: although @tech{footnote definitions} are syntactically blocks in Markdown input, they are @emph{not} a type of @tech{block} (as recognized by the @racket[block?] predicate) and cannot be included directly in the main @tech{document} @tech{flow}. @tech{Footnote definitions} are collected into the separate @racket[document-footnotes] field of the @tech{document} structure during parsing, since they represent auxiliary definitions, and their precise location in the Markdown input does not matter. 247 | 248 | (This is quite similar to the way the parser processes @cm-tech{link reference definitions}, except that @tech{footnote definitions} must be retained separately for later rendering, whereas @cm-tech{link reference definitions} can be discarded after all link targets have been resolved.) 249 | 250 | @history[#:added "1.1"]} 251 | 252 | @subsection[#:tag "blocks"]{Blocks} 253 | 254 | @defproc[(block? [v any/c]) boolean?]{ 255 | @see-cm[@tech{blocks} @cm-section{Blocks and inlines}] 256 | 257 | Returns @racket[#t] if @racket[v] is a @deftech{block}: a @tech{paragraph}, @tech{itemization}, @tech{block quote}, @tech{code block}, @tech{HTML block}, @tech{heading}, or @tech{thematic break}. Otherwise, returns @racket[#f]. 258 | 259 | A @deftech{flow} is a list of @tech{blocks}. The body of a @tech{document}, the contents of a @tech{block quote}, and each item in an @tech{itemization} are flows.} 260 | 261 | @defstruct*[paragraph ([content inline?]) #:transparent]{ 262 | @see-cm[@tech{paragraphs} @cm-section{Paragraphs}] 263 | 264 | A @deftech{paragraph} is a @tech{block} that contains @tech{inline content}. In HTML output, it corresponds to a @tt{

} element. Most blocks in a @tech{document} are usually paragraphs.} 265 | 266 | @defstruct*[itemization ([blockss (listof (listof block?))] 267 | [style (or/c 'loose 'tight)] 268 | [start-num (or/c exact-nonnegative-integer? #f)]) 269 | #:transparent]{ 270 | @see-cm[@tech{itemizations} @elem{@cm-section{Lists} and @cm-section{List items}}] 271 | 272 | An @deftech{itemization} is a @tech{block} that contains a list of @tech{flows}. In HTML output, it corresponds to a @tt{