`, ``, ``, etc. — must be separated from
227 | > surrounding content by blank lines, and the start and end tags of the
228 | > block should not be indented with tabs or spaces.
229 |
230 | In some ways Gruber's rule is more restrictive than the one given
231 | here:
232 |
233 | - It requires that an HTML block be preceded and followed by a blank line.
234 | - It does not allow the start tag to be indented.
235 | - It does not allow the end tag to be indented.
236 | - It does not require that the open tag be an HTML block-level tag.
237 |
238 | Indeed, most markdown implementations, including some of Gruber's
239 | own perl implementations, do not impose these restrictions.
240 |
241 | However, unlike Gruber's rule, this one requires that the open
242 | tag be on a line by itself. It also differs from most markdown
243 | implementations in how it handles the case where there is no matching
244 | closing tag (a case not mentioned in Gruber's rule). In such a case,
245 | the rule stated above includes the whole rest of the document in the
246 | HTML block.
247 |
248 |
--------------------------------------------------------------------------------
/test/spec_tests.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python3
2 | # -*- coding: utf-8 -*-
3 |
4 | import sys
5 | from difflib import unified_diff
6 | import argparse
7 | import re
8 | import json
9 | from cmark import CMark
10 | from normalize import normalize_html
11 |
12 | if __name__ == "__main__":
13 | parser = argparse.ArgumentParser(description='Run cmark tests.')
14 | parser.add_argument('-p', '--program', dest='program', nargs='?', default=None,
15 | help='program to test')
16 | parser.add_argument('-s', '--spec', dest='spec', nargs='?', default='spec.txt',
17 | help='path to spec')
18 | parser.add_argument('-P', '--pattern', dest='pattern', nargs='?',
19 | default=None, help='limit to sections matching regex pattern')
20 | parser.add_argument('--library-dir', dest='library_dir', nargs='?',
21 | default=None, help='directory containing dynamic library')
22 | parser.add_argument('--no-normalize', dest='normalize',
23 | action='store_const', const=False, default=True,
24 | help='do not normalize HTML')
25 | parser.add_argument('-d', '--dump-tests', dest='dump_tests',
26 | action='store_const', const=True, default=False,
27 | help='dump tests in JSON format')
28 | parser.add_argument('--debug-normalization', dest='debug_normalization',
29 | action='store_const', const=True,
30 | default=False, help='filter stdin through normalizer for testing')
31 | parser.add_argument('-n', '--number', type=int, default=None,
32 | help='only consider the test with the given number')
33 | args = parser.parse_args(sys.argv[1:])
34 |
35 | def out(str):
36 | sys.stdout.buffer.write(str.encode('utf-8'))
37 |
38 | def print_test_header(headertext, example_number, start_line, end_line):
39 | out("Example %d (lines %d-%d) %s\n" % (example_number,start_line,end_line,headertext))
40 |
41 | def do_test(test, normalize, result_counts):
42 | [retcode, actual_html, err] = cmark.to_html(test['markdown'])
43 | if retcode == 0:
44 | expected_html = test['html']
45 | unicode_error = None
46 | if normalize:
47 | try:
48 | passed = normalize_html(actual_html) == normalize_html(expected_html)
49 | except UnicodeDecodeError as e:
50 | unicode_error = e
51 | passed = False
52 | else:
53 | passed = actual_html == expected_html
54 | if passed:
55 | result_counts['pass'] += 1
56 | else:
57 | print_test_header(test['section'], test['example'], test['start_line'], test['end_line'])
58 | out(test['markdown'] + '\n')
59 | if unicode_error:
60 | out("Unicode error: " + str(unicode_error) + '\n')
61 | out("Expected: " + repr(expected_html) + '\n')
62 | out("Got: " + repr(actual_html) + '\n')
63 | else:
64 | expected_html_lines = expected_html.splitlines(True)
65 | actual_html_lines = actual_html.splitlines(True)
66 | for diffline in unified_diff(expected_html_lines, actual_html_lines,
67 | "expected HTML", "actual HTML"):
68 | out(diffline)
69 | out('\n')
70 | result_counts['fail'] += 1
71 | else:
72 | print_test_header(test['section'], test['example'], test['start_line'], test['end_line'])
73 | out("program returned error code %d\n" % retcode)
74 | sys.stdout.buffer.write(err)
75 | result_counts['error'] += 1
76 |
77 | def get_tests(specfile):
78 | line_number = 0
79 | start_line = 0
80 | end_line = 0
81 | example_number = 0
82 | markdown_lines = []
83 | html_lines = []
84 | state = 0 # 0 regular text, 1 markdown example, 2 html output
85 | headertext = ''
86 | tests = []
87 |
88 | header_re = re.compile('#+ ')
89 |
90 | with open(specfile, 'r', encoding='utf-8', newline='\n') as specf:
91 | for line in specf:
92 | line_number = line_number + 1
93 | l = line.strip()
94 | if l == "`" * 32 + " example":
95 | state = 1
96 | elif state == 2 and l == "`" * 32:
97 | state = 0
98 | example_number = example_number + 1
99 | end_line = line_number
100 | tests.append({
101 | "markdown":''.join(markdown_lines).replace('→',"\t"),
102 | "html":''.join(html_lines).replace('→',"\t"),
103 | "example": example_number,
104 | "start_line": start_line,
105 | "end_line": end_line,
106 | "section": headertext})
107 | start_line = 0
108 | markdown_lines = []
109 | html_lines = []
110 | elif l == ".":
111 | state = 2
112 | elif state == 1:
113 | if start_line == 0:
114 | start_line = line_number - 1
115 | markdown_lines.append(line)
116 | elif state == 2:
117 | html_lines.append(line)
118 | elif state == 0 and re.match(header_re, line):
119 | headertext = header_re.sub('', line).strip()
120 | return tests
121 |
122 | if __name__ == "__main__":
123 | if args.debug_normalization:
124 | out(normalize_html(sys.stdin.read()))
125 | exit(0)
126 |
127 | all_tests = get_tests(args.spec)
128 | if args.pattern:
129 | pattern_re = re.compile(args.pattern, re.IGNORECASE)
130 | else:
131 | pattern_re = re.compile('.')
132 | tests = [ test for test in all_tests if re.search(pattern_re, test['section']) and (not args.number or test['example'] == args.number) ]
133 | if args.dump_tests:
134 | out(json.dumps(tests, ensure_ascii=False, indent=2))
135 | exit(0)
136 | else:
137 | skipped = len(all_tests) - len(tests)
138 | cmark = CMark(prog=args.program, library_dir=args.library_dir)
139 | result_counts = {'pass': 0, 'fail': 0, 'error': 0, 'skip': skipped}
140 | for test in tests:
141 | do_test(test, args.normalize, result_counts)
142 | out("{pass} passed, {fail} failed, {error} errored, {skip} skipped\n".format(**result_counts))
143 | exit(result_counts['fail'] + result_counts['error'])
144 |
--------------------------------------------------------------------------------
/test/normalize.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | from html.parser import HTMLParser
3 | import urllib
4 |
5 | try:
6 | from html.parser import HTMLParseError
7 | except ImportError:
8 | # HTMLParseError was removed in Python 3.5. It could never be
9 | # thrown, so we define a placeholder instead.
10 | class HTMLParseError(Exception):
11 | pass
12 |
13 | from html.entities import name2codepoint
14 | import sys
15 | import re
16 | import html
17 |
18 | # Normalization code, adapted from
19 | # https://github.com/karlcow/markdown-testsuite/
20 | significant_attrs = ["alt", "href", "src", "title"]
21 | whitespace_re = re.compile('\s+')
22 | class MyHTMLParser(HTMLParser):
23 | def __init__(self):
24 | HTMLParser.__init__(self)
25 | self.convert_charrefs = False
26 | self.last = "starttag"
27 | self.in_pre = False
28 | self.output = ""
29 | self.last_tag = ""
30 | def handle_data(self, data):
31 | after_tag = self.last == "endtag" or self.last == "starttag"
32 | after_block_tag = after_tag and self.is_block_tag(self.last_tag)
33 | if after_tag and self.last_tag == "br":
34 | data = data.lstrip('\n')
35 | if not self.in_pre:
36 | data = whitespace_re.sub(' ', data)
37 | if after_block_tag and not self.in_pre:
38 | if self.last == "starttag":
39 | data = data.lstrip()
40 | elif self.last == "endtag":
41 | data = data.strip()
42 | self.output += data
43 | self.last = "data"
44 | def handle_endtag(self, tag):
45 | if tag == "pre":
46 | self.in_pre = False
47 | elif self.is_block_tag(tag):
48 | self.output = self.output.rstrip()
49 | self.output += "" + tag + ">"
50 | self.last_tag = tag
51 | self.last = "endtag"
52 | def handle_starttag(self, tag, attrs):
53 | if tag == "pre":
54 | self.in_pre = True
55 | if self.is_block_tag(tag):
56 | self.output = self.output.rstrip()
57 | self.output += "<" + tag
58 | # For now we don't strip out 'extra' attributes, because of
59 | # raw HTML test cases.
60 | # attrs = filter(lambda attr: attr[0] in significant_attrs, attrs)
61 | if attrs:
62 | attrs.sort()
63 | for (k,v) in attrs:
64 | self.output += " " + k
65 | if v in ['href','src']:
66 | self.output += ("=" + '"' +
67 | urllib.quote(urllib.unquote(v), safe='/') + '"')
68 | elif v != None:
69 | self.output += ("=" + '"' + html.escape(v,quote=True) + '"')
70 | self.output += ">"
71 | self.last_tag = tag
72 | self.last = "starttag"
73 | def handle_startendtag(self, tag, attrs):
74 | """Ignore closing tag for self-closing """
75 | self.handle_starttag(tag, attrs)
76 | self.last_tag = tag
77 | self.last = "endtag"
78 | def handle_comment(self, data):
79 | self.output += ''
80 | self.last = "comment"
81 | def handle_decl(self, data):
82 | self.output += ''
83 | self.last = "decl"
84 | def unknown_decl(self, data):
85 | self.output += ''
86 | self.last = "decl"
87 | def handle_pi(self,data):
88 | self.output += '' + data + '>'
89 | self.last = "pi"
90 | def handle_entityref(self, name):
91 | try:
92 | c = chr(name2codepoint[name])
93 | except KeyError:
94 | c = None
95 | self.output_char(c, '&' + name + ';')
96 | self.last = "ref"
97 | def handle_charref(self, name):
98 | try:
99 | if name.startswith("x"):
100 | c = chr(int(name[1:], 16))
101 | else:
102 | c = chr(int(name))
103 | except ValueError:
104 | c = None
105 | self.output_char(c, '&' + name + ';')
106 | self.last = "ref"
107 | # Helpers.
108 | def output_char(self, c, fallback):
109 | if c == '<':
110 | self.output += "<"
111 | elif c == '>':
112 | self.output += ">"
113 | elif c == '&':
114 | self.output += "&"
115 | elif c == '"':
116 | self.output += """
117 | elif c == None:
118 | self.output += fallback
119 | else:
120 | self.output += c
121 |
122 | def is_block_tag(self,tag):
123 | return (tag in ['article', 'header', 'aside', 'hgroup', 'blockquote',
124 | 'hr', 'iframe', 'body', 'li', 'map', 'button', 'object', 'canvas',
125 | 'ol', 'caption', 'output', 'col', 'p', 'colgroup', 'pre', 'dd',
126 | 'progress', 'div', 'section', 'dl', 'table', 'td', 'dt',
127 | 'tbody', 'embed', 'textarea', 'fieldset', 'tfoot', 'figcaption',
128 | 'th', 'figure', 'thead', 'footer', 'tr', 'form', 'ul',
129 | 'h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'video', 'script', 'style'])
130 |
131 | def normalize_html(html):
132 | r"""
133 | Return normalized form of HTML which ignores insignificant output
134 | differences:
135 |
136 | Multiple inner whitespaces are collapsed to a single space (except
137 | in pre tags):
138 |
139 | >>> normalize_html("
a \t b
")
140 | 'a b
'
141 |
142 | >>> normalize_html("a \t\nb
")
143 | 'a b
'
144 |
145 | * Whitespace surrounding block-level tags is removed.
146 |
147 | >>> normalize_html("a b
")
148 | 'a b
'
149 |
150 | >>> normalize_html(" a b
")
151 | 'a b
'
152 |
153 | >>> normalize_html("a b
")
154 | 'a b
'
155 |
156 | >>> normalize_html("\n\t\n\t\ta b\t\t
\n\t")
157 | 'a b
'
158 |
159 | >>> normalize_html("a b ")
160 | 'a b '
161 |
162 | * Self-closing tags are converted to open tags.
163 |
164 | >>> normalize_html("
")
165 | '
'
166 |
167 | * Attributes are sorted and lowercased.
168 |
169 | >>> normalize_html('x')
170 | 'x'
171 |
172 | * References are converted to unicode, except that '<', '>', '&', and
173 | '"' are rendered using entities.
174 |
175 | >>> normalize_html("∀&><"")
176 | '\u2200&><"'
177 |
178 | """
179 | html_chunk_re = re.compile("(\|\<[^>]*\>|[^<]+)")
180 | try:
181 | parser = MyHTMLParser()
182 | # We work around HTMLParser's limitations parsing CDATA
183 | # by breaking the input into chunks and passing CDATA chunks
184 | # through verbatim.
185 | for chunk in re.finditer(html_chunk_re, html):
186 | if chunk.group(0)[:8] == ".
13 |
14 | This repository contains the spec itself, along with tools for
15 | running tests against the spec, and for creating HTML and PDF versions
16 | of the spec.
17 |
18 | The reference implementations live in separate repositories:
19 |
20 | - (C)
21 | - (JavaScript)
22 |
23 | There is a list of third-party libraries
24 | in a dozen different languages
25 | [here](https://github.com/commonmark/CommonMark/wiki/List-of-CommonMark-Implementations).
26 |
27 | Running tests against the spec
28 | ------------------------------
29 |
30 | [The spec] contains over 500 embedded examples which serve as conformance
31 | tests. To run the tests using an executable `$PROG`:
32 |
33 | python3 test/spec_tests.py --program $PROG
34 |
35 | If you want to extract the raw test data from the spec without
36 | actually running the tests, you can do:
37 |
38 | python3 test/spec_tests.py --dump-tests
39 |
40 | and you'll get all the tests in JSON format.
41 |
42 | JavaScript developers may find it more convenient to use the
43 | [`commonmark-spec` npm package], which is published from this
44 | repository. It exports an array `tests` of JSON objects with
45 | the format
46 |
47 | ```json
48 | {
49 | "markdown": "Foo\nBar\n---\n",
50 | "html": "Foo\nBar
\n",
51 | "section": "Setext headings",
52 | "number": 65
53 | }
54 | ```
55 |
56 | [`commonmark-spec` npm package]: https://www.npmjs.com/package/commonmark-spec
57 |
58 | The spec
59 | --------
60 |
61 | The source of [the spec] is `spec.txt`. This is basically a Markdown
62 | file, with code examples written in a shorthand form:
63 |
64 | ```````````````````````````````` example
65 | Markdown source
66 | .
67 | expected HTML output
68 | ````````````````````````````````
69 |
70 | To build an HTML version of the spec, do `make spec.html`. To build a
71 | PDF version, do `make spec.pdf`. For both versions, you must
72 | have the lua rock `lcmark` installed: after installing lua and
73 | lua rocks, `luarocks install lcmark`. For the PDF you must also
74 | have xelatex installed.
75 |
76 | The spec is written from the point of view of the human writer, not
77 | the computer reader. It is not an algorithm---an English translation of
78 | a computer program---but a declarative description of what counts as a block
79 | quote, a code block, and each of the other structural elements that can
80 | make up a Markdown document.
81 |
82 | Because John Gruber's [canonical syntax
83 | description](http://daringfireball.net/projects/markdown/syntax) leaves
84 | many aspects of the syntax undetermined, writing a precise spec requires
85 | making a large number of decisions, many of them somewhat arbitrary.
86 | In making them, we have appealed to existing conventions and
87 | considerations of simplicity, readability, expressive power, and
88 | consistency. We have tried to ensure that "normal" documents in the many
89 | incompatible existing implementations of Markdown will render, as far as
90 | possible, as their authors intended. And we have tried to make the rules
91 | for different elements work together harmoniously. In places where
92 | different decisions could have been made (for example, the rules
93 | governing list indentation), we have explained the rationale for
94 | our choices. In a few cases, we have departed slightly from the canonical
95 | syntax description, in ways that we think further the goals of Markdown
96 | as stated in that description.
97 |
98 | For the most part, we have limited ourselves to the basic elements
99 | described in Gruber's canonical syntax description, eschewing extensions
100 | like footnotes and definition lists. It is important to get the core
101 | right before considering such things. However, we have included a visible
102 | syntax for line breaks and fenced code blocks.
103 |
104 | Differences from original Markdown
105 | ----------------------------------
106 |
107 | There are only a few places where this spec says things that contradict
108 | the canonical syntax description:
109 |
110 | - It allows all punctuation symbols to be backslash-escaped,
111 | not just the symbols with special meanings in Markdown. We found
112 | that it was just too hard to remember which symbols could be
113 | escaped.
114 |
115 | - It introduces an alternative syntax for hard line
116 | breaks, a backslash at the end of the line, supplementing the
117 | two-spaces-at-the-end-of-line rule. This is motivated by persistent
118 | complaints about the “invisible” nature of the two-space rule.
119 |
120 | - Link syntax has been made a bit more predictable (in a
121 | backwards-compatible way). For example, `Markdown.pl` allows single
122 | quotes around a title in inline links, but not in reference links.
123 | This kind of difference is really hard for users to remember, so the
124 | spec allows single quotes in both contexts.
125 |
126 | - The rule for HTML blocks differs, though in most real cases it
127 | shouldn't make a difference. (See the section on HTML Blocks
128 | for details.) The spec's proposal makes it easy to include Markdown
129 | inside HTML block-level tags, if you want to, but also allows you to
130 | exclude this. It also makes parsing much easier, avoiding
131 | expensive backtracking.
132 |
133 | - It does not collapse adjacent bird-track blocks into a single
134 | blockquote:
135 |
136 | > these are two
137 |
138 | > blockquotes
139 |
140 | > this is a single
141 | >
142 | > blockquote with two paragraphs
143 |
144 | - Rules for content in lists differ in a few respects, though (as with
145 | HTML blocks), most lists in existing documents should render as
146 | intended. There is some discussion of the choice points and
147 | differences in the subsection of List Items entitled Motivation.
148 | We think that the spec's proposal does better than any existing
149 | implementation in rendering lists the way a human writer or reader
150 | would intuitively understand them. (We could give numerous examples
151 | of perfectly natural looking lists that nearly every existing
152 | implementation flubs up.)
153 |
154 | - Changing bullet characters, or changing from bullets to numbers or
155 | vice versa, starts a new list. We think that is almost always going
156 | to be the writer's intent.
157 |
158 | - The number that begins an ordered list item may be followed by
159 | either `.` or `)`. Changing the delimiter style starts a new
160 | list.
161 |
162 | - The start number of an ordered list is significant.
163 |
164 | - Fenced code blocks are supported, delimited by either
165 | backticks (```` ``` ````) or tildes (` ~~~ `).
166 |
167 | Contributing
168 | ------------
169 |
170 | There is a [forum for discussing
171 | CommonMark](http://talk.commonmark.org); you should use it instead of
172 | github issues for questions and possibly open-ended discussions.
173 | Use the [github issue tracker](http://github.com/commonmark/CommonMark/issues)
174 | only for simple, clear, actionable issues.
175 |
176 | Authors
177 | -------
178 |
179 | The spec was written by John MacFarlane, drawing on
180 |
181 | - his experience writing and maintaining Markdown implementations in several
182 | languages, including the first Markdown parser not based on regular
183 | expression substitutions ([pandoc](http://github.com/jgm/pandoc)) and
184 | the first markdown parsers based on PEG grammars
185 | ([peg-markdown](http://github.com/jgm/peg-markdown),
186 | [lunamark](http://github.com/jgm/lunamark))
187 | - a detailed examination of the differences between existing Markdown
188 | implementations using [BabelMark 2](http://johnmacfarlane.net/babelmark2/),
189 | and
190 | - extensive discussions with David Greenspan, Jeff Atwood, Vicent
191 | Marti, Neil Williams, and Benjamin Dumke-von der Ehe.
192 |
193 | Since the first announcement, many people have contributed ideas.
194 | Kārlis Gaņģis was especially helpful in refining the rules for
195 | emphasis, strong emphasis, links, and images.
196 |
--------------------------------------------------------------------------------
/tools/make_spec.lua:
--------------------------------------------------------------------------------
1 | local lcmark = require('lcmark')
2 | local cmark = require('cmark')
3 |
4 | local format = arg[1] or 'html'
5 |
6 | local trim = function(s)
7 | return s:gsub("^%s+",""):gsub("%s+$","")
8 | end
9 |
10 | local warn = function(s)
11 | io.stderr:write('WARNING: ' .. s .. '\n')
12 | end
13 |
14 | local to_identifier = function(s)
15 | return trim(s):lower():gsub('[^%w]+', ' '):gsub('[%s]+', '-')
16 | end
17 |
18 | local render_number = function(tbl)
19 | local buf = {}
20 | for i,x in ipairs(tbl) do
21 | buf[i] = tostring(x)
22 | end
23 | return table.concat(buf, '.')
24 | end
25 |
26 | local extract_label = function(cur)
27 | local label = ""
28 | for subcur, subentering, subnode_type in cmark.walk(cur) do
29 | if subentering and subnode_type == cmark.NODE_TEXT then
30 | label = label .. cmark.node_get_literal(subcur)
31 | elseif subentering and subnode_type == cmark.NODE_SOFTBREAK then
32 | label = label .. " "
33 | end
34 | end
35 | return label
36 | end
37 |
38 | local extract_references = function(doc)
39 | local cur, entering, node_type
40 | local refs = {}
41 | for cur, entering, node_type in cmark.walk(doc) do
42 | if not entering and
43 | ((node_type == cmark.NODE_LINK and cmark.node_get_url(cur) == '@') or
44 | node_type == cmark.NODE_HEADING) then
45 | local label = extract_label(cur)
46 | local ident = to_identifier(label)
47 | if refs[label] then
48 | warn("duplicate reference " .. label)
49 | end
50 | refs[label] = ident
51 | if not refs[label .. 's'] then
52 | -- plural too
53 | refs[label .. 's'] = ident
54 | end
55 | end
56 | end
57 | -- check for duplicate IDs
58 | local idents = {}
59 | for _,id in ipairs(refs) do
60 | if idents[id] then
61 | warn("duplicate identifier " .. id)
62 | end
63 | idents[#idents + 1] = id
64 | end
65 | return refs
66 | end
67 |
68 | local make_toc = function(toc)
69 | -- we create a commonmark string, then parse it
70 | local toclines = {}
71 | for _,entry in ipairs(toc) do
72 | if entry.level <= 2 then
73 | local indent = string.rep(' ', entry.level - 1)
74 | toclines[#toclines + 1] = indent .. '* [' ..
75 | (entry.number == '' and ''
76 | or '' .. entry.number .. '') ..
77 | entry.label .. '](#' .. entry.ident .. ')'
78 | end
79 | end
80 | -- now parse our cm list and return the resulting list node:
81 | local doc = cmark.parse_string(table.concat(toclines, '\n'), cmark.OPT_SMART)
82 | return cmark.node_first_child(doc)
83 | end
84 |
85 | local make_html_element = function(block, tagname, attrs)
86 | local div = cmark.node_new(block and cmark.NODE_CUSTOM_BLOCK or
87 | cmark.NODE_CUSTOM_INLINE)
88 | local attribs = {}
89 | for _,attr in ipairs(attrs) do
90 | attribs[#attribs + 1] = ' ' .. attr[1] .. '="' .. attr[2] .. '"'
91 | end
92 | local opentag = '<' .. tagname .. table.concat(attribs, '') .. '>'
93 | local closetag = '' .. tagname .. '>'
94 | cmark.node_set_on_enter(div, opentag)
95 | cmark.node_set_on_exit(div, closetag)
96 | return div
97 | end
98 |
99 | local make_html_block = function(tagname, attrs)
100 | return make_html_element(true, tagname, attrs)
101 | end
102 |
103 | local make_html_inline = function(tagname, attrs)
104 | return make_html_element(false, tagname, attrs)
105 | end
106 |
107 | local make_latex = function(spec)
108 | local latex = cmark.node_new(spec.block and cmark.NODE_CUSTOM_BLOCK or
109 | cmark.NODE_CUSTOM_INLINE)
110 | cmark.node_set_on_enter(latex, spec.start)
111 | cmark.node_set_on_exit(latex, spec.stop)
112 | return latex
113 | end
114 |
115 | local make_text = function(s)
116 | local text = cmark.node_new(cmark.NODE_TEXT)
117 | cmark.node_set_literal(text, s)
118 | return text
119 | end
120 |
121 | local create_anchors = function(doc, meta, to)
122 | local cur, entering, node_type
123 | local toc = {}
124 | local number = {0}
125 | local example = 0
126 | for cur, entering, node_type in cmark.walk(doc) do
127 | if not entering and
128 | ((node_type == cmark.NODE_LINK and cmark.node_get_url(cur) == '@') or
129 | node_type == cmark.NODE_HEADING) then
130 |
131 | local anchor
132 | local label = extract_label(cur)
133 | local ident = to_identifier(label)
134 | if node_type == cmark.NODE_LINK then
135 | if format == 'latex' then
136 | anchor = make_latex({start="\\hypertarget{" .. ident .. "}{",
137 | stop="\\label{" .. ident .. "}}",
138 | block = true})
139 | else
140 | anchor = make_html_inline('a', {{'id', ident}, {'href', '#'..ident},
141 | {'class', 'definition'}})
142 | end
143 |
144 | else -- NODE_HEADING
145 |
146 | local level = cmark.node_get_heading_level(cur)
147 | local last_level = #toc == 0 and 1 or toc[#toc].level
148 | if #number > 0 then
149 | if level > last_level then -- subhead
150 | number[level] = 1
151 | else
152 | while last_level > level do
153 | number[last_level] = nil
154 | last_level = last_level - 1
155 | end
156 | number[level] = number[level] + 1
157 | end
158 | end
159 | table.insert(toc, { label = label, ident = ident, level = level, number = render_number(number) })
160 | local num = render_number(number)
161 | local section_cmds = {"\\section", "\\subsection",
162 | "\\subsubsection", "\\chapter"}
163 | if format == 'latex' then
164 | anchor = make_latex({start="\\hypertarget{" .. ident .. "}{" ..
165 | section_cmds[level] .. "{",
166 | stop="}\\label{" .. ident .. "}}",
167 | block = true})
168 | else
169 | anchor = make_html_block('h' .. tostring(math.floor(level)),
170 | {{'id', ident},
171 | {'class', 'definition'}})
172 | if num ~= '' then
173 | local numspan = make_html_inline('span', {{'class','number'}})
174 | cmark.node_append_child(numspan, make_text(num))
175 | cmark.node_append_child(anchor, numspan)
176 | end
177 | end
178 | end
179 | local children = {}
180 | local child = cmark.node_first_child(cur)
181 | while child do
182 | children[#children + 1] = child
183 | child = cmark.node_next(child)
184 | end
185 | for _,child in ipairs(children) do
186 | cmark.node_append_child(anchor, child)
187 | end
188 | cmark.node_insert_before(cur, anchor)
189 | cmark.node_unlink(cur)
190 | elseif entering and node_type == cmark.NODE_CODE_BLOCK and
191 | cmark.node_get_fence_info(cur) == 'example' then
192 | example = example + 1
193 | -- split into two code blocks
194 | local code = cmark.node_get_literal(cur)
195 | local sepstart, sepend = code:find("[\n\r]+%.[\n\r]+")
196 | if not sepstart then
197 | warn("Could not find separator in:\n" .. contents)
198 | end
199 | local markdown_code = cmark.node_new(cmark.NODE_CODE_BLOCK)
200 | local html_code = cmark.node_new(cmark.NODE_CODE_BLOCK)
201 | -- note: we replace the ␣ with a special span after rendering
202 | local markdown_code_string = code:sub(1, sepstart):gsub(' ', '␣')
203 | local html_code_string = code:sub(sepend + 1):gsub(' ', '␣')
204 | cmark.node_set_literal(markdown_code, markdown_code_string)
205 | cmark.node_set_fence_info(markdown_code, 'markdown')
206 | cmark.node_set_literal(html_code, html_code_string)
207 | cmark.node_set_fence_info(html_code, 'html')
208 |
209 | local example_div, leftcol_div, rightcol_div
210 | if format == 'latex' then
211 | example_div = make_latex({start = '\\begin{minipage}[t]{\\textwidth}\n{\\scriptsize Example ' .. tostring(example) .. '}\n\n\\vspace{-0.4em}\n', stop = '\\end{minipage}', block = true})
212 | leftcol_div = make_latex({start = "\\begin{minipage}[t]{0.49\\textwidth}\n\\definecolor{shadecolor}{gray}{0.85}\n\\begin{snugshade}\\small\n", stop = "\\end{snugshade}\n\\end{minipage}\n\\hfill", block = true})
213 | rightcol_div = make_latex({start = "\\begin{minipage}[t]{0.49\\textwidth}\n\\definecolor{shadecolor}{gray}{0.95}\n\\begin{snugshade}\\small\n", stop = "\\end{snugshade}\n\\end{minipage}\n\\vspace{0.8em}", block = true})
214 | cmark.node_append_child(leftcol_div, markdown_code)
215 | cmark.node_append_child(rightcol_div, html_code)
216 | cmark.node_append_child(example_div, leftcol_div)
217 | cmark.node_append_child(example_div, rightcol_div)
218 | else
219 | leftcol_div = make_html_block('div', {{'class','column'}})
220 | rightcol_div = make_html_block('div', {{'class', 'column'}})
221 | cmark.node_append_child(leftcol_div, markdown_code)
222 | cmark.node_append_child(rightcol_div, html_code)
223 | local examplenum_div = make_html_block('div', {{'class', 'examplenum'}})
224 | local interact_link = make_html_inline('a', {{'class', 'dingus'},
225 | {'title', 'open in interactive dingus'}})
226 | cmark.node_append_child(interact_link, make_text("Try It"))
227 | local examplenum_link = cmark.node_new(cmark.NODE_LINK)
228 | cmark.node_set_url(examplenum_link, '#example-' .. tostring(example))
229 | cmark.node_append_child(examplenum_link,
230 | make_text("Example " .. tostring(example)))
231 | cmark.node_append_child(examplenum_div, examplenum_link)
232 | if format == 'html' then
233 | cmark.node_append_child(examplenum_div, interact_link)
234 | end
235 | example_div = make_html_block('div', {{'class', 'example'},
236 | {'id','example-' .. tostring(example)}})
237 | cmark.node_append_child(example_div, examplenum_div)
238 | cmark.node_append_child(example_div, leftcol_div)
239 | cmark.node_append_child(example_div, rightcol_div)
240 | end
241 | cmark.node_insert_before(cur, example_div)
242 | cmark.node_unlink(cur)
243 | cmark.node_free(cur)
244 | elseif node_type == cmark.NODE_HTML_BLOCK and
245 | cmark.node_get_literal(cur) == '\n' then
246 | -- change numbering
247 | number = {}
248 | if format ~= 'latex' then
249 | local appendices = make_html_block('div', {{'class','appendices'}})
250 | cmark.node_insert_after(cur, appendices)
251 | -- put the remaining sections in an appendix
252 | local tmp = cmark.node_next(appendices)
253 | while tmp do
254 | cmark.node_append_child(appendices, tmp)
255 | tmp = cmark.node_next(tmp)
256 | end
257 | end
258 | end
259 | end
260 | meta.toc = make_toc(toc)
261 | end
262 |
263 | local to_ref = function(ref)
264 | return '[' .. ref.label .. ']: #' .. ref.indent .. '\n'
265 | end
266 |
267 | local inp = io.read("*a")
268 | local doc1 = cmark.parse_string(inp, cmark.OPT_DEFAULT)
269 | local refs = extract_references(doc1)
270 | local refblock = '\n'
271 | for lab,ident in pairs(refs) do
272 | refblock = refblock .. '[' .. lab .. ']: #' .. ident .. '\n'
273 | -- refblock = refblock .. '[' .. lab .. 's]: #' .. ident .. '\n'
274 | end
275 | -- append references and parse again
276 | local contents, meta, msg = lcmark.convert(inp .. refblock, format,
277 | { smart = true,
278 | yaml_metadata = true,
279 | safe = false,
280 | filters = { create_anchors }
281 | })
282 |
283 | if contents then
284 | local f = io.open("tools/template." .. format, 'r')
285 | if not f then
286 | io.stderr:write("Could not find template!")
287 | os.exit(1)
288 | end
289 | local template = f:read("*a")
290 |
291 | if format == 'html' then
292 | contents = contents:gsub('␣', ' ')
293 | end
294 | meta.body = contents
295 | local rendered, msg = lcmark.render_template(template, meta)
296 | if not rendered then
297 | io.stderr:write(msg)
298 | os.exit(1)
299 | end
300 | io.write(rendered)
301 | os.exit(0)
302 | else
303 | io.stderr:write(msg)
304 | os.exit(1)
305 | end
306 |
--------------------------------------------------------------------------------
/changelog.txt:
--------------------------------------------------------------------------------
1 | [0.30]
2 |
3 | * Add note clarifying that not every feature of HTML examples is normative
4 | (#672).
5 | * Move "Backslash escapes" and "Character references" to "Preliminaries"
6 | (#600). It was confusing having them in the "Inline" section, since
7 | they also affect some block contexts (e.g. reference link definitions).
8 | * Clarify wording in spec for character groups (#618, Titus Wormer, with
9 | Johel Ernesto Guerror Peña).
10 | + Remove line tabulation, form feed from whitespace
11 | + Rename newline to line feed or line ending
12 | + Reword spec to be more explicit about whitespace
13 | + Rename `Punctuation` to `Unicode punctuation`
14 | + Reword description of line breaks
15 | + Link unicode punctuation
16 | + Clarify link whitespace by describing "link information"
17 | + Clarify link destination and title
18 | * Add definition of ASCII control characters (#603, Titus Wormer).
19 | * Fix wording in start condition of type 7 HTML (#665, Titus Wormer).
20 | * Add `textarea` to list of literal HTML block tags (#657).
21 | Like `script`, `style`, and `pre`, `textarea` blocks can contain
22 | blank lines without the contents being interpret as commonmark.
23 | * Add `textarea` to more cases (#667, Titus Wormer).
24 | * Remove superfluous restrictions on declarations (#620, Titus Wormer).
25 | HTML declarations need not be limited to all capital ASCII letters.
26 | * Add inline link examples with empty link text (#636, jsteuer).
27 | * Remove outdated restriction on list item (#627, Johel Ernesto Guerrero
28 | Peña). This is a holdover from the days when two blank lines
29 | broke out of a list.
30 | * Add example with unbalanced parens in link destination.
31 | * Clarify that new blocks are added to *container* blocks. (#598, Jay
32 | Weisskopf).
33 | * Clarify language for backtick code spans (#589, Johel Ernesto Guerrero
34 | Peña).
35 | * Fix link text grouping sample (#584, Ashe Connor, Johel Ernesto
36 | Guerrero Peña).
37 | * Fix misleading text for full reference link (#581). There is no
38 | "first link label" here, just a link text.
39 | * Use better example to test unicode case fold for reference links (#582).
40 | The earlier test could be passed by implementations that just uppercase.
41 | * Test new entity length constraints (#575, Miha Zupan).
42 | * normalize.py: replace cgi.escape with html.escape (#656,
43 | Christopher Fujino).
44 | * tools/make_spec.lua:
45 | + Fix unqualified calls to node_append_child.
46 | + Remove extraneous href attribute on headings (#625, Christoph Päper).
47 | + Properly handle cross-refs (#578). Previously it broke in a few cases,
48 | e.g. with soft breaks.
49 | + Use unsafe mode so HTML isn't filtered out.
50 | + Changes for compatibility with lua 5.3's new number type.
51 | * CSS and HTML improvements in HTML version of spec
52 | (#639, #641, 642, Andrei Korzhyts).
53 | * Revise emphasis parsing algorithm description in light of
54 | commonmark/cmark#383.
55 | * Add documentation of the npm package to README.md (thanks to
56 | Shawn Erquhart).
57 | * Fix anchor definition for 'end condition'.
58 | * Remove duplicate example (#660, Titus Wormer).
59 | * Remove duplicate links in spec (#655, Levi Gruspe).
60 | * Various typo fixes (#632, Scott Abbey; #623, Anthony Fok; #601, #617, #659,
61 | Titus Wormer).
62 |
63 | [0.29]
64 |
65 | * Clarify that entities can't be used to indicate structure (#474).
66 | For example, you can't use `*` instead of `*` for a bullet
67 | list, or `
` to create a new paragraph.
68 | * Limit numerical entities to 6 hex or 7 decimal digits (#487).
69 | This is all that is needed given the upper bound on
70 | unicode code points.
71 | * Specify dropping of initial/final whitespace in setext heading content
72 | (#461).
73 | * Add example with a reference link definition where the reference is never
74 | used (#454).
75 | * Add example with setext header after reference link defs (#395).
76 | * Clarify that script, pre, style close tags can begin an HTML block (#517).
77 | * Remove `meta` from list of block tags in start condition #6 of
78 | HTML blocks (#527). meta tags are used in some inline contexts (though
79 | this isn't valid HTML5), e.g. in schema.org.
80 | * Disallow newlines inside of unquoted attributes (#507,
81 | Shyouhei Urabe) as per HTML spec:
82 | .
83 | * Remove vestigial restriction in list item spec (#543).
84 | The "not separated by more than one blank line" was a left-over
85 | from an earlier version of the spec in which two blank lines
86 | ended a list.
87 | * Fix tests where list items are indented 4+ spaces (#497).
88 | The tests did not accord with the spec here; these
89 | lines should be continuation lines (if no blank space)
90 | or indented code blocks (if blank space).
91 | * Clarify tildes and backticks in info strings (#119).
92 | We disallow backticks in info strings after backtick fences
93 | only. Both backticks and tildes are allowed in info strings
94 | after tilde fences. Add example.
95 | * Indicate that info string is trimmed of all whitespace (#505, Ashe
96 | Connor). As noted in
97 | , the info string
98 | is not only trimmed of "spaces" (U+0020) but also of tabs.
99 | * Don't strip spaces in code span containing only spaces (#569).
100 | This allows one to include a code span with just spaces,
101 | using the most obvious syntax.
102 | * Add example w/ reference link w/ empty destination in `<>` (#172).
103 | * Disallow link destination beginning with `<` unless it is inside `<..>`
104 | (#538). This brings the description in line with the spec example:
105 | ```
106 | [foo]: (baz)
107 |
108 | [foo]
109 | .
110 | [foo]: (baz)
111 | [foo]
112 | ```
113 | * Allow spaces inside link destinations in pointy brackets (#503).
114 | This reverts a change in 0.24 and should make things easier
115 | for people working with image paths containing spaces.
116 | * Remove redundant condition. We don't need to specify that the absolute
117 | URI in an autolink doesn't contain `<`, since this is specified in
118 | the description of an absolute URI.
119 | * Add additional spec examples involving link destinations in `<>` (#473).
120 | * Add test for `[test]()` (#562).
121 | * Disallow unescaped `(` in parenthesized link titles (#526).
122 | * Add example showing need for space before title in reference link (#469).
123 | * Add codepoints for punctuation characters (#564, Christoph Päper).
124 | * Clarify the left- and right-flanking definitions (#534, Jay Martin).
125 | * Match interior delimiter runs if lengths of both are multiples of 3
126 | (#528). This gives better results on `a***b***c` without giving bad
127 | results on the cases that motivated the original multiple of 3 rule.
128 | * Add another test case for emphasis (#509, Michael Howell).
129 | * Code spans: don't collapse interior space (#532).
130 | * Simplify revisions to code span normalization rules.
131 | Remove the complex rule about ignoring newlines adjacent
132 | to spaces. Now newlines are simply converted to spaces.
133 | * Replace image 'url' with 'destination' (#512, Phill).
134 | * Add some links for occurrences of "above" (#480).
135 | * Various typo fixes (#514, Kenta Sato; #533, nikolas;
136 | tnaia, #556; #551, Grahame Grieve).
137 | * Create .gitattributes so that changelog.txt is highlighted as
138 | markdown (#499, Christoph Päper).
139 | * Update GitHub links (Morten Piibeleht).
140 | * Update references to container and leaf block headers to use the
141 | correct pluralization (#531, Elijah Hamovitz).
142 | * Rephrase example #111 to indicate that the rendering is not mandated
143 | (#568).
144 | * Improve documentation of parsing strategy (#563).
145 | Note that `openers_bottom` needs to be indexed to
146 | delimiter run lengths as well as types.
147 | * make_spec.lua: Fix migration of children nodes in create_anchors (#536,
148 | Zhiming Wang). This resulted in some bugs in the rendered spec
149 | (where words would be dropped from links).
150 | * Fix dingus link when double clicking Markdown code (#535, Zhiming Wang).
151 | Prior to this commit, the link opened is always `/dingus/?text=` (no
152 | text).
153 | * Add spec.json generator to Makefile (M Pacer).
154 |
155 | [0.28]
156 |
157 | * Allow unlimited balanced pairs of parentheses in link URLs
158 | (@kivikakk, commonmark/cmark#166). The rationale is that there are many URLs
159 | containing unescaped nested parentheses. Note that
160 | implementations are allowed to impose limits on parentheses
161 | nesting for performance reasons, but at least three levels
162 | of nesting should be supported.
163 | * Change Rule 14 for Emphasis. Previously the nesting
164 | Strong (Emph (...)) was preferred over Emph (Strong (...)).
165 | This change makes Emph (Strong (...)) preferred.
166 | Note that the commonmark reference implementations
167 | were not entirely consistent about this rule, giving
168 | different results for
169 |
170 | ***hi***
171 |
172 | and
173 |
174 | ***hi****
175 |
176 | This change simplifies parsing. It goes against the majority
177 | of implementations, but only on something utterly trivial.
178 | * Clarify definition of delimiter runs (Aidan Woods, #485).
179 | * Clarify precedence of thematic breaks over list items
180 | in the list spec, not just the thematic break spec (Aidan Woods).
181 | * Clarify definition of link labels and normalization
182 | procedure (Matthias Geier, #479).
183 | * Clarify that the end of an HTML block is determined by its
184 | start, and is not changed by HTML tags encountered inside
185 | the block. Give an example of a case where a block ends
186 | before an interior element has been closed (Yuki Izumi, #459).
187 | In the future we might consider changing the spec to handle
188 | this case more intuitively; this change just documents more
189 | explicitly what the spec requires.
190 | * Acknowledge Aaron Swartz's role in developing Markdown.
191 | * Remove misleading backslash in example for disabling image markup
192 | (Matthias Geier).
193 | * Fix Unicode terminology (general category, not class)
194 | (Philipp Matthias Schaefer).
195 | * Add another illustrative case for code spans (#463).
196 | * Remove possibly misleading 'either's (#467).
197 | * Fix typo (Aidan Woods).
198 | * Clarify that some blocks can contain code spans (Matthias Geier).
199 | * Fix typo and clarified tab expansion rule (Scott Abbey).
200 | * Add a missing "iff" (Matthias Geier).
201 | * Add release checklist.
202 | * Added npm package for spec (Vitaly Puzrin).
203 | * Remove SPEC variable from Makefile.
204 |
205 | [0.27]
206 |
207 | * Update statement on blank lines and lists (Jesse Rosenthal).
208 | The definition of a list still said that "two blank lines end all
209 | containing lists." That rule has been removed.
210 | * Clarify that the exception for ordered lists only applies to first
211 | item in list (#420).
212 | * Added cases clarifying precedence of shortcut links (#427).
213 | * Added h2..h6 to block tag list (#430).
214 | * Remove duplicated test (Maxim Dikun). Tests 390 and 391 were the same.
215 | * Use fenced code blocks for markdown examples that are not test cases
216 | for uniformity.
217 | * Added closing paren (#428).
218 | * Test suite: Don't mess up on examples with 32 backticks (#423).
219 | * Removed duplicate reference to "container block".
220 | * Add examples for Unicode whitespace (Timothy Gu). In light of
221 | commonmark/commonmark.js#107, add a few examples/test cases to make sure the
222 | distinction between Unicode whitespace and regular whitespace is kept.
223 | * Fix missing closing paren typo (Robin Stocker).
224 |
225 | [0.26]
226 |
227 | * Empty list items can no longer interrupt a paragraph.
228 | This removes an ambiguity between setext headers and
229 | lists in cases like
230 |
231 | foo
232 | -
233 |
234 | Removed the "two blank lines breaks out of lists" rule.
235 | This is incompatible with the principle of uniformity
236 | (and indeed with the spec for list items, which requires
237 | that the meaning of a chunk of lines not change when it
238 | is put into a list item.)
239 | * Ordered list markers that interrupt a paragraph must start with 1.
240 | * Improved the section on tabs. Added some test cases for ATX
241 | headers and thematic breaks. Clarified that it's not just cases
242 | that affect indentation that matter, but all cases where whitespace
243 | matters for block structure.
244 | * Removed example of ATX header with tab after `#`.
245 | * Allow HTML blocks to end on the last line of their container
246 | (Colin O'Dell, #103).
247 | * Spec changes in strong/emph parsing. See
248 | https://talk.commonmark.org/t/emphasis-strong-emphasis-corner-cases/2123
249 | for motivation. This restores intuitive parsings for a number of cases.
250 | The main change is to disallow strong or emph when one of
251 | the delimiters is "internal" and the sum of the lengths of
252 | the enclosing delimiter runs is a multiple of 3. Thus,
253 | `**foo*bar***` gets parsed `foo*bar` rather than
254 | `foobar**` as before. Thanks to Fletcher Penney
255 | for discussion.
256 | * Add tests to check that markdown parsing is working fine after an HTML
257 | block end tag (Alexandre Mutel).
258 | * Add test case for nested lists with an indent > 4 (Alexandre Mutel).
259 | * Cleaned up terminology around lazy continuation lines. Added some links.
260 | * Fixed broken links and typos (Artyom, Matthias Geier, Sam Estep).
261 | * Use `≤` instead of `<` in list item spec for clarity.
262 | * Add point about readibility to "What is Markdown" section.
263 | * Add example for list item starting with a blank line with spaces
264 | (Robin Stocker).
265 | * Make interact more button-like and clearer (Jeff Atwood).
266 | * `spec_tests.py`: exit code is now sum of failures and errors.
267 | * `spec_tests.py`: specify newline when opening file.
268 |
269 | [0.25]
270 |
271 | * Added several more tab-handling cases (see commonmark/cmark#101).
272 | * Fixed spec test for tabs. In the blockquote with a tab following
273 | the `>`, only one space should be consumed, yielding two spaces
274 | at the beginning of the content.
275 | * Update license year range to 2016 (Prayag Verma).
276 | * Fixed typo: setext heading line -> setext heading underline (#389).
277 | * Fixed date 2015->2016 (#388)
278 |
279 | [0.24]
280 |
281 | * New format for spec tests, new lua formatter for specs.
282 | The format for the spec examples has changed from
283 |
284 | .
285 | markdown
286 | .
287 | html
288 | .
289 |
290 | to
291 |
292 | ```````````````````````````````` example
293 | markdown
294 | .
295 | html
296 | ````````````````````````````````
297 |
298 | One advantage of this is that `spec.txt` becomes a valid
299 | CommonMark file.
300 | * Change `tests/spec_test.py` to use the new format.
301 | * Replace `tools/makespec.py` with a lua script, `tools/make_spec.lua`,
302 | which uses the `lcmark` rock (and indirectly libcmark). It can
303 | generate HTML, LaTeX, and CommonMark versions of the spec. Pandoc
304 | is no longer needed for the latex/PDF version. And, since the new
305 | program uses the cmark API and operates directly on the parse tree,
306 | we get much more reliable translations than we got with the old
307 | Python script (#387).
308 | * Remove whitelist of valid schemes. Now a scheme is any sequence
309 | of 2-32 characters, beginning with an ASCII letter, and containing
310 | only ASCII letters, digits, and the symbols `-`, `+`, `.`.
311 | Discussion at .
312 | * Added an example: URI schemes must be more than one character.
313 | * Disallow spaces in link destinations, even inside pointy braces.
314 | Discussion at and
315 | .
316 | * Modify setext heading spec to allow multiline headings.
317 | Text like
318 |
319 | Foo
320 | bar
321 | ---
322 | baz
323 |
324 | is now interpreted as heading + paragraph, rather than
325 | paragraph + thematic break + paragraph.
326 | * Call U+FFFD the REPLACEMENT CHARACTER, not the "unknown code
327 | point character."
328 | * Change misleading undefined entity name example.
329 | * Remove misleading claim about entity references in raw HTML
330 | (a regression in 0.23). Entity references are not treated
331 | as literal text in raw HTML; they are just passed through.
332 | * CommonMark.dtd: allow `item` in `custom_block`.
333 |
334 | [0.23]
335 |
336 | * Don't allow space between link text and link label in a
337 | reference link. This fixes the problem of inadvertent capture:
338 |
339 | [foo] [bar]
340 |
341 | [foo]: /u1
342 | [bar]: /u2
343 | * Rename "horizontal rule" -> "thematic break". This matches the HTML5
344 | meaning for the hr element, and recognizes that the element may be
345 | rendered in various ways (not always as a horizontal rule).
346 | See http://talk.commonmark.org/t/horizontal-rule-or-thematic-break/912/3
347 | * Rename "header" -> "heading". This avoids a confusion that might arise
348 | now that HTML5 has a "header" element, distinct from the "headings"
349 | h1, h2, ... Our headings correspond to HTML5 headings, not HTML5 headers.
350 | The terminology of 'headings' is more natural, too.
351 | * ATX headers: clarify that a space (or EOL) is needed; other whitespace
352 | won't do (#373). Added a test case.
353 | * Rewrote "Entities" section with more correct terminology (#375).
354 | Entity references and numeric character references.
355 | * Clarified that spec does not dictate URL encoding/normalization policy.
356 | * New test case: list item code block with empty line (Craig M.
357 | Brandenburg).
358 | * Added example with escaped backslash at end of link label (#325).
359 | * Shortened an example so it doesn't wrap (#371).
360 | * Fixed duplicate id "attribute".
361 | * Fix four link targets (Lucas Werkmeister).
362 | * Fix typo for link to "attributes" (Robin Stocker).
363 | * Fix "delimiter" spellings and links (Sam Rawlins).
364 | * Consistent usage of "we" instead of "I" (renzo).
365 | * CommonMark.dtd - Rename `html` -> `html_block`,
366 | `inline_html` -> `html_inline` for consistency. (Otherwise it is too
367 | hard to remember whether `html` is block or inline, a source of
368 | some bugs.)
369 | * CommonMark.dtd - added `xmlns` attribute to document.
370 | * CommonMark.dtd - added `custom_block`, `custom_inline`.
371 | * CommonMark.dtd - renamed `hrule` to `thematic_break`.
372 | * Fixed some HTML inline tests, which were actually HTML blocks, given
373 | the changes to block parsing rules since these examples were written
374 | (#382).
375 | * Normalize URLs when comparing test output. This way we don't fail
376 | tests for legitimate variations in URL escaping/normalization policies
377 | (#334).
378 | * `normalize.py`: don't use `HTMLParseError`, which has been removed
379 | as of python 3.5 (#380).
380 | * Minor spacing adjustments in test output, to match cmark's output,
381 | since we test cmark without normalization.
382 | * `makespec.py`: remove need for link anchors to be on one line.
383 | * `makespec.py`: Only do two levels in the TOC.
384 | * Use `display:inline-block` rather than floats for side-by-side.
385 | This works when printed too.
386 | * Added better print CSS.
387 |
388 | [0.22]
389 |
390 | * Don't list `title` twice as HTML block tag (Robin Stocker).
391 | * More direct example of type 7 HTML block starting with closing tag.
392 | * Clarified rule 7 for HTML blocks. `pre`, `script`, and `style`
393 | are excluded because they're covered by other rules.
394 | * Clarified that type 7 HTML blocks can start with a closing tag (#349).
395 | * Removed `pre` tag from rule 6 of HTML blocks (#355).
396 | It is already covered by rule 1, so this removes an ambiguity.
397 | * Added `iframe` to list of tags that always start HTML blocks (#352).
398 | * Added example of list item starting with two blanks (#332).
399 | * Added test case clarifying laziness in block quotes (see
400 | commonmark/commonmark.js#60).
401 | * Add an example with mixed indentation code block in "Tabs" section
402 | (Robin Stocker). This makes sure that implementations skip columns instead
403 | of offsets for continued indented code blocks.
404 | * Clarified that in ATX headers, the closing `#`s must be unescaped,
405 | and removed misleading reference to "non-whitespace character" in
406 | an example (#356).
407 | * Changed anchor for "non-whitespace character" to reflect new name.
408 | * Removed ambiguities concerning lines and line endings (#357, Lasse
409 | R.H. Nielsen). The previous spec allowed, technically, that a line
410 | ending in `\r\n` might be considered to have two line endings,
411 | or that the `\r` might be considered part of the line and the
412 | `\n` the line ending. These fixes rule out those interpretations.
413 | * Clarify that a character is any code point.
414 | * Space in "code point".
415 | * Capitalize "Unicode".
416 | * Reflow paragraph to avoid unwanted list item (#360, #361).
417 | * Avoid extra space before section number in `spec.md`.
418 | * `makespec.py`: Use `check_output` for simpler `pipe_through_prog`.
419 | * In README, clarified build requirements for `spec.html`, `spec.pdf`.
420 | * Fixed some encoding issues in `makespec.py` (#353).
421 | * Fixed various problems with `spec.pdf` generation (#353).
422 | * Added version to coverpage in PDF version of spec.
423 |
424 | [0.21.1]
425 |
426 | * Added date.
427 |
428 | [0.21]
429 |
430 | * Changed handling of tabs. Instead of having a preprocessing step
431 | where tabs are converted to spaces, we now handle tabs directly in
432 | the parser. This allows tabs to be retained in code blocks and code
433 | spans. This change adds some general language to the effect that,
434 | for purposes of determining block structure, tabs are to be treated
435 | just like equivalent spaces.
436 | * Completely rewrote spec for HTML blocks. The new spec provides
437 | better handling of tags like ``, which can be either block
438 | or inline level content, better handling of custom tags, and
439 | better handling of verbatim contexts like ``, comments,
440 | and `