├── .github └── workflows │ └── release-please.yml ├── 1.0-semantics.norg ├── 1.0-specification.norg ├── CHANGELOG.md ├── design-decisions.norg ├── gtd-1.0.0-rc1.norg ├── norg.peg ├── readme.md ├── readme.norg ├── stdlib.janet └── stdlib.norg /.github/workflows/release-please.yml: -------------------------------------------------------------------------------- 1 | name: Release Please Automatic Semver 2 | 3 | on: 4 | push: 5 | branches: 6 | - main 7 | 8 | jobs: 9 | release: 10 | name: release 11 | runs-on: ubuntu-latest 12 | steps: 13 | - uses: google-github-actions/release-please-action@v3 14 | with: 15 | release-type: simple 16 | package-name: norg-specs 17 | -------------------------------------------------------------------------------- /1.0-semantics.norg: -------------------------------------------------------------------------------- 1 | @document.meta 2 | title: 1.0-semantics 3 | description: 4 | authors: vhyrro 5 | categories: 6 | created: 2022-09-10 7 | version: 0.0.13 8 | @end 9 | 10 | - ( ) Document stdlib macros/carryover tags/ranged tags 11 | - ( ) Describe how tags are evaluated 12 | - ( ) Document inbuilt attached modifier extensions and their behaviours 13 | - (x) When evaluating macros for attributes (inline elements w/ attached mod ext) and 14 | the `&var&` syntax they should be placed on a new line and /then/ expanded. 15 | This prevents user error. 16 | - ( ) Explain how extendable links are macros under the hood. 17 | - (x) Force `#eval` to take in a vararg of variable names to transfer to the janet side? 18 | How does `#eval` know the parameters passed to the current function? 19 | 20 | - :: 21 | Example: 22 | @code norg 23 | : . : Char 24 | : > : Name/Type 25 | : > : Categories 26 | : _ : `*` 27 | : > : Headings 28 | : > : structural, nestable 29 | : _ : `-` 30 | 31 | : . : Char 32 | : v : `*` 33 | : v : `-` 34 | : v : 35 | : / : Name/Type 36 | : v : Headings 37 | @end 38 | --- 39 | 40 | * Introduction 41 | 42 | This file contains a formal description of the semantics of the Norg file format. For an 43 | introduction of what the Norg file format is, it is recommended that you read the [specification] 44 | first. When writing a syntax parser, it's not necessary for you to understand exactly how Norg is 45 | intended to behave - however, when writing a more sophisticated tool like 46 | [Neorg]{https://github.com/nvim-neorg/neorg}, understanding how various tags and dynamic elements 47 | behave is quite crucial. 48 | 49 | This specification, in contrast to the syntax specification, reads more like a book, with 50 | recommendations and requirements written out as they are applicable. 51 | 52 | * Tags 53 | 54 | All dynamic/extendable parts of Norg are centered around tags and macros. 55 | The single {# macro tag} defines macros, and all of the other tag types 56 | execute a macro in some specific way. 57 | 58 | ** Common Definitions 59 | 60 | $ Macro Expansion 61 | The process of expanding a macro involves supplying it with the correct parameters, 62 | then replacing the macro invocation with the result, expanding any other macros that 63 | may be present in said result. 64 | 65 | $ Variable 66 | A variable is a macro that takes no parameters as input and always produces the same output 67 | regardless of context. 68 | 69 | ** Macro Tag 70 | 71 | The base form, the [macro tag]{:1.0-specification:*** Macro Tags} can be seen as a function 72 | definition - it defines the name of the macro, its parameters, and also what the macro evaluates 73 | to. 74 | 75 | An example implementation of a macro may look like such: 76 | |example 77 | =greet name 78 | Hello, &name&! 79 | =end 80 | |end 81 | 82 | We define a macro called `greet`, which takes in a mandatory parameter called `name`. 83 | When this macro is invoked, it evaluates to `Hello, &name&!` 84 | where the `&name&` variable is replaced with whatever we provided to the macro (see: {# macro 85 | expansion}, {# inline macro expansion}). 86 | 87 | *** Macro Redefinitions 88 | 89 | Redefining a macro is permitted. One may have many reasons to redefine macros, it is most 90 | notably used however when redefining {$ variable}s (see {# parameters as macros}). 91 | 92 | *** Parameters as Macros 93 | 94 | Since {$ variable}s are officially just macros without parameters, parameters supplied to 95 | macros are also themselves macros (hence they can be expanded with the `&inline macro 96 | expansion&` syntax). 97 | 98 | *** Supplying Parameters to Macros 99 | 100 | By default, every parameter that a macro expects must be supplied, else an error should be 101 | thrown. If excess values are supplied, then the supplied parameters should be highlighted 102 | in some form in your editor or interpreter and a warning should be issued. Upon execution 103 | excess values should be discarded. 104 | 105 | *** Parameter Modifiers (suffixes) 106 | 107 | There are three suffixes that may be applied to a macro to express some properties about the 108 | object. These are: `?` (optional variable), `*` (optional vararg), `+` (vararg with at least one 109 | element). 110 | 111 | To illustrate via an example: 112 | 113 | |example 114 | =mymacro variable? 115 | Where did my &variable& go. 116 | =end 117 | |end 118 | 119 | Now, it is possible to not supply the variable and not get an error. When expanding a "null" 120 | object, it should evaluate to an empty string, or in other words, to nothing. 121 | 122 | *NOTE:* Norg does not actually have a notion of null values. Instead, the null value is 123 | represented as an empty {* abstract objects}[abstract object]. See that section for more 124 | details. 125 | 126 | For the vararg variations, they exist to store an arbitrary amount of parameters within a single 127 | value, which may be thought of as a list of objects. Yet again, Norg does not have a notion of 128 | lists in their traditional sense, it simply encodes the list as a macro that, when expanded, 129 | evaluates to the contents of the list (space separated). For example: 130 | 131 | |example 132 | =vararg.expand args+ 133 | Here are my args: &args& 134 | =end 135 | |end 136 | 137 | Given the input `test-arg1 test-arg2` the macro will evaluate to `Here are my args: test-arg1 138 | test-arg2`. 139 | 140 | When referencing the parameters in e.g. `&variable&` expansions you do not include the 141 | succeeding modifier. 142 | 143 | *** Parameter Filters (prefixes) 144 | 145 | Apart from being able to supply a modifier to a parameter, you may also choose to filter 146 | out/support only certain kinds of parameters supplied from different sources. For example, 147 | you may only want your macro to be used in the context of a ranged verbatim tag, or you 148 | may only want your function to be usable when ran through an {# extendable links}[extendable 149 | link]. 150 | 151 | Norg recognizes the following set of filters: 152 | - `=` - the content of an extendable link 153 | - `@` - content of a ranged verbatim tag 154 | - `\|` - content of a standard ranged tag 155 | - `>` - the next object (content of all carryover tag types). There is no way to differentiate 156 | between a `#` and `+` invocation, as both have consistent behaviours and return the same types 157 | of data 158 | - `&` - the "variable reference". Used seldom, but this prefix ensures that the value you 159 | provide as a string is a name of a variable that exists in the current scope. When paired with 160 | the `*` vararg and if no values are provided by the user, all variables in the current scope 161 | should be supplied as the content of the variable (see {# parameter filters (prefixes)}). 162 | Expanding the parameter reference should expand the variable the reference is pointing to. 163 | 164 | When referencing the arguments through e.g. the {# inline macro expansion}, you do not include 165 | the preceding modifier in the parameter name. 166 | 167 | *** Macro Return Values 168 | 169 | Macros may return only one of three values: 170 | - Raw Norg Markup 171 | - An {# Abstract Objects}[abstract object] 172 | - An abstract object future 173 | 174 | When dealing with raw norg macros, one may only return raw norg markup. 175 | When invoking {* janet} from within the macro via {* the `eval` carryover tag}, 176 | then the return value is determined by the last expression in the eval block. 177 | 178 | If the return value is not a {https://janet-lang.org/docs/strings.html}[string] , a 179 | `(neorg/abstract-object)` nor a `(neorg/await-abstract-object)`, then an appropriate error 180 | should be raised. 181 | 182 | ** Infirm Tags 183 | 184 | The simplest way to invoke a macro is via the infirm tag. It does not auto-supply any parameters 185 | like the rest do. 186 | 187 | An example usage looks like so: 188 | 189 | @code norg 190 | 191 | =greet name 192 | Hello, &name&! 193 | =end 194 | 195 | We then invoke the macro with the `.` prefix: 196 | 197 | .greet Vhyrro 198 | 199 | @end 200 | 201 | When expanding the macro, you get: 202 | @code norg 203 | 204 | Hello, 205 | Vhyrro 206 | \! 207 | 208 | @end 209 | 210 | Which, after an automatic reformatting, converts to `Hello, Vhyrro!`. 211 | See {# inline macro expansion} rules for more information. 212 | 213 | ** Ranged Tags 214 | 215 | Ranged tags are a special type of macro invocation. They take in arbitrary parameters, and then 216 | also consume some content until an end marker is reached. This content may be norg markup, in 217 | which case the tag in question is a /standard ranged tag/, otherwise the tag takes in verbatim 218 | markup and is a /verbatim ranged tag/. 219 | 220 | To create a macro that specifically targets a ranged tag, you may use one of the prefixes 221 | described {# parameter filters (prefixes)}[here]. Below is an example for the verbatim ranged 222 | tag that takes in a string and slices it by some amount: 223 | @code norg 224 | 225 | =slice start end? @content 226 | .eval janet (string/slice content start end) 227 | =end 228 | 229 | @end 230 | 231 | *NOTE:* When invoking the macro using the `@` tag syntax, the `content` variable should be auto-supplied 232 | by Norg. The content should always target the last parameter. 233 | 234 | The example uses {* janet} to perform the actual slicing logic. To invoke the macro, we use the 235 | ranged tag syntax: 236 | |example 237 | 238 | @slice 0 5 239 | hello world! 240 | @end 241 | 242 | |end 243 | 244 | When the macro is expanded, it evaluates to the norg markup `hello`. This is because the `.eval` 245 | call returns a string (see {# macro return values}). 246 | 247 | * Inline Macro Expansion 248 | 249 | Inline macro expansion is the process of expanding the content of a macro in-line via the 250 | `&` attached modifiers. 251 | There is no way to supply parameters to an inline macro expansion and, therefore by definition, 252 | only {# variable}s may be expanded through this syntax. 253 | 254 | When performing inline expansion, the expansion must be isolated onto its own separate line, 255 | this involves prefixing and postfixing the inline macro expansion with newlines /and/ postfixing 256 | the expansion with an escape character (this is to prevent punctuation that may occur after the 257 | macro expansion to be treated as a detached modifier). This means that given the input `Hello, 258 | &name&!` (like in our [greet example]{# macro tag}), the macro should be isolated like this: 259 | 260 | @code norg 261 | 262 | Hello, 263 | &name& 264 | \! 265 | 266 | @end 267 | 268 | And only then should `&name&` be expanded to its underlying value. 269 | 270 | Such a rule allows these variables to expand to complex data types like headings, footnotes and 271 | other detached modifiers without unintentionally breaking. 272 | 273 | This rule also carries over to {* extendable links}. 274 | 275 | ** Compressing Expanded Results 276 | 277 | After the result has been expanded, your application may want to automatically reformat the 278 | output of the macro. Several lines for what could be a single line is unpleasant to look at, 279 | not to mention the extraneous escape character that the expansion might generate. 280 | 281 | Once the expansion is complete, an inbuilt formatter may choose to stitch the text back together 282 | in a coherent fashion, if the output of the macro does not begin with a detached modifier or tag. 283 | Following up with the example in the previous section: 284 | 285 | @code norg 286 | 287 | Hello, 288 | Vhyrro, 289 | \! 290 | 291 | @end 292 | 293 | May be compressed back to: 294 | 295 | @code norg 296 | 297 | Hello, Vhyrro! 298 | 299 | @end 300 | 301 | * (=) Attributes 302 | 303 | Attributes are a special data type in Norg as they serve as a way to retain a function invocation 304 | for later use, and allow for combinatoric logic by chaining macro calls. If a carryover tag is 305 | applied to an attribute, then the carryover tag should /not/ be invoked on the attribute, but 306 | should instead be stored in the attribute's metadata internally in your application through the 307 | standard library function . 308 | 309 | * Null Attached Modifier 310 | 311 | The null attached modifier serves two purposes - when used standalone, all text inbetween the two 312 | `%` modifiers gets removed. This means that, on its own, `%this%` acts as an inline comment 313 | syntax. When paired with attached modifier extensions, the attached modifier extensions determine 314 | how the text is displayed. For example, `%blue text%(color:blue)` renders the text as blue, 315 | without any further modifications. This allows for easy inline styling of objects or words, as `%` 316 | is non-verbatim (other markup can exist within it). 317 | 318 | * Extendable Links 319 | 320 | Apart from the traditional inbuilt links, extendable links exist to allow custom search behaviour 321 | to be implemented within Norg. The extendable link is prefixed with a `=`, and has its behaviour 322 | governed by an succeeding attached modifier extension. 323 | 324 | The extendable link is simply a fancy macro invocation with two parameters, the second being 325 | optional: the content of the link, and the content of the link description (if any). 326 | 327 | This means that the following: 328 | @code norg 329 | {= some text}[a description](macro-name) 330 | @end 331 | 332 | Is equivalent to: 333 | @code norg 334 | .macro-name some\ text a\ description 335 | @end 336 | 337 | To explicitly capture extendable links, you may use the {# parameter filters (prefixes)}[`=` 338 | prefix] before your parameter name. 339 | 340 | * The Standard Library 341 | 342 | Norg implements a cross-platform standard library that any [Neorg]-like implementation may use. 343 | It is never recommended to roll your own standard library. 344 | 345 | Despite this, a single macro is /required/ to be implemented by every client that implements Norg 346 | macros - this is the `.invoke-janet` macro. Its syntax looks like so: 347 | @code norg 348 | 349 | .invoke-janet code+ 350 | 351 | @end 352 | 353 | Where `code` is a vararg of strings that get stitched together with a single space char before 354 | executing the code. Code should be executed in a janet {# sandboxing}[sandbox]. It forms the baseline for the 355 | `#eval` carryover tag. 356 | 357 | The `stdlib.norg` file can be found {/ stdlib.norg}[here]. 358 | 359 | * The `#eval` Carryover Tag 360 | 361 | Complex macros and their behaviours are only possible with a scripting language backing it. {* 362 | Janet} is our first class citizen language of choice, and there must be some way to execute it and 363 | return its result. 364 | 365 | For this, the `#eval` carryover tag exists, which is a wrapper around `invoke-janet`, ensuring 366 | every variable captured is valid. The `#eval` tag delegates the rest of the invocation logic to 367 | `(neorg/execute)` (see {# norg-janet standard library}), which also allows for execution of code 368 | in different programming languages. 369 | 370 | `#eval` always expects a `@code` block to follow itself containing the code to execute. 371 | 372 | * Abstract Objects 373 | 374 | Abstract Objects (AOs) are an opaque data type within Norg. They serve as a way to represent some 375 | intermediate information about an object without having concrete Norg markup backing it. 376 | 377 | Abstract Objects have a few builtin properties - some must be provided, whereas others are left as 378 | optional: 379 | - The {# AST Node} that the abstract object would like to bind itself to (*optional*) 380 | - A translation function to convert the intermediate representation of the AO to any given target 381 | format (markdown, asciidoc etc.). The target may also be Norg itself. Returning `nil` tells 382 | Neorg that said object does not have a representation for the given target. (*required*) 383 | - Custom data that the AO would like to keep for future reference (*optional*) 384 | 385 | A prime example of AOs in use are `@code` blocks. There is no Norg syntax that `@code` blocks 386 | could possibly evaluate to, as Norg does not have a built-in code block syntax. Instead, 387 | when the `@code` macro is evaluated, it gets translated into an abstract object with some 388 | properties. Now, when the user wants to export their Norg document into markdown, the translation 389 | function is invoked, and the `@code` block is converted into a markdown fenced code block 390 | (`|```|`). 391 | 392 | ** "Null" Objects 393 | 394 | Null objects are a special type of AO, which form when the {# AST Node} is left as 395 | `nil`, and whose translation function simply always return an empty string (`""`). 396 | 397 | To produce a null AO, you may use the `(neorg/null-abstract-object)` function (see {* janet}[this 398 | section] on janet support). 399 | 400 | ** `&...&` expansion overrides 401 | 402 | When attempting to expand a {$ variable} through the inline macro expansion syntax (`&this&`), 403 | Norg should look at a few factors: 404 | 405 | ~ If the variable contains raw norg markup, paste the contents of the raw markup within the 406 | document, respecting the {# inline macro expansion} rules. 407 | ~ If the variable is an abstract object, then execute the translation function with the target 408 | language set to `norg`. If the returned result is raw norg markup, then perform step 1. If the 409 | translation function returns `nil`, then the macro is considered {# Bakeability}[unbakeable]. 410 | When this is the case, issue a warning to your user letting them know that baking the macro is 411 | not possible. 412 | 413 | * Bakeability 414 | 415 | "Baking" refers to the process of expanding a macro permanently and irreversibly by pasting its 416 | expanded form in place of the original macro invocation. Most macros can be baked, but not all. 417 | 418 | Baking is a process manually triggered by the user (with a keybind or command in your 419 | application). Not all macros can be baked, however. 420 | 421 | Baking of an object boils down to invoking the translation function of an {# Abstract 422 | Objects}[abstract object] with the target set to `norg`. If the function returns `nil`, then the 423 | macro is not bakeable. 424 | 425 | * Tables 426 | 427 | Tables in Norg are rather alien. Instead of opting for a visual approach of defining a table, 428 | Norg's syntax serves as a way to "program" a table moreso than it is to create one on the spot. 429 | For a visual approach, one may use the `@table` tag, which evaluates to the official table syntax. 430 | 431 | In usual table implementations, tables start at the root (0, 0)\/A1. Afterwards, the user visually 432 | populates the table with entries in a dynamic fashion - new cells grow the table as they are 433 | encountered. In norg you are initially provided with an /infinite spreadsheet/ to act as a canvas, 434 | with a root (A1). Then, you define /cells/ which exist at some position in this canvas, i.e. A5, 435 | and the content of that cell gets placed at the specified position. You build tables by defining 436 | *both the what and the where*, instead of just the *what*. 437 | 438 | As stated previously, tables are defined by individual /cells/. Each `:` part of a table is 439 | considered a cell: 440 | @code norg 441 | : A1 442 | Cell one. 443 | : B1 444 | Cell two. 445 | @end 446 | 447 | The title portion of the `:` detached modifier determines /where to place the content of the 448 | cell/. Absolute positions are marked with a `[A-Z]+[0-9]+` syntax. For example, `A3` means "first 449 | row, third column". More crazy examples include: `AA230`, 450 | `B032` (preceding zeroes are trimmed, leaving `B32` as the final cell location). 451 | 452 | ** Motions 453 | 454 | Apart from absolute positions, one may opt for /relative positions/, using a combination of the 455 | following *motions*: 456 | 457 | - *Root motion*: `.` - this motion is a shorthand alternative to writing `A1`. It denotes the root 458 | of the table. 459 | - *General motions*: `<`, `>`, `^`, `v` - these define a single movement left, right, up and down 460 | relative to the previous cell, respectively. 461 | - *Floor motion*: `_` - moves down once and continues moving left until the leftmost /populated/ 462 | cell is encountered. This allows for easy creation of left-to-right tables, as you may choose a 463 | starting position for your table, use the `>` motion to populate entries to the right, and then 464 | slide back to the left with the `_` operator as if you were sliding back the paper bail of a 465 | typewriter - your "cell cursor" is now back at the beginning, just a row lower. 466 | - *Ceiling motion*: `/` - another paradigm for creating tables that isn't left-to-right is 467 | top-to-bottom. In this case, you define a starting position and use the `v` motion to move down 468 | and populate cells, after which you use the `/` operator to move one cell to the right and to 469 | move back to the top. The `/` operator should continue searching upwards for the upmost 470 | /populated/ cell, and consider that the "ceiling". 471 | 472 | *** Motion Repetition 473 | 474 | Motions may be prefixed with a count to repeat the motion `n` amount of times. 475 | Different motions may also be combined. 476 | 477 | Examples include: 478 | - `3>` - move to the right three times 479 | - `2_` - perform two floor motions 480 | - `2>v` - move to the right twice, then down a single cell 481 | 482 | *** Left-side Underflow 483 | 484 | Norg allows a semantic edge case for the `<` motion - when the motion is at column `1` and the 485 | left motion is used the motion underflows to the row above, occupying the position of the 486 | rightmost /populated/ cell. It may be considered the inverse (or the undoing) of the floor 487 | motion (`_`). 488 | 489 | ** Cell Notation 490 | 491 | The notation for a cell is the same as the notation used in spreadsheet applications like excel. 492 | Letters determine the column whereas numbers determine the row (e.g. `C2` is column 3 row 2). 493 | 494 | ** Intersecting Modifiers 495 | 496 | Commonly you'll see the intersecting modifier used to make tables appear simpler. 497 | Instead of writing out the full syntax: 498 | @code norg 499 | : A1 500 | Content. 501 | @end 502 | 503 | When the cell only contains text users will commonly write: 504 | @code norg 505 | : A1 : Content. 506 | @end 507 | for brevity. 508 | 509 | ** Dynamic Positioning 510 | 511 | A unique side effect of Norg allowing cells to exist at arbitrary positions in an infinite canvas 512 | includes /being able to place cells at dynamic positions using macros/: 513 | @code norg 514 | : &position& 515 | Content. 516 | @end 517 | 518 | The `position` variable may depend on parameters or other data within the current table, 519 | yielding complex behaviours. 520 | 521 | ** ( ) Examples 522 | 523 | TODO 524 | 525 | * Janet 526 | 527 | **** Norg-Janet Standard Library 528 | 529 | ** AST Nodes 530 | 531 | === 532 | 533 | %| vim: set tw=100 :|% 534 | -------------------------------------------------------------------------------- /1.0-specification.norg: -------------------------------------------------------------------------------- 1 | @document.meta 2 | title: The 1.0 Norg Specification 3 | authors: [ 4 | vhyrro 5 | mrossinek 6 | ] 7 | categories: specifications 8 | version: 1.0 9 | @end 10 | 11 | * Norg File Format Specification 12 | This file contains the formal file format specification of the Norg syntax version 1.0. 13 | This document is written in the Norg format in its original form and, thus, attempts to be 14 | self-documenting. 15 | 16 | Please note that this is *not* a reference implementation - this is an established rule set that 17 | should be strictly followed. 18 | 19 | * Introduction 20 | Before diving into the details we will start with an introduction. The Norg file format was 21 | designed as part of the [Neorg]{https://github.com/nvim-neorg/neorg} plugin for Neovim which was 22 | started by /Vhyrro (@vhyrro)/ in April 2021. Soon after starting this work, /Max Rossmannek 23 | (@mrossinek)/ joined the development team, and, with the help of the [Neorg] community, the two 24 | have shaped the Norg syntax to what it has become today. 25 | 26 | ** What is Norg? 27 | The Norg syntax is a /structured/ plain-text file format which aims to be human-readable when 28 | viewed standalone while also providing a suite of markup utilities for typesetting structured 29 | documents. Compared to other plain-text file formats like e.g. Markdown, Org, RST or AsciiDoc, it 30 | sets itself apart most notably by following a strict philosophy to abide by the following simple 31 | rules: 32 | ~ *Consistency:* the syntax should be consistent. Even if you know only a part of the syntax, 33 | learning new parts should not be surprising and rather feel predictable and intuitive. 34 | ~ *Unambiguity:* the syntax should leave _no_ room for ambiguity. This is especially motivated by 35 | the use of [tree-sitter]{https://tree-sitter.github.io/tree-sitter/} for the original syntax 36 | parser, which takes a strict left-to-right parsing approach and only has single-character 37 | look-ahead. 38 | ~ *[Free-form]{https://en.wikipedia.org/wiki/Free-form_language}:* whitespace is _only_ used to 39 | delimit tokens but has no other significance! This is probably the most contrasting feature to 40 | other plain-text formats which often adhere to the 41 | [off-side rule]{https://en.wikipedia.org/wiki/Off-side_rule}, meaning that the syntax relies on 42 | whitespace-indentation to carry meaning. 43 | 44 | Although built with [Neorg] in mind, Norg can be utilized in a wide range of applications, 45 | from external note-taking plugins to even messaging applications. Thanks to its {* layers}[layer] 46 | system one can choose the feature set they'd like to support and can ignore the higher levels. 47 | 48 | * Preliminaries 49 | First, we define some basic concepts which will be used in this specification. 50 | 51 | ** Characters 52 | A Norg file is made up of /characters/. 53 | A is any Unicode [code point]{https://en.wikipedia.org/wiki/Code_point} or 54 | [grapheme]{https://www.unicode.org/glossary/#grapheme}. 55 | 56 | *** Whitespace 57 | A {** characters}[character] is considered *whitespace* if it constitutes any code point in the 58 | [Unicode Zs general category]{https://www.fileformat.info/info/unicode/category/Zs/list.htm}. 59 | 60 | Any combination of the above is also considered whitespace. 61 | 62 | Tabs are not expanded to spaces and since whitespace has no semantic meaning there is no need 63 | to define a default tab stop. However, if a parser must (for implementation reasons) define a 64 | tab stop, we suggest setting it to 4 spaces. 65 | 66 | Any line may be preceded by a variable amount of whitespace, which should be ignored. Upon 67 | entering the beginning of a new line, it is recommended for parsers to continue consuming (and 68 | discarding) consecutive whitespace characters exhaustively. 69 | 70 | The "start of a line" is considered to be /after/ this initial whitespace has been parsed. 71 | Keep this in mind when reading the rest of the document. 72 | 73 | *** Line Endings 74 | Line endings in Norg serve as a termination character. They are used e.g. to terminate 75 | {** paragraph segments}, {** paragraphs} and other elements like the endings of {** range-able 76 | detached modifiers}. They are not considered {*** whitespace}. 77 | 78 | The following chars are considered line endings: 79 | - A line feed `U+000A` 80 | - A form feed `U+000C` 81 | - A carriage return `U+000D` 82 | 83 | The following line ending combinations are permitted: 84 | - A single line feed 85 | - A single carriage return 86 | - A carriage return immediately followed by a line feed 87 | 88 | *** Punctuation 89 | A {** characters}[character] is considered *punctuation* if it is any of the following: 90 | - A standard ASCII punctuation character: `|!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~|` 91 | - Anything in the general Unicode categories `Pc`, `Pd`, `Pe`, `Pf`, `Pi`, `Po` or `Ps`. 92 | 93 | *** Escaping 94 | A single {** characters}[character] can be escaped if it is immediately preceded by a backslash, 95 | `|\|` (`U+005C`). 96 | 97 | Escaping renders the next character /verbatim/. Any {** characters}[character] may be escaped 98 | /apart from/ {** characters} within free-form and ranged verbatim segments (see {** free-form 99 | attached modifiers} and {*** verbatim ranged tags}). 100 | 101 | For more information about precedence, take a look at the {* precedence} section. 102 | 103 | *** Regular Characters 104 | Any other character not described by the preceding sections is treated as a generic code 105 | point/character. 106 | 107 | ** Words 108 | The Norg format is designed to be parsed on a word-by-word basis from left-to-right through the 109 | entire document /in a single pass/. This is possible because the language is [free-form], meaning 110 | that whitespace has no semantic meaning, and because the markup follows strict rules which are 111 | outlined in the later sections of this document. 112 | 113 | A *word* is considered to be any combination of {*** regular characters}. 114 | 115 | ** Paragraph Segments 116 | {** Words} are first combined into *paragraph segments*. A paragraph segment may then contain any 117 | inline element of type: 118 | - {* attached modifiers} 119 | - {* linkables} 120 | 121 | Usually, a {*** line endings}[line ending] terminates the paragraph segment. 122 | This means that a paragraph segment is simply a line of text: 123 | |example 124 | I am a paragraph segment. 125 | I am another paragraph segment. 126 | Together we form a paragraph. 127 | |end 128 | 129 | ** Verbatim Paragraph Segments 130 | These are structurally equivalent to regular {** paragraph segments} with a single exception. 131 | Verbatim paragraph segments are built up from /only/ {** words}. This means that attached 132 | modifiers and linkables are simply parsed as raw text within a verbatim paragraph segment. 133 | 134 | ** Paragraphs 135 | Paragraphs are then formed of consecutive {** paragraph segments}. A paragraph is terminated by: 136 | - A {$ paragraph break} 137 | - Any of the {* detached modifiers} 138 | - Any of the {** delimiting modifiers} 139 | - Any of the {** ranged tags} 140 | - Any of the {*** strong carryover tags} 141 | 142 | $ Paragraph Break 143 | A paragraph break is defined as an _empty line_. In the simplest case that means two consecutive 144 | {*** line endings} but since Neorg is a /free-form/ markup language, a line which only contains 145 | whitespace is also considered empty. 146 | 147 | * Detached Modifiers 148 | Norg has several detached modifiers. The name originates from their differentiation to the 149 | {* attached modifiers}, which will be discussed later. These make up the majority of the syntax. 150 | 151 | All detached modifiers must abide by the following rules: 152 | - A detached modifier can _only_ occur at the beginning of the line. 153 | - Depending on the modifier type one, two or an arbitrary amount of the same consecutive 154 | characters may initiate the detached modifier. 155 | - A detached modifier must be immediately followed by {*** whitespace}. 156 | 157 | The following table outlines all valid *detached modifiers*. It also adds various possible 158 | properties to each category which will be explained in more detail below. 159 | : . : Character 160 | : > : Name 161 | : > : Categories 162 | : _ : `*` 163 | : > : Headings 164 | :: > 165 | - Structural 166 | - Nestable 167 | :: 168 | : _ : `-` 169 | : > : Unordered Lists 170 | :: > 171 | - Nestable 172 | :: 173 | : _ : `~` 174 | : > : Ordered Lists 175 | :: > 176 | - Nestable 177 | :: 178 | : _ : `>` 179 | : > : Quotes 180 | :: > 181 | - Nestable 182 | :: 183 | : _ : `$` 184 | : > : Definitions 185 | :: > 186 | - Range-able 187 | :: 188 | : _ : `^` 189 | : > : Footnotes 190 | :: > 191 | - Range-able 192 | :: 193 | : _ : `:` 194 | : > : Table cells 195 | :: > 196 | - Range-able 197 | :: 198 | : _ : `%` 199 | : > : Attributes 200 | :: > 201 | - Nestable 202 | :: 203 | 204 | ** Structural Detached Modifiers 205 | The first detached modifier type is the /structural/ modifier type. As the name suggests, 206 | modifiers under this category *structure* the Norg document in some form or another. 207 | 208 | After a structural modifier, one {# paragraph segments}[paragraph segment] is consumed as the 209 | /title/ of the modifier. 210 | 211 | A property of structural detached modifiers is that they consume *all* other non-structural 212 | detached modifiers, lower-level structural modifiers, inline markup and {** paragraphs}; 213 | they are the most important detached modifier in the hierarchy of modifiers. 214 | 215 | To manually terminate a structural detached modifier (like a heading) you must use a 216 | {** delimiting modifiers}[delimiting modifier]. Structural detached modifiers are automatically 217 | closed when you use another structural modifier of the same or lower level. 218 | 219 | *** Headings 220 | |example 221 | * Heading level 1 222 | ** Heading level 2 223 | *** Heading level 3 224 | **** Heading level 4 225 | ***** Heading level 5 226 | ****** Heading level 6 227 | ******* Heading level 7 (falls back to level 6 in the tree-sitter parser) 228 | |end 229 | 230 | Although headings are both structural /and/ nestable (see next section), the former takes 231 | precedence over the latter, meaning that headings only affect a single 232 | {** paragraph segments}[paragraph segment] as their title. This is for user convenience as it 233 | does not require an empty line right below a heading. Because of this precedence, headings are 234 | also non-{** grouping}. 235 | 236 | Headings serve as a way to categorize and organize other elements into smaller chunks for better 237 | readability. They are currently the /only/ structural detached modifier present in the Norg 238 | syntax. 239 | 240 | ** Nestable Detached Modifiers 241 | Nestable detached modifiers are a kind which may be repeated multiple times in order to produce a 242 | _nested_ object of the given type. 243 | 244 | Furthermore, in contrast to most other {* detached modifiers}, this detached modifier 245 | type has /no/ title, and consumes the following `paragraph` instead of only the next 246 | {# paragraph segments}[paragraph segment]. Said paragraph then becomes the modifier's /content/. 247 | This means that in order to terminate the detached modifier contents, you must use a {$ paragraph 248 | break}. 249 | 250 | Below you will find examples of nestable detached modifiers. 251 | 252 | *** Unordered Lists 253 | |example 254 | - Unordered list level 1 255 | -- Unordered list level 2 256 | --- Unordered list level 3 257 | ---- Unordered list level 4 258 | ----- Unordered list level 5 259 | ------ Unordered list level 6 260 | ------- Unordered list level 7 (falls back to level 6 in the tree-sitter parser) 261 | 262 | - Unordered list level 1 263 | This text is still part of the level 1 list item. 264 | -- Unordered list level 2 265 | This text is still part of the level 2 list item. 266 | --- Unordered list level 3 267 | This text is still part of the level 3 list item. 268 | ---- Unordered list level 4 269 | This text is still part of the level 4 list item. 270 | ----- Unordered list level 5 271 | This text is still part of the level 5 list item. 272 | ------ Unordered list level 6 273 | This text is still part of the level 6 list item. 274 | ------- Unordered list level 7 (falls back to level 6 in the tree-sitter parser) 275 | This text is still part of the level 7 list item. 276 | |end 277 | 278 | Unordered lists provide an easy way to enumerate items in an unordered fashion. Useful for data 279 | that's categorically similar but doesn't need to follow a strict order. 280 | 281 | *** Ordered Lists 282 | |example 283 | ~ Ordered list level 1 284 | ~~ Ordered list level 2 285 | ~~~ Ordered list level 3 286 | ~~~~ Ordered list level 4 287 | ~~~~~ Ordered list level 5 288 | ~~~~~~ Ordered list level 6 289 | ~~~~~~~ Ordered list level 7 (falls back to level 6 in the tree-sitter parser) 290 | 291 | ~ Ordered list level 1 292 | This text is still part of the level 1 list item. 293 | ~~ Ordered list level 2 294 | This text is still part of the level 2 list item. 295 | ~~~ Ordered list level 3 296 | This text is still part of the level 3 list item. 297 | ~~~~ Ordered list level 4 298 | This text is still part of the level 4 list item. 299 | ~~~~~ Ordered list level 5 300 | This text is still part of the level 5 list item. 301 | ~~~~~~ Ordered list level 6 302 | This text is still part of the level 6 list item. 303 | ~~~~~~~ Ordered list level 7 (falls back to level 6 in the tree-sitter parser) 304 | This text is still part of the level 7 list item. 305 | |end 306 | 307 | This list type is only useful for data that needs to be kept in sequence. In contrast to other 308 | formats which may use a syntax like `1.`/`1)`, Norg counts the items automatically - this 309 | reduces complexity and makes reordering items simple. 310 | 311 | *** Quotes 312 | |example 313 | > Quote level 1 314 | >> Quote level 2 315 | >>> Quote level 3 316 | >>>> Quote level 4 317 | >>>>> Quote level 5 318 | >>>>>> Quote level 6 319 | >>>>>>> Quote level 7 (falls back to level 6 in the tree-sitter parser) 320 | 321 | > Quote level 1 322 | This text is still part of the level 1 quote. 323 | >> Quote level 2 324 | This text is still part of the level 2 quote. 325 | >>> Quote level 3 326 | This text is still part of the level 3 quote. 327 | >>>> Quote level 4 328 | This text is still part of the level 4 quote. 329 | >>>>> Quote level 5 330 | This text is still part of the level 5 quote. 331 | >>>>>> Quote level 6 332 | This text is still part of the level 6 quote. 333 | >>>>>>> Quote level 7 (falls back to level 6 in the tree-sitter parser) 334 | This text is still part of the level 7 quote. 335 | |end 336 | 337 | Quotes are rather self-explanatory - they allow you to cite e.g. a passage from another source. 338 | 339 | *** Invalid Nestable Detached Modifier Examples 340 | |example 341 | >I am not a quote 342 | 343 | some preceding text > I am also not a quote 344 | 345 | >- I am not a valid detached modifier 346 | 347 | > > I am only a level 1 quote 348 | 349 | * 350 | I am not a valid heading title. 351 | |end 352 | 353 | ** Range-able Detached Modifiers 354 | Range-able detached modifiers can occur in two forms: 355 | - With a single character in which case they consume: 356 | -- The following *verbatim* paragraph segment which becomes the /title/. 357 | -- Any following paragraph which becomes the /content/. 358 | - With two consecutive characters in which case: 359 | -- The following *verbatim* paragraph segment also becomes the /title/. 360 | -- The content continues until the "closing" detached modifier is found. Said closing modifier is 361 | made up of the same two consecutive characters that initially opened the range-able detached 362 | modifier, however is immediately followed by a {*** line endings}[line ending]. 363 | 364 | Below you may find all available range-able detached modifiers within the Norg syntax. 365 | 366 | *** Definitions 367 | Definitions are primarily of use to people who write technical documents. 368 | They consist of a term, and then are followed by a definition of that term. 369 | 370 | |example 371 | $ Term 372 | Definition content. 373 | |end 374 | 375 | To create longer definitions, use the ranged definition syntax instead: 376 | |example 377 | $$ Term 378 | Content of the definition. 379 | 380 | Which scans up to the closing modifier. 381 | $$ 382 | |end 383 | 384 | *** Footnotes 385 | Footnotes allow the user to give supplementary information related to some text without 386 | polluting the paragraph itself. Footnotes can be linked to using {* linkables}. 387 | 388 | |example 389 | ^ Single Footnote 390 | Optional footnote content. 391 | |end 392 | 393 | To create longer footnotes, use the ranged footnote syntax instead: 394 | |example 395 | ^^ Ranged Footnote 396 | Content of the footnote. 397 | 398 | Which scans up to the closing modifier. 399 | ^^ 400 | |end 401 | 402 | *** Table Cells 403 | Table cells are used to procedurally build up a table. Here are a few examples of table cells: 404 | |example 405 | : A1 406 | Content of table cell at `A1`. 407 | :: A2 408 | > Content of table cell at `A2` (in a quote). 409 | :: 410 | |end 411 | Their semantics are described in more detail in the {:1.0-semantics:* Tables}[semantics] document, 412 | which we recommend reading if you are interested in the behavior of objects as opposed to how 413 | they are represented using just syntax. 414 | 415 | *NOTE*: In order to make tables more aesthetically pleasing, they're commonly mixed with the 416 | {* intersecting modifiers}[intersecting modifier] syntax to produce the following: 417 | |example 418 | : A1 : Content of table cell at `A1`. 419 | |end 420 | 421 | ** Grouping 422 | Both nestable and range-able detached modifiers have a unique quality - when several consecutive 423 | modifiers /of the same type/ are encountered (by consecutive we mean *not* separated via a 424 | {$ paragraph break}), they are treated as one whole . This is crucial to understand as 425 | it is required for the many types of {** carryover tags} to function. 426 | 427 | *** Examples 428 | |example 429 | The following items naturally group because they are range-able, for example forming a 430 | definition list: 431 | $ Term 1 432 | Definition 1! 433 | $ Term 2 434 | Definition 2! 435 | |end 436 | 437 | |example 438 | Together, these form one whole unordered list: 439 | - List item 1 440 | - List item 2 441 | |end 442 | 443 | |example 444 | - List item in one list 445 | 446 | - This item is in another list, because we used a {$ paragraph break} to split these items 447 | |end 448 | 449 | ** Delimiting Modifiers 450 | In Norg, {** structural detached modifiers} and {*** indent segment}s may be terminated by a 451 | delimiting modifier. This allows one to prematurely terminate e.g. a heading. 452 | 453 | This kind of modifier must abide by the following rules: 454 | - A delimiting modifier can _only_ occur at the beginning of the line. 455 | - A delimiting modifier must consist of two or more consecutive characters of the same type. 456 | - A delimiting modifier must be followed by an immediate {*** line endings}[line ending]. 457 | 458 | *** Weak Delimiting Modifier 459 | This modifier uses the `-` character and immediately closes the /current/ nesting level 460 | (decreases the current nesting level by one). 461 | |example 462 | * Heading level 1 463 | Text under first level heading. 464 | 465 | ** Heading level 2 466 | Text under second level heading. 467 | --- 468 | 469 | Text under first level heading again. 470 | |end 471 | 472 | *** Strong Delimiting Modifier 473 | This modifier uses the `=` character and immediately closes all nesting levels. 474 | |example 475 | * Heading level 1 476 | Text under first level heading. 477 | 478 | ** Heading level 2 479 | Text under second level heading. 480 | === 481 | 482 | Text belonging to the document's root. 483 | |end 484 | 485 | *** Horizontal Rule 486 | This modifier uses the `_` character and simply renders a horizontal line. It does *not* 487 | affect the heading level but immediately terminates any {** paragraphs}[paragraph]. 488 | |example 489 | * Heading level 1 490 | Text under first level heading. 491 | ___ 492 | This is a new paragraph separated from the previous one by a horizontal line. 493 | This text still belongs to the first level heading. 494 | |end 495 | 496 | ** Detached Modifier Extensions 497 | {* Detached modifiers} support extensions which must immediately follow the detached modifier (or 498 | another extension). These are used to attach general metadata to the detached modifier (i.e. TODO 499 | statuses, due dates etc.). Note that {* detached modifiers} must be succeeded with {# whitespace}, 500 | therefore by "immediately followed" we mean /after/ the whitespace character in the 501 | detached modifier, e.g. `- (x) List item`(lang:norg). 502 | 503 | The syntax is as follows: 504 | - An extension starts with a `(` char 505 | - Immediately a special character must follow. This character determines the type of extension. 506 | - Some extensions can support parameters - if this is the case, the special character must be 507 | followed with {# whitespace} after which the parameters (a sequence of {** words} and/or 508 | {*** line endings}) ensue. Not all extensions support parameters and for good reason. There is no need 509 | to attach extra metadata to a done or undone state for instance. Several extensions should be 510 | delimited with the `\|` character. 511 | - A `\|` character may be matched, which allows the user to chain many extensions together, e.g. 512 | `(x|# A)`(lang:norg) (done with a priority of A). 513 | - Finally a `)` char closes the extension. 514 | 515 | NOTE: The whole detached modifier extension /must/ be followed by whitespace. 516 | 517 | *** TODO Status Extension 518 | The TODO item extension assigns a task status to a certain modifier. You probably know this 519 | concept from Org or Markdown where unordered lists can become tasks. In Norg we take this 520 | concept to the next level because any detached modifier can be assigned a task status. This can 521 | for example be useful for the author of a document to keep track of the status of certain 522 | sections. 523 | 524 | The following characters are reserved for the TODO status extension: 525 | -- `| |`: undone (a literal space) 526 | -- `x`: done 527 | -- `?`: needs further input/clarification 528 | -- `!`: urgent 529 | -- `+`: recurring (with an optional {**** timestamp extension}[timestamp]) 530 | -- `-`: in-progress/pending 531 | -- `=`: on hold 532 | -- `_`: put down/cancelled 533 | 534 | Some examples include: 535 | |example 536 | - ( ) Undone 537 | - (x) Done 538 | 539 | - (# B| ) Undone with a priority of B 540 | - (+) Recurring 541 | - (+ 5th Jan) Recurring every 5th of January 542 | |end 543 | 544 | *** Advanced Detached Modifier Extensions 545 | Apart from just being able to assign a TODO state it is also possible to apply more complex 546 | states with parameters to certain indicators. Such examples are the {**** timestamp extension} 547 | and the {**** priority extension}. 548 | In the following sections you will find descriptions for a few other extensions supported within 549 | Norg. 550 | 551 | **** Timestamp Extension 552 | The timestamp extension allows you to associate a {* detached modifiers}[detached modifier] 553 | with a date/time. 554 | 555 | The syntax for a timestamp is as {^ note to parser developers}[follows]:\ 556 | `?,? -?