├── .hgtags
├── irishsea
    ├── doc
    │   ├── environment.markdown
    │   ├── model.markdown
    │   ├── language.markdown
    │   └── original-notes.markdown
    └── README.markdown
├── README.markdown
├── opus-2
    └── opus-2.markdown
├── tamerlane
    └── tamerlane.markdown
├── turkey-bomb
    └── turkey-bomb.markdown
├── didigm
    └── didigm.markdown
├── star-w
    └── star-w.markdown
├── sartre
    └── sartre.markdown
├── mdpn
    └── mdpn.markdown
├── you-are-reading-the-name-of-this-esolang
    └── you-are-reading-the-name-of-this-esolang.markdown
├── sampo
    └── Practical_Matters.markdown
├── oozlybub-and-murphy
    └── oozlybub-and-murphy.markdown
└── madison
    └── Madison.markdown


/.hgtags:
--------------------------------------------------------------------------------
 1 | fd0f61445aef8f6368a3b74dcfb42d1b635c2cfa checkpoint_1
 2 | 0f16ac518ce82490f51e0b5d87bb655836ae019e checkpoint_2
 3 | fd0f61445aef8f6368a3b74dcfb42d1b635c2cfa 0.1
 4 | 0f16ac518ce82490f51e0b5d87bb655836ae019e 0.2
 5 | fd0f61445aef8f6368a3b74dcfb42d1b635c2cfa checkpoint_1
 6 | 0000000000000000000000000000000000000000 checkpoint_1
 7 | 0f16ac518ce82490f51e0b5d87bb655836ae019e checkpoint_2
 8 | 0000000000000000000000000000000000000000 checkpoint_2
 9 | ba14f1a39f11ae8fb0ac1262677cb7be562459d7 0.3
10 | 


--------------------------------------------------------------------------------
/irishsea/doc/environment.markdown:
--------------------------------------------------------------------------------
 1 | Irishsea: Environment
 2 | =====================
 3 | 
 4 | The Irishsea environment is a user interface (UI) for interacting with the
 5 | Irishsea model.  It consists of two main parts:
 6 | 
 7 | *   the _monitor_, which graphically depicts all the processes active in the
 8 |     model, what they are currently doing, and what they will be doing in the
 9 |     near future (insofar as that is predictable); and
10 | *   the _entry area_, where commands in the Irishsea language can be entered
11 |     to affect these processes.
12 | 
13 | ...
14 | 


--------------------------------------------------------------------------------
/irishsea/doc/model.markdown:
--------------------------------------------------------------------------------
 1 | Irishsea: Model
 2 | ===============
 3 | 
 4 | The Irishsea model (communications/control model, or execution environment)
 5 | comprises a set of concurrently-executing processes.
 6 | 
 7 | Each process may receive messages, and may send messages to other processes.
 8 | 
 9 | Devices look like any other processes in this model.  Input from them is
10 | sent to some other process as a message.  Sending them messages causes them
11 | to produce output, or to engage in other activities, possibly observable.
12 | 
13 | The Irishsea environment is one such "input device".  Instructions in the
14 | Irishsea language are entered into it; such instructions typically command
15 | it to send a specified message to a specified process.
16 | 
17 | Irishsea processes may be implemented in any programming language; it is not
18 | necessary for it to be the Irishsea language.  However, such processes must
19 | conform to the semantics of (i.e. expected behaviour of) Irishsea processes,
20 | which will now be described.
21 | 
22 | ...
23 | 


--------------------------------------------------------------------------------
/irishsea/doc/language.markdown:
--------------------------------------------------------------------------------
 1 | Irishsea: Language
 2 | ==================
 3 | 
 4 | The Irishsea language provides a syntax and semantics for instructing an
 5 | Irishsea process to send messages to other processes, and how to react to
 6 | messages it receives.
 7 | 
 8 | The Irishsea language is extremely terse.  It is designed to be economical
 9 | to enter on a US QWERTY computer keyboard layout, rather than to be readable.
10 | (However, many constructs may have an alternate "long form" for readability,
11 | when speed of entry is not an issue.)
12 | 
13 | (TODO the language should probably be somewhat flexible on the point of
14 | keyboard layout, for non-US keyboards, perhaps by letting symbols be
15 | redefined.)
16 | 
17 | It assumes the following characters can be entered with a single keystroke,
18 | with no modifier keys: lower-case Latin letters from `a` to `z`, decimal
19 | digits from `0` to `9`, and the following symbols: `` ` `` (backquote),
20 | `-` (hyphen), `=` (equals sign), `[` and `]` (square brackets), `\\`
21 | (backslash), `;` (semicolon), `'` (apostrophe), `,` (comma), `.` (period),
22 | `/` (forward slash), and ` ` (blank space).
23 | 
24 | (A keyboard with a numeric keypad may also provide `+` and `*`, but the
25 | presence of a numeric keypad should not be assumed.)
26 | 
27 | It reserves these characters for use in language constructs which must be
28 | issued frequently and quickly.  Other characters are relegated to those
29 | constructs which are less common or which are rarely issued "on-line".
30 | 
31 | ...
32 | 


--------------------------------------------------------------------------------
/README.markdown:
--------------------------------------------------------------------------------
 1 | Specs on Spec
 2 | =============
 3 | 
 4 | This is a collection of specifications for programming languages that have
 5 | not been implemented.  Indeed, many of them may well be unimplementable.
 6 | 
 7 | Most of them were designed, and their specs written, by Chris Pressey of
 8 | Cat's Eye Technologies; the exceptions are:
 9 | 
10 | *   **Startre** and **\*W**, which were designed and written by
11 |     John Colagioia; and
12 | *   **TURKEY BOMB**, which (I baldly assert) was found unexpectedly one
13 |     day under a stack of _Byte_ magazines at a charity shop.
14 | 
15 | Also, I say "programming language", but of course that term is rather flexible
16 | 'round these parts:
17 | 
18 | *   **Madison** is a language for writing formal proofs;
19 | *   **MDPN** is a (two-dimensional) parser-definition language; and
20 | *   **Opus-2** is a "spoken" language, for some rather exceptional meaning of
21 |     "speaking".
22 | 
23 | Most of these specifications are "finished" in the sense that there is nothing
24 | obviously more to add to them.  (Of course, an implementation, or some really
25 | brow-furrowing thought experiments, could always turn up problems with a
26 | specification.)  The exceptions, which can be considered "works in progress",
27 | are:
28 |     
29 | *   **Irishsea**, which is largely a set of notes for a livecoding language.
30 | *   **Sampo**, which is largely a set of notes for a production language.
31 | 
32 | The specification documents are copyrighted by their respective authors.  Not
33 | that I mind if you fork this repo and submit pull requests to fix errors or
34 | the like, for such is the nature of the distributed version control beast.
35 | 
36 | Note on the name: in the dialect of English where I come from, "spec" is short
37 | for "specification" but "on spec" is short for "on *speculation*."  Thus the
38 | name is trying to convey the idea of specifications that were just kind of
39 | pulled out of the air.
40 | 
41 | 


--------------------------------------------------------------------------------
/opus-2/opus-2.markdown:
--------------------------------------------------------------------------------
  1 | Opus-2
  2 | ======
  3 | 
  4 | Opus-2 is an abstract artlang composed by Chris Pressey at or around
  5 | March 10, 2001.
  6 | 
  7 | ### Design Goals
  8 | 
  9 | Eliminate word order entirely. Despite the appearance of the resulting
 10 | language, this was the only real design goal to begin with.
 11 | 
 12 | ### Grammatical Overview
 13 | 
 14 | Verbs in Opus-2 take the form of colours. Nouns take the form of sounds.
 15 | Adjectives take the form of smells. Adverbs take the form of inner-ear
 16 | sensations. Certain tenses and phrasings are indicated by tastes.
 17 | 
 18 | To distinguish between the roles of the nouns in a sentence, objects are
 19 | quieter, *sotto voce* sounds, and subjects are foreground sounds. It is
 20 | important to remember that the sensations corresponding to object,
 21 | subject, and verb all occur at the same time in an event termed an
 22 | *sentence-experience*.
 23 | 
 24 | This dominant-recessive relationship is also present in strong and weak
 25 | scents which indicate whether an adjective describes the subject or the
 26 | object, and in intense and gentle inner-ear sensations (feelings of
 27 | sudden or gradual acceleration) to determine the target of an adverb.
 28 | 
 29 | ### Vocabulary Overview
 30 | 
 31 | Sample dictionary:
 32 | 
 33 |     **verbs**
 34 |     flee             *pale green*
 35 |     approach         *deep orange*
 36 |     examine          *medium grey*
 37 |     glorify          *deep red*
 38 |     
 39 |     **nouns**
 40 |     man              *Eb below middle C, trombone*
 41 |     woman            *F above middle C, french horn*
 42 |     world            *car door slamming*
 43 |     child            *middle C, tubular bells*
 44 |     building         *F, tympani roll*
 45 |     radio            *harp sweep*
 46 |     
 47 |     **adjectives**
 48 |     fast             *burning rubber*
 49 |     dangerous        *mothballs*
 50 |     
 51 |     **adverbs**
 52 |     quickly          *leaning 40 degrees left*
 53 |     dangerously      *leaning 25 degrees right*
 54 | 
 55 | ### Context Overview
 56 | 
 57 | While each sentence is "instantaneous" in the sense that there is no
 58 | internal word order, sentence-experiences still follow one another, and
 59 | each sentence-experience does take a certain amount of time to perceive.
 60 | Tense is thus implied by context between successive sentences and the
 61 | duration of each sentence. The shorter a sentence is, the further into
 62 | the future it is presumed to refer to. (Thanks to Rob Norman and Panu
 63 | Kalliokosi for suggesting these ideas.)
 64 | 
 65 | ### Examples of Usage
 66 | 
 67 | Example sentence-experience: "The building glorifies the woman":
 68 | 
 69 |     *deep red*
 70 |     *F, tympani roll, forte*
 71 |     *F, french horn, piano*
 72 | 
 73 | Example sentence-experience: "The man quickly flees the dangerous
 74 | child":
 75 | 
 76 |     *pale green*
 77 |     *Eb, trombone, forte*
 78 |     *leaning 40 degrees left (sudden)*
 79 |     *C, tubular bells, piano*
 80 |     *mothballs (gentle whiff)*
 81 | 
 82 | ### Who Speaks Opus-2?
 83 | 
 84 | This language was designed purely as an abstract exercise in language
 85 | design. Thus it was not designed for any preconceived group of speakers,
 86 | and little consideration was given to their culture and capabilities. It
 87 | is neither specifically a conversation language, nor a formalized
 88 | language (e.g. a programming language.)
 89 | 
 90 | Most of the problems of finding speakers of Opus-2 is in finding
 91 | creatures that can create smells and inner-ear sensations as easily as
 92 | humans can create complex sounds. However, some popular opinions of who
 93 | or what might speak Opus-2 have been suggested since its unveiling:
 94 | 
 95 | -   An efficient-yet-entertaining form of future communication using
 96 |     direct neural jacks.
 97 | -   A code used by e.g. Neo (from *The Matrix*) to communicate to
 98 |     subjects unknowingly trapped in a virtual reality.
 99 | -   A pidgin spoken between highly-telepathic beings and
100 |     marginally-telepathic beings.
101 | 


--------------------------------------------------------------------------------
/irishsea/README.markdown:
--------------------------------------------------------------------------------
  1 | Irishsea
  2 | ========
  3 | 
  4 | Irishsea is an experiment in "live coding" or "cyber-physical programming"
  5 | (or maybe "cyberphysical livecoding", why not?)
  6 | 
  7 | It is vapourware.  I haven't even been as far as decided for what I want to
  8 | implement, or what language/environment to implement it in.  It is mostly,
  9 | for the time being, a collection of thoughts on the subject.  And they're
 10 | not even very coherent thoughts!  Don't try to make sense of them!
 11 | 
 12 | Let's back up.
 13 | 
 14 | Many of the ideas behind "cyber-physical programming" do not seem to be
 15 | actually very new.  Consider...
 16 | 
 17 | *   [Sketchpad][]
 18 | *   Front-panel lights on the [Altair 8800][]
 19 | *   Interactive debugging capabilities of [LISP machines][]
 20 | *   Turtle Logo
 21 | *   Smalltalk
 22 | *   Even most 8-bit BASICs let you interrupt a running program, change
 23 |     some variables, then issue a `CONT` to continue execution
 24 | *   "Expression Watch" windows in various IDE's
 25 | *   etc.
 26 | 
 27 | The main new idea seems to be:
 28 | 
 29 | *   _the effects of the operator's interactive reprogramming of the
 30 |     system are thought of as a kind of **performance**_.
 31 | 
 32 | This may be a literal performance, in the case of, say, a musical livecoding
 33 | concert.  Or, it may be something more informal, or more abstract, or of
 34 | supposedly practical value; but it is still some kind of... I don't know,
 35 | _experience_, for lack of a better word; potentially a shared experience.
 36 | 
 37 | Goals
 38 | -----
 39 | 
 40 | One of the goals of the Irishsea project is to come up with answers to
 41 | the question: _how do you play a computer like you play a musical instrument?_
 42 | 
 43 | Actually, I should say "performance instrument".  I said "musical instrument"
 44 | only because (a) you probably know what musical instruments look like, and
 45 | how they generally work, and (b) you probably don't have a good idea of what
 46 | a "performance instrument" would look like or how it would work.  (I know I
 47 | don't.)
 48 | 
 49 | Or, another way to arrange the confusion in the above two paragraphs:
 50 | Irishsea is a performance instrument, made out of a computer, using the
 51 | concept of a musical performance to *frame*, but not *limit*, what we mean
 52 | by "performance".
 53 | 
 54 | Working towards these goals might include:
 55 | 
 56 | *   define a model for programs that can "do performance" and/or
 57 |     whose executions *are* performances
 58 | *   define a language (protocol) for reprogramming the model
 59 | *   define an environment (UI) in which that language can be "spoken"
 60 | 
 61 | (Why do I keep saying "reprogramming?"  Because you almost never program a
 62 | computer from scratch.  Other people have already programmed it a lot before
 63 | you got your hands on it -- they built the OS, the text editors, the compilers
 64 | and interpreters that you use...  You can think of yourself just "programming"
 65 | it, because you are adding new code to the existing code, but you still have
 66 | to admit that your code does not live in a vacuum.  Unless maybe you like to
 67 | hand-assemble your own operating systems.)
 68 | 
 69 | ### Ideas for Model ###
 70 | 
 71 | *   process- and messaging-based (like e.g. Erlang)
 72 | *   input/output devices look like processes and send/receive messages
 73 | *   see `doc/model.markdown` for more info
 74 | 
 75 | ### Ideas for Language ###
 76 | 
 77 | *   also process- and messaging-based
 78 | *   terse, very terse, because you are to play it like an instrument
 79 | *   see `doc/language.markdown` for more info
 80 | 
 81 | ### Ideas for Environment ###
 82 | 
 83 | *   visibility into what all the process are doing, and what they will be doing
 84 |      (insofar as that can be predicted)
 85 | *   see `doc/environment.markdown` for more info
 86 | 
 87 | Motivation
 88 | ----------
 89 | 
 90 | Having done all of the following things:
 91 | 
 92 | * performed music
 93 | * composed music
 94 | * written software
 95 | * used software
 96 | 
 97 | I have a hard time reconciling musical performance with writing software.
 98 | They're very different activities, for me.  But I have less of a problem
 99 | reconciling
100 | 
101 | * performing music with composing it (they call this _improvisation_)
102 | * composing music with writing software (they're not dissimilar)
103 | * writing software with using it (you can call this _bootstrapping_)
104 | 
105 | So it seems theoretically possible.  So I'd like to try.  So this is me
106 | trying.
107 | 
108 | Links
109 | -----
110 | 
111 | *   [extempore](https://github.com/digego/extempore)
112 | *   [circa](https://github.com/paulhodge/circa)
113 | *   [vivace](https://github.com/automata/vivace)
114 | *   [live unit tests demo](http://livecoding.staticloud.com/)
115 | 
116 | [Altair 8800]: http://en.wikipedia.org/wiki/Altair_8800
117 | [Lisp machines]: http://en.wikipedia.org/wiki/Lisp_machine
118 | [Sketchpad]: http://en.wikipedia.org/wiki/Sketchpad
119 | 


--------------------------------------------------------------------------------
/tamerlane/tamerlane.markdown:
--------------------------------------------------------------------------------
  1 | Tamerlane
  2 | =========
  3 | 
  4 | Chris Pressey  
  5 | Created Jan 29 2000
  6 | 
  7 | ### Introduction to Tamerlane
  8 | 
  9 | Tamerlane is a "constraint flow" language. The point of its creation is
 10 | to attempt to break as many paradigmal stereotypes and idioms that I'm
 11 | aware of; at least, to make it tricky and confounding to pigeonhole
 12 | easily.
 13 | 
 14 | It has some concepts in it from imperative languages, some from
 15 | functional languages, some from dataflow and object oriented languages,
 16 | and some from graph rewriting and other constraint-based languages, and
 17 | they're all muddled together into a ridiculous *potpourri*.
 18 | 
 19 | Despite being such a mutt, there might actually be some algorithms that
 20 | would be dreadfully easy to write in Tamerlane.
 21 | 
 22 | ### Overview of Tamerlane
 23 | 
 24 | A Tamerlane program consists of a mutably weighted directed graph.
 25 | 
 26 | Each node is considered to be an independent updatable store. The data
 27 | held in each store is represented by the weights of the arcs exiting the
 28 | node.
 29 | 
 30 | An arc of weight zero is functionally equivalent to the absence of an
 31 | arc.
 32 | 
 33 | ### Description and Example
 34 | 
 35 | At this point we may introduce the syntax in a sample ASCII notation for
 36 | a simple, almost pathological Tamerlane program:
 37 | 
 38 |       Point-A: 1 Point-B,
 39 |       Point-B: 1 Point-C,
 40 |       Point-C: 1 Point-A.
 41 | 
 42 | The user of a Tamerlane program may submit *messages* to the program at
 43 | runtime. In this sense the user and the program are both objects which
 44 | share the symmetrical relationship **user-of/used-by**. The program
 45 | object's interface exposes a `query` method to the user, which is to be
 46 | considered runtime-polymorphic.
 47 | 
 48 | Using this message-passing mechanism, queries are submitted by the user
 49 | to a running Tamerlane program, much as queries would be submitted to a
 50 | running Prolog program.
 51 | 
 52 | Queries have their own syntax and semantics. Unlike Prolog, the user's
 53 | queries are interpreted as *rules*, perhaps accompanied by information
 54 | about where and when the rules are "introduced" into the graph.
 55 | 
 56 | As an example of a query that could be submitted to the above Tamerlane
 57 | program:
 58 | 
 59 |       1 Point-A -> 0 Point-A @ Point-A
 60 | 
 61 | This would introduce the rewriting rule
 62 | 
 63 | This rule is applied to the weights of the nodes in the graph starting
 64 | with the node specified after the `@` symbol. In this instance it would
 65 | start by trying to apply the rewrite to the node `Point-A`, but finding
 66 | `Point-A` to contain `1 Point-B`, nothing would change.
 67 | 
 68 | Each time a further query is submitted, each rule which has been
 69 | introduced into the graph, disappears from the node it was working on,
 70 | and is transmitted to the adjacent node with the lowest positive weight
 71 | value. If there is a tie for lowest weight, the rule is transmitted to
 72 | all of the adjacent nodes with the same lowest weight.
 73 | 
 74 | For efficacy we can consider the user able to submit a `nop` query. This
 75 | would not introduce any new rules into the graph, but it would cause all
 76 | active rules to propogate to new nodes on the graph.
 77 | 
 78 | So assume the user submits `nop`. The rule that was introduced by the
 79 | last query 'moves' from the node labelled `Point-A` to the node labelled
 80 | `Point-B` (since it's the only positive route out of `Point-A`.) It
 81 | tries to rewrite `Point-B`, but finding only `1 Point-C` in `Point-B`,
 82 | nothing happens.
 83 | 
 84 | Assume the user `nop`s again. The rule is propogated to `Point-C`.
 85 | Finally the pattern match succeeds, and `Point-C` is rewritten to a new
 86 | `Point-C`:
 87 | 
 88 |       Point-C: 0 Point-A.
 89 | 
 90 | After one more `nop`, the engine will generate a
 91 | 
 92 |       Rule stopped at Point-C (no adjacent nodes)
 93 | 
 94 | message back to the user. This uses the operation that the user object's
 95 | interface supplies to the running program, called `messageback`, which
 96 | the user must supply, but is, like `query`, considered
 97 | runtime-polymorphic.
 98 | 
 99 | Advanced Topics
100 | ---------------
101 | 
102 | ### Rule Priority
103 | 
104 | Obviously, the user can enter more than one query in succession; s/he
105 | doesn't need to explicity `nop` to wait for rules to resolve. Since the
106 | graph can contain cycles, this would lead to a form of inherent,
107 | synchronous concurrency: each time the `query` or `nop` methods are
108 | invoked, more than one rule may be applied to the same node.
109 | 
110 | The order in which these competing rules is applied is based on the
111 | rule's *priority*. Rules can be submitted with a specific priority in
112 | the following manner:
113 | 
114 |       1 Point-A -> 0 Point-A @ Point-A ! 10
115 | 
116 | If there is a tie in priority when two rules are competing to rewrite
117 | the same node, the outcome is guaranteed to be not simply undefined, but
118 | rather, non-deterministic, or at least probabilistic.
119 | 
120 | The user can also specify a delay, measured in number of method calls
121 | (`query`s or `nop`s) from the present query, at which the rule will be
122 | introduced into the graph, like so:
123 | 
124 |       1 Point-A -> 0 Point-A @ Point-A in 10
125 | 
126 | And, for the sake of efficacy, the `nop` method on the program object
127 | has overloaded syntaxes whose semantics are 'keep `nop`ing until some or
128 | all rules have stopped.'
129 | 
130 | ### Negative Weights
131 | 
132 | A negative weight from one node to another is interpreted in a rather
133 | negative fashion with respect to the running program.
134 | 
135 | A negative weight is selected for propogation when it is closer to zero
136 | (while still being non-zero) than any other weight (absolute value.)
137 | When a negative weight is selected, however, the semantics of
138 | propogation are different.
139 | 
140 | When a rule encounters an arc of the form:
141 | 
142 |       X: -1 Y.
143 | 
144 | The rule is not "just" copied to Y like "usual". Instead, it "rolls
145 | back" the rule to Y in the fashion of a functional continuation.
146 | 
147 | All of the nodes that have been "touched" by this rule, since it was
148 | last at node Y, are reset to their condition when the rule was last at
149 | node Y. The rule itself is deleted from X and propogated to Y, and
150 | rewriting continues from there.
151 | 
152 | If the rule has never visited Y, however, then such an exit arc is not
153 | considered a valid candidate. It has an effective weight of 0
154 | (non-existent.)
155 | 
156 | ### Lambda Graphs
157 | 
158 | Lambda abstraction is a powerful form of referring to non-numeric
159 | calculatory items such as functions and, in the case of Tamerlane,
160 | graphs.
161 | 
162 | A user may submit a rule in the form
163 | 
164 |       1 X -> 1 Y ? Y: 1 Z, Z: 1 Y
165 | 
166 | Y is like a 'local variable' in this respect. When the rule is applied
167 | to the node, and the pattern match succeeds, the nodes named Y and Z
168 | need not actually exist, as they are simply created dynamically and
169 | 'attached' to the program graph.
170 | 
171 | Lambda graphs are subject to garbage collection, since they can only be
172 | used by the rule that created them.
173 | 
174 | ### Pigeonholes (Updatable)
175 | 
176 | Variables may be supplied which are explicitly updatable, and these are
177 | termed pigeonholes. Pigeonholes acknowledge events. Their assignment is
178 | associated with both nodes and arcs. They can occur when:
179 | 
180 | -   a rule enters a node
181 | -   a rule leaves a node
182 | -   a rule chooses an arc
183 | -   a rule passes through an arc
184 | -   a rule rewrites an arc
185 | 
186 | Updatable variables are always named after the node (or node.arc)
187 | they're associated with, preceded by a `$` symbol. The data in the
188 | variable is, of course, the arc weight (or in the case of nodes, the
189 | node's internal key.)
190 | 
191 | If the pigeonhole is assigned the special value `%Cancel`, the rule is
192 | cancelled from moving to the node/through the arc/rewriting the arc.
193 | 
194 | ### Placeholders (Unification)
195 | 
196 | Variables may also be supplied as "placeholders". These are the same as
197 | bindable (unifiable) variables in logical languages like Prolog. An
198 | example of such might be:
199 | 
200 |       1 X 7 ^Y -> 7 X 1 ^Y
201 | 
202 | Which would replace "1 X 7 Foo" with "7 X 1 Foo", "1 X 7 Bar" with "7 X
203 | 1 Bar", etc.
204 | 
205 | ### Horn Rules
206 | 
207 | Horn rules only succeed if all of their rules succeed. A Horn rule is
208 | specified like:
209 | 
210 |       1 X -> 1 Y + 1 Z 0 G -> 1 G
211 | 
212 | (Remember that 0 G is equivalent to "the absence of an arc to G." If you
213 | have a pattern like
214 | 
215 |       ^a G ^b G 0 G -> ^b Q
216 | 
217 | It will only match when there are exactly two different exit arcs to the
218 | same node G, which is not disallowed (nor is a node with an exit arc
219 | which points to itself.))
220 | 
221 | Note also that arcs are unordered amongst themselves. The rule
222 | 
223 |       1 A 1 B 1 C -> 1 E 3 D
224 | 
225 | is the same as
226 | 
227 |       1 B 1 A 1 C -> 3 D 1 E
228 | 


--------------------------------------------------------------------------------
/turkey-bomb/turkey-bomb.markdown:
--------------------------------------------------------------------------------
  1 | TURKEY BOMB
  2 | ===========
  3 | 
  4 | Anonymous
  5 | 
  6 | Introduction
  7 | ------------
  8 | 
  9 | TURKEY BOMB, the first known programming-language-cum-drinking-game,
 10 | evolved independently on four seperate continents and was widely used as
 11 | an implementation base for computer operating systems for several
 12 | centuries.
 13 | 
 14 | Later, when computers were proven beyond a shadow of a doubt to be the
 15 | malevolent work of unseen evil forces, and digital technology was banned
 16 | on punishment of bodily disintigration, TURKEY BOMB thrived on, only
 17 | slightly modified, as a popular drinking game.
 18 | 
 19 | Now that the treaty negotiations between the UN and the world of unseen
 20 | evil forces have been signed, however, computers are back (and in full
 21 | force, now billions upon billions of times more efficient than they once
 22 | were,) and TURKEY BOMB's popularity as a computer programming language
 23 | may just be making a comeback!
 24 | 
 25 | Archaeologists have recently uncovered the largest known collection of
 26 | TURKEY BOMB articles. Dating from A.D. 2014 and apparently an almanac of
 27 | black magic of some sort, with the cryptic title "Communications of the
 28 | ACM," the remains of an almost-four-hundred-year-old periodical is
 29 | practically all historians have to go on.
 30 | 
 31 | Even then, this obscure grimoire talks of this language as if it were an
 32 | already-established phenomenon - indeed, perhaps dating back to the tail
 33 | end of the second millenium of A.D. For this reason, some scholars
 34 | attribute the widespread popularity of this language to one CLIN\_TON, a
 35 | great leader of the time. It is said that CLIN\_TON was such an
 36 | excellent and heroic TURKEY BOMB jockey that "he never inhaled",
 37 | although that is largely deemed myth, perhaps connected to the
 38 | allegorical story of his close followers who "blew him good".
 39 | 
 40 | Our knowledge of the time in which TURKEY BOMB originated is slim,
 41 | indeed. As such, the remainder of this document is by no means a
 42 | complete reconstruction of the language, for that is surely impossible.
 43 | However, it *is* an attempt to organize, apparently for the first time
 44 | ever, the elements of TURKEY BOMB in a human-referencable fashion.
 45 | 
 46 | Description
 47 | -----------
 48 | 
 49 | To fully and exactly understand TURKEY BOMB one must first grok the
 50 | ancient art of computer programming while under the influence of
 51 | recreational consumables. If you are already under the influence (and
 52 | why else would you be reading a document about a language named TURKEY
 53 | BOMB,) congratulations, you've already taken your first steps on the
 54 | road of becoming an expert TURKEY BOMB programmer/jockey.
 55 | 
 56 | But do not be lulled into thinking that simply ingesting an amusing
 57 | chemical will ensure your vainglorious TURKEY BOMB hobby or career! No
 58 | indeed, for it is the most wise and experienced programmer who needs no
 59 | more buzz than to program in TURKEY BOMB itself. Many teatotalling
 60 | hackers excel at the Annual International TURKEY BOMB Open in Maui for
 61 | this very reason. (See you there this fall!)
 62 | 
 63 | Data Types
 64 | ----------
 65 | 
 66 | Name
 67 | 
 68 | Description
 69 | 
 70 | Size
 71 | 
 72 | `   ZILCH`
 73 | 
 74 | "A little slice of Nirvana."
 75 | 
 76 | Zero.
 77 | 
 78 | `   BI_IT`
 79 | 
 80 | A composite quantum state of information.
 81 | 
 82 | Two thirds of a bit plus half a trit.
 83 | 
 84 | `   AMICED`
 85 | 
 86 | A conceptual quantum state of information.
 87 | 
 88 | Negative six sevenths of a decimal digit.
 89 | 
 90 | `   TRIVIA CONCERNING type`
 91 | 
 92 | Three references: one to an object of the named type, two to TRIVIA
 93 | objects.
 94 | 
 95 | Exactly fifteen bytes, no exceptions.
 96 | 
 97 | `   ADVISORY PERTAINING TO type`
 98 | 
 99 | A quarter of a reference to a object of the given type.
100 | 
101 | A quarter of the platform-defined pointer size.
102 | 
103 | `   GRUBSTEAK`
104 | 
105 | A fraction whose numerator is a perfect square and whose denominator is
106 | a prime number.
107 | 
108 | No bigger than necessary.
109 | 
110 | `   IMPROPER GRUBSTEAK`
111 | 
112 | A GRUBSTEAK whose denominator is less than the square root of the
113 | numerator.
114 | 
115 | Same as GRUBSTEAK.
116 | 
117 | `   INDECENT GRUBSTEAK`
118 | 
119 | A fraction whose numerator is a perfect square of a perfect square and
120 | whose denominator is a prime number whose ordinal position in the
121 | counting list of prime numbers is also prime.
122 | 
123 | In the drinking game, whenever an INDECENT GRUBSTEAK is involved in an
124 | expression, everyone starts chanting "BANG BANG BANG!!!" until the
125 | player holding the TURKEY BOMB either finishes their drink and starts
126 | another, or falls down (in which case someone who hasn't been playing
127 | should take him or her home).
128 | 
129 | Same as GRUBSTEAK.
130 | 
131 | `   NOMENCLATURE`
132 | 
133 | A set of variable names, defined by an EBNF expression that must contain
134 | at least one { } (repeated 0 or more times) term.
135 | 
136 | As big as possible.
137 | 
138 | `   PUDDING`
139 | 
140 | An unknowable value.
141 | 
142 | Infinite.
143 | 
144 | `   HUMIDOR BUILT UP   FROM type, type & type`
145 | 
146 | A structure containing three other types, specified at compile-time, all
147 | of which must be different, one of which must be another HUMIDOR.
148 | 
149 | Infinite.
150 | 
151 | `   HYBRID OBTAINED BY   COMBINING type & type   [WITH GUSTO]`
152 | 
153 | A unified structure containing data from two different types, specified
154 | at compile time.
155 | 
156 | The average size of the two types... which may present problems when
157 | accuracy of representation is desired, which is why the WITH GUSTO
158 | clause is made available to pad the size of a HYBRID to the larger of
159 | the sizes of it's two consitituent data types.
160 | 
161 | `   TURKEY BOMB`
162 | 
163 | A mysterious and shadowy type, suspected to be a reference to itself.
164 | There can only be one (no more, no less) variable of type TURKEY BOMB,
165 | and it is predeclared under the variable name TURKEY BOMB. A variable of
166 | type TURKEY BOMB (that is to say, the variable named TURKEY BOMB) can
167 | only take on one value, that value being TURKEY BOMB.
168 | 
169 | Exactly 1 TURKEY BOMB.
170 | 
171 | Paradigm
172 | --------
173 | 
174 | When TURKEY BOMB is played as a drinking game, the TURKEY BOMB is
175 | represented by a real object - usually something convenient, like a
176 | shoe, when an impromptu game is played for fun, but the real hardcore
177 | TURKEY BOMB junkies insist on using either a real live turkey, or a real
178 | live time bomb, or ideally, both (tied *securely* together).
179 | 
180 | The TURKEY BOMB is then passed from player to player while the referee
181 | (operating system) designates challengers (tasks). The chosen challenger
182 | takes a deep breath (inhales) and shouts an expression at the player
183 | holding the TURKEY BOMB. If the player can produce the correct result
184 | before the referee can, they only have to take a sip of their drink
185 | before passing off the TURKEY BOMB. Otherwise, it's the whole thing,
186 | down the hatch.
187 | 
188 | If the player holding the TURKEY BOMB makes an error, they must down
189 | their drink and get another before trying again. If the referee makes an
190 | error, *everyone*, especially the referee, must down their drink, and
191 | get another.
192 | 
193 | Variables are also declared by any player spontaneously standing up and
194 | shouting out a name that hasn't been mentioned yet, and a type to go
195 | with it, at any time.
196 | 
197 | For these reasons, TURKEY BOMB should not be considered as much an
198 | imperative language as a "peer-pressure" one.
199 | 
200 | Syntax
201 | ------
202 | 
203 | There are no comments in TURKEY BOMB; it's entire content is considered
204 | a comment on those who program/play it.
205 | 
206 | Operators
207 | ---------
208 | 
209 | Syntax
210 | 
211 | Description
212 | 
213 | `   BI_IT BI_IT BI_IT ! BI_IT BI_IT BI_IT`
214 | 
215 | 2-bit NAND, rotate known trit left.
216 | 
217 | `   BI_IT BI_IT ? BI_IT BI_IT ? BI_IT BI_IT`
218 | 
219 | 3-argument trit operation; unfortunately the Ancient Texts seem unclear
220 | on what it actually does. (The closest English translation appears to be
221 | "take these trits three and meditate soundly upon them.")
222 | 
223 | `   $ BI_IT BI_IT BI_IT BI_IT BI_IT BI_IT $`
224 | 
225 | Attempt to make a GRUBSTEAK.
226 | 
227 | `   TRIVIA Y EXPR Y TRIVIA`
228 | 
229 | Attempt to make a TRIVIA.
230 | 
231 | `   TRIVIA BI_IT //`
232 | 
233 | Attempt to connect a TRIVIA to itself and return it. The BI\_IT argument
234 | is required, but serves no detectably useful purpose (hardcore followers
235 | of the drinking game tradition insist that it's for good luck.)
236 | 
237 | `   & EXPR`
238 | 
239 | Do not evaluate EXPR. Not particularly useful when programming in TURKEY
240 | BOMB, but wow, can one of these ever screw you up in the middle of a
241 | game.
242 | 
243 | `   \ ADVISORY ADVISORY ADVISORY ADVISORY`
244 | 
245 | Returns the type thus pointed to. Also, the player holding the TURKEY
246 | BOMB must pass it off.
247 | 
248 | `   HYBRID.type`
249 | 
250 | Casts a HYBRID to either type it was defined with.
251 | 
252 | `   HUMIDOR.type`
253 | 
254 | Retrieves an element from a HUMIDOR.
255 | 
256 | `   @ HUMIDOR`
257 | 
258 | Retrieves a PUDDING which represents the entire HUMIDOR.
259 | 
260 | `   PUDDING!!!!!`
261 | 
262 | Attempts to deduce the existance of a HUMIDOR in the given PUDDING. The
263 | player to the left of the player holding the TURKEY BOMB has to keep
264 | drinking continuously while the computer/referee does their deducing.
265 | 
266 | `   ALL BUT EXPR`
267 | 
268 | Returns a PUDDING indicating everything but EXPR.
269 | 
270 | `   WHEREFORE ART EXPR`
271 | 
272 | Returns a PUDDING indicating the entire metaphysical nature of EXPR.
273 | 
274 | `   WHEREFORE AIN'T EXPR`
275 | 
276 | Short for WHEREFORE ART ALL BUT EXPR.
277 | 
278 | `   WHEREFOREN'T EXPR`
279 | 
280 | Short for ALL BUT WHEREFORE ART EXPR.
281 | 
282 | `   GARNISH PUDDING`
283 | 
284 | Convolutes the PUDDING with recent context drawn from the program. The
285 | player holding the TURKEY BOMB must pass it off.
286 | 
287 | `   IMAGINE PUDDING, PUDDING!`
288 | 
289 | Returns a NOMENCLATURE indicating all the variables unchanged between
290 | two PUDDINGs.
291 | 
292 | `   EXPR :-> NOMENCLATURE`
293 | 
294 | Mass-assign the set of variables.
295 | 
296 | `   < NOMENCLATURE`
297 | 
298 | Mass-retrieve the set of variables.
299 | 
300 | `   NOMENCLATURE % GRUBSTEAK GRUBSTEAK`
301 | 
302 | Perform iterative cypher transformation of set of names.
303 | 
304 | Notes
305 | -----
306 | 
307 | The drinking game can also be played in an asylum, replacing 'drink'
308 | with 'medication'. Do *not* play this game with LSD.
309 | 


--------------------------------------------------------------------------------
/didigm/didigm.markdown:
--------------------------------------------------------------------------------
  1 | The Didigm Reflective Cellular Automaton
  2 | ========================================
  3 | 
  4 | November 2007, Chris Pressey, Cat's Eye Technologies
  5 | 
  6 | Introduction
  7 | ------------
  8 | 
  9 | Didigm is a *reflective cellular automaton*. What I mean to impart by
 10 | this phrase is that it is a cellular automaton where the transition
 11 | rules are given by the very patterns of cells that exist in the
 12 | playfield at any given time.
 13 | 
 14 | Perhaps another way to think of Didigm is: Didigm = [ALPACA][] + [Ypsilax][].
 15 | 
 16 | [ALPACA]: http://catseye.tc/node/ALPACA.html
 17 | [Ypsilax]: http://catseye.tc/node/Ypsilax.html
 18 | 
 19 | Didigm as Parameterized Language
 20 | --------------------------------
 21 | 
 22 | Didigm is actually a parameterized language. A parameterized language is
 23 | a schema for specifying a set of languages, where a specific language
 24 | can be obtained by supplying one or more parameters. For example,
 25 | [Xigxag][] is a parameterized language, where the
 26 | direction of the scanning and the direction of the building of new
 27 | states are parameters. Didigm "takes" a single parameter, an integer.
 28 | This parameter determines how many *colours* (number of possible states
 29 | for any given cell) the cellular automaton has. This parameter defaults
 30 | to 8. So when we say Didigm, we actually mean Didigm(8), but there are
 31 | an infinite number of other possible languages, such as Didigm(5) and
 32 | Didigm(70521).
 33 | 
 34 | [Xigxag]: http://catseye.tc/node/Xigxag.html
 35 | 
 36 | The languages Didigm(0), Didigm(-1), Didigm(-2) and so forth are
 37 | probably nonsensical abberations, but I'll leave that question for the
 38 | philosophers to ponder. Didigm(1) is at least well-defined, but it's
 39 | trivial. Didigm(2) and Didigm(3) are semantically problematic for more
 40 | technically interesting reasons (Didigm(3) might be CA-universal, but
 41 | Didigm(2) probably isn't.) Didigm(4) and above are easily shown to be
 42 | CA-universal.
 43 | 
 44 | (I say CA-universal, and not Turing-complete, because technically
 45 | cellular automata cannot simulate Turing machines without some extra
 46 | machinery: TMs can halt, but CAs can't. Since I don't want to deal with
 47 | defining that extra machinery in Didigm, it's simpler to avoid it for
 48 | now.)
 49 | 
 50 | Colours are typically numbered. However, this is not meant to imply an
 51 | ordering between colours. The eight colours of Didigm are typically
 52 | referred to as 0 to 7.
 53 | 
 54 | Language Description
 55 | --------------------
 56 | 
 57 | ### Playfield
 58 | 
 59 | The Didigm playfield, called *le monde*, is considered unbounded, like
 60 | most cellular automaton playfields, but there is one significant
 61 | difference. There is a horizontal division in this playfield, splitting
 62 | it into regions called *le ciel*, on top, and *la terre*, below. This
 63 | division is distinguishable — meaning, it must be possible to tell which
 64 | region a given cell is in — but it need not have a presence beyond that.
 65 | Specifically, this division lies on the edges between cells, rather than
 66 | in the cells themselves. It has no "substance" and need not be visible
 67 | to the user. (The Didigm Input Format, below, describes how it may be
 68 | specified in textual input files.)
 69 | 
 70 | ### Magic Colours
 71 | 
 72 | Each region of the division has a distinguished colour which is called
 73 | the *magic colour* of that region. The magic colour of le ciel is colour
 74 | 0. The magic colour of la terre is colour 7. (In Didigm(n), the magic
 75 | colour of la terre is colour n-1.)
 76 | 
 77 | ### Transition Rules
 78 | 
 79 | #### Definition
 80 | 
 81 | Each transition rule of the cellular automaton is not fixed, rather, it
 82 | is given by certain forms that are present in the playfield.
 83 | 
 84 | Such a form is called *une salle* and has the following configuration.
 85 | Two horizontally-adjacent cells of the magic colour abut a cell of the
 86 | *destination colour* to the right. Two cells below the rightmost
 87 | magic-colour cell is the cell of the *source colour*; it is surrounded
 88 | by cells of any colour called the *determiners*.
 89 | 
 90 | This is perhaps better illustrated than explained. In the following
 91 | diagram, the magic colour is 0 (this salle is in le ciel,) the source
 92 | colour is 1, the destination colour is 2, and the determiners are
 93 | indicated by D's.
 94 | 
 95 |     002
 96 |     DDD
 97 |     D1D
 98 |     DDD
 99 | 
100 | #### Application
101 | 
102 | Salles are interpreted as transition rules as follows. When the colour
103 | of a given cell is the same as the source colour of some salle, and when
104 | the colours of all the cells surrounding that cell are the exact same
105 | colours (in the exact same pattern) as the determininers of that salle,
106 | we say that that salle *matches* that cell. When any cell is matched by
107 | some salle in the other region, we say that that salle *applies* to that
108 | cell, and that cell is replaced by a cell of the destination colour of
109 | that salle.
110 | 
111 | "The other region" refers, of course, to the region that is not the
112 | region in which the cell being transformed resides. Salles in la terre
113 | only apply to cells in le ciel and vice-versa. This complementarity
114 | serves to limit the amount of chaos: if there was some salle that
115 | applied to *all* cells, it would apply directly to the cells that made
116 | up that salle, and that salle would be immediately transformed.
117 | 
118 | On each "tick" of the cellular automaton, all cells are checked to find
119 | the salle that applies to them, and then all are transformed,
120 | simultaneously, resulting in the next configuration of le monde.
121 | 
122 | There is a "default" transition rule which also serves to limit the
123 | amount of chaos: if no salle applies to a cell, the colour of that cell
124 | does not change.
125 | 
126 | Salles may overlap. However, no salle may straddle the horizon. (That
127 | is, each salle must be either completely in le ciel or completely in la
128 | terre.)
129 | 
130 | Salles may conflict (i.e. two salles may have the same source colour and
131 | determiners, but different destination colours.) The behaviour in this
132 | case is defined to be uniformly random: if there are n conflicting
133 | salles, each has a 1/n chance of being the one that applies.
134 | 
135 | Didigm Input Format
136 | -------------------
137 | 
138 | I'd like to give some examples, but first I need a format to given them
139 | in.
140 | 
141 | A Didigm Input File is a text file. The textual digit symbols `0`
142 | through `9` indicate cells of colours 0 through 9. Further colours may
143 | be indicated by enclosing a decimal digit string in square brackets, for
144 | example `[123]`. This digit string may contain leading zeros, in order
145 | for columns to line up nicely in the file.
146 | 
147 | A line containing only a `,` symbol in the leftmost column indicates the
148 | division between le ciel and la terre. This line does not become part of
149 | the playfield.
150 | 
151 | A line beginning with a `=` is a directive of some sort.
152 | 
153 | A line beginning with `=C` followed by a colour indicator indicates how
154 | many colours (the n in Didigm(n)) this playfield contains. This
155 | directive may only occur once.
156 | 
157 | A line beginning with `=F` followed by a colour indicator as described
158 | above, indicates that the unspecified (and unbounded) remainder of le
159 | ciel or la terre (whichever side of `,` the directive is on) is to be
160 | considered filled with cells of the given colour.
161 | 
162 | Of course, an application which implements Didigm with some alternate
163 | means of specifying le monde, for example a graphical user interface,
164 | need not understand the Didigm Input Format.
165 | 
166 | Examples
167 | --------
168 | 
169 | Didigm is immediately seen to be CA-universal, in that you can readily
170 | (and stably) express a number of known CA-universal cellular automata in
171 | it. For example, to express John Conway's Life, you could say that
172 | colour 1 means "alive" and colour 2 means "dead", and compose something
173 | like
174 | 
175 |     002002001001
176 |     222122112212 ... and so on ...
177 |     212212212121 ... for all 256 ...
178 |     222222222222 ... rules of Life ...
179 |     =F3
180 |     ,
181 |     =F2
182 |     22222
183 |     21222
184 |     21212
185 |     21122
186 |     22222
187 | 
188 | Because the magic colour 7 never appears in la terre, il n'y a aucune
189 | salle dans la terre et donc tout le ciel est toujours la meme chose.
190 | 
191 | There are of course simpler CA's that are apparently CA-universal that
192 | would be possible to describe more compactly. But more interesting (to
193 | me) is the possibility for making reflective CA's.
194 | 
195 | To do this in an uncontrolled fashion is easy. We just stick some salles
196 | in le ciel, some salles in la terre, and let 'er rip. Unfortunately, in
197 | general, les salles in each region will probably quickly damage enough
198 | of the salles in the other region that le monde will become sterile soon
199 | enough.
200 | 
201 | A rudimentary example of something a little more orchestrated follows.
202 | 
203 |     3333333333333
204 |     3002300230073
205 |     3111311132113
206 |     3311321131573
207 |     3111311131333
208 |     3333333333333
209 |     =F3
210 |     ,
211 |     =F1
212 |     111111111111111
213 |     111111131111111
214 |     111111111111574
215 |     111111111111333
216 |     311111111111023
217 |     111111111111113
218 | 
219 | The intent of this is that the 3's in la terre initially grow streams of
220 | 2's to their right, due to the leftmost two salles in le ciel. However,
221 | when the top stream of 2's reaches the cell just above and to the left
222 | of the 5, the third salle in le ciel matches and turns the 5 into a 7,
223 | forming une salle dans la terre. This salle turns every 2 to the right
224 | of a 0 in le ciel into a 4, thus modifying two of les salles in le ciel.
225 | The result of these modified salles is to turn the bottom stream of 2's
226 | into a stream of 4's halfway along.
227 | 
228 | This is at least predictable, but it still becomes uninteresting fairly
229 | quickly. Also note that it's not just the isolated 3's in la terre that
230 | grow streams of 2's to the right: the 3's on the right side of la salle
231 | would, too. This could be rectified by having a wall of some other
232 | colour on that side of la salle, and I'm sure you could extend this
233 | example by having something else happen when the stream of 4's hit the 0
234 | in that salle, but you get the picture. Creating a neat and tidy and
235 | long-lived reflective cellular automaton requires at least as much care
236 | as constructing a "normal" cellular automaton, and probably in general
237 | more.
238 | 
239 | History
240 | -------
241 | 
242 | I came up with the concept of a reflective cellular automaton (which is,
243 | as far as I'm aware, a novel concept) independently on November 1st,
244 | 2007, while walking on Felix Avenue, in Windsor, Ontario.
245 | 
246 | No reference implementation exists yet. Therefore, all Didigm runs have
247 | been thought experiments, and it's entirely possible that I've missed
248 | something in its definition that having a working simulator would
249 | reveal.
250 | 
251 | Happy magic colouring!  
252 | Chris Pressey  
253 | Chicago, Illinois  
254 | November 17, 2007
255 | 


--------------------------------------------------------------------------------
/star-w/star-w.markdown:
--------------------------------------------------------------------------------
  1 | The \*W Programming Language
  2 | ============================
  3 | 
  4 | John Colagioia, 199?
  5 | 
  6 | Introduction
  7 | ------------
  8 | 
  9 | The \*W language should be based on the W language which, of course,
 10 | does not exist. Instead it is based on an assortment of odds and ends
 11 | which could be useful in languages, but never seem to have been
 12 | implemented (and definitely shouldn't have been implemented in the same
 13 | language), combined with some patching added to, firstly, make \*W truly
 14 | bizarre and, secondly, to make it as functionally complete a language as
 15 | C and C++ (from which some concepts such as casting have been borrowed,
 16 | as well as the name convention), for example. The data types should
 17 | provide enough of a range that any structure may be built up, and
 18 | includes arbitrary bitstrings, machine independant pointers (useful for
 19 | a pass-by-reference), name bindings of data (useful for a pass-by-name),
 20 | allows for homogenous array-like compositions, and has a semi-structured
 21 | composite data type. All arithmetic expressions are constructed in
 22 | prefix to preserve continuity with subroutine calls, and include a
 23 | fairly complete set of arithmetic and bitwise operators and inbuilt
 24 | functions to execute any calculation.
 25 | 
 26 | The language is also quite robust in flow control, allowing for
 27 | conditionals, iteration (bounded and unbounded), function calls,
 28 | interrupt-driven, and even random execution. To enforce structured
 29 | programming, however, neither a "go to" or a "come from" statement has
 30 | been implemented in \*W.
 31 | 
 32 | To minimize readability, of course, \*W is fully case insensitive, so
 33 | that Count, COUNT, count, and COunT are all indistinct except under
 34 | fairly confusing circumstances (which may or may not exist, depending on
 35 | the implementation). Data instances must begin with an alphabetic
 36 | character (`a`-`z` or `A`-`Z`), an underscore (`_`), or a hyphen (`-`).
 37 | The remainder of the name may then be made up of any alphanumeric
 38 | characters and the underscore, hyphen, and, of course, the right bracket
 39 | (`]`). Comments are also possible in \*W (though not necessarily
 40 | suggested as the language is fairly confusing without misspelled and
 41 | incorrect descriptions of the program to botch things), and may be
 42 | included in text by placing a double pipe (`||`) at the beginning of a
 43 | comment, terminated by the doubled end-of-statement marker (`!!`). Such
 44 | comments may be placed anywhere in the program, and should be completely
 45 | ignored by the compiler (just as they are by most programmers). This
 46 | does not mean that comments are equivalent to whitespace. On the
 47 | contrary, the compiler considers comments to simply not exist,
 48 | essentially concatenating the strings to either side of the comment.
 49 | 
 50 | \*W, like most modern languages, is entirely freeform, meaning that
 51 | statements are not constrained to the dimensions of, say, a punch card,
 52 | teletype, computer monitor, or three-dimensional, virtual reality
 53 | programmers' editor. Program format, therefore, is entirely dependant on
 54 | input device, host computer's character set, and lack of programmer
 55 | style, though the compiler is permitted (actually somewhat encouraged)
 56 | to mock poor format style.
 57 | 
 58 | \*W Data Types
 59 | --------------
 60 | 
 61 | The \*W data types are designed with maximum versatility in mind. With
 62 | them, any other known (and several unknown) types may be built. In
 63 | addition, several predefined instances of these types are provided to
 64 | enhance the language.
 65 | 
 66 |     `bits`   A bitstring of arbitrary length.
 67 |     `cplx`   A complex number in the mathematical form (A + Bj) where A and B are integers and j is the square root of (-1). Each component of a cplx is specified to have a minimum precision of {-32768 ... 32767}, but may be more, depending on the implementation.
 68 |     `sack`   A (semi)structured data type consisting of a collection of elements which can be packed, unpacked, and checked with other data.
 69 |     `dref`   A reference to an instance of some data type.
 70 |     `name`   A name of another datum.
 71 |     `chrs`   A character string of arbitrary length.
 72 |     `hole`   A data type with no value. May pose as any type.
 73 | 
 74 | Predefined \*W Instances
 75 | ------------------------
 76 | 
 77 | The following data instances are provided to the \*W programming
 78 | environment to facilitate programming certain concepts which would be
 79 | nearly impossible otherwise.
 80 | 
 81 |     `WORLD` (`bits`)       The \*W representation of the outside world. Assigning an expression to WORLD (see below) causes the character represented by the expression to be appended to the computer display. Likewise, using WORLD in an expression represents the value of the next character in the input buffer (if any).
 82 |     `NOWHERE` (`hole`)..   NOWHERE is a place to discard things as well as a place to get nothing. Can also be used for comparison purposes.
 83 |     `NL` (`chrs`)          NL is a newline character.
 84 |     `POCKET` (`sack`)      Data instances local to each subroutine. Both may be used to store any data, but RESULT will be available for the calling routine to read.
 85 |     `RESULT` (`bits`)
 86 | 
 87 | \*W Program Parts
 88 | -----------------
 89 | 
 90 | Each \*W program is made of several parts. The functions part, which
 91 | defines any user functions, the stuff part, which defines any instances
 92 | of data for the program, and the text part, which contains the program
 93 | instructions, themselves.
 94 | 
 95 | ### Functions
 96 | 
 97 | Subprograms which can be used from the Text, in the form.
 98 | 
 99 |     @ name = Stuff Text
100 | 
101 | ### Stuff
102 | 
103 | Declarations of data to be used by the Text portion of the program, with
104 | an optional constant initializer. The initial number allows multiple
105 | indexed instances to exist.
106 | 
107 |     num/name IS type [const] !
108 |     num/name , num/name ... ARE ALL type [const] !
109 |     AUTOPACKED SACK name [, ... name] HAS type [, ... type] !
110 | 
111 | ### Text
112 | 
113 | A list of statements, appearing as:
114 | 
115 |     TEXT: {statements} :ENDTEXT
116 | 
117 | W Statement Types
118 | -----------------
119 | 
120 |     statement % expr !
121 | 
122 | Runs statement with a probability of expr. If expr is less than 100, the
123 | statement is executed that percentage of time. If it is greater, the
124 | statement is executed (expr/100) more times, each time decrementing the
125 | value of expr by 100, and, if expr ever falls below 100, is subject to
126 | the first rule. A negative value for expression works just like a
127 | positive value, except only under conditions where program execution
128 | runs backward; otherwise, it is treated as a zero.
129 | 
130 |     statment UNLESS expr !
131 | 
132 | The statment is executed whenever encountered except in any cases when
133 | expr evaluates to non-zero.
134 | 
135 |     statement WHEN expr !
136 | 
137 | The statement is not executed when encountered, but is instead executed
138 | after any statement where expr currently evaluates to non-zero.
139 | 
140 |     lval < expr !
141 |     expr > lval !
142 | 
143 | Takes the value of expr and copies it into lval.
144 | 
145 |     function (parameters) !
146 | 
147 | Calls a function with the appropriate parameters. The parameter list
148 | must correspond one-to-one with the Stuff list for that function.
149 | 
150 | The scoping rules in \*W are much more simplified than they would have
151 | been in W, had it existed: A function may only access data instances
152 | declared within itself, including those implicitly defined.
153 | 
154 |     -|- (expr) !
155 | 
156 | If expr is 0, jumps to the end of the current block (see below),
157 | otherwise, terminates the current expr blocks. If expr is greater than
158 | the current block nesting, it does nothing. If expr is negative, the
159 | program terminates.
160 | 
161 |     & statement & statement & ... statement &&
162 | 
163 | A blocking mechanism for multiple statments.
164 | 
165 | \*W Mathematical Operations
166 | ---------------------------
167 | 
168 |     `^ X`      And the bits of X (yielding a single bit).
169 |     `. X`      Or the bits of X.
170 |     `? X`      Xor the bits of X.
171 |     `* X`      Butterfly the bits of X, i.e., 11001100 becomes 10100101.
172 |     `- A B`    If A and B are simple (bits, cplx, chrs, hole), identical types, subtracts B from A.
173 |     `/ A B`    If A and B are numeric (cplx), divides A by B.
174 |     `# A B`    If A and B are numeric (cplx), takes A to the B power.
175 |     `$ A B`    Mingles B with A.
176 |     `~ A B`    Selects the B bits from A.
177 |     `SIZE X`   Returns the size of X, in full bytes (rounded up if X is a bitstring).
178 | 
179 | \*W Sack Operations
180 | -------------------
181 | 
182 |           PACK sack data:         Add data to sack.
183 |           UNPACK sack data:       Remove a data-like element from sack.
184 |           UNPACK sack:            Remove an element from sack.
185 |           CHECK sack data:        Examine sack for data.
186 |           WEIGH sack:             Returns the weight of the sack (in bits).
187 | 
188 | Other \*W Operations
189 | --------------------
190 | 
191 |           NAME name AFTER data:   Assigns the name of data to name.
192 |           WHOIS (data):           Returns name suggested by data.
193 |           REF (data):             Returns a reference to data.
194 |           DATA (dref):            Returns the data referred to by ref.
195 |           FCHRS (chrs):           Returns the first character of the string.
196 |           LCHRS (chrs):           Returns the last character of the string.
197 |           FBIT (bits):            Returns the first (lowest) bit of the bits.
198 |           LBIT (bits):            Returns the last (highest) bit of the bits.
199 | 
200 | Sample \*W Programs
201 | -------------------
202 | 
203 |           1.      Functions:
204 |                   || No functions for this program !!
205 |                   Stuff:
206 |                           1/Hello is chrs!
207 |                           1/Sz, 1/Total are all cplx!
208 |                   Text:
209 |                   || Initialize the data !!
210 |                           Hello < "Hello, World!"!
211 |                           Size Hello > Sz!
212 |                           Total < 0!
213 |                   || Take the string length and multiply by 100 !!
214 |                           - Size - 0 Total > Total %10000!
215 |                   || Print and delete a character that many times !!
216 |                           &       WORLD < FCHRS (Hello)!
217 |                           &       Hello < - Hello FCHRS (Hello)!
218 |                           &&      %Total!
219 |                   || Add a newline !!
220 |                           WORLD < nl!
221 |                   :Endtext
222 |               Result:  Prints "Hello, World!" to the screen, followed by a
223 |                           newline.
224 |           2.      Functions:
225 |                           @ mult =
226 |                                   Stuff:
227 |                                           1/A, 1/B are all cplx!
228 |                                   Text:
229 |                                           cplx (RESULT) < 0!
230 |                                           cplx (RESULT) < - cplx (RESULT) - 0 B
231 |                                                   %10000!
232 |                                           bits (B) < RESULT!
233 |                                           cplx (RESULT) < 0!
234 |                                           cplx (RESULT) < - cplx (RESULT) - 0 A
235 |                                                   %B!
236 |                                   :Endtext
237 |                           @ fact =
238 |                                   Stuff:
239 |                                           1/n is cplx!
240 |                                   Text:
241 |                                           RESULT < bits (1)!
242 |                                           RESULT < bits (mult (n, - n 1))
243 |                                                   unless - 1 ?n!
244 |                                   :Endtext
245 |                   Stuff:
246 |                           1/Input, 1/Output are all chrs!
247 |                           1/SR is chrs "0"!
248 |                           1/Num, 1/Out, 1/Place, 1/Index, 1/Mod are all cplx 0!
249 |                   Text:
250 |                           WORLD > Input!
251 |                           Place < 1!
252 |                           Index < cplx (mult (SIZE Input, 100))!
253 |                           &       Num < - Num - 0 cplx (mult (Place,
254 |                                           - LCHRS (Input) ZR))!
255 |                           &       Input < - Input LCHRS (Input)!
256 |                           &       cplx (mult (10, Place)) > Place!
257 |                           &&     %Index!
258 |                           cplx (fact (Num)) > Out!
259 |                           &       - Out cplx (mult (/ Out 10, 10)) > Mod!
260 |                           &       + 100 / Out 10 > Out!
261 |                           &       + Output chrs (+ Mod Zr) > Output!
262 |                           &&      %%Out!
263 |                           Size Output > Index!
264 |                           Index < cplx (mult (100, Index))!
265 |                           &       WORLD < LCHRS (Hello)!
266 |                           &       Hello < - Hello LCHRS (Hello)!
267 |                           &&      %Index!
268 |                           WORLD < nl!
269 |                   :Endtext
270 |               Result:  Accepts a positive integer (n) as input, then outputs
271 |                           the factorial of n (n!).
272 | 


--------------------------------------------------------------------------------
/sartre/sartre.markdown:
--------------------------------------------------------------------------------
  1 | The Sartre Programming Language
  2 | ===============================
  3 | 
  4 | John Colagioia, 199?
  5 | 
  6 | Introduction
  7 | ------------
  8 | 
  9 | The Sartre programming language is named for the late existential
 10 | philosopher, Jean-Paul Sartre. Sartre is an extremely unstructured
 11 | language. Statements in Sartre have essentially no philosophical
 12 | purpose; they just are. Thus, Sartre programs are often left to define
 13 | their own functions.
 14 | 
 15 | Unlike traditional programming languages (or maybe very much like them),
 16 | nothing in Sartre is guaranteed, except maybe for the fact that nothing
 17 | is guaranteed. The Sartre compiler, therefore, must be case insensitive
 18 | (technically, it requires all capital letters, but since nothing matters
 19 | anyway, why should this?).
 20 | 
 21 | Names in Sartre may only contain letters (and only capital letters, at
 22 | that, but nobody really cares that much), and maybe some trailing
 23 | digits, but nothing else.
 24 | 
 25 | No standard mathematical functionality is supplied in Sartre. Instead,
 26 | nihilists are created, which may damage the properties inherent to the
 27 | data, and nihilators are executed to reclaim storage dynamically.
 28 | 
 29 | Sartre programmers, perhaps somewhat predictably, tend to be boring and
 30 | depressed, and are no fun at parties. No comments will be made on the
 31 | level of contrast between the Sartre programmer and any other
 32 | programmer.
 33 | 
 34 | In the words of a Sartre programmer who worked intensely for months,
 35 | eating whatever junkfood wandered near his cubicle, "I have been gaining
 36 | twenty-five pounds a week for two months, and I am now experiencing
 37 | light tides. It is stupid to be so fat. My pain, ultimate solitude, and
 38 | collection of Dilbert cartoons are still as authentic as they were when
 39 | I was thin, but seem to impress girls far less. From now on, I will live
 40 | on cigarettes and black coffee," which is the general diet of the Sartre
 41 | programmer, for obvious reasons.
 42 | 
 43 | Comments are available in Sartre, though not at all suggested, since
 44 | nobody really wants to listen to you, anyway, by typing the comment at
 45 | the beginning of the line, and terminating it (on the same line) with a
 46 | squiggle (brace) pair ("{}"). Valid Sartre text may be placed after the
 47 | comment if desired. If comments are absolutely necessary, they should
 48 | adequately describe the futility of the program and the plight of
 49 | programmer and computer in a world ruled by an unfeeling God and His
 50 | compilers, as well as providing explanation of the surrounding program
 51 | statements.
 52 | 
 53 | They should not be misspelled, as some compilers may check.
 54 | 
 55 | Admittedly, while it is not hard to string Sartre statements together to
 56 | create a Sartre-compilable text file, it can be quite hard to program in
 57 | the Sartre paradigm. To wit, one may keep creating programs, one after
 58 | another, like soldiers marching into the sea, but each one may seem
 59 | empty, hollow, like stone. One may want to create a program that
 60 | expresses the meaninglessness of existence, and instead they average two
 61 | numbers.
 62 | 
 63 | Sartre Data Types
 64 | -----------------
 65 | 
 66 | The Sartre language has two basic data types, the EN-SOI and the
 67 | POUR-SOI. The en-soi is a completely filled heap of a specified rank,
 68 | whereas the pour-soi is a dynamic structure which never has the same
 69 | value. An integer may also be used in Sartre, but it may only take the
 70 | value of zero (the Dada extensions to Sartre allow integers to also take
 71 | on the value of "duck sauce", but that's neither here nor there--unless
 72 | you happen to like duck sauce, of course).
 73 | 
 74 | en-soi
 75 | 
 76 | The en-soi, as mentioned before, is a full heap of a specified rank. As
 77 | Sartre does not allow pre-initialized data, the actual data in the heap
 78 | is non-specified (but the heap is "pre-heapified"). Data (en-sois of
 79 | rank 0) may be "deconstruct"ed from the en-soi, or "rotate"d through. At
 80 | all times, the en-soi remains a full heap, however. En-sois of rank zero
 81 | are 32 bits with no inherent meaning. En-sois of higher rank may be
 82 | defined as each element being of another (same) data type (an en-soi of
 83 | integers, all with a value of zero (or duck sauce), for example).
 84 | 
 85 | pour-soi
 86 | 
 87 | The pour-soi is only, and precisely, what it is not. It may be
 88 | "unassigned from" a certain value, thereby exactly increasing the number
 89 | of things it isn't. It is specified to be a two-bit value, so it
 90 | probably isn't.
 91 | 
 92 | integer
 93 | 
 94 | Unlike the integers in most programming languages, Sartre integers all
 95 | have a value of zero (again, unless the Dada extensions are being used).
 96 | Like the rest of the dreary universe (duck sauce included), this is
 97 | something that must be lived with.
 98 | 
 99 | orthograph
100 | 
101 | The orthograph is a special type of pictogram used in the specification
102 | of lexicographic elements. The set of orthographs varies from Sartre
103 | implementation to Sartre implementation, but is guaranteed to contain
104 | all the so-called "letters" from at least one modern language,
105 | transliterated to the closest element of the ASCII set and ordered as
106 | the bit-reversed EBCDIC value, assuming those bit-patterns were
107 | integers, which they probably aren't. Unavailable orthographs in a given
108 | implementation are represented by "frowny faces". As defined, the
109 | orthograph is possibly the simplest and most convenient data type to
110 | work with.
111 | 
112 | const
113 | 
114 | Not an actual data type, but somewhat useful, is the introduction of the
115 | symbolic constant into the Sartre language. Since Sartre does not allow
116 | for unconventional, potentially confusing symbols to be strewn about a
117 | program (for example, "17" meant to represent a certain quantity of
118 | items), this allows the programmer to define a set of symbolic constants
119 | he plans to use. To avoid confusion, symbolic constants are defined in
120 | unary, using the "wow" (!) as the unit (i.e., !, !!, !!!, !!!!, ...).
121 | 
122 | Symbolic constants must be surrounded in "rabbit ears" (") in use during
123 | the action section of the program.
124 | 
125 | Predefined Sartre Instances
126 | ---------------------------
127 | 
128 | The following data instances are provided to the Sartre programming
129 | environment to facilitate programming certain concepts which would be
130 | (and probably should be) nearly impossible otherwise.
131 | 
132 |                 MAXINT  This is the maximum integer value allowed by the
133 |                         particular Sartre implementation:  zero.
134 |                 MININT  This is the minimum integer value allowed by the
135 |                         particular Sartre implementation.  If using the Dada
136 |                         extensions, MININT is duck sauce; if not, it is zero.
137 |                 ORTH0   This is the "initial orthograph" of the Sartre
138 |                         implementation.
139 |                 ORTHL8  This is the "final orthograph" of the implementation.
140 |                         The name is properly pronounced "Orthograph:  Lazy
141 |                         Eight".
142 | 
143 | Sartre Program Segments
144 | -----------------------
145 | 
146 | The Sartre program is broken into simple, logical portions. In essence,
147 | all things must be declared before usage, and the declaration section
148 | comes before the action section (if any). Since the Sartre language has
149 | a recursive structure, the Sartre nihilist has the same structure as the
150 | main nihilator which is about to be described:
151 | 
152 |                     Nihilator <name> ;
153 |                     {Nihilist;}
154 |                     Const   <name> = <value>;
155 |                     Consts  <name> = <value>..<value>;
156 |                     Matter  {<name> {, <name>} : <type> ;}
157 |                     Act
158 |                             {<statement> ;}
159 |                     No more ;
160 |                     .
161 | 
162 | The only difference between a nihilist and the nihilator is that a
163 | nihilist does not use the trailing one-spot.
164 | 
165 | Example Matter definitions (data declaration) might be:
166 | 
167 |                     Const   3 = !!;
168 |                     Matter  dooM:   integer;
169 |                             Rniqqlj:en-soi, rank "3", of pour-soi;
170 | 
171 | which creates an integer (with a value of 0, since we are not invoking
172 | Dada extensions), and a heapified en-soi with seven pour-soi storage
173 | locations.
174 | 
175 | Sartre Statement Types
176 | ----------------------
177 | 
178 | `IF ;`
179 | 
180 | The Sartre conditional takes no arguments and then alters program flow
181 | accordingly. Put simply, on the condition where the most recently
182 | executed nihilator was successful, program execution is transferred to
183 | just beyond the next conditional, restarting the search from the
184 | beginning, if necessary.
185 | 
186 | `LVAL := expr ;`
187 | 
188 | The Sartre assignment statement takes the bourgois-perceived value of
189 | the damned expression and places it in LVAL. If LVAL is a pour-soi, this
190 | unassigns a value from the pour-soi.
191 | 
192 | `No Exit ;`
193 | 
194 | A reminder to the program that none of us can escape what we have
195 | wrought, or even escape what others have wrought. In programming terms,
196 | this may either cause the machine to hang or cause the program not to
197 | terminate, depending on the implementation.
198 | 
199 | `Life Is Meaningless ;`
200 | 
201 | A special command which, due to the resignation of the programmer, is
202 | permitted to perform a wide variety of tasks, among them, alter the
203 | direction of program flow, execute a random function, terminate the
204 | program, or positionally invert the bits in the data region. Since the
205 | programmer doesn't care anyway, this doesn't really matter. In the
206 | (tee-hee) ordinary version of Sartre, this operation is defined at
207 | compile-time, and is constant at that statement for each incidence of
208 | execution. The Dada extensions, however, redefine meaninglessness (since
209 | everything under Dada is meaningless to begin with) to be determined at
210 | run- time. Further, it may also logically negate each bit in the
211 | dataspace under Dada.
212 | 
213 | `{ data } <nihilist> ;`
214 | 
215 | This invokes the named nihilist and allows it to accomplish its goal.
216 | 
217 | The Sartre scoping rules are somewhat complex in that it may only
218 | utilize data which has been accessed previously or any data which it
219 | makes up itself. Data which has not yet been accessed is unknown to the
220 | Sartre nihilist, however.
221 | 
222 | `again ;`
223 | 
224 | Repeats the last statement, for the computationally-impaired.
225 | 
226 | `Act {<statement> ; } No more ;`
227 | 
228 | Allows certain statement-sets to be considered a single, atomic
229 | statment. A conditional cannot jump to within such an atomic structure.
230 | 
231 | Predefined Sartre Nihilists
232 | ---------------------------
233 | 
234 | `<orthograph> which`
235 | 
236 | Gets replaced with a zero-rank en-soi with the bit pattern of the
237 | orthograph. The orthograph may be replaced by a symbolic constant, and
238 | returns the bit-pattern that would be associated if the symbolic
239 | constant were an integer, which it isn't, otherwise it would be zero.
240 | 
241 | `<zres> that`
242 | 
243 | Gets replaced with the orthograph that matches the bit- pattern in the
244 | zero-rank en-soi.
245 | 
246 | `<zres> <zres> <logop>`
247 | 
248 | \<logop\> is one of "and", "or", or "xor", and returns the bitwise
249 | logical operation between the two zero-rank en-sois.
250 | 
251 | `NOT <pour-soi>`
252 | 
253 | This makes the pour-soi what it isn't, even if it is.
254 | 
255 | `annihilate ;`
256 | 
257 | Clears the values (sets to arbitrary values) of any data in the program.
258 | Proceeds to destroy any dynamically allocated storage. If no dynamic
259 | storage exists, causes a "Bad Faith" error in the program.
260 | 
261 | `<zres> Dump ;`
262 | 
263 | Prints the zero-rank en-soi to the screen as up to four orthographs. The
264 | en-soi may be replaced with an orthograph, in which case the orthograph
265 | itself is printed.
266 | 
267 | `<orthograph> Get ;`
268 | 
269 | Allows an orthograph to be input from the keyboard and stored in the
270 | specified orthograph. The orthograph may be replaced by any-rank en-soi
271 | of orthographs, in which case the statement will read in enough
272 | orthographs to fill the en-soi, then "heapify" the en-soi.
273 | 
274 | `<en-soi> <data> Zip`
275 | 
276 | First, removes an element from the en-soi, then adds the new element and
277 | "heapifies," returning the removed element.
278 | 
279 | `<en-soi> Dir`
280 | 
281 | Flips direction of the en-soi's "heapification." If the Dir nihilist is
282 | never executed, the heapification of the en-soi is, by default,
283 | descending.
284 | 
285 | `<value> <data> Flip ;`
286 | 
287 | Flips the bit represented by \<value\> in \<data\>.
288 | 
289 | `<data> <data> Concat`
290 | 
291 | Returns the concatenated bitpatterns of the two \<data\> elements.
292 | 
293 | Sample Sartre Programs
294 | ----------------------
295 | 
296 |             Nihilator SartreExample1;
297 |             Act
298 |             No more ; .
299 | 
300 | This program can be appreciated for its ability to not sort the input
301 | list of values in ascending order. It's elegance and simplicity in not
302 | accomplishing this goal are admirable. To fully appreciate that sorting
303 | is the activity being denied the user, as opposed to, say, searching or
304 | some sort of filtering, one should stare at the (lack of) program output
305 | forever and not turn on the lights when it gets dark.
306 | 
307 |             Nihilator SartreExample2;
308 |             Act
309 |               IF ;
310 |               again ;
311 |             No more ; .
312 | 
313 | This program fully considers the implications of its existance. It
314 | begins by questioning itself and, if successful, control flow moves to
315 | find the next conditional, of which there is none, so flow wraps to
316 | itself. If unsuccessful, it defaults to the "again" statement, which
317 | makes the same consideration (of which the condition is now true),
318 | wrapping the control flow to the previously executed "IF".
319 | 


--------------------------------------------------------------------------------
/mdpn/mdpn.markdown:
--------------------------------------------------------------------------------
  1 | Multi-Directional Pattern Notation
  2 | ==================================
  3 | 
  4 | Final - Sep 6 1999
  5 | 
  6 | * * * * *
  7 | 
  8 | Introduction
  9 | ------------
 10 | 
 11 | MDPN is an extension to EBNF, which attributes it for the purposes of
 12 | scanning and parsing input files which assume a non-unidirectional form.
 13 | A familiarity with EBNF is assumed for the remainder of this document.
 14 | 
 15 | MDPN was developed by Chris Pressey in late 1998, built on an earlier,
 16 | less successful attempt at a "2D EBNF" devised to fill a void that the
 17 | mainstream literature on parsing seemed to rarely if ever approach, with
 18 | much help provided by John Colagioia throughout 1998.
 19 | 
 20 | MDPN has possible uses in the construction of parsers and subsequently
 21 | compilers for multi-directional and multi-dimensional languages such as
 22 | Orthogonal, Befunge, Wierd, Blank, Plankalkül, and even less contrived
 23 | notations like structured Flowchart and Object models of systems.
 24 | 
 25 | As the name indicates, MDPN provides a notation for describing
 26 | multidimensional patterns by extending the concept of linear scanning
 27 | and matching with geometric attributes in a given number of dimensions.
 28 | 
 29 | Preconditions for Multidirectional Parsing
 30 | ------------------------------------------
 31 | 
 32 | The multidirectional parsing that MDPN concerns itself with assumes that
 33 | any portion of the input file is accessable at any time. Concepts such
 34 | as LL(1) are fairly meaningless in a non-unidirectional parsing system
 35 | of this sort. The unidirectional input devices such as paper tape and
 36 | punch cards that were the concern of original parsing methods have been
 37 | superceded by modern devices such as hard disk drives and ample, cheap
 38 | RAM.
 39 | 
 40 | In addition, MDPN is limited to an orthogonal representation of the
 41 | input file, and this document is generally less concerned about forms of
 42 | four or higher dimensions, to reduce unnecessary complexity.
 43 | 
 44 | Notation from EBNF
 45 | ------------------
 46 | 
 47 | Syntax is drawn from EBNF. It is slightly modified, but should not
 48 | surprise anyone who is familiar with EBNF.
 49 | 
 50 | A freely-chosen unadorned ('bareword') alphabetic multicharacter
 51 | identifier indicates the name of a nonterminal (named pattern) in the
 52 | grammar. e.g. `foo`. (Single characters have special meaning as
 53 | operators.) Case matters: `foo` is not the same name as `Foo` or `FOO`.
 54 | 
 55 | Double quotes begin and end literal terminals (symbols.) e.g. `"bar"`.
 56 | 
 57 | A double-colon-equals-sign (`::=`) describes a production (pattern
 58 | match) by associating a single nonterminal on the left with a pattern on
 59 | the right, terminated with a period. e.g. `foo ::= "bar".`
 60 | 
 61 | A pattern is a series of terminals, nonterminals, operators, and
 62 | parenthetics.
 63 | 
 64 | The `|` operator denotes alternatives. e.g. `"foo" | "bar"`
 65 | 
 66 | The `(` and `)` parentheses denote precedence and grouping.
 67 | 
 68 | The `[` and `]` brackets denote that the contents may be omitted, that
 69 | is, they may occur zero or one times. e.g. `"bar" ["baz"]`
 70 | 
 71 | The `{` and `}` braces denote that the contents may be omitted or may be
 72 | repeated any number of times. e.g. `"bar" {baz "quuz"}`
 73 | 
 74 | Deviations from EBNF
 75 | --------------------
 76 | 
 77 | The input file is spatially related to a coordinate system and it is
 78 | useful to think of the input mapped to an orthogonally distributed
 79 | (Cartesian) form with no arbitrary limit imposed on its size,
 80 | hereinafter referred to as *scan-space*.
 81 | 
 82 | The input file is mapped to scan-space. The first printable character in
 83 | the input file always maps to the *origin* of scan-space regardless of
 84 | the number of dimensions. The origin is enumerated with coordinates (0)
 85 | in one dimension, (0,0) in two dimensions, (0,0,0) in three dimensions,
 86 | etc.
 87 | 
 88 | Scan-space follows the 'computer storage' co-ordinate system so that *x*
 89 | coordinates increase to the 'east' (rightwards), *y* coordinates
 90 | increase to the 'south' (downwards), and *z* coordinates increase on
 91 | each successive 'page'.
 92 | 
 93 | Successive characters in the input file indicate successive coordinate
 94 | (*x*) values in scan-space. For two and three dimensions, end-of-line
 95 | markers are assumed to indicate "reset the *x* dimension and increment
 96 | the *y* dimension", and end-of-page markers indicate "reset the *y*
 97 | dimension and increment the *z* dimension", thus following the
 98 | commonplace mapping of computer text printouts.
 99 | 
100 | Whitespace in the input file are **not** ignored. The terminal `" "`,
101 | however, will match any whitespace (including tabs, which are **not**
102 | expanded.) The pattern `{" "}` may be used to indicate any number of
103 | whitespaces; `" " {" "}` may be used to indicate one or more
104 | whitespaces. Areas of scan-space beyond the bounds of the input file are
105 | considered to be filled with whitespaces.
106 | 
107 | Therefore, `"hello"` as a terminal is exactly the same as
108 | `"h" "e" "l" "l" "o"` as an pattern of terminals.
109 | 
110 | A `}` closing brace can be followed by a `^` (*constraint*) operator,
111 | which is followed by an expression in parentheses.
112 | 
113 | This expression is actually in a subnotation which supports a very
114 | simple form of algebra. The expression (built with terms connected by
115 | infix `+-*/%` operators with their C language meanings) can either
116 | reduce to
117 | 
118 | -   a constant value, as in `{"X"} ^ (5)`, which would match five `X`
119 |     terminals in a line; or
120 | -   an unknown value, which can involve any single lowercase letters,
121 |     which indicate variables local to the production, as in
122 |     `{"+"}^(x) {"-"}^(x*2)`, which would match only twice as many minus
123 |     signs as plus signs.
124 | 
125 | Complex algebraic expressions in constraints can and probably should be
126 | avoided when constructing a MDPN grammar for a real (non-contrived)
127 | compiler. MDPN-based compiler-compilers aren't expected to support more
128 | than one or two unknowns per expression, for example. There is no such
129 | restriction, of course, when using MDPN as a guide for hand-coding a
130 | multidimensional parser, or otherwise using it as a more sophisticated
131 | pattern-matching tool.
132 | 
133 | The Scan Pointer
134 | ----------------
135 | 
136 | It is useful to imagine a *scan pointer* (SP, not to be confused with a
137 | *stack pointer*, which is not the concern of this document) which is
138 | analogous to the current token in a single-dimensional parser, but
139 | exists in MDPN as a free spatial relationship to the input file, and
140 | thus also has associated geometric attributes such as direction.
141 | 
142 | The SP's *location* is advanced through scan-space by its *heading* as
143 | terminals in the productions are successfully matched with symbols in
144 | the input buffer.
145 | 
146 | The following geometric attribution operators modify the properties of
147 | the SP. Note that injudicious use of any of these operators *can* result
148 | in an infinite loop during scanning. There is no built-in contingency
149 | measure to escape from an infinite parsing loop in MDPN (but see
150 | exclusivity, below, for a possible way to overcome this.)
151 | 
152 | `t` is the relative translation operator. It is followed by a vector, in
153 | parentheses, which is added to the location of the SP. This does not
154 | change its heading.
155 | 
156 | For example, `t (0,-1)` moves the SP one symbol above the current symbol
157 | (the symbol which was *about* to be matched.)
158 | 
159 | As a more realistic example of how this works, consider that the pattern
160 | `"." t(-1,1) "!" t(0,-1)` will match a period with an exclamation point
161 | directly below it, like:
162 | 
163 |     .
164 |     !
165 | 
166 | `r` is the relative rotation operator. It is followed by an axis
167 | identifier (optional: see below) and an orthogonal angle (an angle *a*
168 | such that |*a*| **mod** 90 degrees = 0) assumed to be measured in
169 | degrees, both in parentheses. The angle is added to the SP's heading.
170 | Negative angle arguments are allowed.
171 | 
172 | Described in two dimensions, the (default) heading 0 denotes 'east,'
173 | that is, parsing character by character in a rightward direction, where
174 | the SP's *x* axis coordinate increases and all other axes coordinates
175 | stay the same. Increasing angles ascend counterclockwise (90 = 'north',
176 | 180 = 'west', 270 = 'south'.)
177 | 
178 | For example, `">" r(-90) "+^"` would match
179 | 
180 |     >+
181 |      ^
182 | 
183 | The axis identifier indicates which axis this rotation occurs around. If
184 | the axis identifier is omitted, the *z* axis is to be assumed, since
185 | this is certainly the most common axis to rotate about, in two
186 | dimensions.
187 | 
188 | If the axis identifier is present, it may be a single letter in the set
189 | `xyz` (these unsurprisingly indicate the *x*, *y*, and *z* dimensions
190 | respectively), or it may be a non-negative integer value, where 0
191 | corresponds to the *x* dimension, 1 corresponds to the *y* dimension,
192 | etc. (Implementation note: in more than two dimensions, the SP's heading
193 | property should probably be broken up internally into theta, rho, &c
194 | components as appropriate.)
195 | 
196 | For example, `r(z,180)` rotates the SP's heading 180 degrees about the
197 | *z* (dimension \#2) axis, as does `r(2,180)` or even just `r(180)`.
198 | 
199 | `<` and `>` are the push and pop state-stack operators, respectively.
200 | Alternately, they can be viewed as lookahead-assertion parenthetics,
201 | since the stack is generally assumed to be local to the production.
202 | (Compiler-compilers should probably notify the user, but not necessarily
203 | panic, if they find unbalanced `<>`'s.)
204 | 
205 | All properties of the SP (including location and heading, and scale
206 | factor if supported) are pushed as a group onto the stack during `<` and
207 | popped as a group off the stack during `>`.
208 | 
209 | Advanced SP Features
210 | --------------------
211 | 
212 | These features are not absolutely necessary for most non-contrived
213 | multi-directional grammars. MDPN compiler-compilers are not expected to
214 | support them.
215 | 
216 | `T` is the absolute translation operator. It is followed by a vector
217 | which is assigned to the location of the SP. e.g. `T (0,0)` will 'home'
218 | the scan.
219 | 
220 | `R` is the absolute rotation operator. It is followed by an optional
221 | axis identifier, and an orthogonal angle assumed to be measured in
222 | degrees. The SP's heading is set to this angle. e.g. `R(270)` sets the
223 | SP scanning line after line down the input text, downwards. See the `r`
224 | operator, above, for how the axis identifier functions.
225 | 
226 | `S` is the absolute scale operator. It is followed by an orthogonal
227 | *scaling factor* (a scalar *s* such that *s* = **int**(*s*) and *s* \>=
228 | 1). The SP's scale factor is set to this value. The finest possible
229 | scale, 1, indicates a 1:1 map with the input file; for each one input
230 | symbol matched, the SP advances one symbol in its path. When the scale
231 | factor is two, then for each one input symbol matched, the SP advances
232 | two symbols, skipping over an interim symbol. Etc.
233 | 
234 | `s` is the relative scale operator. It is followed by a scalar integer
235 | which is added to the SP's scaling factor (so long as it does not cause
236 | the scaling factor to be zero or negative.)
237 | 
238 | Scale operators may also take an optional axis identifier (as in
239 | `S(y,2)`), but when the axis identifier is omitted, all axes are assumed
240 | (non-distortional scaling).
241 | 
242 | `!>` is a state-assertion alternative to `>`, for the purpose of
243 | determining that the SP successfully and completely reverted to a
244 | previous state that was pushed onto the stack ('came full circle'). This
245 | operator is something of a luxury; a grammar which uses constraints
246 | correctly should never *need* it, but it can come in handy.
247 | 
248 | Other Advanced Features: Exclusivity
249 | ------------------------------------
250 | 
251 | Lastly, in the specification of a production, the *exclusivity* applying
252 | to that production can be given between a hyphen following the name of
253 | the nonterminal, and the `::=` operator.
254 | 
255 | Exclusivity is a list of productions, named by their nonterminals, and
256 | comes into play at any particular *instance* of the production (i.e.
257 | when the production successfully matches specific symbols at specific
258 | points in scan-space during a parse, called the *domain*.) The
259 | exclusivity describes how the domain of each instance is protected from
260 | being the domain of any further instances. The domain of any subsequent
261 | instances of any productions listed in the exclusivity is restricted
262 | from sharing points in scan-space with the established domain.
263 | 
264 | Exclusivity is a measure to prevent so-called *crossword grammars* -
265 | that is, where instances of productions can *overlap* and share common
266 | symbols - if desired. Internally it's generally considered a list of
267 | 'used-by-this-production' references associated with each point in
268 | scan-space. An example of the syntax to specify exclusivity is
269 | `bar - bar quuz ::= foo {"a"} baz`. Note that the domain of an instance
270 | of `bar` is the sum of the domains `foo`, `baz` and the chain of "`a`"
271 | terminals, and that neither a subsequent instance of `quuz` nor `bar`
272 | again can overlap it.
273 | 
274 | Examples of MDPN-described Grammars
275 | -----------------------------------
276 | 
277 | **Example 1.** A grammar for describing boxes.
278 | 
279 | The task of writing a translator to recognize a two-dimensional
280 | construct such as a box can easily be realized using a tool such as
281 | MDPN.
282 | 
283 | An input file might contain a of box drawn in ASCII characters, such as
284 | 
285 |     +------+
286 |     |      |
287 |     |      |
288 |     +------+
289 | 
290 | Let's also say that boxes have a minimum height of four (they must
291 | contain at least two rows), but no minimum width. Also, edge characters
292 | must match up with which edge they are on. So, the following forms are
293 | both illegal inputs:
294 | 
295 |     +-+
296 |     +-+
297 | 
298 |     +-|-+
299 |     |   |
300 |     *    
301 |     |   |
302 |     +-|-+
303 | 
304 | The MDPN production used to describe this box might be
305 | 
306 |       Box ::= "+" {"-"}^(w) r(-90) "+" "||" {"|"}^(h) r(-90)
307 |               "+" {"-"}^(w) r(-90) "+" "||" {"|"}^(h) r(-90).
308 | 
309 | **Example 2.** A simplified grammar for Plankalkül's assignments.
310 | 
311 | An input file might contain an ASCII approximation of something Zuse
312 | might have jotted down on paper:
313 | 
314 |      |Z   + Z   => Z
315 |     V|1     2      3
316 |     S|1.n   1.n    1.n
317 | 
318 | Simplified MDPN productions used to describe this might be
319 | 
320 |     Staff     ::= Spaces "|" TempVar AddOp TempVar Assign TempVar.
321 |     TempVar   ::= "Z" t(-1,1) Index t(-1,1) Structure t(0,-2) Spaces.
322 |     Index     ::= <Digit {Digit}>.
323 |     Digit     ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9".
324 |     Structure ::= <Digit ".n">.
325 |     AddOp     ::= ("+" | "-") Spaces.
326 |     Assign    ::= "=>" Spaces.
327 |     Spaces    ::= {" "}.
328 | 


--------------------------------------------------------------------------------
/you-are-reading-the-name-of-this-esolang/you-are-reading-the-name-of-this-esolang.markdown:
--------------------------------------------------------------------------------
  1 | You are Reading the Name of this Esolang
  2 | ========================================
  3 | 
  4 | November 2007, Chris Pressey, Cat's Eye Technologies
  5 | 
  6 | Introduction
  7 | ------------
  8 | 
  9 | This programming language, called **You are Reading the Name of this
 10 | Esolang**, is my first foray into the design space of programming
 11 | languages whose programs contain undecidable elements. In the case of
 12 | You are Reading the Name of this Esolang, these elements are the
 13 | instructions themselves — or rather, the symbols that the instructions
 14 | are composed of.
 15 | 
 16 | Before we begin, some lexical notes. The name of this language is not
 17 | pronounced exactly how it looks; rather, it is pronounced as an English
 18 | speaker would pronounce the phrase "you are hearing the name of this
 19 | esolang." In addition, it is strongly discouraged to refer to this
 20 | language by the name "Yartnote", whether spoken or in writing, and in
 21 | any capitalization scheme. After all, there may actually be a completely
 22 | unrelated esolang called Yartnote one day, Zeus willing. A similar logic
 23 | applies to the taboo on calling it "YRNE".
 24 | 
 25 | Program structure
 26 | -----------------
 27 | 
 28 | A You are Reading the Name of this Esolang program is a string of
 29 | symbols drawn from the alphabet `0`, `1`, `[`, and `]`.
 30 | 
 31 | `0` and `1` are interpreted as they are in the programming language
 32 | Spoon, or rather, the slightly more clearly-specified version of Spoon
 33 | that follows. The Spoon tape is considered unbounded in both directions.
 34 | Each cell of the Spoon tape may contain any non-negative integer (again,
 35 | unbounded.) In addition, attempting to decrement a cell below 0 results
 36 | in an immediate termination of the program. Oh, and there's no way to
 37 | change what the symbols are. But that's the extent of the difference, I
 38 | think.
 39 | 
 40 | For your convenience, the Spoon instructions are repeated here (taken
 41 | from the public-domain [Spoon](http://esolangs.org/wiki/Spoon) entry on
 42 | the [Esolang wiki](http://esolangs.org/wiki/)):
 43 | 
 44 |     1        Increment the memory cell under the pointer
 45 |     000      Decrement the memory cell under the pointer
 46 |     010      Move the pointer to the right
 47 |     011      Move the pointer to the left
 48 |     0011     Jump back to the matching 00100
 49 |     00100    Jump past the matching 0011 if the cell under the pointer is zero
 50 |     001010   Output the character signified by the cell at the pointer
 51 |     0010110  Input a character and store it at the cell in the pointer
 52 |     00101110 Output the entire memory array
 53 |     00101111 Immediately terminate program execution
 54 | 
 55 | Each `[` must be matched with a `]`; between them lies a subprogram with
 56 | the same structure as a general You are Reading the Name of this Esolang
 57 | program. The meaning of this subprogram is determined from its structure
 58 | as follows. The subprogram is considered to be given the same input as
 59 | the entire program. If the subprogram halts on this input, it is reduced
 60 | to a `1`, and if it loops forever on this input, it is reduced to a `0`.
 61 | These reduced instructions are interpreted as they are in Spoon, as
 62 | described above. Any output produced by the subprogram is simply
 63 | discarded.
 64 | 
 65 | Subprograms may themselves contain subprograms, nested to arbitrary
 66 | depth, in which case the reduction above is recursively applied, from
 67 | the inside out, until a string of only `0` and `1` symbols remains. This
 68 | is then executed as if it were a Spoon program. Any syntactically
 69 | ill-formed program or subprogram is considered to halt immediately,
 70 | producing no output. Note however that, as a consequence of this, a
 71 | subprogram can be syntactically ill-formed (for example consisting of a
 72 | single `0`) while the parent program can still be syntactically OK (the
 73 | subprogram just reduces to a `1` in the parent program.)
 74 | 
 75 | Implementation notes
 76 | --------------------
 77 | 
 78 | An implementation will determine if a particular subprogram halts, or
 79 | not, if it can. Implementations may vary in the power of their proof
 80 | methods used for this, but at a minimum must be able to recognize at
 81 | least one subprogram that halts on any input, and one subprogram that
 82 | loops forever on any input. This implementation-dependence should not
 83 | strike anyone as too bizarre, I don't think — it is quite similar to how
 84 | different implementations of a traditional systems-programming language
 85 | can, for example, provide different levels of support for sizes of
 86 | numerical data types like integers.
 87 | 
 88 | Recall that the problem of telling if an arbitrary program in some given
 89 | Turing-complete language halts on some input is *undecidable*, or
 90 | equivalently, that set of all programs that halt on some input is
 91 | *recursively enumerable*. The set of all programs that loop forever on
 92 | some given input is the complement of this set, and it is called
 93 | *co-recursively enumerable* (*co-r.e.* for short.)
 94 | 
 95 | Despite this, there are many methods, ranging from simplistic to
 96 | sophisticated, that can be used to prove in *specific* circumstances
 97 | that a given program, on a given input, will either halt or fail to
 98 | halt. These methods can be used in a You are Reading the Name of this
 99 | Esolang implementation.
100 | 
101 | The simplest method for proving that a subprogram halts is probably just
102 | to simulate it on the given input indefinately, returning `1` if it
103 | halts. If the subprogram does indeed halt, this technique will
104 | (eventually) reveal that fact. The simulation can be peformed
105 | concurrently with other subprograms, so that if no proof of halting for
106 | one subprogram is ever found, this will not prevent other subprograms
107 | from being checked.
108 | 
109 | The simplest method for proving that a subprogram loops forever is
110 | probably to check it against a library of subprograms known to loop
111 | forever. For example, it can check if the program is
112 | `0010000000111001000011` (in Brainfuck: `[-]+[]`. You can readily assure
113 | yourself that this program loops forever, on any input.) This technique
114 | of course limits the number of recognizably looping subprograms that the
115 | implementation can handle to a finite number. Checking the general case,
116 | that is, recognizing an infinite number of forever-looping programs, is
117 | more difficult, however. Such an implementation will require techniques
118 | such as automatically finding a proof by induction, or abstract
119 | interpretation. However, ultimately, we know from the Halting Problem
120 | that there is no perfectly general technique that will recognize *every*
121 | program that loops forever.
122 | 
123 | Computability class
124 | -------------------
125 | 
126 | You are Reading the Name of this Esolang can be trivially shown to be as
127 | powerful as Spoon, since valid Spoon programs are valid You are Reading
128 | the Name of this Esolang programs. (Modulo the absence of
129 | negative-valued tape cells and the other little variations mentioned
130 | above. Since these issues have been dealt with extensively in the
131 | Brainfuck "literature", such as it is, I'm not going to worry about
132 | them.)
133 | 
134 | What about the other way around?
135 | 
136 | Well, take as a starting point the fact (in classical logic, at least)
137 | that every Spoon program either halts at some point or loops forever.
138 | (Whether we can *discover* which of these is the case is a different
139 | story.) This means that every You are Reading the Name of this Esolang
140 | program has a "canonical" form consisting of just `1`'s and `0`'s —
141 | again, whether we have an interpreter powerful enough to discover it or
142 | not. At this level, Spoon is as powerful as "canonical" You are Reading
143 | the Name of this Esolang, because "canonical" You are Reading the Name
144 | of this Esolang programs are valid Spoon programs.
145 | 
146 | Now we can go in the other direction. We can start with any "canonical"
147 | You are Reading the Name of this Esolang program, and replace each `1`
148 | with any You are Reading the Name of this Esolang subprogram that always
149 | halts. Since even the simple method, described above, of proving that a
150 | subprogram halts will always resolve to `1` if the subprogram does
151 | indeed halt, this subset of You are Reading the Name of this Esolang
152 | programs is still executable in Spoon (or any other Turing-complete
153 | language). We only need to add the halting proof mechanism to rewrite
154 | the program into "canonical" form, before executing or simulating it.
155 | 
156 | This extends recursively to any arbitrary level of nesting, too: we can
157 | replace each `1` in each subprogram with subprograms that always halt,
158 | with no bound. We only have to test these subprograms recursively, from
159 | the inside out, to eventually recover a "canonical" program.
160 | 
161 | However, something strange happens when we turn our attention to `0`'s.
162 | If we replace even one `0` with a subprogram that loops forever on some
163 | input, there is always a possibility that: a) the You are Reading the
164 | Name of this Esolang program will be run with that input, and that b)
165 | the interpreter cannot prove that the subprogram loops forever with that
166 | input. Because the set of programs that loop forever is co-r.e., there
167 | is no Turing machine (or Spoon program) that can look at any one Spoon
168 | program and say, yep, I'm certain that this here program loops forever
169 | on this input, so it should darn well be rewritten into a `0` symbol.
170 | 
171 | Thus it seems that there are You are Reading the Name of this Esolang
172 | programs which no Spoon interpreter — indeed, not any interpreter for
173 | any Turing-complete language — is able to interpret.
174 | 
175 | Since the case of `0`'s seems to mirror the case of `1`'s when it comes
176 | to expanding them into subprograms, we may conjecture that, instead of
177 | remaining basically the same as we consider more and more deeply nested
178 | subprograms (as it was with expanding `1`'s,) maybe the problem becomes
179 | more intractable the deeper we go in expanding `0`'s. Perhaps we are
180 | climbing up the arithmetic hierarchy?
181 | 
182 | At any rate, *I* certainly initially conjectured that that was the case,
183 | but it appears to be off the mark. Say you have a Spoon interpreter
184 | that's equipped with an oracle. You can feed an input string and a Spoon
185 | program into the oracle, and the oracle tells you whether or not that
186 | program halts on that input. You could then use that oracle to resolve a
187 | given You are Reading the Name of this Esolang subprogram into a `0` or
188 | `1`. But, you could also do this recursively, resolving them from the
189 | inside outward. A Spoon interpreter with such an oracle would be able to
190 | simulate any You are Reading the Name of this Esolang program — no more
191 | powerful oracle is needed, no matter how deep the subprograms are
192 | nested.
193 | 
194 | Discussion
195 | ----------
196 | 
197 | I started designing You are Reading the Name of this Esolang shortly
198 | after reading about the programming language Gravity, while trying to
199 | determine the sense in which it is "non-computable." In particular, I
200 | noticed that these two statements (taken from the
201 | [Gravity](http://esolangs.org/wiki/Gravity) entry on the [Esolang
202 | wiki](http://esolangs.org/wiki/),) on which the claim of Gravity's
203 | "non-computability" apparently rests, have ready analogies to problems
204 | the world of Turing machines:
205 | 
206 | -   "Although [Gravity's] behavior is well-defined and deterministic,
207 |     the evolution of its space is in general non-computable [...]"
208 | 
209 |     The evolution of the state-space (set of successive configurations)
210 |     of a universal Turing machine is also in general non-computable
211 |     (there's no Turing machine that can tell you that some given
212 |     configuration will never be reached.)
213 | 
214 | -   "It can be shown that a Turing machine cannot compute, in the
215 |     general case, whether even a single collision [in a given Gravity
216 |     program] will ever happen."
217 | 
218 |     It can also be shown that a Turing machine cannot compute, in the
219 |     general case, whether or not even a single given state of another
220 |     Turing machine's finite control will ever be reached. (Just make
221 |     that state a halt state, and you have the Halting Problem right
222 |     there.)
223 | 
224 | Because of this, I am skeptical that Gravity is any more
225 | "non-computable" than a universal Turing machine. (I am, however, far
226 | from an expert on the computability of differential equations; it could
227 | be that the rather nonspecific term "non-computable", as used in that
228 | subfield, means something stronger than simply "undecidable".)
229 | 
230 | At any rate, the idea interested me enough to spur me into designing a
231 | language that I *could* be reasonably certain was non-computable, in
232 | some sense that I could explain. The name You are Reading the Name of
233 | this Esolang was drifting around nearby in the æther at that moment, and
234 | seemed fitting enough for this monstrosity.
235 | 
236 | The general approach was to simply force the language interpreter to
237 | decide — that is, to reduce to either a `0` or a `1` — some problem that
238 | is undecidable. This led to looking for something that needed
239 | specifically either a `0` or a `1` to specify something necessary, and
240 | that in turn led to the choice of Spoon as a base language. (Of course,
241 | I could have picked just about any language which is "its own binary
242 | Gödel numbering"; there are plenty to choose from there, but Spoon had a
243 | cool name. What can I say — I like The Tick.)
244 | 
245 | The obvious choice of undecidable problem was whether another program
246 | halts or not. Making the subject of this problem a *subprogram* with the
247 | same structure as the general program let me examine the case of
248 | unbounded recursive descent. This turned out to be not quite as
249 | interesting as I hoped, but perhaps still somewhat illuminating. (Just
250 | what *would* it take, to require that a Spoon interpreter have a more
251 | powerful oracle than HP, to run every You are Reading the Name of this
252 | Esolang program? Perhaps [Banana
253 | Scheme](http://esolangs.org/wiki/Banana_Scheme) could provide some
254 | inspiration, here. *It* certainly seems to be climbing the arithmetic
255 | hierarchy, although I can't quite say how far. Possibly "damned far.")
256 | 
257 | I suppose one or two other things can be said about You are Reading the
258 | Name of this Esolang.
259 | 
260 | Unlike both Gravity and Banana Scheme, You are Reading the Name of this
261 | Esolang has a recursively enumerable *syntax*: the problem of whether or
262 | not a given string over the alphabet `0`, `1`, `[`, and `]` is even a
263 | well-formed You are Reading the Name of this Esolang program is
264 | undecidable!
265 | 
266 | It's not entirely clear how to interpret the instruction `00101110`,
267 | "Output the entire memory array," in the context of having a tape
268 | unbounded in both directions. I suppose I ought to stipulate that we are
269 | to just output the portion of the tape that has been "touched", i.e.
270 | that the tape head has moved over. But really, it's not so important for
271 | the goals of You are Reading the Name of this Esolang, so maybe I should
272 | just leave it undefined, for kicks.
273 | 
274 | The fact that every subprogram takes the *same* input (same as the main
275 | program) might lead to some interesting programs — programs which are
276 | unduly sensitive to changes in the input, I imagine. Of course, this
277 | doesn't affect the undecidability of subprograms, since they are always
278 | free to ignore input completely.
279 | 
280 | There is no implementation, yet, but constructing an efficient one would
281 | be a good exercise in static program analysis.
282 | 
283 | Happy undeciding!
284 | 
285 | -Chris Pressey  
286 | November 5, 2007  
287 | Chicago, Illinois, USA
288 | 


--------------------------------------------------------------------------------
/irishsea/doc/original-notes.markdown:
--------------------------------------------------------------------------------
  1 | Thoughts about livecoding and related activities
  2 | ================================================
  3 | 
  4 | This is a random collection of notes (not necessarily particularly
  5 | intelligent ones, either) comparing some technical and creative activities.
  6 | 
  7 | *   Here are a few of the technical and creative activities I've undertaken:
  8 |     
  9 |     * Computer programming
 10 |     * Programming language design
 11 |     * Musical composition
 12 |     * Musical performance
 13 | 
 14 | *   These can be combined.
 15 |     
 16 |     Doing musical composition during a musical performance can be called
 17 |     _musical improvisation_.  I've done this.
 18 |     
 19 |     Doing computer programming as a means of musical improvisation can be
 20 |     called _musical livecoding_.  I've never done this.
 21 | 
 22 | *   Livecoding, eh?
 23 |     
 24 |     I only recently encountered livecoding.  These notes are largely the
 25 |     result of me trying to come to grips with the concept.  I'm coming at
 26 |     it in a very raw way;  I basically stumbled upon it on Github.  I've
 27 |     never experienced a livecoding performance.  But I think that inexperience
 28 |     might let me think about it in a less biased way, too.
 29 | 
 30 | *   I have a hard time reconciling computer programming with musical improvisation.
 31 |     
 32 |     My experience with computer programming is similar to my experience with musical
 33 |     composition.
 34 |     
 35 |     I know I am able to musically improvise.  So it must be theoretically possible to
 36 |     reconcile the two.  But there are several things to account for.  Here are a few.
 37 | 
 38 | *   Engineering is not an eager algorithm.  But improvisation needs to be be.
 39 |     
 40 |     Any software project of any non-trifling size requires thought and planning and
 41 |     structure and being broken up into components and interfaces and invariants and
 42 |     ideally has a test suite, too.
 43 |     
 44 |     This can't be done effectively with an eager algorithm — the closest you can
 45 |     come to that is probably a quick prototype followed by a series of incremental
 46 |     improvements and refactorings.  But even then, if you're not willing to do an
 47 |     occasional rewrite (i.e. significant rethink/refactoring of the design/code) at
 48 |     some point, you can paint yourself into a corner.  And the rewrites require
 49 |     thought and planning and consideration and all that stuff that takes some time,
 50 |     i.e. not an eager algorithm.
 51 |     
 52 |     But improvisation, for the most part, requires that you compose something
 53 |     "on the fly" — you don't have time to sit down and plan it.  But, in practice,
 54 |     that is not quite true; jazz musicians *practice* improvisation, and one of the
 55 |     results is they build up a stock of experience about improvisation that they can
 56 |     draw on, "on the fly".  Some of this is accumulating a library of "licks", but 
 57 |     much of it is also about building an intuition of what "works" with what.
 58 |     
 59 |     Also — when I played jazz, and had to play a solo for actual performance, I
 60 |     tended to refine a particular improvisation during practice, and play that
 61 |     during performance.  But that's just an instance of composition.  (I see nothing
 62 |     wrong with this approach, but some other musicians might feel that a solo
 63 |     played this way isn't "from the heart", or something.  It might be interesting
 64 |     to know just how much of their solo your average musician works out beforehand,
 65 |     and how much they actually make up during performance.  I'm sure it varies.)
 66 | 
 67 | *   Music is a performance art.  Programming isn't.
 68 |     
 69 |     Generally, no one is watching you program.  (Pair programming excepted, I
 70 |     suppose.)
 71 | 
 72 | *   Musical performance is "write-only".  Composition isn't.  Programming isn't.
 73 |     
 74 |     In a live performance, as soon as you hit a piano key, or as soon as you blow
 75 |     into the mouthpiece of your horn, the note you played is committed to the
 76 |     sound waves in the air.  There's no going back.  If you cacked it, if it was
 77 |     the wrong note, it was the wrong note and *it already happened*.  Sorry!
 78 |     
 79 |     Putting together a composition on with a sequencer (which is how the vast
 80 |     majority of synthesized music is done) is not like that at all.  You can
 81 |     arrange a few events, try it out, erase it, modify it until it's how you want
 82 |     it.
 83 |     
 84 |     Livecoding seems to necessarily fall under the first category.  Even though
 85 |     you can perhaps backspace to correct a line of code before it causes anything
 86 |     to happen, as soon as you do commit it, it's committed, and it's going
 87 |     to start having an affect on the performance, as soon as those events are
 88 |     queued up to go.
 89 |     
 90 |     (I suppose a livecoding language could allow you to cancel changes you
 91 |     recently made, if the events they queued up haven't happened yet.)
 92 | 
 93 | *   I have ten fingers, two hands, two arms, two feet, and a mouth and lungs.
 94 |     
 95 |     Almost every instrument I can think of uses some combination of those to
 96 |     control it.  Some examples:
 97 |     
 98 |     * piano: 10 fingers, 2 feet
 99 |     * drumkit: 2 hands, 2 feet
100 |     * tuba: 4 fingers, mouth, lungs
101 |     * trombone: 1 hand/arm, mouth, lungs
102 |     * penny whistle: 8 fingers, lungs
103 |     
104 |     etc.
105 |     
106 |     A computer keyboard would be classified: 10 fingers.
107 |     
108 |     In combination with the idea that musical performance is "write-only", though:
109 |     typically those 10 fingers must produce some combination of symbols, which
110 |     is then comitted only after it is complete.  Also, it is generally assumed
111 |     there is *visual feedback* involved in letting the operator compose such a
112 |     sequence (i.e. I need to see the words I am typing, or I won't be able to
113 |     type anything without making a typo.  I won't be able to know what I typed.)
114 |     
115 |     This all means the computer keyboard, as a musical instrument interface, is
116 |     somewhat similar to, but also significantly different from, all other musical
117 |     instruments' interfaces.
118 | 
119 | *   Livecoding is about external events.  With programs, it varies.
120 |     
121 |     I'm assuming that the livecoding we're dealing with here (so far) is a
122 |     performance art technique in that the events that are caused by the coding
123 |     have an observable effect outside of the computer.  They set up sounds to
124 |     be played as part of music, lights, video, animation, dance, etc. etc.
125 |     
126 |     Programs, on the other hand, don't have to be about external events.  In
127 |     practice, they are: even a simple program produces output when it's finished,
128 |     and that's an external event.
129 |     
130 |     But programs (especially in esolang circles) can also be seen merely as
131 |     _embodiments of computations_.  Many esolangs don't even have input and
132 |     output — they just do a computation, and you examine the state of the
133 |     virtual machine (or whatever) when it is finished executing, to see the
134 |     result.
135 |     
136 |     Nevertheless, speaking *operationally*, the execution of a program can also
137 |     be seen entirely as a sequence of events — assign a value to a variable,
138 |     add two values, etc.  These are all events, albeit _internal_ events which
139 |     are usually not detectable (unless you are debugging the program.)
140 | 
141 | *   "Pure" livecoding?
142 |     
143 |     Examining the previous point, if the internal events of an executing program
144 |     are exposed and made observable to an audience... livecoding with that
145 |     program would not be tied to some other media like music or dance.  This would
146 |     be "pure" livecoding.
147 |     
148 |     It would obviously have a much smaller audience.
149 |     
150 |     But there are some interesting potentials here.  If I assign sounds to be
151 |     played when certain things happen in the internal event model of my program,
152 |     might that help me debug it?  Might certain algorithms produce inherently
153 |     interesting patterns of events?  (cf bytebeat.  And things like John Conway's
154 |     Game of Life, which is basically a computation, but often pleasing to watch.)
155 | 
156 | *   Computer is so much faster than me!  What chance do I stand!
157 |     
158 |     If a program is producing a series of events at a musical time scale, then it
159 |     is not unlikely that I could write a line of code and execute it in time, before
160 |     the next musical sequence comes up, so I can affect the song in real-time.
161 |     
162 |     If the events are at a computational time scale — that is SO much faster than
163 |     even the musical time scale that there is no chance that I could, for example,
164 |     modify the computation of a function before it is done.  (Only if it is operating
165 |     on a very large amount of data, or is an inherently complex function, such as
166 |     the Ackermann function, is it really even conceivable.)
167 | 
168 | *   Coding isn't necessarily general programming.
169 |     
170 |     "Coding" refers to writing a program in a Turing-complete programming language,
171 |     but it can also refer to writing things in much weaker languages.  For example
172 |     it is possible to say "coding an HTML page".  A purist might not want to call
173 |     this "programming", maybe instead preferring "configuration".
174 |     
175 |     I will guess that most livecoding is configuration, not general programming.
176 |     This is very reasonable, and a natural consequence of many of the things I've
177 |     already mentioned.  General programs require engineering, which is slow, and
178 |     are hard to debug in part because they're done in "powerful" languages.
179 |     
180 |     You really, really wouldn't want to have to debug a program during a live
181 |     performance!
182 |     
183 |     At the same time, it would be really neat to have livecoding be about more
184 |     than just reconfiguring event streams — it would be awesome to have some kind
185 |     of "higher" computational aspect to it too.
186 | 
187 | What this all might mean for the design of a livecoding language/environment
188 | ============================================================================
189 | 
190 | *   The operator ought to be able to enter a change quickly.  But they also ought
191 |     to be able to edit it easily before they commit it.
192 |     
193 |     Assuming we're sticking with the classical computer keyboard as the input
194 |     device, this suggests to me that the most frequently used alterations should
195 |     be performed with _short sequences of keystrokes which do not require multiple
196 |     key combinations_ (so, no "shifted" symbols, like capital letters — of course,
197 |     this varies internationally from one keyboard layout to another.)
198 |     
199 | *   Assuming there are event "generators" or "streams" which produce events that you
200 |     want, until told to do otherwise, (for example, a drum track loop,) it seems
201 |     like the most obvious operation is **tell a given generator to change what it
202 |     will generate**.  If we think of the generators as processes (a la Erlang),
203 |     this operation is just sending a message to a generator.
204 |     
205 |     If we are content with 26 generators, we can use lowercase letters to identify
206 |     them.  Sending a message to a generator is a matter of naming the generator
207 |     and then giving it the message you want to send to it.
208 | 
209 | *   Each generator should be pre-programmed (and perhaps re-programmable on the fly)
210 |     to understand a certain set of messages.
211 |     
212 |     That is, each generator runs some "code" which reacts to events — a _handler_.
213 |     Multiple generators could run the same handler.
214 |     
215 |     If a generator receives a message it doesn't understand, nothing should happen
216 |     (well, maybe some feedback should be given to the operator, but other than that,
217 |     nothing.)
218 | 
219 | *   So as an example, to send the message `p5` to the generator `a`, the operator
220 |     might type:
221 |     
222 |         ap5↵
223 |     
224 |     (where ↵ is the Enter key)
225 |     
226 |     Presumably the handler that generator `a` is running knows what `p5` means.
227 |     It might be a command that says "switch to pattern number five at the next
228 |     transition."  In this case, `p` is the command, and `5` is a parameter to it.
229 |     And assuming we have some rules for parsing commands and arguments like this
230 |     (the simplest is that a space ends a command sequence) the operator could send
231 |     several messages to one or more generators in one action, like
232 |     
233 |         ap5 bp7 br0↵
234 |     
235 |     (Maybe `r0` means "turn off reverb" or something.)  Or, if we have even more
236 |     knowledge of the syntax, we might be able to shorten that to
237 |     
238 |         ap5bp7r0↵
239 |     
240 |     but this gets tricky, as we might want to have more complex arguments.  This
241 |     syntax, obviously, needs formalizing.  But the point is to demonstrate that you
242 |     could have such a syntax.  And you could type it quickly enough to do it during
243 |     a performance — even a fast tempo performance, if you got practiced at it.
244 | 
245 | *   It might be useful for the editor to know about the syntax and do "structured
246 |     editing", too; so that if, for example, `d` was not a valid message for the
247 |     handler running in generator `a`, the editor would not even let you type `ad`.
248 | 
249 | *   Taking an idea from the design of the piano: maybe SHIFT and CTRL modifier keys
250 |     could be foot pedals!  (Or is that too weird?)
251 | 
252 | *   Most changes take effect "at the next transition point".  (Computer is so much
253 |     faster than me!  Music is too, for that matter!  But transitions should be
254 |     kept smooth.)  There might be some exceptions.
255 | 
256 | *   Programming the generators to know what kind of commands to accept at runtime is
257 |     a large part of this.  It's akin to accumulating those "licks" when practicing
258 |     improvisation — teach the generator a set of tricks, and tell it what tricks to
259 |     perform during the performance.  Each trick can be arbitrarily complicated (and
260 |     you don't have to debug it during the performance!)  But the set of known tricks
261 |     is limited.  *Unless* you have good ways to sensibly combine tricks into new
262 |     tricks on the fly (without everything falling apart.)  This is where it seems
263 |     very interesting.
264 | 
265 | *   Following that up, generators probably ought to be able to send messages to other
266 |     generators.  This would let them act as proxies (send a message to one generator,
267 |     it translates/filters it, and passes it on to one or more other generators) and
268 |     other things, e.g. periodically send "increase amplitude" events to all other
269 |     generators to achieve a global crescendo.
270 | 
271 | *   For that matter, the output devices can be thought of as processes which receive
272 |     messages, too; so a generator for producing a percussion track would periodically
273 |     be sending "bass" and "snare" messages to the "drum machine output device".  Would
274 |     this use the same syntax as the operator's syntax for sending messages?  Ideally,
275 |     yes, or at least something similar/compatible.
276 | 
277 | *   Operator ought to be able to start up a generator with a particular handler (i.e.
278 |     the "code" that the generator is running, which knows what to do with particular
279 |     events).  Switching handlers, and "tearing down" a generator could be handled by
280 |     responding to messages, so the language probably doesn't need built-in operations
281 |     for those.  (But maybe forcibly terminating a generator would be one.  And clearing
282 |     a generator's message queue could be another — effectively, "cancel" what I've
283 |     previously told this generator to do.)
284 | 
285 | *   This process/message model is very similar to Erlang's in many ways.
286 | 
287 | *   Multiple video monitors would probably be useful... but regardless, it would be useful
288 |     for the operator to see a summary of **which generators are running what handlers** in
289 |     real-time.
290 | 


--------------------------------------------------------------------------------
/sampo/Practical_Matters.markdown:
--------------------------------------------------------------------------------
  1 | Practical Matters
  2 | =================
  3 | 
  4 | This document is a collection of notes I've made over the years about the
  5 | practical matters of production programming languages — usually stemming
  6 | from being irked by some existing programming language's lack of adequate
  7 | (in my opinion) support for them.  As such, these thoughts may be overblown
  8 | and sophistry-laden.  But it is nice to have a place to put them.
  9 | 
 10 | Fundamental Abstractions
 11 | ------------------------
 12 | 
 13 | The following facilities should be either built-in to the language, or part
 14 | of the standard (highly standardized) libraries:
 15 | 
 16 | * Tracing.  Ideally, the programmer should be able to easily browse all the
 17 |   relevant reduction steps, and the relevant data being manipulated therein,
 18 |   in the part of the program's execution that interests them.  In addition,
 19 |   this should be something that can be enabled without polluting the source
 20 |   code (overmuch).
 21 |   
 22 |   This could be done, and fairly well, with techniques from aspect-oriented
 23 |   programming.  The rules to describe what to trace (or to highlight in a
 24 |   full trace) could be specified in what amounts to a configuration file,
 25 |   and thus be an implementation issue rather than a language issue.
 26 |   
 27 |   Unfortunately, this ideal is hard to achieve, so the system should also
 28 |   support...
 29 | 
 30 | * Logging.  Logging is basically an ad-hoc way to explicitly achieve
 31 |   selective tracing: the programmer knows what points in the program, and
 32 |   what data, are of interest to them, and outputs that data to the log at
 33 |   those points.
 34 | 
 35 |   Whether this is "debug logging" during development, or to support post-
 36 |   mortem analysis of issues in production, it amounts to the same thing:
 37 |   debugging, just on different time scales.
 38 |   
 39 |   The use of a "log level" is mostly just a way to filter the trace built
 40 |   up in the log files.  This is not necessarily a bad idea, but it should
 41 |   probably not be linear; information should be logged based on the reason
 42 |   that it is being logged, probably in the form of some sort of "tag", and
 43 |   filterable on that (whether at the time the log is being recorded, or
 44 |   being read.)
 45 |   
 46 |   Logging should not count as a side-effect.
 47 | 
 48 |   The logging function itself should have some properties:
 49 | 
 50 |   - Should not have side-effects (for example from evaluating its arguments),
 51 |     so that if it is not executed (because we are not interested in that
 52 |     part of the execution trace) the behaviour of the program is not changed.
 53 | 
 54 |   - In fact, should ensure that its arguments have no side-effects, and
 55 |     ideally, be total, with no chance of hanging or crashing.
 56 | 
 57 |   - Should pretty-print the relevant values, include the type and other
 58 |     metadata of the values, and put clearly visible delimeters around the
 59 |     values so printed.
 60 | 
 61 |   - Should include the source filename and line number.
 62 | 
 63 |   - Should not be overridable (shadowed?  not sure what I meant here.)
 64 | 
 65 | * History.  This is more relevant in a language with mutable values, but
 66 |   as part of tracing, it is useful to know the history of mutations of a
 67 |   value.  With immutable values, it would be useful to be able to view
 68 |   all the reductions which fed into the computation of the value at a
 69 |   point.  Either way, however, this is expensive, so should be specified
 70 |   selectively.  Again, an external, aspect-like configuration language
 71 |   for specifying which values to watch makes this an implementation issue.
 72 | 
 73 | * Command-line option parsing.  This should not rely on the Unix or DOS
 74 |   idea of a command line, and it should be unified with parameter passing
 75 |   in the language itself; calling an executable built in the language with
 76 |   arguments `a b c` should be no different from calling a function from
 77 |   within the language with the arguments `a b c` (probably as string values.)
 78 | 
 79 | Reflection
 80 | ----------
 81 | 
 82 | * First-class tracebacks.  When a program, for example, encounters an error
 83 |   parsing an external file such as a configuration file, it should be able to
 84 |   report the position in that file that caused the error as part of the
 85 |   traceback, for consistency.  Java has some limited facilities for this, and
 86 |   some Python libraries do this (Jinja2? werkzeug?) using frame hacks, but
 87 |   a less clumsy solution would be nice.
 88 | 
 89 | * Tracebacks are *not* a special case of logging, or an artefact of throwing
 90 |   exceptions.  Since the traceback is basically a formatted version of the
 91 |   current continuation, this suggests the two facilities should be unified,
 92 |   perhaps not totally, but to a high degree.
 93 | 
 94 | Abstractions, not Wrappers
 95 | --------------------------
 96 | 
 97 | The basic principle here is that the existing APIs of most libraries are
 98 | (let's be polite) less than ideal, especially when they were designed for
 99 | some other language (such as C), and instead of blindly wrapping them in a
100 | new language, the designer should at least *try* to make something nicer.
101 | 
102 | The abstractions should also recognize that modern computer systems are
103 | generally not resource-starved (or at least that truly high-level
104 | programming languages should not treat them that way.)
105 | 
106 | This applies to very basic facilities as well as what are usually thought
107 | of as external libraries.  Specifically,
108 | 
109 | * Date and time: We can do better than simply copycatting interfaces like
110 |   `strftime`.  All time data should be stored consistently, in GMT, always
111 |   with a time zone.
112 | 
113 | * String formatting: We can do better than simply copycatting interfaces
114 |   like `printf`.  We can use visual formatting strings, where fixed-size
115 |   slots appear as fixed-sized placeholders (of the same size) in the
116 |   formatting string.  (See also the scathing prog21 criticism of the
117 |   vertical tab character.)
118 | 
119 | * Line-oriented communication: We can look at line-oriented communication
120 |   more generally, as a form of record-oriented communication where the
121 |   "delimiter set" for each record is {LF, CR, CRLF}.
122 | 
123 | The programmer who really wants atavistic interfaces like those mentioned
124 | above can always implement them as "compatibility modules" if they wish.
125 | 
126 | Seperation from the Implementation
127 | ----------------------------------
128 | 
129 | This is just a repeat of the above section in slightly different terms.
130 | 
131 | A language should avoid tying any language construct (e.g. imports,
132 | include files) to the file system or the operating system.  Instead,
133 | have mappings between e.g. module names and where they live in the file
134 | system, and between our model of a running computer and a real OS.
135 | These mapping could be  specified in configuration files which are
136 | in the domain of the implementation and outside the domain of the
137 | language, i.e. they never appear in programs.
138 | 
139 | Standard modules supplied with the language should expose *models* of
140 | commonplace artefacts out in the world, for example operating systems.
141 | The models are similar to the artefacts, in order that the burden of
142 | implementing an interface from the model to any given artefact is not
143 | too great.  However, the models are *not* the artefacts.  Programs
144 | should be written to the model, not to the artefact.
145 | 
146 | People who construct bindings to the language should be encouraged
147 | (only because they can't effectively be required) to create models
148 | more abstract than the libraries that they are binding.
149 | 
150 | Insofar as possible, we can have a compiler optimize things so that they
151 | match the underlying architecture.  The language should allows and even
152 | encourage definitions in the most general sense; special cases are to be
153 | detected and optimized when they occur, instead of instituting those
154 | special cases into the language itself.
155 | 
156 | Another aspect of this point of philosophy is that it should be possible
157 | to specify and change the performance characteristics of the program
158 | (but ideally not its behaviour) from outside the program, using
159 | configuration files.
160 | 
161 | This counts as a practical matter because maintaining code which is
162 | cluttered with implementation-specific artefacts is burdensome.
163 | 
164 | Serialization
165 | -------------
166 | 
167 | (This section needs to be rewritten)
168 | 
169 | - All primitive values must be serializable
170 | - All primitive values must be round-trippable
171 | - All primitive values must thus have an order to them (like Ruby 1.9's
172 |   hashes) because in this world of representations, orderless things don't
173 |   really exist
174 | - When building user-defined values from primitive values it must be
175 |   easy to retain these serialization properties in the composite value
176 | - This is actually fairly agnostic of the particular serialization format
177 |   (yaml, xml, binary, etc)
178 | - S-expressions are trivially serializable, except for functions
179 | 
180 | Formatting
181 | ----------
182 | 
183 | Closely related to serialization.
184 | 
185 | Many languages support a "standard" operation to convert an arbitrary value to
186 | a string.  Some even have two (e.g. Python's `str` and `repr`).
187 | 
188 | But in reality, there are any number of ways to convert a value to a string.
189 | Why should the string representation of 16 necessarily be `"16"` — why not
190 | `"0xf"` or `"XVI"`?  `"16"` is fine, but it should be explicitly noted to be
191 | the default for the reason that it's the most convenient for the audience of
192 | humans who use the decimal Arabic notation when dealing with numbers.
193 | 
194 | How can we support both a reasonable (and possibly configurable) default
195 | formatting, as well as any number of other ways to format values which would
196 | be more appropriate in different contexts?
197 | 
198 | Can we pass a "style" argument to the string-conversion function?
199 | 
200 | Should we establish a "design pattern" for writing formatting functions, and
201 | provide support for implementing such patterns?
202 | 
203 | (Also, `format` is probably a better name for this function than `str`.)
204 | 
205 | Multiple Environments
206 | ---------------------
207 | 
208 | (This section needs to be rewritten)
209 | 
210 | - Lots of software runs in multiple environments - "development", "qa",
211 |   "production"
212 | - Inherently support that idea
213 | 
214 | Assertions
215 | ----------
216 | 
217 | (This section needs to be rewritten)
218 | 
219 | - Software engineering is more about defining invariants than writing code.
220 | - An "assert" command which produces details errors in development, but only
221 |   logs warnings in production environments
222 | - Very lightweight so that programmers use it without thinking
223 |   (Python's `self.assertEqual()` is *not* lightweight)
224 |   (Erlang's `A = {foo,B}` IS lightweight)
225 | - So a conditional, by itself, is an assertion. (?)
226 | 
227 | Interfaces
228 | ----------
229 | 
230 | (This section needs to be rewritten)
231 | 
232 | One way or another, it should be possible to discover (programmatically,
233 | through reflection of some sort) the set of operations that a value supports —
234 | its interface.  Each operation has a name and a signature of some sort.
235 | 
236 | Collections are interfaces.
237 | 
238 | Some parts of an interface might be "private".  This — information hiding —
239 | is obviously a somewhat complex topic.  The obvious bit is that information
240 | hiding is useful to prevent unintended changes to program state, but it also
241 | hinders debugging and testing.
242 | 
243 | Usability
244 | ---------
245 | 
246 | Memorization is not a good thing to make programmers do.  This can be
247 | addressed by either copying things from an existing language that the
248 | programmer base can be expected to already have memorized, or by providing
249 | a more orthogonal set of things which maps to the culture which programmers,
250 | as people, already live in.  (For example, few people in the Western world
251 | do not know that `&` means "and".)
252 | 
253 | Non-alphabetic symbols should, idealy, have the same meaning regardless of
254 | the context they're used in — in other words, the language should avoid
255 | using the same symbol for different purposes in different contexts.
256 | 
257 | (Lots of languages are lacking here.  In C, `*` is both multiplication and
258 | dereferencing. In Python, `.` is both object attribute access and package
259 | hierarchy — although packages are, at least, kind of like objects.  In Lua,
260 | `=` is both assignment and key value association.)
261 | 
262 | Programming Languages vs. Operating Systems
263 | -------------------------------------------
264 | 
265 | (this section needs to be cleaned up — not sure where to put it, and it
266 | arguably doesn't belong here)
267 | 
268 | What you see before you in this distribution can be described as a
269 | programming language, but many of the ideas took root while thinking about
270 | operating systems.
271 | 
272 | What's the difference between a programming language and an operating system?
273 | 
274 | Well, maybe less than you think.
275 | 
276 | Programming languages do need to define the environment in which they can
277 | express programs.  Sometimes this is a specific OS (like early C on Unix) --
278 | or they claim to be "portable", but then they're really just defining an
279 | abstraction against all the possible OS'es they think they'll run on.  Often
280 | this abstract is clumsy, but some languages put a lot of thought into it,
281 | like Smalltalk.
282 | 
283 | Operating systems, on the other hand, don't tell you what programming
284 | language to use -- or do they?  A modern OS insists everything is, at some
285 | point, in native machine language, and a running instance will almost always
286 | be limited to a single machine language of a single architecture.  Somewhat
287 | more alternative OS'es define a virtual machine language to abstract away
288 | from the concrete machine language.  Usually this virtual machine language
289 | looks like a machine language, but sometimes it's a tad more high-level,
290 | like Lisp.  Any way you slice it, the OS does sanction a particular, albeit
291 | usually low-level, programming language.
292 | 
293 | Where PL's and OS's seem to meet more-or-less neatly is in the idea of the
294 | VM, so let's examine that.
295 | 
296 | Most modern virtual machines are designed to implement high-level languages
297 | in a modern operating system environment.  The JVM was specifically designed
298 | for running Java, and while .NET was ostensibly designed for multiple
299 | languages, the bytecode is pretty closely tuned to C\#.
300 | 
301 | What these VMs were not designed to do, but what a VM "should really" be
302 | designed to do (if it, at least, wants to live up to the name "virtual
303 | machine") is to abstract the *hardware* and provide virtualizations
304 | (abstractions) of the available devices.
305 | 
306 | An environment contains zero or more devices.  A device exposes zero
307 | or more services.  Each service conforms to one or more interfaces.
308 | Each service may additionally require one or more services be available
309 | (by interface).
310 | 
311 | At one point I was calling this place where programming language and
312 | operating system meet a "CE" (Computational Environment) because
313 | "operating system" is far too generic-sounding and "programming language"
314 | doesn't address the important environmental aspect here.  Whether I would
315 | continue to use the term CE or not, I'm not sure — it could just add to the
316 | confusion.
317 | 
318 | How do most programming languages deal with the abstraction of available
319 | (or virtual) devices?  Terribly, I would say.  Take, as a simple example,
320 | an addressable character screen device.  Someone writes a library, in C,
321 | to access it (e.g. `ncurses`,) providing an API comprising C functions
322 | and C structs.  Someone then writes a binding or a wrapper (e.g. using
323 | `swig`) or otherwise foreign-function interfaces it to the language, usually
324 | exposing the exact same C-level API naively adapted to the programming
325 | language.  Then you, the programmer in this language, wrestle with working
326 | with the device almost exactly as a C programmer would, initializing and
327 | releasing it as a C programmer would, with limitations on how you may or
328 | may not use it from multithreaded code like a C programmer would (which
329 | might be brutally different from how the runtime for your programming
330 | language implementation assumes that its world works.)  All this, with the
331 | added hassle of having to make sure you have all these bindings for the
332 | device for your chosen implementation of your language built and installed
333 | correctly.
334 | 


--------------------------------------------------------------------------------
/oozlybub-and-murphy/oozlybub-and-murphy.markdown:
--------------------------------------------------------------------------------
  1 | Oozlybub and Murphy
  2 | ===================
  3 | 
  4 | Language version 1.1
  5 | 
  6 | Overview
  7 | --------
  8 | 
  9 | This document describes a new programming language. The name of this
 10 | language is Oozlybub and Murphy. Despite appearances, this name refers
 11 | to a single language. The majority of the language is named Oozlybub.
 12 | The fact that the language is not entirely named Oozlybub is named
 13 | Murphy. Deal with it.
 14 | 
 15 | For the sake of providing an "olde tyme esoterickal de-sign", the
 16 | language combines several unusual features, including multiple
 17 | interleaved parse streams, infinitely long variable names, gratuitously
 18 | strong typing, and only-conjectural Turing completeness. While no
 19 | implementation of the language exists as of this writing, it is thought
 20 | to be sufficiently consistent to be implementable, modulo any errors in
 21 | this docunemt.
 22 | 
 23 | In places the language may resemble [SMITH][] and [Quylthulg][], but
 24 | this was not intended, and the similarities are purely emergent.
 25 | 
 26 | [SMITH]: http://catseye.tc/node/SMITH.html
 27 | [Quylthulg]: http://catseye.tc/node/Quylthulg.html
 28 | 
 29 | Program Structure
 30 | -----------------
 31 | 
 32 | A Oozlybub and Murphy program consists of a number of variables and a
 33 | number of objects called _dynasts_. A Oozlybub and Murphy program text
 34 | consists of multiple parse streams. Each parse stream contains zero or
 35 | more variable declarations, and optionally a single dynast.
 36 | 
 37 | ### Parse Streams
 38 | 
 39 | A parse stream is just a segment, possibly non-contiguous, of the text
 40 | of a Oozlybub and Murphy program. A program starts out with a single
 41 | parse stream, but certain parse stream manipulation pragmas can change
 42 | this. These pragmas have the form `{@x}` and have a similar syntactic
 43 | status as comments; they can appear anywhere except inside a lexeme.
 44 | 
 45 | Parse streams are arranged as a ring (a cyclic doubly linked list.) When
 46 | parsing of the program text begins initially, there is already a single
 47 | pre-created parse stream. When the program text ends, all parse streams
 48 | which may be active are deleted.
 49 | 
 50 | The meanings of the pragmas are:
 51 | 
 52 | -   `{@+}` Create a new parse stream to the right of the current one.
 53 | -   `{@>}` Switch to the parse stream to the right of the current one.
 54 | -   `{@<}` Switch to the parse stream to the left of the current one.
 55 | -   `{@-}` Delete the current parse stream. The parse stream to the left
 56 |     of the deleted parse stream will become the new current parse
 57 |     stream.
 58 | 
 59 | Deleting a parse stream while it contains an unfinished syntactic
 60 | construct is a syntax error, just as an end-of-file in that circumstance
 61 | would be in most other languages.
 62 | 
 63 | Providing a concrete example of parse streams in action will be
 64 | difficult in the absence of defined syntax for the rest of Oozlybub and
 65 | Murphy, so we will, for the purposes of the following demonstration
 66 | only, pretend that the contents of a parse stream is a sentence of
 67 | English. Here is how three parse streams might be managed:
 68 | 
 69 | `The quick {@+}brown{@>}Now is the time{@<}fox{@<} for all good men to {@+}{@>}Wherefore art thou {@>} jumped over {@>}{@>}Romeo?{@-} come to the aid of {@>}the lazy dog's tail.{@-}their country.{@-}`
 70 | 
 71 | ### Variables
 72 | 
 73 | All variables are declared in a block at the beginning of a parse
 74 | stream. If there is also a dynast in that stream, the variables are
 75 | private to that dynast; otherwise they are global and shared by all
 76 | dynasts. (*Defined in 1.1*) Any dynamically created dynast gets its own
 77 | private copies of any private variables the original dynast had; they
 78 | will initially hold the values they had in the original, but they are
 79 | not shared.
 80 | 
 81 | The name of a variable in Oozlybub and Murphy is not a fixed,
 82 | finite-length string of symbols, as you would find in other programming
 83 | languages. No sir! In Oozlybub and Murphy, each variable is named by a
 84 | possibly-infinite set of strings (over the alphanumeric-plus-spaces
 85 | alphabet `[a-zA-Z0-9 ]`), at least one of which must be infinitely long.
 86 | (*New in 1.1*: spaces [but no other kinds of whitespace] are allowed in
 87 | these strings.)
 88 | 
 89 | To accomodate this method of identifying a variable, in Oozlybub and
 90 | Murphy programs, which are finite, variables are identified using
 91 | regular expressions which match their set of names. An equivalence class
 92 | of regular expressions is a set of all regular expressions which accept
 93 | exactly the same set of strings; each equivalence class of regular
 94 | expressions refers to the same, unique Oozlybub and Murphy variable.
 95 | 
 96 | (In case you wonder about the implementability of this: Checking that
 97 | two regular expressions are equivalent is decidable: we convert them
 98 | both to NFAs, then to DFAs, then minimize those DFAs, then check if the
 99 | transition graphs of those DFAs are isomorphic. Checking that the
100 | regular expression accepts at least one infinitely-long string is also
101 | decidable: just look for a cycle in the DFA's graph.)
102 | 
103 | Note that these identifier-sets need not be disjoint. `/ma*/` and
104 | `/mb*/` are distinct variables, even though both contain the string `m`.
105 | (Note also that we are fudging slightly on how we consider to have
106 | described an infinitely long name; technically we would want to have a
107 | Büchi automaton that specifies an unending repetition with ^ω^ instead
108 | of \*. But the distinction is subtle enough in this context that we're
109 | gonna let it slide.)
110 | 
111 | Syntax for giving a variable name is fairly straightforward: it is
112 | delimited on either side by `/` symbols; the alphanumeric symbols are
113 | literals; textual concatenation is regular expression sequencing, `|` is
114 | alteration, `(` and `)` increase precedence, and `*` is Kleene
115 | asteration (zero or more occurrences). Asteration has higher precedence
116 | than sequencing, which has higher precedence than alteration. Because
117 | none of these operators is alphanumeric nor a space, no escaping scheme
118 | needs to be installed.
119 | 
120 | Variables are declared with the following syntax (`i` and `a` are the
121 | types of the variables, described in the next section):
122 | 
123 |     VARIABLES ARE i /pp*/, i /qq*/, a /(0|1)*/.
124 | 
125 | This declares an integer variable identified by the names {`p`, `pp`,
126 | `ppp`, ...}, an integer variable identified by the names {`q`, `qq`,
127 | `qqq`, ...}, and an array variable identified by the names of all
128 | strings of `0`'s and `1`'s.
129 | 
130 | When not in wimpmode (see below), any regular expression which denotes a
131 | variable may not be literally repeated anywhere else in the program. So
132 | in the above example, it would not be legal to refer to `/pp*/` further
133 | down in the program; an equivalent regular expression, such as
134 | `/p|ppp*/` or `/p*p/` or `/pp*|pp*|pp*/` would have to be used instead.
135 | 
136 | ### Types
137 | 
138 | Oozlybub and Murphy is a statically-typed language, in that variables as
139 | well as values have types, and a value of one type cannot be stored in a
140 | variable of another type. The types of values, however, are not entirely
141 | disjoint, as we will see, and special considerations may arise for
142 | checking and conversion because of this.
143 | 
144 | The basic types are:
145 | 
146 | -   `i`, the type of integers.
147 | 
148 |     These are integers of unbounded extent, both positive and negative.
149 |     Literal constants of type `i` are given in the usual decimal format.
150 |     Variables of this type initially contain the value 0.
151 | 
152 | -   `p`, the type of prime numbers.
153 | 
154 |     All prime numbers are integers but not all integers are prime
155 |     numbers. Thus, values of prime number type will automatically be
156 |     coerced to integers in contexts that require integers; however the
157 |     reverse is not true, and in the other direction a conversion
158 |     function (`P?`) must be used. There are no literal constants of type
159 |     `p`. Variables of this type initially contain the value 2.
160 | 
161 | -   `a`, the type of arrays of integers.
162 | 
163 |     An integer array has an integer index which is likewise of unbounded
164 |     extent, both positive and negative. Variables of this type initially
165 |     contain an empty array value, where all of the entries are 0.
166 | 
167 | -   `b`, the type of booleans.
168 | 
169 |     A boolean has two possible values, `true` and `false`. Note that
170 |     there are no literal constants of type `b`; these must be specified
171 |     by constructing a tautology or contradiction with boolean (or other)
172 |     operators. It is illegal to retrieve the value of a variable of this
173 |     type before first assigning it, except to construct a tautology or
174 |     contradiction.
175 | 
176 | -   `t`, the type of truth-values.
177 | 
178 |     A truth-value has two possible values, `yes` and `no`. There are no
179 |     literal constants of type `t`. It is illegal to retrieve the value
180 |     of a variable of this type before first assigning it, except to
181 |     construct a tautology or contradiction.
182 | 
183 | -   `z`, the type of bits.
184 | 
185 |     A bit has two possible values, `one` and `zero`. There are no
186 |     literal constants of type `z`. It is illegal to retrieve the value
187 |     of a variable of this type before first assigning it, except to
188 |     construct a tautology or contradiction.
189 | 
190 | -   `c`, the type of conditions.
191 | 
192 |     A condition has two possible values, `go` and `nogo`. There are no
193 |     literal constants of type `c`. It is illegal to retrieve the value
194 |     of a variable of this type before first assigning it, except to
195 |     construct a tautology or contradiction.
196 | 
197 | ### Wimpmode
198 | 
199 | (*New in 1.1*) An Oozlybub and Murphy program is in wimpmode if it
200 | declares a global variable of integer type which matches the string
201 | `am a wimp`, for example:
202 | 
203 |     VARIABLES ARE i /am *a *wimp/.
204 | 
205 | Certain language constructs, noted in this document as such, are only
206 | permissible in wimpmode. If they are used in a program in which wimpmode
207 | is not in effect, a compile-time error shall occur and the program shall
208 | not be executed.
209 | 
210 | ### Dynasts
211 | 
212 | Each dynast is labeled with a positive integer and contains an
213 | expression. Only one dynast may be denoted in any given parse stream,
214 | but dynasts may also be created dynamically during program execution.
215 | 
216 | Program execution begins at the lowest-numbered dynast that exists in
217 | the initial program. When a dynast is executed, the expression of that
218 | dynast is evaluated for its side-effects. If there is a dynast labelled
219 | with the next higher integer (i.e. the successor of the label of the
220 | current dynast), execution continues with that dynast; otherwise, the
221 | program halts. Once a dynast has been executed, it continues to exist
222 | until the program halts, but it may never be executed again.
223 | 
224 | Evaluation of an expression may have side-effects, including writing
225 | characters to an output channel, reading characters from an input
226 | channel, altering the value of a variable, and creating a new dynast.
227 | 
228 | Dynasts are written with the syntax `dynast(label) <-> expr`. A concrete
229 | example follows:
230 | 
231 |     dynast(100) <-> for each prime /p*/ below 1000 do write (./p*|p/+1.)
232 | 
233 | ### TRIVIA PORTION OF SHOW
234 | 
235 | WHO WAS IT FAMOUS MAN THAT SAID THIS?
236 | 
237 | -   A) RONALD REAGAN
238 | -   B) RONALD REAGAN
239 | -   B) RONALD STEWART
240 | -   C) RENALDO
241 | 
242 | contestant enters lightning round now
243 | 
244 | ### Expressions
245 | 
246 | In the following, the letter preceding -expr or -var indicates the
247 | expected type, if any, of that expression or variable. Where the
248 | expressions listed below are infix expressions, they are listed from
249 | highest to lowest precedence. Unless noted otherwise, subexpressions are
250 | evaluated left to right.
251 | 
252 | -   `(.expr.)`
253 | 
254 |     Surrounding an expression with dotted parens gives it that
255 |     precedence boost that's just the thing to have it be evaluated
256 |     before the expression it's in, but there is a catch. The number of
257 |     parens in the dotted parens expression must match the nesting depth
258 |     in the following way: if a set of dotted parens is nested within n
259 |     dotted parens, it must contain fib(n) parens, where fib(n) is the
260 |     nth member of the Fibonacci sequence. For example, `(.(.0.).)` and
261 |     `(.(.((.(((.(((((.0.))))).))).)).).)` are syntactically well-formed
262 |     expressions (when not nested in any other dotted paren expression),
263 |     but `(.(((.0.))).)` and `(.(.(.0.).).)` are not.
264 | 
265 | -   `var`
266 | 
267 |     A variable evaluates to the value it contains at that point in
268 |     execution.
269 | 
270 | -   `0`, `1`, `2`, `3`, etc.
271 | 
272 |     Decimal literals evaluate to the expected value of type `i`.
273 | 
274 | -   `#myself#`
275 | 
276 |     This special nullary token evaluates to the numeric label of the
277 |     currently executing dynast.
278 | 
279 | -   `var := expr`
280 | 
281 |     Evaluates the expr and stores the result in the specified variable.
282 |     The variable and the expression must have the same type. Evaluates
283 |     to whatever expr evaluated to.
284 | 
285 | -   `a-expr [i-expr]`
286 | 
287 |     Evaluates to the `i` stored at the location in the array given by
288 |     i-expr.
289 | 
290 | -   `a-expr [i-expr] := i-expr`
291 | 
292 |     Evaluates the second i-expr and stores the result in the location in
293 |     the array given by the first i-expr. Evaluates to whatever the
294 |     second i-expr evaluated to.
295 | 
296 | -   `a-expr ? i-expr`
297 | 
298 |     Evaluates to `go` if `a-expr [i-expr]` and `i-expr` evaluate to the
299 |     same thing, `nogo` otherwise. The i-expr is only evaluated once.
300 | 
301 | -   `minus i-expr`
302 | 
303 |     Evaluate to the integer that is zero minus the result of evaluating
304 |     i-expr.
305 | 
306 | -   `write i-expr`
307 | 
308 |     Write the Unicode code point whose number is obtained by evaluating
309 |     i-expr, to the standard output channel. Writing a negative number
310 |     shall produce one of a number of amusing and informative messages
311 |     which are not defined by this document.
312 | 
313 | -   `#read#`
314 | 
315 |     Wait for a Unicode character to become available on the standard
316 |     input channel and evaluate to its integer code point value.
317 | 
318 | -   `not? z-expr`
319 | 
320 |     Converts a bit value to a boolean value (`zero` becomes `true` and
321 |     `one` becomes `false`).
322 | 
323 | -   `if? b-expr`
324 | 
325 |     Converts a boolean value to condition value (true becomes go and
326 |     false becomes nogo).
327 | 
328 | -   `cvt? c-expr`
329 | 
330 |     Converts a condition value to a truth-value (`go` becomes `yes` and
331 |     `nogo` becomes `no`).
332 | 
333 | -   `to? t-expr`
334 | 
335 |     Converts a truth-value to a bit value (`yes` becomes `one` and `no`
336 |     becomes `zero`).
337 | 
338 | -   `P? i-expr [t-var]`
339 | 
340 |     If the result of evaluating i-expr is a prime number, evaluates to
341 |     that prime number (and has the type `p`). If it is not prime, stores
342 |     the value `no` into t-var and evaluates to 2.
343 | 
344 | -   `i-expr * i-expr`
345 | 
346 |     Evaluates to the product of the two i-exprs. The result is never of
347 |     type `p`, but the implementation doesn't need to do anything based
348 |     on that fact.
349 | 
350 | -   `i-expr + i-expr`
351 | 
352 |     Evaluates to the sum of the two i-exprs.
353 | 
354 | -   `exists/dynast i-expr`
355 | 
356 |     Evaluates to `one` if a dynast exists with the given label, or
357 |     `zero` if one does not.
358 | 
359 | -   `copy/dynast i-expr, p-expr, p-expr`
360 | 
361 |     Creates a new dynast based on an existing one. The existing one is
362 |     identified by the label given in the i-expr. The new dynast is a
363 |     copy of the existing dynast, but with a new label. The new label is
364 |     the sum of the two p-exprs. If a dynast with that label already
365 |     exists, the program terminates. (*Defined in 1.1*) This expression
366 |     evaluates to the value of the given i-expr.
367 | 
368 | -   `create/countably/many/dynasts i-expr, i-expr`
369 | 
370 |     Creates a countably infinite number of dynasts based on an existing
371 |     one. The existing one is identified by the label given in the first
372 |     i-expr. The new dynasts are copies of the existing dynast, but with
373 |     new labels. The new labels start at the first odd integer greater
374 |     than the second i-expr, and consist of every odd integer greater
375 |     than that. If any dynast with such a label already exists, the
376 |     program terminates. (*Defined in 1.1*) This expression evaluates to
377 |     the value of the first given i-expr.
378 | 
379 | -   `b-expr and b-expr`
380 | 
381 |     Evaluates to `one` if both b-exprs are `true`, `zero` otherwise.
382 |     Note that this is not short-circuting; both b-exprs are evaluated.
383 | 
384 | -   `c-expr or c-expr`
385 | 
386 |     Evaluates to `yes` if either or both c-exprs are `go`, `no`
387 |     otherwise. Note that this is not short-circuting; both c-exprs are
388 |     evaluated.
389 | 
390 | -   `do expr`
391 | 
392 |     Evaluates the expr, throws away the result, and evaluates to `go`.
393 | 
394 | -   `c-expr then expr`
395 | 
396 |     **Wimpmode only.** Evaluates the c-expr on the left-hand side for
397 |     its side-effects only, throwing away the result, then evaluates to
398 |     the result of evaluating the right-hand side expr.
399 | 
400 | -   `c-expr ,then i-expr`
401 | 
402 |     (*New in 1.1*) Evaluates the c-expr on the left-hand side; if it is
403 |     `go`, evaluates to the result of evaluating the right-hand side
404 |     i-expr; if it is `nogo`, evaluates to an unspecified and quite
405 |     possibly random integer between 1 and 1000000 inclusive, without
406 |     evaluating the right-hand side. Note that this operator has the same
407 |     precedence as `then`.
408 | 
409 | -   `for each prime var below i-expr do i-expr`
410 | 
411 |     The var must be a declared variable of type `p`. The first i-expr
412 |     must evaluate to an integer, which we will call k. The second i-expr
413 |     is evaluated once for each prime number between k and 2, inclusive;
414 |     each time it is evaluated, var is bound to a successively smaller
415 |     prime number between k and 2. (*Defined in 1.1*) Evaluates to the
416 |     result of the final evaluation of the second i-expr.
417 | 
418 | ### Grammar
419 | 
420 | This section attempts to capture and summarize the syntax rules (for a
421 | single parse stream) described above, using an EBNF-like syntax extended
422 | with a few ad-hoc annotations that I don't feel like explaining right
423 | now.
424 | 
425 |     ParseStream  ::= VarDeclBlock {DynastLit}.
426 |     VarDeclBlock ::= "VARIABLES ARE" VarDecl {"," VarDecl} ".".
427 |     VarDecl      ::= TypeSpec VarName.
428 |     TypeSpec     ::= "i" | "p" | "a" | "b" | "t" | "z" | "c".
429 |     VarName      ::= "/" Pattern "/".
430 |     Pattern      ::= {[a-zA-Z0-9 ]}
431 |                    | Pattern "|" Pattern                    /* ignoring precedence here */
432 |                    | Pattern "*"                            /* and here */
433 |                    | "(" Pattern ")".
434 |     DynastLit    ::= "dynast" "(" Gumber ")" "<->" Expr.
435 |     Expr         ::= Expr1[c] {"then" Expr1 | ",then" Expr1[i]}.
436 |     Expr1        ::= Expr2[c] {"or" Expr2[c]}.
437 |     Expr2        ::= Expr3[b] {"and" Expr3[b]}.
438 |     Expr3        ::= Expr4[i] {"+" Expr4[i]}.
439 |     Expr4        ::= Expr5[i] {"*" Expr5[i]}.
440 |     Expr5        ::= Expr6[a] {"?" Expr6[i]}.
441 |     Expr6        ::= Prim[a] {"[" Expr[i] "]"} [":=" Expr[i]].
442 |     Prim         ::= {"("}* "." Expr "." {")"}*             /* remember the Fibonacci rule! */
443 |                    | VarName [":=" Expr]
444 |                    | Gumber
445 |                    | "#myself#"
446 |                    | "minus" Expr[i]
447 |                    | "write" Expr[i]
448 |                    | "#read#"
449 |                    | "not?" Expr[z]
450 |                    | "if?" Expr[b]
451 |                    | "cvt?" Expr[c]
452 |                    | "to?" Expr[t]
453 |                    | "P?" Expr[i]
454 |                    | "exists/dynast" Expr[i]
455 |                    | "copy/dynast" Expr[i] "," Expr[p] "," Expr[p]
456 |                    | "create/countably/many/dynasts"
457 |                        Expr[i] "," Expr[i]
458 |                    | "do" Expr
459 |                    | "for" "each" "prime" VarName "below"
460 |                        Expr[i] "do" Expr[i].
461 |     Gumber       ::= {[0-9]}.
462 | 
463 | ### Boolean Idioms
464 | 
465 | Here we show how we can get any value of any of the `b`, `t`, `z`, and
466 | `c` types, without any constants or variables with known values of these
467 | types.
468 | 
469 |     VARIABLES ARE b /b*/.
470 |     zero = /b*|b/ and not? to? cvt? if? /b*|b*/
471 |     true = not? zero
472 |     go = if? true
473 |     yes = cvt? go
474 |     one = to? yes
475 |     false = not? one
476 |     nogo = if? false
477 |     no = cvt? nogo
478 | 
479 | ### Computational Class
480 | 
481 | Because the single in-dynast looping construct, `for each prime below`,
482 | is always a finite loop, the execution of any fixed number of dynasts
483 | cannot be Turing-complete. We must create new dynasts at runtime, and
484 | continue execution in them, if we want any chance at being
485 | Turing-complete. We demonstrate this by showing an example of a
486 | (conjecturally) infinite loop in Oozlybub and Murphy, an idiom which
487 | will doubtless come in handy in real programs.
488 | 
489 |     VARIABLES ARE p /p*/, p /q*/.
490 |     dynast(3) <->
491 |       (. do (. if? not? exists/dynast 5 ,then
492 |            create/countably/many/dynasts #myself#, 5 .) .) ,then
493 |       (. for each prime /p*|p/ below #myself#+2 do
494 |            for each prime /q*|q/ below /p*|pp/+1 do
495 |              if? not? exists/dynast /p*|p|p/+/q*|q|q/ ,then
496 |                copy/dynast #myself#, /p*|ppp/, /q*|qqq/ .)
497 | 
498 | As you can see, the ability to loop indefinitely in Oozlybub and Murphy
499 | hinges on whether Goldbach's Conjecture is true or not. Looping forever
500 | requires creating an unbounded number of new dynasts. We can create all
501 | the odd-numbered dynasts at once, but that won't be enough to loop
502 | forever, as we must proceed to the next highest numbered dynast after
503 | executing a dynast. So we must create new dynasts with successively
504 | higher even integer labels, and these can only be created by summing two
505 | primes. So, if Goldbach's conjecture is false, then there is some even
506 | number greater than two which is not the sum of two primes; thus there
507 | is some dynast that cannot be created by a running Oozlybub and Murphy
508 | program, thus it is not possible to loop forever in Oozlybub and Murphy,
509 | thus Oozlybub and Murphy is not Turing-complete (because it cannot
510 | simulate any Turing machine that loops forever.)
511 | 
512 | It should not however be difficult to show that Oozlybub and Murphy is
513 | Turing-complete under the assumption that Goldbach's Conjecture is true.
514 | If Goldbach's Conjecture is true, then the above program is an infinite
515 | loop. We need only add to it appropriate conditional instructions to,
516 | say, simulate the execution of an arbitrarily-chosen Turing machine. An
517 | array can serve as the tape, and an integer can serve as the head.
518 | Another integer can serve as the state of the finite control. The
519 | integer can be tested against various fixed integers by establishing an
520 | array for each of these fixed integers and using the `?` operator
521 | against each in turn; each branch can mutate the tape, tape head, and
522 | finite control as desired. The program can halt by neglecting to create
523 | a new even dynast to execute next, or by trying to create a dynast with
524 | a label that already exists.
525 | 
526 | Happy FLIMPING,  
527 | Chris Pressey  
528 | December 1, 2010  
529 | Evanston, Illinois, USA
530 | 


--------------------------------------------------------------------------------
/madison/Madison.markdown:
--------------------------------------------------------------------------------
  1 | Madison
  2 | =======
  3 | 
  4 | Version 0.1
  5 | December 2011, Chris Pressey, Cat's Eye Technologies
  6 | 
  7 | Abstract
  8 | --------
  9 | 
 10 | Madison is a language in which one can state proofs of properties
 11 | of term-rewriting systems.  Classical methods of automated reasoning,
 12 | such as resolution, are not used; indeed, term-rewriting itself is
 13 | used to check the proofs.  Both direct proof and proof by induction
 14 | are supported.  Induction in a proof must be across a structure which
 15 | has a well-founded inductive definition.  Such structures can be
 16 | thought of as types, although this is largely nominal; the traditional
 17 | typelessness of term-rewiting systems is largely retained.
 18 | 
 19 | Term-rewriting
 20 | --------------
 21 | 
 22 | Madison has at its core a simple term-rewriting language.  It is of
 23 | a common form which should be unsurprising to anyone who has worked
 24 | at all with term rewriting.  A typical simple program contains
 25 | a set of rules and a term on which to apply those rules.  Each
 26 | rule is a pair of terms; either term of the pair may contain
 27 | variables, but any variable that appears on the r.h.s must also
 28 | appear on the l.h.s.  A rule matches a term if is the same as
 29 | the term with the exception of the variables, which are bound
 30 | to its subterms; applying a matching rule replaces the term
 31 | with the r.h.s. of the rule, with the variables expanded approp-
 32 | riately.  Rules are applied in an innermost, leftmost fashion to
 33 | the term, corresponding to eager evaluation.  Rewriting terminates
 34 | when there is no rule whose l.h.s. matches the current incarnation
 35 | of the term being rewritten.
 36 | 
 37 | A term is either an atom, which is a symbol that stands alone,
 38 | or a constructor, which is a symbol followed by a comma-separated list
 39 | of subterms enclosed in parentheses.  Symbols may consist of letters,
 40 | digits, and hyphens, with no intervening whitespace.  A symbol is
 41 | a variable symbol if it begins with a capital letter.  Variable
 42 | symbols may also begin with underscores, but these may only occur
 43 | in the l.h.s. of a rewrite rule, to indicate that we don't care
 44 | what value is bound to the variable and we won't be using it on
 45 | the r.h.s.
 46 | 
 47 | (The way we are using the term "constructor" may be slightly non-
 48 | standard; in some other sources, this is called a "function symbol",
 49 | and a "constructor" is a subtly different thing.)
 50 | 
 51 | Because the rewriting language is merely a component (albeit the
 52 | core component) of a larger system, the aforementioned typical
 53 | simple program must be cast into some buttressing syntax.  A full
 54 | program consists of a `let` block which contains the rules
 55 | and a `rewrite` admonition which specifies the term to be re-
 56 | written.  An example follows.
 57 | 
 58 |     | let
 59 |     |   leftmost(tree(X,Y)) -> leftmost(X)
 60 |     |   leftmost(leaf(X))   -> X
 61 |     | in
 62 |     |   rewrite leftmost(tree(tree(leaf(alice),leaf(grace)),leaf(dan)))
 63 |     = alice
 64 | 
 65 | In the above example, there are two rules for the constructor
 66 | `leftmost/1`.  The first is applied to the outer tree to obtain
 67 | a new leftmost constructor containing the inner tree; the first
 68 | is applied again to obtain a new leftmost constructor containing
 69 | the leaf containing `alice`; and the second is applied to that
 70 | leaf term to obtain just `alice`.  At that point, no more rules
 71 | apply, so rewriting terminates, yielding `alice`.
 72 | 
 73 | Madison is deterministic; if rules overlap, the first one given
 74 | (syntactically) is used.  For this reason, it is a good idea
 75 | to order rules from most specific to least specific.
 76 | 
 77 | I used the phrase "typical simple program" above because I was
 78 | trying intentionally to avoid saying "simplest program".  In fact,
 79 | technically no `let` block is required, so you can write some
 80 | really trivial Madison programs, like the following:
 81 | 
 82 |     | rewrite cat
 83 |     = cat
 84 | 
 85 | I think that just about covers the core term-rewriting language.
 86 | Term-rewriting is Turing-complete, so Madison is too.  If you
 87 | wish to learn more about term rewriting, there are several good
 88 | books and webpages on the subject; I won't go into it further
 89 | here.
 90 | 
 91 | Proof-Checking
 92 | --------------
 93 | 
 94 | My desire with Madison was to design a language in which you
 95 | can prove things.  Not a full-blown theorem prover -- just a
 96 | proof checker, where you supply a proof and it confirms either
 97 | that the proof holds or doesn't hold.  (Every theorem prover
 98 | has at its core a proof checker, but it comes bundled with a lot of
 99 | extra machinery to search the space of possible proofs cleverly,
100 | looking for one which will pass the proof-checking phase.)
101 | 
102 | It's no coincidence that Madison is built on top of a term-rewriting
103 | language.  For starters, a proof is very similar to the execution
104 | trace of a term being rewritten.  In each of the steps of the proof,
105 | the statement to be proved is transformed by replacing some part
106 | of it with some equally true thing -- in other words, rewritten.
107 | In fact, Post Canonical Systems were an early kind of rewriting
108 | system, devised by Emil Post to (as I understand it) illustrate this
109 | similarity, and to show that proofs could be mechanically carried out
110 | in a rewriting system.
111 | 
112 | So: given a term-rewriting language, we can give a trivial kind
113 | of proof simply by stating the rewrite steps that *should* occur
114 | when a term is rewritten, and check that proof by rewriting the term
115 | and confirming that those were in fact the steps that occurred.
116 | 
117 | For the purpose of stating these sequences of rewrite steps to be
118 | checked, Madison has a `theorem..proof..qed` form.  To demonstrate
119 | this form, let's use Madison to prove that 2 + 2 = 4, using Peano
120 | arithmetic.
121 | 
122 |     | let
123 |     |   add(s(X),Y) -> add(X,s(Y))
124 |     |   add(z,Y)    -> Y
125 |     | in theorem
126 |     |   add(s(s(z)),s(s(z))) ~> s(s(s(s(z))))
127 |     | proof
128 |     |   add(s(s(z)),s(s(z)))
129 |     |   -> add(s(z),s(s(s(z)))) [by add.1]
130 |     |   -> add(z,s(s(s(s(z))))) [by add.1]
131 |     |   -> s(s(s(s(z))))        [by add.2]
132 |     | qed
133 |     = true
134 | 
135 | The basic syntax should be fairly apparent.  The `theorem` block
136 | contains the statement to be proved.  The `~>` means "rewrites
137 | in zero or more steps to".  So, here, we are saying that 2 + 2
138 | (in Peano notation) rewrites, in zero or more steps, to 4.
139 | 
140 | The `proof` block contains the actual series of rewrite steps that
141 | should be carried out.  For elucidation, each step may name the
142 | particular rule which is applied to arrive at the transformed term
143 | at that step.  Rules are named by their outermost constructor,
144 | followed by a dot and the ordinal position of the rule in the list
145 | of rules.  These rule-references are optional, but the fact that
146 | the rule so named was actually used to rewrite the term at that step
147 | could be checked too, of course.  The `qed` keyword ends the proof
148 | block.
149 | 
150 | Naturally, you can also write a proof which does not hold, and
151 | Madison should inform you of this fact.  2 + 3, for example,
152 | does not equal 4, and it can pinpoint exactly where you went
153 | wrong should you come to this conclusion:
154 | 
155 |     | let
156 |     |   add(s(X),Y) -> add(X,s(Y))
157 |     |   add(z,Y)    -> Y
158 |     | in theorem
159 |     |   add(s(s(z)),s(s(s(z)))) ~> s(s(s(s(z))))
160 |     | proof
161 |     |   add(s(s(z)),s(s(s(z))))
162 |     |   -> add(s(z),s(s(s(s(z))))) [by add.1]
163 |     |   -> add(z,s(s(s(s(z)))))    [by add.1]
164 |     |   -> s(s(s(s(z))))           [by add.2]
165 |     | qed
166 |     ? Error in proof [line 6]: step 2 does not follow from applying [add.1] to previous step
167 | 
168 | Now, while these *are* proofs, they don't tell us much about the
169 | properties of the terms and rules involved, because they are not
170 | *generalized*.  They say something about a few fixed values, like
171 | 2 and 4, but they do not say anything about any *infinite*
172 | sets of values, like the natural numbers.  Now, that would be *really*
173 | useful.  And, while I could say that what you've seen of Madison so far
174 | is a proof checker, it is not a very impressive one.  So let's take
175 | this further.
176 | 
177 | Quantification
178 | --------------
179 | 
180 | To state a generalized proof, we will need to introduce variables,
181 | and to have variables, we will need to be able to say what those
182 | variables can range over; in short, we need *quantification*.  Since
183 | we're particularly interested in making statements about infinite
184 | sets of values (like the natural numbers), we specifically want
185 | *universal quantification*:
186 | 
187 |     For all x, ...
188 | 
189 | But to have universal quantification, we first need a *universe*
190 | over which to quantify.  When we say "for all /x/", we generally
191 | don't mean "any and all things of any kind which we could
192 | possibly name /x/".  Rather, we think of /x/ as having a type of
193 | some kind:
194 | 
195 |     For all natural numbers x, ...
196 | 
197 | Then, if our proof holds, it holds for all natural numbers.
198 | No matter what integer value greater than or equal to zero
199 | we choose for /x/, the truism contained in the proof remains true.
200 | This is the sort of thing we want in Madison.
201 | 
202 | Well, to start, there is one glaringly obvious type in any
203 | term-rewriting language, namely, the term.  We could say
204 | 
205 |     For all terms t, ...
206 | 
207 | But it would not actually be very interesting, because terms
208 | are so general and basic that there's not actually very much you
209 | can say about them that you don't already know.  You sort of need
210 | to know the basic properties of terms just to build a term-rewriting
211 | language (like the one at Madison's core) in the first place.
212 | 
213 | The most useful property of terms as far as Madison is concerned is
214 | that the subterm relationship is _well-founded_.  In other words,
215 | in the term `c(X)`, `X` is "smaller than" `c(X)`, and since terms are
216 | finite, any series of rewrites which always results in "smaller" terms
217 | will eventually terminate.  For completeness, we should probably prove
218 | that rigorously, but for expediency we will simply take it as a given
219 | fact for our proofs.
220 | 
221 | Anyway, to get at something actually interesting, we must look further
222 | than the terms themselves.
223 | 
224 | Types
225 | -----
226 | 
227 | What's actually interesting is when you define a restricted
228 | set of forms that terms can take, and you distinguish terms inside
229 | this set of forms from the terms outside the set.  For example,
230 | 
231 |     | let
232 |     |   boolean(true)  -> true
233 |     |   boolean(false) -> true
234 |     |   boolean(_)     -> false
235 |     | in
236 |     |   rewrite boolean(false)
237 |     = true
238 | 
239 | We call a set of forms like this a _type_.  As you can see, we
240 | have basically written a predicate that defines our type.  If any
241 | of the rewrite rules in the definition of this predicate rewrite
242 | a given term to `true`, that term is of our type; if it rewrites
243 | to `false`, it is not.
244 | 
245 | Once we have types, any constructor may be said to have a type.
246 | By this we mean that no matter what subterms the constructor has,
247 | the predicate of the type of which we speak will always reduce to
248 | `true` when that term is inserted in it.
249 | 
250 | Note that using predicates like this allows our types to be
251 | non-disjoint; the same term may reduce to true in two different
252 | predicates.  My first sketches for Madison had disjoint types,
253 | described by rules which reduced each term to an atom which named
254 | the type of that term.  (So the above would have been written with
255 | rules `true -> boolean` and `false -> boolean` instead.)  However,
256 | while that method may be, on the surface, more elegant, I believe
257 | this approach better reflects how types are actually used in
258 | programming.  At the end of the day, every type is just a predicate,
259 | and there is nothing stopping 2 from being both a natural number and
260 | an integer.  And, for that matter, a rational number and a real
261 | number.
262 | 
263 | In theory, every predicate is a type, too, but that's where things
264 | get interesting.  Is 2 not also an even number, and a prime number?
265 | And in an appropriate (albeit contrived) language, is it not a
266 | description of a computation which may or may not always halt?
267 | 
268 | The Type Syntax
269 | ---------------
270 | 
271 | The above considerations motivate us to be careful when dealing
272 | with types.  We should establish some ground rules so that we
273 | know that our types are useful to universally quantify over.
274 | 
275 | Unfortunately, this introduces something of a chicken-and-egg
276 | situation, as our ground rules will be using logical connectives,
277 | while at the same time they will be applied to those logical
278 | connectives to ensure that they are sound.  This is not, actually,
279 | a big deal; I mention it here more because it is interesting.
280 | 
281 | So, the rules which define our type must conform to certain
282 | rules, themselves.  While it would be possible to allow the
283 | Madison programmer to use any old bunch of rewrite rules as a
284 | type, and to check that these rules make for a "good" type when
285 | such a usage is seen -- and while this would be somewhat
286 | attractive from the standpoint of proving properties of term-
287 | rewriting systems using term-rewriting systems -- it's not strictly
288 | necessary to use a descriptive approach such as this, and there are
289 | certain organizational benefits we can achieve by taking a more
290 | prescriptive tack.
291 | 
292 | Viz., we introduce a special syntax for defining a type with a
293 | set of rules which function collectively as a type predicate.
294 | Again, it's not strictly necessary to do this, but it does
295 | help organize our code and perhaps our thoughts, and perhaps make
296 | an implementation easier to build.  It's nice to be able to say,
297 | yes, what it means to be a `boolean` is defined right here and
298 | nowhere else.
299 | 
300 | So, to define a type, we write our type rules in a `type..in`
301 | block, like the following.
302 | 
303 |     | type boolean is
304 |     |   boolean(true)  -> true
305 |     |   boolean(false) -> true
306 |     | in
307 |     |   rewrite boolean(false)
308 |     = true
309 | 
310 | As you can see, the wildcard reduction to false can be omitted for
311 | brevity.  (In other words, "Nothing else is a boolean" is implied.)
312 | And, the `boolean` constructor can be used for rewriting in a term
313 | just like any other plain, non-`type`-blessed rewrite rule.
314 | 
315 |     | type boolean is
316 |     |   boolean(true)  -> true
317 |     |   boolean(false) -> true
318 |     | in
319 |     |   rewrite boolean(tree(leaf(sabrina),leaf(joe)))
320 |     = false
321 | 
322 | Here are the rules that the type-defining rules must conform to.
323 | If any of these rules are violated in the `type` block, the Madison
324 | implementation must complain, and not proceed to try to prove anything
325 | from them.
326 | 
327 | Once a type is defined, it cannot be defined further in a regular,
328 | non-type-defining rewriting rule.
329 | 
330 |     | type boolean is
331 |     |   boolean(true)  -> true
332 |     |   boolean(false) -> true
333 |     | in let
334 |     |   boolean(red) -> green
335 |     | in
336 |     |   rewrite boolean(red)
337 |     ? Constructor "boolean" used in rule but already defined as a type
338 | 
339 | The constructor in the l.h.s. must be the same in all rules.
340 | 
341 |     | type foo is
342 |     |   foo(bar) -> true
343 |     |   baz(bar) -> true
344 |     | in
345 |     |   rewrite cat
346 |     ? In type "foo", constructor "bar" used on l.h.s. of rule
347 | 
348 | The constructor used in the rules must be arity 1 (i.e. have exactly
349 | one subterm.)
350 | 
351 |     | type foo is
352 |     |   foo(bar,X) -> true
353 |     | in
354 |     |   rewrite cat
355 |     ? In type "foo", constructor has arity greater than one
356 | 
357 | It is considered an error if the predicate rules ever rewrite, inside
358 | the `type` block, to anything besides the atoms `true` or `false`.
359 | 
360 |     | type foo is
361 |     |   foo(bar) -> true
362 |     |   foo(tree(X)) -> bar
363 |     | in
364 |     |   rewrite cat
365 |     ? In type "foo", rule reduces to "bar" instead of true or false
366 | 
367 | The r.h.s.'s of the rules of the type predicate must *always*
368 | rewrite to `true` or `false`.  That means, if we can't prove that
369 | the rules always rewrite to something, we can't use them as type
370 | predicate rules.  In practice, there are a few properties that
371 | we insist that they have.
372 | 
373 | They may involve type predicates that have previously been
374 | established.
375 | 
376 |     | type boolean is
377 |     |   boolean(true)  -> true
378 |     |   boolean(false) -> true
379 |     | in type boolbox is
380 |     |   boolbox(box(X)) -> boolean(X)
381 |     | in
382 |     |   rewrite boolbox(box(true))
383 |     = true
384 | 
385 | They may involve certain, pre-defined rewriting rules which can
386 | be thought of as operators on values of boolean type (which, honestly,
387 | is probably built-in to the language.)  For now there is only one
388 | such pre-defined rewriting rule: `and(X,Y)`, where `X` and `Y` are
389 | booleans, and which rewrites to a boolean, using the standard truth
390 | table rules for boolean conjunction.
391 | 
392 |     | type boolean is
393 |     |   boolean(true)  -> true
394 |     |   boolean(false) -> true
395 |     | in type boolpair is
396 |     |   boolpair(pair(X,Y)) -> and(boolean(X),boolean(Y))
397 |     | in
398 |     |   rewrite boolpair(pair(true,false))
399 |     = true
400 | 
401 |     | type boolean is
402 |     |   boolean(true)  -> true
403 |     |   boolean(false) -> true
404 |     | in type boolpair is
405 |     |   boolpair(pair(X,Y)) -> and(boolean(X),boolean(Y))
406 |     | in
407 |     |   rewrite boolpair(pair(true,cheese))
408 |     = false
409 | 
410 | Lastly, the r.h.s. of a type predicate rule can refer to the self-same
411 | type being defined, but *only* under certain conditions.  Namely,
412 | the rewriting must "shrink" the term being rewritten.  This is what
413 | lets us inductively define types.
414 | 
415 |     | type nat is
416 |     |   nat(z)    -> true
417 |     |   nat(s(X)) -> nat(X)
418 |     | in
419 |     |   rewrite nat(s(s(z)))
420 |     = true
421 | 
422 |     | type nat is
423 |     |   nat(z)    -> true
424 |     |   nat(s(X)) -> nat(s(X))
425 |     | in
426 |     |   rewrite nat(s(s(z)))
427 |     ? Type not well-founded: recursive rewrite does not decrease in [foo.2]
428 | 
429 |     | type nat is
430 |     |   nat(z)    -> true
431 |     |   nat(s(X)) -> nat(s(s(X)))
432 |     | in
433 |     |   rewrite nat(s(s(z)))
434 |     ? Type not well-founded: recursive rewrite does not decrease in [foo.2]
435 | 
436 |     | type bad
437 |     |   bad(leaf(X)) -> true
438 |     |   bad(tree(X,Y)) -> and(bad(X),bad(tree(Y,Y))
439 |     | in
440 |     |   rewrite whatever
441 |     ? Type not well-founded: recursive rewrite does not decrease in [bad.2]
442 | 
443 | We can check this by looking at all the rewrite rules in the
444 | definition of the type that are recursive, i.e. that contain on
445 | on their r.h.s. the constructor being defined as a type predicate.
446 | For every such occurrence on the r.h.s. of a recursive rewrite,
447 | the contents of the constructor must be "smaller" than the contents
448 | of the constructor on the l.h.s.  What it means to be smaller
449 | should be fairly obvious: it just has fewer subterms.  If all the
450 | rules conform to this pattern, rewriting will eventually terminate,
451 | because it will run out of subterms to rewrite.
452 | 
453 | Application of Types in Proofs
454 | ------------------------------
455 | 
456 | Now, aside from these restrictions, type predicates are basically
457 | rewrite rules, just like any other.  The main difference is that
458 | we know they are well-defined enough to be used to scope the
459 | universal quantification in a proof.
460 | 
461 | Simply having a definition for a `boolean` type allows us to construct
462 | a simple proof with variables.  Universal quantification over the
463 | universe of booleans isn't exactly impressive; we don't cover an infinite
464 | range of values, like we would with integers, or lists.  But it's
465 | a starting point on which we can build.  We will give some rewrite rules
466 | for a constructor `not`, and prove that this constructor always reduces
467 | to a boolean when given a boolean.
468 | 
469 |     | type boolean is
470 |     |   boolean(true)  -> true
471 |     |   boolean(false) -> true
472 |     | in let
473 |     |   not(true)  -> false
474 |     |   not(false) -> true
475 |     |   not(_)     -> undefined
476 |     | in theorem
477 |     |   forall X where boolean(X)
478 |     |     boolean(not(X)) ~> true
479 |     | proof
480 |     |   case X = true
481 |     |     boolean(not(true))
482 |     |     -> boolean(true)   [by not.1]
483 |     |     -> true            [by boolean.1]
484 |     |   case X = false
485 |     |     boolean(not(false))
486 |     |     -> boolean(false)  [by not.2]
487 |     |     -> true            [by boolean.2]
488 |     | qed
489 |     = true
490 | 
491 | As you can see, proofs using universally quantified variables
492 | need to make use of _cases_.  We know this proof is sound, because
493 | it shows the rewrite steps for all the possible values of the
494 | variable -- and we know they are all the possible values, from the
495 | definition of the type.
496 | 
497 | In this instance, the cases are just the two possible values
498 | of the boolean type, but if the type was defined inductively,
499 | they would need to cover the base and inductive cases.  In both
500 | matters, each case in a complete proof maps to exactly one of
501 | the possible rewrite rules for the type predicate.  (and vice versa)
502 | 
503 | Let's prove the type of a slightly more complex rewrite rule,
504 | one which has multiple subterms which can vary.  (This `and`
505 | constructor has already been introduced, and we've claimed we
506 | can use it in the definition of well-founded inductive types;
507 | but this code proves that it is indeed well-founded, and it
508 | doesn't rely on it already being defined.)
509 | 
510 |     | let
511 |     |   and(true,true) -> true
512 |     |   and(_,_)       -> false
513 |     | in theorem
514 |     |   forall X where boolean(X)
515 |     |     forall Y where boolean(Y)
516 |     |       boolean(and(X,Y)) ~> true
517 |     | proof
518 |     |   case X = true
519 |     |     case Y = true
520 |     |       boolean(and(true,true))
521 |     |       -> boolean(true)        [by and.1]
522 |     |       -> true                 [by boolean.1]
523 |     |     case Y = false
524 |     |       boolean(and(true,false))
525 |     |       -> boolean(false)       [by and.2]
526 |     |       -> true                 [by boolean.2]
527 |     |   case X = false
528 |     |     case Y = true
529 |     |       boolean(and(false,true))
530 |     |       -> boolean(false)       [by and.2]
531 |     |       -> true                 [by boolean.2]
532 |     |     case Y = false
533 |     |       boolean(and(false,false))
534 |     |       -> boolean(false)       [by and.2]
535 |     |       -> true                 [by boolean.2]
536 |     | qed
537 |     = true
538 | 
539 | Unwieldy, you say!  And you are correct.  But making something
540 | easy to use was never my goal.
541 | 
542 | Note that the definition of `and()` is a bit more open-ended than
543 | `not()`.  `and.2` allows terms like `and(dog,cat)` to rewrite to `false`.
544 | But our proof only shows that the result of reducing `and(A,B)` is
545 | a boolean *when both A and B are booleans*.  So it, in fact,
546 | tells us nothing about the type of `and(dog,cat)`, nor in fact anything
547 | at all about the properties of `and(A,B)` when one or more of `A` and
548 | `B` are not of boolean type.  So be it.
549 | 
550 | Anyway, since we were speaking of inductively defined types
551 | previously, let's do that now.  With the help of `and()`, here is
552 | a type for binary trees.
553 | 
554 |     | type tree is
555 |     |   tree(leaf)        -> true
556 |     |   tree(branch(X,Y)) -> and(tree(X),tree(Y))
557 |     | in
558 |     |   rewrite tree(branch(leaf,leaf))
559 |     = true
560 | 
561 | We can define some rewrite rules on trees.  To start small,
562 | let's define a simple predicate on trees.
563 | 
564 |     | type tree is
565 |     |   tree(leaf)        -> true
566 |     |   tree(branch(X,Y)) -> and(tree(X),tree(Y))
567 |     | in let
568 |     |   empty(leaf)        -> true
569 |     |   empty(branch(_,_)) -> false
570 |     | in empty(branch(branch(leaf,leaf),leaf))
571 |     = false
572 | 
573 |     | type tree is
574 |     |   tree(leaf)        -> true
575 |     |   tree(branch(X,Y)) -> and(tree(X),tree(Y))
576 |     | in let
577 |     |   empty(leaf)        -> true
578 |     |   empty(branch(_,_)) -> false
579 |     | in empty(leaf)
580 |     = true
581 | 
582 | Now let's prove that our predicate always rewrites to a boolean
583 | (i.e. that it has boolean type) when its argument is a tree.
584 | 
585 |     | type tree is
586 |     |   tree(leaf)        -> true
587 |     |   tree(branch(X,Y)) -> and(tree(X),tree(Y))
588 |     | in let
589 |     |   empty(leaf)        -> true
590 |     |   empty(branch(_,_)) -> false
591 |     | in theorem
592 |     |   forall X where tree(X)
593 |     |     boolean(empty(X)) ~> true
594 |     | proof
595 |     |   case X = leaf
596 |     |     boolean(empty(leaf))
597 |     |     -> boolean(true)  [by empty.1]
598 |     |     -> true           [by boolean.1]
599 |     |   case X = branch(S,T)
600 |     |     boolean(empty(branch(S,T)))
601 |     |     -> boolean(false) [by empty.2]
602 |     |     -> true           [by boolean.2]
603 |     | qed
604 |     = true
605 | 
606 | This isn't really a proof by induction yet, but it's getting closer.
607 | This is still really us examining the cases to determine the type.
608 | But, we have an extra guarantee here; in `case X = branch(S,T)`, we
609 | know `tree(S) -> true`, and `tree(T) -> true`, because `tree(X) -> true`.
610 | This is one more reason why `and(X,Y)` is built-in to Madison; it
611 | needs to leverage what it means and make use of this information in a
612 | proof.  We don't really use that extra information in this proof, but
613 | we will later on.
614 | 
615 | Structural Induction
616 | --------------------
617 | 
618 | Let's try something stronger, and get into something that could be
619 | described as real structural induction.  This time, we won't just prove
620 | something's type.  We'll prove something that actually walks and talks
621 | like a real (albeit simple) theorem: the reflection of the reflection
622 | of any binary tree is the same as the original tree.
623 | 
624 |     | type tree is
625 |     |   tree(leaf)        -> true
626 |     |   tree(branch(X,Y)) -> and(tree(X),tree(Y))
627 |     | in let
628 |     |   reflect(leaf)        -> leaf
629 |     |   reflect(branch(A,B)) -> branch(reflect(B),reflect(A))
630 |     | in theorem
631 |     |   forall X where tree(X)
632 |     |     reflect(reflect(X)) ~> X
633 |     | proof
634 |     |   case X = leaf
635 |     |     reflect(reflect(leaf))
636 |     |     -> reflect(leaf)        [by reflect.1]
637 |     |     -> leaf                 [by reflect.1]
638 |     |   case X = branch(S, T)
639 |     |     reflect(reflect(branch(S, T)))
640 |     |     -> reflect(branch(reflect(T),reflect(S)))          [by reflect.2]
641 |     |     -> branch(reflect(reflect(S)),reflect(reflect(T))) [by reflect.2]
642 |     |     -> branch(S,reflect(reflect(T)))                   [by IH]
643 |     |     -> branch(S,T)                                     [by IH]
644 |     | qed
645 |     = true
646 | 
647 | Finally, this is a proof using induction!  In the [by IH] clauses,
648 | IH stands for "inductive hypothesis", the hypothesis that we may
649 | assume in making the proof; namely, that the property holds for
650 | "smaller" instances of the type of X -- in this case, the "smaller"
651 | trees S and T that are used to construct the tree `branch(S, T)`.
652 | 
653 | Relying on the IH is valid only after we have proved the base case.
654 | After having proved `reflect(reflect(S)) -> S` for the base cases of
655 | the type of S, we are free to assume that `reflect(reflect(S)) -> S`
656 | in the induction cases.  And we do so, to rewrite the last two steps.
657 | 
658 | Like cases, the induction in a proof maps directly to the
659 | induction in the definition of the type of the variable being
660 | universally quantified upon.  If the induction in the type is well-
661 | founded, so too will be the induction in the proof.  (Indeed, the
662 | relationship between induction and cases is implicit in the
663 | concepts of the "base case" and "inductive case (or step)".)
664 | 
665 | Stepping Back
666 | -------------
667 | 
668 | So, we have given a simple term-rewriting-based language for proofs,
669 | and shown that it can handle a proof of a property over an infinite
670 | universe of things (trees.)  That was basically my goal in designing
671 | this language.  Now let's step back and consider some of the
672 | implications of this system.
673 | 
674 | We have, here, a typed programming language.  We can define types
675 | that look an awful lot like algebraic data types.  But instead of
676 | glibly declaring the type of any given term, like we would in most
677 | functional languages, we actually have to *prove* that our terms
678 | always rewrite to a value of that type.  That's more work, of
679 | course, but it's also stronger: in proving that the term always
680 | rewrites to a value of the type, we have, naturally, proved that
681 | it *always* rewrites -- that its rewrite sequence is terminating.
682 | There is no possibility that its rewrite sequence will enter an
683 | infinite loop.  Often, we establish this with the help of previously
684 | established basis that our inductively-defined types are well-founded,
685 | which is itself proved on the basis that the subterm relationship is
686 | well-founded.
687 | 
688 | Much like we can prove termination in course of proving a type,
689 | we can prove a type in the course of proving a property -- such
690 | as the type of `reflect(reflect(T))` above.  (This does not directly
691 | lead to a proof of the type of `reflect`, but whatever.)
692 | 
693 | And, of course, we are only proving the type of term on the
694 | assumption that its subterms have specific types.  These proofs
695 | say nothing about the other cases.  This may provide flexibility
696 | for extending rewrite systems -- or it might not, I'm not sure.
697 | It might be nice to prove that all other types result in some
698 | error term.  (One of the more annoying things about term-rewriting
699 | languages is how an error can result in a half-rewritten program
700 | instead of a recgonizable error code.  There seems to be a tradeoff
701 | between extensibility and producing recognizable errors.)
702 | 
703 | Grammar so Far
704 | --------------
705 | 
706 | I think I've described everything I want in the language above, so
707 | the grammar should, modulo tweaks, look something like this:
708 | 
709 |     Madison      ::= Block.
710 |     Block        ::= LetBlock | TypeBlock | ProofBlock | RewriteBlock.
711 |     LetBlock     ::= "let" {Rule} "in" Block.
712 |     TypeBlock    ::= "type" Symbol "is" {Rule} "in" Block.
713 |     RewriteBlock ::= "rewrite" Term.
714 |     Rule         ::= Term "->" Term.
715 |     Term         ::= Atom | Constructor | Variable.
716 |     Atom         ::= Symbol.
717 |     Constructor  ::= Symbol "(" Term {"," Term} ")".
718 |     ProofBlock   ::= "theorem" Statement "proof" Proof "qed".
719 |     Statement    ::= Quantifier Statement | MultiStep.
720 |     Quantifiers  ::= "forall" Variable "where" Term.
721 |     MultiStep    ::= Term "~>" Term.
722 |     Proof        ::= Case Proof {Case Proof} | Trace.
723 |     Trace        ::= Term {"->" Term [RuleRef]}.
724 |     RuleRef      ::= "[" "by" (Symbol "." Integer | "IH") "]".
725 | 
726 | Discussion
727 | ----------
728 | 
729 | I think that basically covers it.  This document is still a little
730 | rough, but that's what major version zeroes are for, right?
731 | 
732 | I have essentially convinced myself that the above-described system
733 | is sufficient for simple proof checking.  There are three significant
734 | things I had to convince myself of to get to this point, which I'll
735 | describe here.
736 | 
737 | One is that types have to be well-founded in order for them to serve
738 | as scopes for universal quantification.  This is obvious in
739 | retrospect, but getting them into the language in a way where it was
740 | clear they could be checked for well-foundedness took a little
741 | effort.  The demarcation of type-predicate rewrite rules was a big
742 | step, and a little disappointing because it introduces the loaded
743 | term `type` into Madison's vernacular, which I wanted to avoid.
744 | But it made it much easier to think about, and to formulate the
745 | rules for checking that a type is well-founded.  As I mentioned, it
746 | could go away -- Madison could just as easily check that any
747 | constructor used to scope a universal quantification is well-founded.
748 | But that would probably muddy the presentation of the idea in this
749 | document somewhat.  It would be something to keep in mind for a
750 | subsequent version of Madison that further distances itself from the
751 | notion of "types".
752 | 
753 | Also, it would probably be possible to extend the notion of well-
754 | founded rewriting rules to mutually-recursive rewriting rules.
755 | However, this would complicate the procedure for checking that a
756 | type predicate is well-founded.
757 | 
758 | The second thing I had to accept to get to this conviction is that
759 | `and(X,Y)` is built into the language.  It can't just be defined
760 | in Madison code, because while this would be wonderful from a
761 | standpoint of minimalism, Madison has to know what it means to let
762 | you write non-trivial inductive proofs.  In a nutshell, it has to
763 | know that `foo(X) -> and(bar(X),baz(X))` means that if `foo(X)` is
764 | true, then `bar(X)` is also true, and `baz(X)` is true as well.
765 | 
766 | I considered making `or(X,Y)` a built-in as well, but after some
767 | thought, wasn't convinced that it was that valuable in the kinds
768 | of proofs I wanted to write.
769 | 
770 | Lastly, the third thing I had to come to terms with was, in general,
771 | how we know a stated proof is complete.  As I've tried to describe
772 | above, we know it's complete because each of the cases maps to a
773 | possible rewrite rule, and induction maps to the inductive definition
774 | of a type predicate, which we know is well-founded because of the
775 | checks Madison does on it (ultimately based on the assumption that
776 | the subterm property is well-founded.)  There Is Nothing Else.
777 | 
778 | This gets a little more complicated when you get into proofs by
779 | induction.  The thing there is that we can assume the property
780 | we want to prove, in one of the cases (the inductive case) of the
781 | proof, so long as we have already proved all the other cases (the
782 | base case.)  This is perfectly sound in proofs by hand, so it is
783 | likewise perfectly sound in a formal proof checker like Madison;
784 | the question is how Madison "knows" that it is sound, i.e. how it
785 | can be programmed to reject proofs which are not structured this
786 | way.  Well, if we limit it to what I've just described above --
787 | check that the scope of the universal quantification is well-
788 | founded, check that there are two cases, and check that we've already
789 | proved one case, then allow the inductive hypothesis to be used as a
790 | rewrite rule in the other case of the proof -- this is not difficult
791 | to see how it could be mechanized.
792 | 
793 | However, this is also very limited.  Let's talk about limitations.
794 | 
795 | For real data structures, you might well have multiple base cases;
796 | for example, a tree with two kinds of leaf nodes.  Does this start
797 | breaking down?  Probably.  It probably breaks down with multiple
798 | inductive cases, as well, although you might be able to get around
799 | that with breaking the proof into multiple proofs, and having
800 | subsequent proofs rely on properties proved in previous proofs.
801 | 
802 | Another limitation I discovered when trying to write a proof that
803 | addition in Peano arithmetic is commutative.  It seemingly can't
804 | be done in Madison as it currently stands, as Madison only knows
805 | how to rewrite something into something else, and cannot express
806 | the fact that two things (like `add(A,B)` and `add(B,A)`) rewrite
807 | to the same thing.  Such a facility would be easy enough to add,
808 | and may appear in a future version of Madison, possibly with a
809 | syntax like:
810 | 
811 |     theorem
812 |       forall A where nat(A)
813 |         forall B where nat(B)
814 |           add(A,B) ~=~ add(B,A)
815 |     proof ...
816 | 
817 | You would then show that `add(A,B)` reduces to something, and
818 | that `add(B,A)` reduces to something, and Madison would check
819 | that the two somethings are in fact the same thing.  This is, in
820 | fact, a fairly standard method in the world of term rewriting.
821 | 
822 | As a historical note, Madison is one of the pieces of fallout from
823 | the overly-ambitious project I started a year and a half ago called
824 | Rho.  Rho was a homoiconic rewriting language with several very
825 | general capabilities, and it wasn't long before I decided it was
826 | possible to write proofs in it, as well as the other things it was
827 | designed for.  Of course, this stretched it to about the limit of
828 | what I could keep track of in a single project, and it was soon
829 | afterwards abandoned.  Other fallout from Rho made it into other
830 | projects of mine, including Pail (having `let` bindings within
831 | the names of other `let` bindings), Falderal (the test suite from
832 | the Rho implementation), and Q-expressions (a variant of
833 | S-expressions, with better quoting capabilities, still forthcoming.)
834 | 
835 | Happy proof-checking!  
836 | Chris Pressey  
837 | December 2, 2011  
838 | Evanston, Illinois  
839 | 


--------------------------------------------------------------------------------