├── .hgtags ├── irishsea ├── doc │ ├── environment.markdown │ ├── model.markdown │ ├── language.markdown │ └── original-notes.markdown └── README.markdown ├── README.markdown ├── opus-2 └── opus-2.markdown ├── tamerlane └── tamerlane.markdown ├── turkey-bomb └── turkey-bomb.markdown ├── didigm └── didigm.markdown ├── star-w └── star-w.markdown ├── sartre └── sartre.markdown ├── mdpn └── mdpn.markdown ├── you-are-reading-the-name-of-this-esolang └── you-are-reading-the-name-of-this-esolang.markdown ├── sampo └── Practical_Matters.markdown ├── oozlybub-and-murphy └── oozlybub-and-murphy.markdown └── madison └── Madison.markdown /.hgtags: -------------------------------------------------------------------------------- 1 | fd0f61445aef8f6368a3b74dcfb42d1b635c2cfa checkpoint_1 2 | 0f16ac518ce82490f51e0b5d87bb655836ae019e checkpoint_2 3 | fd0f61445aef8f6368a3b74dcfb42d1b635c2cfa 0.1 4 | 0f16ac518ce82490f51e0b5d87bb655836ae019e 0.2 5 | fd0f61445aef8f6368a3b74dcfb42d1b635c2cfa checkpoint_1 6 | 0000000000000000000000000000000000000000 checkpoint_1 7 | 0f16ac518ce82490f51e0b5d87bb655836ae019e checkpoint_2 8 | 0000000000000000000000000000000000000000 checkpoint_2 9 | ba14f1a39f11ae8fb0ac1262677cb7be562459d7 0.3 10 | -------------------------------------------------------------------------------- /irishsea/doc/environment.markdown: -------------------------------------------------------------------------------- 1 | Irishsea: Environment 2 | ===================== 3 | 4 | The Irishsea environment is a user interface (UI) for interacting with the 5 | Irishsea model. It consists of two main parts: 6 | 7 | * the _monitor_, which graphically depicts all the processes active in the 8 | model, what they are currently doing, and what they will be doing in the 9 | near future (insofar as that is predictable); and 10 | * the _entry area_, where commands in the Irishsea language can be entered 11 | to affect these processes. 12 | 13 | ... 14 | -------------------------------------------------------------------------------- /irishsea/doc/model.markdown: -------------------------------------------------------------------------------- 1 | Irishsea: Model 2 | =============== 3 | 4 | The Irishsea model (communications/control model, or execution environment) 5 | comprises a set of concurrently-executing processes. 6 | 7 | Each process may receive messages, and may send messages to other processes. 8 | 9 | Devices look like any other processes in this model. Input from them is 10 | sent to some other process as a message. Sending them messages causes them 11 | to produce output, or to engage in other activities, possibly observable. 12 | 13 | The Irishsea environment is one such "input device". Instructions in the 14 | Irishsea language are entered into it; such instructions typically command 15 | it to send a specified message to a specified process. 16 | 17 | Irishsea processes may be implemented in any programming language; it is not 18 | necessary for it to be the Irishsea language. However, such processes must 19 | conform to the semantics of (i.e. expected behaviour of) Irishsea processes, 20 | which will now be described. 21 | 22 | ... 23 | -------------------------------------------------------------------------------- /irishsea/doc/language.markdown: -------------------------------------------------------------------------------- 1 | Irishsea: Language 2 | ================== 3 | 4 | The Irishsea language provides a syntax and semantics for instructing an 5 | Irishsea process to send messages to other processes, and how to react to 6 | messages it receives. 7 | 8 | The Irishsea language is extremely terse. It is designed to be economical 9 | to enter on a US QWERTY computer keyboard layout, rather than to be readable. 10 | (However, many constructs may have an alternate "long form" for readability, 11 | when speed of entry is not an issue.) 12 | 13 | (TODO the language should probably be somewhat flexible on the point of 14 | keyboard layout, for non-US keyboards, perhaps by letting symbols be 15 | redefined.) 16 | 17 | It assumes the following characters can be entered with a single keystroke, 18 | with no modifier keys: lower-case Latin letters from `a` to `z`, decimal 19 | digits from `0` to `9`, and the following symbols: `` ` `` (backquote), 20 | `-` (hyphen), `=` (equals sign), `[` and `]` (square brackets), `\\` 21 | (backslash), `;` (semicolon), `'` (apostrophe), `,` (comma), `.` (period), 22 | `/` (forward slash), and ` ` (blank space). 23 | 24 | (A keyboard with a numeric keypad may also provide `+` and `*`, but the 25 | presence of a numeric keypad should not be assumed.) 26 | 27 | It reserves these characters for use in language constructs which must be 28 | issued frequently and quickly. Other characters are relegated to those 29 | constructs which are less common or which are rarely issued "on-line". 30 | 31 | ... 32 | -------------------------------------------------------------------------------- /README.markdown: -------------------------------------------------------------------------------- 1 | Specs on Spec 2 | ============= 3 | 4 | This is a collection of specifications for programming languages that have 5 | not been implemented. Indeed, many of them may well be unimplementable. 6 | 7 | Most of them were designed, and their specs written, by Chris Pressey of 8 | Cat's Eye Technologies; the exceptions are: 9 | 10 | * **Startre** and **\*W**, which were designed and written by 11 | John Colagioia; and 12 | * **TURKEY BOMB**, which (I baldly assert) was found unexpectedly one 13 | day under a stack of _Byte_ magazines at a charity shop. 14 | 15 | Also, I say "programming language", but of course that term is rather flexible 16 | 'round these parts: 17 | 18 | * **Madison** is a language for writing formal proofs; 19 | * **MDPN** is a (two-dimensional) parser-definition language; and 20 | * **Opus-2** is a "spoken" language, for some rather exceptional meaning of 21 | "speaking". 22 | 23 | Most of these specifications are "finished" in the sense that there is nothing 24 | obviously more to add to them. (Of course, an implementation, or some really 25 | brow-furrowing thought experiments, could always turn up problems with a 26 | specification.) The exceptions, which can be considered "works in progress", 27 | are: 28 | 29 | * **Irishsea**, which is largely a set of notes for a livecoding language. 30 | * **Sampo**, which is largely a set of notes for a production language. 31 | 32 | The specification documents are copyrighted by their respective authors. Not 33 | that I mind if you fork this repo and submit pull requests to fix errors or 34 | the like, for such is the nature of the distributed version control beast. 35 | 36 | Note on the name: in the dialect of English where I come from, "spec" is short 37 | for "specification" but "on spec" is short for "on *speculation*." Thus the 38 | name is trying to convey the idea of specifications that were just kind of 39 | pulled out of the air. 40 | 41 | -------------------------------------------------------------------------------- /opus-2/opus-2.markdown: -------------------------------------------------------------------------------- 1 | Opus-2 2 | ====== 3 | 4 | Opus-2 is an abstract artlang composed by Chris Pressey at or around 5 | March 10, 2001. 6 | 7 | ### Design Goals 8 | 9 | Eliminate word order entirely. Despite the appearance of the resulting 10 | language, this was the only real design goal to begin with. 11 | 12 | ### Grammatical Overview 13 | 14 | Verbs in Opus-2 take the form of colours. Nouns take the form of sounds. 15 | Adjectives take the form of smells. Adverbs take the form of inner-ear 16 | sensations. Certain tenses and phrasings are indicated by tastes. 17 | 18 | To distinguish between the roles of the nouns in a sentence, objects are 19 | quieter, *sotto voce* sounds, and subjects are foreground sounds. It is 20 | important to remember that the sensations corresponding to object, 21 | subject, and verb all occur at the same time in an event termed an 22 | *sentence-experience*. 23 | 24 | This dominant-recessive relationship is also present in strong and weak 25 | scents which indicate whether an adjective describes the subject or the 26 | object, and in intense and gentle inner-ear sensations (feelings of 27 | sudden or gradual acceleration) to determine the target of an adverb. 28 | 29 | ### Vocabulary Overview 30 | 31 | Sample dictionary: 32 | 33 | **verbs** 34 | flee *pale green* 35 | approach *deep orange* 36 | examine *medium grey* 37 | glorify *deep red* 38 | 39 | **nouns** 40 | man *Eb below middle C, trombone* 41 | woman *F above middle C, french horn* 42 | world *car door slamming* 43 | child *middle C, tubular bells* 44 | building *F, tympani roll* 45 | radio *harp sweep* 46 | 47 | **adjectives** 48 | fast *burning rubber* 49 | dangerous *mothballs* 50 | 51 | **adverbs** 52 | quickly *leaning 40 degrees left* 53 | dangerously *leaning 25 degrees right* 54 | 55 | ### Context Overview 56 | 57 | While each sentence is "instantaneous" in the sense that there is no 58 | internal word order, sentence-experiences still follow one another, and 59 | each sentence-experience does take a certain amount of time to perceive. 60 | Tense is thus implied by context between successive sentences and the 61 | duration of each sentence. The shorter a sentence is, the further into 62 | the future it is presumed to refer to. (Thanks to Rob Norman and Panu 63 | Kalliokosi for suggesting these ideas.) 64 | 65 | ### Examples of Usage 66 | 67 | Example sentence-experience: "The building glorifies the woman": 68 | 69 | *deep red* 70 | *F, tympani roll, forte* 71 | *F, french horn, piano* 72 | 73 | Example sentence-experience: "The man quickly flees the dangerous 74 | child": 75 | 76 | *pale green* 77 | *Eb, trombone, forte* 78 | *leaning 40 degrees left (sudden)* 79 | *C, tubular bells, piano* 80 | *mothballs (gentle whiff)* 81 | 82 | ### Who Speaks Opus-2? 83 | 84 | This language was designed purely as an abstract exercise in language 85 | design. Thus it was not designed for any preconceived group of speakers, 86 | and little consideration was given to their culture and capabilities. It 87 | is neither specifically a conversation language, nor a formalized 88 | language (e.g. a programming language.) 89 | 90 | Most of the problems of finding speakers of Opus-2 is in finding 91 | creatures that can create smells and inner-ear sensations as easily as 92 | humans can create complex sounds. However, some popular opinions of who 93 | or what might speak Opus-2 have been suggested since its unveiling: 94 | 95 | - An efficient-yet-entertaining form of future communication using 96 | direct neural jacks. 97 | - A code used by e.g. Neo (from *The Matrix*) to communicate to 98 | subjects unknowingly trapped in a virtual reality. 99 | - A pidgin spoken between highly-telepathic beings and 100 | marginally-telepathic beings. 101 | -------------------------------------------------------------------------------- /irishsea/README.markdown: -------------------------------------------------------------------------------- 1 | Irishsea 2 | ======== 3 | 4 | Irishsea is an experiment in "live coding" or "cyber-physical programming" 5 | (or maybe "cyberphysical livecoding", why not?) 6 | 7 | It is vapourware. I haven't even been as far as decided for what I want to 8 | implement, or what language/environment to implement it in. It is mostly, 9 | for the time being, a collection of thoughts on the subject. And they're 10 | not even very coherent thoughts! Don't try to make sense of them! 11 | 12 | Let's back up. 13 | 14 | Many of the ideas behind "cyber-physical programming" do not seem to be 15 | actually very new. Consider... 16 | 17 | * [Sketchpad][] 18 | * Front-panel lights on the [Altair 8800][] 19 | * Interactive debugging capabilities of [LISP machines][] 20 | * Turtle Logo 21 | * Smalltalk 22 | * Even most 8-bit BASICs let you interrupt a running program, change 23 | some variables, then issue a `CONT` to continue execution 24 | * "Expression Watch" windows in various IDE's 25 | * etc. 26 | 27 | The main new idea seems to be: 28 | 29 | * _the effects of the operator's interactive reprogramming of the 30 | system are thought of as a kind of **performance**_. 31 | 32 | This may be a literal performance, in the case of, say, a musical livecoding 33 | concert. Or, it may be something more informal, or more abstract, or of 34 | supposedly practical value; but it is still some kind of... I don't know, 35 | _experience_, for lack of a better word; potentially a shared experience. 36 | 37 | Goals 38 | ----- 39 | 40 | One of the goals of the Irishsea project is to come up with answers to 41 | the question: _how do you play a computer like you play a musical instrument?_ 42 | 43 | Actually, I should say "performance instrument". I said "musical instrument" 44 | only because (a) you probably know what musical instruments look like, and 45 | how they generally work, and (b) you probably don't have a good idea of what 46 | a "performance instrument" would look like or how it would work. (I know I 47 | don't.) 48 | 49 | Or, another way to arrange the confusion in the above two paragraphs: 50 | Irishsea is a performance instrument, made out of a computer, using the 51 | concept of a musical performance to *frame*, but not *limit*, what we mean 52 | by "performance". 53 | 54 | Working towards these goals might include: 55 | 56 | * define a model for programs that can "do performance" and/or 57 | whose executions *are* performances 58 | * define a language (protocol) for reprogramming the model 59 | * define an environment (UI) in which that language can be "spoken" 60 | 61 | (Why do I keep saying "reprogramming?" Because you almost never program a 62 | computer from scratch. Other people have already programmed it a lot before 63 | you got your hands on it -- they built the OS, the text editors, the compilers 64 | and interpreters that you use... You can think of yourself just "programming" 65 | it, because you are adding new code to the existing code, but you still have 66 | to admit that your code does not live in a vacuum. Unless maybe you like to 67 | hand-assemble your own operating systems.) 68 | 69 | ### Ideas for Model ### 70 | 71 | * process- and messaging-based (like e.g. Erlang) 72 | * input/output devices look like processes and send/receive messages 73 | * see `doc/model.markdown` for more info 74 | 75 | ### Ideas for Language ### 76 | 77 | * also process- and messaging-based 78 | * terse, very terse, because you are to play it like an instrument 79 | * see `doc/language.markdown` for more info 80 | 81 | ### Ideas for Environment ### 82 | 83 | * visibility into what all the process are doing, and what they will be doing 84 | (insofar as that can be predicted) 85 | * see `doc/environment.markdown` for more info 86 | 87 | Motivation 88 | ---------- 89 | 90 | Having done all of the following things: 91 | 92 | * performed music 93 | * composed music 94 | * written software 95 | * used software 96 | 97 | I have a hard time reconciling musical performance with writing software. 98 | They're very different activities, for me. But I have less of a problem 99 | reconciling 100 | 101 | * performing music with composing it (they call this _improvisation_) 102 | * composing music with writing software (they're not dissimilar) 103 | * writing software with using it (you can call this _bootstrapping_) 104 | 105 | So it seems theoretically possible. So I'd like to try. So this is me 106 | trying. 107 | 108 | Links 109 | ----- 110 | 111 | * [extempore](https://github.com/digego/extempore) 112 | * [circa](https://github.com/paulhodge/circa) 113 | * [vivace](https://github.com/automata/vivace) 114 | * [live unit tests demo](http://livecoding.staticloud.com/) 115 | 116 | [Altair 8800]: http://en.wikipedia.org/wiki/Altair_8800 117 | [Lisp machines]: http://en.wikipedia.org/wiki/Lisp_machine 118 | [Sketchpad]: http://en.wikipedia.org/wiki/Sketchpad 119 | -------------------------------------------------------------------------------- /tamerlane/tamerlane.markdown: -------------------------------------------------------------------------------- 1 | Tamerlane 2 | ========= 3 | 4 | Chris Pressey 5 | Created Jan 29 2000 6 | 7 | ### Introduction to Tamerlane 8 | 9 | Tamerlane is a "constraint flow" language. The point of its creation is 10 | to attempt to break as many paradigmal stereotypes and idioms that I'm 11 | aware of; at least, to make it tricky and confounding to pigeonhole 12 | easily. 13 | 14 | It has some concepts in it from imperative languages, some from 15 | functional languages, some from dataflow and object oriented languages, 16 | and some from graph rewriting and other constraint-based languages, and 17 | they're all muddled together into a ridiculous *potpourri*. 18 | 19 | Despite being such a mutt, there might actually be some algorithms that 20 | would be dreadfully easy to write in Tamerlane. 21 | 22 | ### Overview of Tamerlane 23 | 24 | A Tamerlane program consists of a mutably weighted directed graph. 25 | 26 | Each node is considered to be an independent updatable store. The data 27 | held in each store is represented by the weights of the arcs exiting the 28 | node. 29 | 30 | An arc of weight zero is functionally equivalent to the absence of an 31 | arc. 32 | 33 | ### Description and Example 34 | 35 | At this point we may introduce the syntax in a sample ASCII notation for 36 | a simple, almost pathological Tamerlane program: 37 | 38 | Point-A: 1 Point-B, 39 | Point-B: 1 Point-C, 40 | Point-C: 1 Point-A. 41 | 42 | The user of a Tamerlane program may submit *messages* to the program at 43 | runtime. In this sense the user and the program are both objects which 44 | share the symmetrical relationship **user-of/used-by**. The program 45 | object's interface exposes a `query` method to the user, which is to be 46 | considered runtime-polymorphic. 47 | 48 | Using this message-passing mechanism, queries are submitted by the user 49 | to a running Tamerlane program, much as queries would be submitted to a 50 | running Prolog program. 51 | 52 | Queries have their own syntax and semantics. Unlike Prolog, the user's 53 | queries are interpreted as *rules*, perhaps accompanied by information 54 | about where and when the rules are "introduced" into the graph. 55 | 56 | As an example of a query that could be submitted to the above Tamerlane 57 | program: 58 | 59 | 1 Point-A -> 0 Point-A @ Point-A 60 | 61 | This would introduce the rewriting rule 62 | 63 | This rule is applied to the weights of the nodes in the graph starting 64 | with the node specified after the `@` symbol. In this instance it would 65 | start by trying to apply the rewrite to the node `Point-A`, but finding 66 | `Point-A` to contain `1 Point-B`, nothing would change. 67 | 68 | Each time a further query is submitted, each rule which has been 69 | introduced into the graph, disappears from the node it was working on, 70 | and is transmitted to the adjacent node with the lowest positive weight 71 | value. If there is a tie for lowest weight, the rule is transmitted to 72 | all of the adjacent nodes with the same lowest weight. 73 | 74 | For efficacy we can consider the user able to submit a `nop` query. This 75 | would not introduce any new rules into the graph, but it would cause all 76 | active rules to propogate to new nodes on the graph. 77 | 78 | So assume the user submits `nop`. The rule that was introduced by the 79 | last query 'moves' from the node labelled `Point-A` to the node labelled 80 | `Point-B` (since it's the only positive route out of `Point-A`.) It 81 | tries to rewrite `Point-B`, but finding only `1 Point-C` in `Point-B`, 82 | nothing happens. 83 | 84 | Assume the user `nop`s again. The rule is propogated to `Point-C`. 85 | Finally the pattern match succeeds, and `Point-C` is rewritten to a new 86 | `Point-C`: 87 | 88 | Point-C: 0 Point-A. 89 | 90 | After one more `nop`, the engine will generate a 91 | 92 | Rule stopped at Point-C (no adjacent nodes) 93 | 94 | message back to the user. This uses the operation that the user object's 95 | interface supplies to the running program, called `messageback`, which 96 | the user must supply, but is, like `query`, considered 97 | runtime-polymorphic. 98 | 99 | Advanced Topics 100 | --------------- 101 | 102 | ### Rule Priority 103 | 104 | Obviously, the user can enter more than one query in succession; s/he 105 | doesn't need to explicity `nop` to wait for rules to resolve. Since the 106 | graph can contain cycles, this would lead to a form of inherent, 107 | synchronous concurrency: each time the `query` or `nop` methods are 108 | invoked, more than one rule may be applied to the same node. 109 | 110 | The order in which these competing rules is applied is based on the 111 | rule's *priority*. Rules can be submitted with a specific priority in 112 | the following manner: 113 | 114 | 1 Point-A -> 0 Point-A @ Point-A ! 10 115 | 116 | If there is a tie in priority when two rules are competing to rewrite 117 | the same node, the outcome is guaranteed to be not simply undefined, but 118 | rather, non-deterministic, or at least probabilistic. 119 | 120 | The user can also specify a delay, measured in number of method calls 121 | (`query`s or `nop`s) from the present query, at which the rule will be 122 | introduced into the graph, like so: 123 | 124 | 1 Point-A -> 0 Point-A @ Point-A in 10 125 | 126 | And, for the sake of efficacy, the `nop` method on the program object 127 | has overloaded syntaxes whose semantics are 'keep `nop`ing until some or 128 | all rules have stopped.' 129 | 130 | ### Negative Weights 131 | 132 | A negative weight from one node to another is interpreted in a rather 133 | negative fashion with respect to the running program. 134 | 135 | A negative weight is selected for propogation when it is closer to zero 136 | (while still being non-zero) than any other weight (absolute value.) 137 | When a negative weight is selected, however, the semantics of 138 | propogation are different. 139 | 140 | When a rule encounters an arc of the form: 141 | 142 | X: -1 Y. 143 | 144 | The rule is not "just" copied to Y like "usual". Instead, it "rolls 145 | back" the rule to Y in the fashion of a functional continuation. 146 | 147 | All of the nodes that have been "touched" by this rule, since it was 148 | last at node Y, are reset to their condition when the rule was last at 149 | node Y. The rule itself is deleted from X and propogated to Y, and 150 | rewriting continues from there. 151 | 152 | If the rule has never visited Y, however, then such an exit arc is not 153 | considered a valid candidate. It has an effective weight of 0 154 | (non-existent.) 155 | 156 | ### Lambda Graphs 157 | 158 | Lambda abstraction is a powerful form of referring to non-numeric 159 | calculatory items such as functions and, in the case of Tamerlane, 160 | graphs. 161 | 162 | A user may submit a rule in the form 163 | 164 | 1 X -> 1 Y ? Y: 1 Z, Z: 1 Y 165 | 166 | Y is like a 'local variable' in this respect. When the rule is applied 167 | to the node, and the pattern match succeeds, the nodes named Y and Z 168 | need not actually exist, as they are simply created dynamically and 169 | 'attached' to the program graph. 170 | 171 | Lambda graphs are subject to garbage collection, since they can only be 172 | used by the rule that created them. 173 | 174 | ### Pigeonholes (Updatable) 175 | 176 | Variables may be supplied which are explicitly updatable, and these are 177 | termed pigeonholes. Pigeonholes acknowledge events. Their assignment is 178 | associated with both nodes and arcs. They can occur when: 179 | 180 | - a rule enters a node 181 | - a rule leaves a node 182 | - a rule chooses an arc 183 | - a rule passes through an arc 184 | - a rule rewrites an arc 185 | 186 | Updatable variables are always named after the node (or node.arc) 187 | they're associated with, preceded by a `$` symbol. The data in the 188 | variable is, of course, the arc weight (or in the case of nodes, the 189 | node's internal key.) 190 | 191 | If the pigeonhole is assigned the special value `%Cancel`, the rule is 192 | cancelled from moving to the node/through the arc/rewriting the arc. 193 | 194 | ### Placeholders (Unification) 195 | 196 | Variables may also be supplied as "placeholders". These are the same as 197 | bindable (unifiable) variables in logical languages like Prolog. An 198 | example of such might be: 199 | 200 | 1 X 7 ^Y -> 7 X 1 ^Y 201 | 202 | Which would replace "1 X 7 Foo" with "7 X 1 Foo", "1 X 7 Bar" with "7 X 203 | 1 Bar", etc. 204 | 205 | ### Horn Rules 206 | 207 | Horn rules only succeed if all of their rules succeed. A Horn rule is 208 | specified like: 209 | 210 | 1 X -> 1 Y + 1 Z 0 G -> 1 G 211 | 212 | (Remember that 0 G is equivalent to "the absence of an arc to G." If you 213 | have a pattern like 214 | 215 | ^a G ^b G 0 G -> ^b Q 216 | 217 | It will only match when there are exactly two different exit arcs to the 218 | same node G, which is not disallowed (nor is a node with an exit arc 219 | which points to itself.)) 220 | 221 | Note also that arcs are unordered amongst themselves. The rule 222 | 223 | 1 A 1 B 1 C -> 1 E 3 D 224 | 225 | is the same as 226 | 227 | 1 B 1 A 1 C -> 3 D 1 E 228 | -------------------------------------------------------------------------------- /turkey-bomb/turkey-bomb.markdown: -------------------------------------------------------------------------------- 1 | TURKEY BOMB 2 | =========== 3 | 4 | Anonymous 5 | 6 | Introduction 7 | ------------ 8 | 9 | TURKEY BOMB, the first known programming-language-cum-drinking-game, 10 | evolved independently on four seperate continents and was widely used as 11 | an implementation base for computer operating systems for several 12 | centuries. 13 | 14 | Later, when computers were proven beyond a shadow of a doubt to be the 15 | malevolent work of unseen evil forces, and digital technology was banned 16 | on punishment of bodily disintigration, TURKEY BOMB thrived on, only 17 | slightly modified, as a popular drinking game. 18 | 19 | Now that the treaty negotiations between the UN and the world of unseen 20 | evil forces have been signed, however, computers are back (and in full 21 | force, now billions upon billions of times more efficient than they once 22 | were,) and TURKEY BOMB's popularity as a computer programming language 23 | may just be making a comeback! 24 | 25 | Archaeologists have recently uncovered the largest known collection of 26 | TURKEY BOMB articles. Dating from A.D. 2014 and apparently an almanac of 27 | black magic of some sort, with the cryptic title "Communications of the 28 | ACM," the remains of an almost-four-hundred-year-old periodical is 29 | practically all historians have to go on. 30 | 31 | Even then, this obscure grimoire talks of this language as if it were an 32 | already-established phenomenon - indeed, perhaps dating back to the tail 33 | end of the second millenium of A.D. For this reason, some scholars 34 | attribute the widespread popularity of this language to one CLIN\_TON, a 35 | great leader of the time. It is said that CLIN\_TON was such an 36 | excellent and heroic TURKEY BOMB jockey that "he never inhaled", 37 | although that is largely deemed myth, perhaps connected to the 38 | allegorical story of his close followers who "blew him good". 39 | 40 | Our knowledge of the time in which TURKEY BOMB originated is slim, 41 | indeed. As such, the remainder of this document is by no means a 42 | complete reconstruction of the language, for that is surely impossible. 43 | However, it *is* an attempt to organize, apparently for the first time 44 | ever, the elements of TURKEY BOMB in a human-referencable fashion. 45 | 46 | Description 47 | ----------- 48 | 49 | To fully and exactly understand TURKEY BOMB one must first grok the 50 | ancient art of computer programming while under the influence of 51 | recreational consumables. If you are already under the influence (and 52 | why else would you be reading a document about a language named TURKEY 53 | BOMB,) congratulations, you've already taken your first steps on the 54 | road of becoming an expert TURKEY BOMB programmer/jockey. 55 | 56 | But do not be lulled into thinking that simply ingesting an amusing 57 | chemical will ensure your vainglorious TURKEY BOMB hobby or career! No 58 | indeed, for it is the most wise and experienced programmer who needs no 59 | more buzz than to program in TURKEY BOMB itself. Many teatotalling 60 | hackers excel at the Annual International TURKEY BOMB Open in Maui for 61 | this very reason. (See you there this fall!) 62 | 63 | Data Types 64 | ---------- 65 | 66 | Name 67 | 68 | Description 69 | 70 | Size 71 | 72 | ` ZILCH` 73 | 74 | "A little slice of Nirvana." 75 | 76 | Zero. 77 | 78 | ` BI_IT` 79 | 80 | A composite quantum state of information. 81 | 82 | Two thirds of a bit plus half a trit. 83 | 84 | ` AMICED` 85 | 86 | A conceptual quantum state of information. 87 | 88 | Negative six sevenths of a decimal digit. 89 | 90 | ` TRIVIA CONCERNING type` 91 | 92 | Three references: one to an object of the named type, two to TRIVIA 93 | objects. 94 | 95 | Exactly fifteen bytes, no exceptions. 96 | 97 | ` ADVISORY PERTAINING TO type` 98 | 99 | A quarter of a reference to a object of the given type. 100 | 101 | A quarter of the platform-defined pointer size. 102 | 103 | ` GRUBSTEAK` 104 | 105 | A fraction whose numerator is a perfect square and whose denominator is 106 | a prime number. 107 | 108 | No bigger than necessary. 109 | 110 | ` IMPROPER GRUBSTEAK` 111 | 112 | A GRUBSTEAK whose denominator is less than the square root of the 113 | numerator. 114 | 115 | Same as GRUBSTEAK. 116 | 117 | ` INDECENT GRUBSTEAK` 118 | 119 | A fraction whose numerator is a perfect square of a perfect square and 120 | whose denominator is a prime number whose ordinal position in the 121 | counting list of prime numbers is also prime. 122 | 123 | In the drinking game, whenever an INDECENT GRUBSTEAK is involved in an 124 | expression, everyone starts chanting "BANG BANG BANG!!!" until the 125 | player holding the TURKEY BOMB either finishes their drink and starts 126 | another, or falls down (in which case someone who hasn't been playing 127 | should take him or her home). 128 | 129 | Same as GRUBSTEAK. 130 | 131 | ` NOMENCLATURE` 132 | 133 | A set of variable names, defined by an EBNF expression that must contain 134 | at least one { } (repeated 0 or more times) term. 135 | 136 | As big as possible. 137 | 138 | ` PUDDING` 139 | 140 | An unknowable value. 141 | 142 | Infinite. 143 | 144 | ` HUMIDOR BUILT UP FROM type, type & type` 145 | 146 | A structure containing three other types, specified at compile-time, all 147 | of which must be different, one of which must be another HUMIDOR. 148 | 149 | Infinite. 150 | 151 | ` HYBRID OBTAINED BY COMBINING type & type [WITH GUSTO]` 152 | 153 | A unified structure containing data from two different types, specified 154 | at compile time. 155 | 156 | The average size of the two types... which may present problems when 157 | accuracy of representation is desired, which is why the WITH GUSTO 158 | clause is made available to pad the size of a HYBRID to the larger of 159 | the sizes of it's two consitituent data types. 160 | 161 | ` TURKEY BOMB` 162 | 163 | A mysterious and shadowy type, suspected to be a reference to itself. 164 | There can only be one (no more, no less) variable of type TURKEY BOMB, 165 | and it is predeclared under the variable name TURKEY BOMB. A variable of 166 | type TURKEY BOMB (that is to say, the variable named TURKEY BOMB) can 167 | only take on one value, that value being TURKEY BOMB. 168 | 169 | Exactly 1 TURKEY BOMB. 170 | 171 | Paradigm 172 | -------- 173 | 174 | When TURKEY BOMB is played as a drinking game, the TURKEY BOMB is 175 | represented by a real object - usually something convenient, like a 176 | shoe, when an impromptu game is played for fun, but the real hardcore 177 | TURKEY BOMB junkies insist on using either a real live turkey, or a real 178 | live time bomb, or ideally, both (tied *securely* together). 179 | 180 | The TURKEY BOMB is then passed from player to player while the referee 181 | (operating system) designates challengers (tasks). The chosen challenger 182 | takes a deep breath (inhales) and shouts an expression at the player 183 | holding the TURKEY BOMB. If the player can produce the correct result 184 | before the referee can, they only have to take a sip of their drink 185 | before passing off the TURKEY BOMB. Otherwise, it's the whole thing, 186 | down the hatch. 187 | 188 | If the player holding the TURKEY BOMB makes an error, they must down 189 | their drink and get another before trying again. If the referee makes an 190 | error, *everyone*, especially the referee, must down their drink, and 191 | get another. 192 | 193 | Variables are also declared by any player spontaneously standing up and 194 | shouting out a name that hasn't been mentioned yet, and a type to go 195 | with it, at any time. 196 | 197 | For these reasons, TURKEY BOMB should not be considered as much an 198 | imperative language as a "peer-pressure" one. 199 | 200 | Syntax 201 | ------ 202 | 203 | There are no comments in TURKEY BOMB; it's entire content is considered 204 | a comment on those who program/play it. 205 | 206 | Operators 207 | --------- 208 | 209 | Syntax 210 | 211 | Description 212 | 213 | ` BI_IT BI_IT BI_IT ! BI_IT BI_IT BI_IT` 214 | 215 | 2-bit NAND, rotate known trit left. 216 | 217 | ` BI_IT BI_IT ? BI_IT BI_IT ? BI_IT BI_IT` 218 | 219 | 3-argument trit operation; unfortunately the Ancient Texts seem unclear 220 | on what it actually does. (The closest English translation appears to be 221 | "take these trits three and meditate soundly upon them.") 222 | 223 | ` $ BI_IT BI_IT BI_IT BI_IT BI_IT BI_IT $` 224 | 225 | Attempt to make a GRUBSTEAK. 226 | 227 | ` TRIVIA Y EXPR Y TRIVIA` 228 | 229 | Attempt to make a TRIVIA. 230 | 231 | ` TRIVIA BI_IT //` 232 | 233 | Attempt to connect a TRIVIA to itself and return it. The BI\_IT argument 234 | is required, but serves no detectably useful purpose (hardcore followers 235 | of the drinking game tradition insist that it's for good luck.) 236 | 237 | ` & EXPR` 238 | 239 | Do not evaluate EXPR. Not particularly useful when programming in TURKEY 240 | BOMB, but wow, can one of these ever screw you up in the middle of a 241 | game. 242 | 243 | ` \ ADVISORY ADVISORY ADVISORY ADVISORY` 244 | 245 | Returns the type thus pointed to. Also, the player holding the TURKEY 246 | BOMB must pass it off. 247 | 248 | ` HYBRID.type` 249 | 250 | Casts a HYBRID to either type it was defined with. 251 | 252 | ` HUMIDOR.type` 253 | 254 | Retrieves an element from a HUMIDOR. 255 | 256 | ` @ HUMIDOR` 257 | 258 | Retrieves a PUDDING which represents the entire HUMIDOR. 259 | 260 | ` PUDDING!!!!!` 261 | 262 | Attempts to deduce the existance of a HUMIDOR in the given PUDDING. The 263 | player to the left of the player holding the TURKEY BOMB has to keep 264 | drinking continuously while the computer/referee does their deducing. 265 | 266 | ` ALL BUT EXPR` 267 | 268 | Returns a PUDDING indicating everything but EXPR. 269 | 270 | ` WHEREFORE ART EXPR` 271 | 272 | Returns a PUDDING indicating the entire metaphysical nature of EXPR. 273 | 274 | ` WHEREFORE AIN'T EXPR` 275 | 276 | Short for WHEREFORE ART ALL BUT EXPR. 277 | 278 | ` WHEREFOREN'T EXPR` 279 | 280 | Short for ALL BUT WHEREFORE ART EXPR. 281 | 282 | ` GARNISH PUDDING` 283 | 284 | Convolutes the PUDDING with recent context drawn from the program. The 285 | player holding the TURKEY BOMB must pass it off. 286 | 287 | ` IMAGINE PUDDING, PUDDING!` 288 | 289 | Returns a NOMENCLATURE indicating all the variables unchanged between 290 | two PUDDINGs. 291 | 292 | ` EXPR :-> NOMENCLATURE` 293 | 294 | Mass-assign the set of variables. 295 | 296 | ` < NOMENCLATURE` 297 | 298 | Mass-retrieve the set of variables. 299 | 300 | ` NOMENCLATURE % GRUBSTEAK GRUBSTEAK` 301 | 302 | Perform iterative cypher transformation of set of names. 303 | 304 | Notes 305 | ----- 306 | 307 | The drinking game can also be played in an asylum, replacing 'drink' 308 | with 'medication'. Do *not* play this game with LSD. 309 | -------------------------------------------------------------------------------- /didigm/didigm.markdown: -------------------------------------------------------------------------------- 1 | The Didigm Reflective Cellular Automaton 2 | ======================================== 3 | 4 | November 2007, Chris Pressey, Cat's Eye Technologies 5 | 6 | Introduction 7 | ------------ 8 | 9 | Didigm is a *reflective cellular automaton*. What I mean to impart by 10 | this phrase is that it is a cellular automaton where the transition 11 | rules are given by the very patterns of cells that exist in the 12 | playfield at any given time. 13 | 14 | Perhaps another way to think of Didigm is: Didigm = [ALPACA][] + [Ypsilax][]. 15 | 16 | [ALPACA]: http://catseye.tc/node/ALPACA.html 17 | [Ypsilax]: http://catseye.tc/node/Ypsilax.html 18 | 19 | Didigm as Parameterized Language 20 | -------------------------------- 21 | 22 | Didigm is actually a parameterized language. A parameterized language is 23 | a schema for specifying a set of languages, where a specific language 24 | can be obtained by supplying one or more parameters. For example, 25 | [Xigxag][] is a parameterized language, where the 26 | direction of the scanning and the direction of the building of new 27 | states are parameters. Didigm "takes" a single parameter, an integer. 28 | This parameter determines how many *colours* (number of possible states 29 | for any given cell) the cellular automaton has. This parameter defaults 30 | to 8. So when we say Didigm, we actually mean Didigm(8), but there are 31 | an infinite number of other possible languages, such as Didigm(5) and 32 | Didigm(70521). 33 | 34 | [Xigxag]: http://catseye.tc/node/Xigxag.html 35 | 36 | The languages Didigm(0), Didigm(-1), Didigm(-2) and so forth are 37 | probably nonsensical abberations, but I'll leave that question for the 38 | philosophers to ponder. Didigm(1) is at least well-defined, but it's 39 | trivial. Didigm(2) and Didigm(3) are semantically problematic for more 40 | technically interesting reasons (Didigm(3) might be CA-universal, but 41 | Didigm(2) probably isn't.) Didigm(4) and above are easily shown to be 42 | CA-universal. 43 | 44 | (I say CA-universal, and not Turing-complete, because technically 45 | cellular automata cannot simulate Turing machines without some extra 46 | machinery: TMs can halt, but CAs can't. Since I don't want to deal with 47 | defining that extra machinery in Didigm, it's simpler to avoid it for 48 | now.) 49 | 50 | Colours are typically numbered. However, this is not meant to imply an 51 | ordering between colours. The eight colours of Didigm are typically 52 | referred to as 0 to 7. 53 | 54 | Language Description 55 | -------------------- 56 | 57 | ### Playfield 58 | 59 | The Didigm playfield, called *le monde*, is considered unbounded, like 60 | most cellular automaton playfields, but there is one significant 61 | difference. There is a horizontal division in this playfield, splitting 62 | it into regions called *le ciel*, on top, and *la terre*, below. This 63 | division is distinguishable — meaning, it must be possible to tell which 64 | region a given cell is in — but it need not have a presence beyond that. 65 | Specifically, this division lies on the edges between cells, rather than 66 | in the cells themselves. It has no "substance" and need not be visible 67 | to the user. (The Didigm Input Format, below, describes how it may be 68 | specified in textual input files.) 69 | 70 | ### Magic Colours 71 | 72 | Each region of the division has a distinguished colour which is called 73 | the *magic colour* of that region. The magic colour of le ciel is colour 74 | 0. The magic colour of la terre is colour 7. (In Didigm(n), the magic 75 | colour of la terre is colour n-1.) 76 | 77 | ### Transition Rules 78 | 79 | #### Definition 80 | 81 | Each transition rule of the cellular automaton is not fixed, rather, it 82 | is given by certain forms that are present in the playfield. 83 | 84 | Such a form is called *une salle* and has the following configuration. 85 | Two horizontally-adjacent cells of the magic colour abut a cell of the 86 | *destination colour* to the right. Two cells below the rightmost 87 | magic-colour cell is the cell of the *source colour*; it is surrounded 88 | by cells of any colour called the *determiners*. 89 | 90 | This is perhaps better illustrated than explained. In the following 91 | diagram, the magic colour is 0 (this salle is in le ciel,) the source 92 | colour is 1, the destination colour is 2, and the determiners are 93 | indicated by D's. 94 | 95 | 002 96 | DDD 97 | D1D 98 | DDD 99 | 100 | #### Application 101 | 102 | Salles are interpreted as transition rules as follows. When the colour 103 | of a given cell is the same as the source colour of some salle, and when 104 | the colours of all the cells surrounding that cell are the exact same 105 | colours (in the exact same pattern) as the determininers of that salle, 106 | we say that that salle *matches* that cell. When any cell is matched by 107 | some salle in the other region, we say that that salle *applies* to that 108 | cell, and that cell is replaced by a cell of the destination colour of 109 | that salle. 110 | 111 | "The other region" refers, of course, to the region that is not the 112 | region in which the cell being transformed resides. Salles in la terre 113 | only apply to cells in le ciel and vice-versa. This complementarity 114 | serves to limit the amount of chaos: if there was some salle that 115 | applied to *all* cells, it would apply directly to the cells that made 116 | up that salle, and that salle would be immediately transformed. 117 | 118 | On each "tick" of the cellular automaton, all cells are checked to find 119 | the salle that applies to them, and then all are transformed, 120 | simultaneously, resulting in the next configuration of le monde. 121 | 122 | There is a "default" transition rule which also serves to limit the 123 | amount of chaos: if no salle applies to a cell, the colour of that cell 124 | does not change. 125 | 126 | Salles may overlap. However, no salle may straddle the horizon. (That 127 | is, each salle must be either completely in le ciel or completely in la 128 | terre.) 129 | 130 | Salles may conflict (i.e. two salles may have the same source colour and 131 | determiners, but different destination colours.) The behaviour in this 132 | case is defined to be uniformly random: if there are n conflicting 133 | salles, each has a 1/n chance of being the one that applies. 134 | 135 | Didigm Input Format 136 | ------------------- 137 | 138 | I'd like to give some examples, but first I need a format to given them 139 | in. 140 | 141 | A Didigm Input File is a text file. The textual digit symbols `0` 142 | through `9` indicate cells of colours 0 through 9. Further colours may 143 | be indicated by enclosing a decimal digit string in square brackets, for 144 | example `[123]`. This digit string may contain leading zeros, in order 145 | for columns to line up nicely in the file. 146 | 147 | A line containing only a `,` symbol in the leftmost column indicates the 148 | division between le ciel and la terre. This line does not become part of 149 | the playfield. 150 | 151 | A line beginning with a `=` is a directive of some sort. 152 | 153 | A line beginning with `=C` followed by a colour indicator indicates how 154 | many colours (the n in Didigm(n)) this playfield contains. This 155 | directive may only occur once. 156 | 157 | A line beginning with `=F` followed by a colour indicator as described 158 | above, indicates that the unspecified (and unbounded) remainder of le 159 | ciel or la terre (whichever side of `,` the directive is on) is to be 160 | considered filled with cells of the given colour. 161 | 162 | Of course, an application which implements Didigm with some alternate 163 | means of specifying le monde, for example a graphical user interface, 164 | need not understand the Didigm Input Format. 165 | 166 | Examples 167 | -------- 168 | 169 | Didigm is immediately seen to be CA-universal, in that you can readily 170 | (and stably) express a number of known CA-universal cellular automata in 171 | it. For example, to express John Conway's Life, you could say that 172 | colour 1 means "alive" and colour 2 means "dead", and compose something 173 | like 174 | 175 | 002002001001 176 | 222122112212 ... and so on ... 177 | 212212212121 ... for all 256 ... 178 | 222222222222 ... rules of Life ... 179 | =F3 180 | , 181 | =F2 182 | 22222 183 | 21222 184 | 21212 185 | 21122 186 | 22222 187 | 188 | Because the magic colour 7 never appears in la terre, il n'y a aucune 189 | salle dans la terre et donc tout le ciel est toujours la meme chose. 190 | 191 | There are of course simpler CA's that are apparently CA-universal that 192 | would be possible to describe more compactly. But more interesting (to 193 | me) is the possibility for making reflective CA's. 194 | 195 | To do this in an uncontrolled fashion is easy. We just stick some salles 196 | in le ciel, some salles in la terre, and let 'er rip. Unfortunately, in 197 | general, les salles in each region will probably quickly damage enough 198 | of the salles in the other region that le monde will become sterile soon 199 | enough. 200 | 201 | A rudimentary example of something a little more orchestrated follows. 202 | 203 | 3333333333333 204 | 3002300230073 205 | 3111311132113 206 | 3311321131573 207 | 3111311131333 208 | 3333333333333 209 | =F3 210 | , 211 | =F1 212 | 111111111111111 213 | 111111131111111 214 | 111111111111574 215 | 111111111111333 216 | 311111111111023 217 | 111111111111113 218 | 219 | The intent of this is that the 3's in la terre initially grow streams of 220 | 2's to their right, due to the leftmost two salles in le ciel. However, 221 | when the top stream of 2's reaches the cell just above and to the left 222 | of the 5, the third salle in le ciel matches and turns the 5 into a 7, 223 | forming une salle dans la terre. This salle turns every 2 to the right 224 | of a 0 in le ciel into a 4, thus modifying two of les salles in le ciel. 225 | The result of these modified salles is to turn the bottom stream of 2's 226 | into a stream of 4's halfway along. 227 | 228 | This is at least predictable, but it still becomes uninteresting fairly 229 | quickly. Also note that it's not just the isolated 3's in la terre that 230 | grow streams of 2's to the right: the 3's on the right side of la salle 231 | would, too. This could be rectified by having a wall of some other 232 | colour on that side of la salle, and I'm sure you could extend this 233 | example by having something else happen when the stream of 4's hit the 0 234 | in that salle, but you get the picture. Creating a neat and tidy and 235 | long-lived reflective cellular automaton requires at least as much care 236 | as constructing a "normal" cellular automaton, and probably in general 237 | more. 238 | 239 | History 240 | ------- 241 | 242 | I came up with the concept of a reflective cellular automaton (which is, 243 | as far as I'm aware, a novel concept) independently on November 1st, 244 | 2007, while walking on Felix Avenue, in Windsor, Ontario. 245 | 246 | No reference implementation exists yet. Therefore, all Didigm runs have 247 | been thought experiments, and it's entirely possible that I've missed 248 | something in its definition that having a working simulator would 249 | reveal. 250 | 251 | Happy magic colouring! 252 | Chris Pressey 253 | Chicago, Illinois 254 | November 17, 2007 255 | -------------------------------------------------------------------------------- /star-w/star-w.markdown: -------------------------------------------------------------------------------- 1 | The \*W Programming Language 2 | ============================ 3 | 4 | John Colagioia, 199? 5 | 6 | Introduction 7 | ------------ 8 | 9 | The \*W language should be based on the W language which, of course, 10 | does not exist. Instead it is based on an assortment of odds and ends 11 | which could be useful in languages, but never seem to have been 12 | implemented (and definitely shouldn't have been implemented in the same 13 | language), combined with some patching added to, firstly, make \*W truly 14 | bizarre and, secondly, to make it as functionally complete a language as 15 | C and C++ (from which some concepts such as casting have been borrowed, 16 | as well as the name convention), for example. The data types should 17 | provide enough of a range that any structure may be built up, and 18 | includes arbitrary bitstrings, machine independant pointers (useful for 19 | a pass-by-reference), name bindings of data (useful for a pass-by-name), 20 | allows for homogenous array-like compositions, and has a semi-structured 21 | composite data type. All arithmetic expressions are constructed in 22 | prefix to preserve continuity with subroutine calls, and include a 23 | fairly complete set of arithmetic and bitwise operators and inbuilt 24 | functions to execute any calculation. 25 | 26 | The language is also quite robust in flow control, allowing for 27 | conditionals, iteration (bounded and unbounded), function calls, 28 | interrupt-driven, and even random execution. To enforce structured 29 | programming, however, neither a "go to" or a "come from" statement has 30 | been implemented in \*W. 31 | 32 | To minimize readability, of course, \*W is fully case insensitive, so 33 | that Count, COUNT, count, and COunT are all indistinct except under 34 | fairly confusing circumstances (which may or may not exist, depending on 35 | the implementation). Data instances must begin with an alphabetic 36 | character (`a`-`z` or `A`-`Z`), an underscore (`_`), or a hyphen (`-`). 37 | The remainder of the name may then be made up of any alphanumeric 38 | characters and the underscore, hyphen, and, of course, the right bracket 39 | (`]`). Comments are also possible in \*W (though not necessarily 40 | suggested as the language is fairly confusing without misspelled and 41 | incorrect descriptions of the program to botch things), and may be 42 | included in text by placing a double pipe (`||`) at the beginning of a 43 | comment, terminated by the doubled end-of-statement marker (`!!`). Such 44 | comments may be placed anywhere in the program, and should be completely 45 | ignored by the compiler (just as they are by most programmers). This 46 | does not mean that comments are equivalent to whitespace. On the 47 | contrary, the compiler considers comments to simply not exist, 48 | essentially concatenating the strings to either side of the comment. 49 | 50 | \*W, like most modern languages, is entirely freeform, meaning that 51 | statements are not constrained to the dimensions of, say, a punch card, 52 | teletype, computer monitor, or three-dimensional, virtual reality 53 | programmers' editor. Program format, therefore, is entirely dependant on 54 | input device, host computer's character set, and lack of programmer 55 | style, though the compiler is permitted (actually somewhat encouraged) 56 | to mock poor format style. 57 | 58 | \*W Data Types 59 | -------------- 60 | 61 | The \*W data types are designed with maximum versatility in mind. With 62 | them, any other known (and several unknown) types may be built. In 63 | addition, several predefined instances of these types are provided to 64 | enhance the language. 65 | 66 | `bits` A bitstring of arbitrary length. 67 | `cplx` A complex number in the mathematical form (A + Bj) where A and B are integers and j is the square root of (-1). Each component of a cplx is specified to have a minimum precision of {-32768 ... 32767}, but may be more, depending on the implementation. 68 | `sack` A (semi)structured data type consisting of a collection of elements which can be packed, unpacked, and checked with other data. 69 | `dref` A reference to an instance of some data type. 70 | `name` A name of another datum. 71 | `chrs` A character string of arbitrary length. 72 | `hole` A data type with no value. May pose as any type. 73 | 74 | Predefined \*W Instances 75 | ------------------------ 76 | 77 | The following data instances are provided to the \*W programming 78 | environment to facilitate programming certain concepts which would be 79 | nearly impossible otherwise. 80 | 81 | `WORLD` (`bits`) The \*W representation of the outside world. Assigning an expression to WORLD (see below) causes the character represented by the expression to be appended to the computer display. Likewise, using WORLD in an expression represents the value of the next character in the input buffer (if any). 82 | `NOWHERE` (`hole`).. NOWHERE is a place to discard things as well as a place to get nothing. Can also be used for comparison purposes. 83 | `NL` (`chrs`) NL is a newline character. 84 | `POCKET` (`sack`) Data instances local to each subroutine. Both may be used to store any data, but RESULT will be available for the calling routine to read. 85 | `RESULT` (`bits`) 86 | 87 | \*W Program Parts 88 | ----------------- 89 | 90 | Each \*W program is made of several parts. The functions part, which 91 | defines any user functions, the stuff part, which defines any instances 92 | of data for the program, and the text part, which contains the program 93 | instructions, themselves. 94 | 95 | ### Functions 96 | 97 | Subprograms which can be used from the Text, in the form. 98 | 99 | @ name = Stuff Text 100 | 101 | ### Stuff 102 | 103 | Declarations of data to be used by the Text portion of the program, with 104 | an optional constant initializer. The initial number allows multiple 105 | indexed instances to exist. 106 | 107 | num/name IS type [const] ! 108 | num/name , num/name ... ARE ALL type [const] ! 109 | AUTOPACKED SACK name [, ... name] HAS type [, ... type] ! 110 | 111 | ### Text 112 | 113 | A list of statements, appearing as: 114 | 115 | TEXT: {statements} :ENDTEXT 116 | 117 | W Statement Types 118 | ----------------- 119 | 120 | statement % expr ! 121 | 122 | Runs statement with a probability of expr. If expr is less than 100, the 123 | statement is executed that percentage of time. If it is greater, the 124 | statement is executed (expr/100) more times, each time decrementing the 125 | value of expr by 100, and, if expr ever falls below 100, is subject to 126 | the first rule. A negative value for expression works just like a 127 | positive value, except only under conditions where program execution 128 | runs backward; otherwise, it is treated as a zero. 129 | 130 | statment UNLESS expr ! 131 | 132 | The statment is executed whenever encountered except in any cases when 133 | expr evaluates to non-zero. 134 | 135 | statement WHEN expr ! 136 | 137 | The statement is not executed when encountered, but is instead executed 138 | after any statement where expr currently evaluates to non-zero. 139 | 140 | lval < expr ! 141 | expr > lval ! 142 | 143 | Takes the value of expr and copies it into lval. 144 | 145 | function (parameters) ! 146 | 147 | Calls a function with the appropriate parameters. The parameter list 148 | must correspond one-to-one with the Stuff list for that function. 149 | 150 | The scoping rules in \*W are much more simplified than they would have 151 | been in W, had it existed: A function may only access data instances 152 | declared within itself, including those implicitly defined. 153 | 154 | -|- (expr) ! 155 | 156 | If expr is 0, jumps to the end of the current block (see below), 157 | otherwise, terminates the current expr blocks. If expr is greater than 158 | the current block nesting, it does nothing. If expr is negative, the 159 | program terminates. 160 | 161 | & statement & statement & ... statement && 162 | 163 | A blocking mechanism for multiple statments. 164 | 165 | \*W Mathematical Operations 166 | --------------------------- 167 | 168 | `^ X` And the bits of X (yielding a single bit). 169 | `. X` Or the bits of X. 170 | `? X` Xor the bits of X. 171 | `* X` Butterfly the bits of X, i.e., 11001100 becomes 10100101. 172 | `- A B` If A and B are simple (bits, cplx, chrs, hole), identical types, subtracts B from A. 173 | `/ A B` If A and B are numeric (cplx), divides A by B. 174 | `# A B` If A and B are numeric (cplx), takes A to the B power. 175 | `$ A B` Mingles B with A. 176 | `~ A B` Selects the B bits from A. 177 | `SIZE X` Returns the size of X, in full bytes (rounded up if X is a bitstring). 178 | 179 | \*W Sack Operations 180 | ------------------- 181 | 182 | PACK sack data: Add data to sack. 183 | UNPACK sack data: Remove a data-like element from sack. 184 | UNPACK sack: Remove an element from sack. 185 | CHECK sack data: Examine sack for data. 186 | WEIGH sack: Returns the weight of the sack (in bits). 187 | 188 | Other \*W Operations 189 | -------------------- 190 | 191 | NAME name AFTER data: Assigns the name of data to name. 192 | WHOIS (data): Returns name suggested by data. 193 | REF (data): Returns a reference to data. 194 | DATA (dref): Returns the data referred to by ref. 195 | FCHRS (chrs): Returns the first character of the string. 196 | LCHRS (chrs): Returns the last character of the string. 197 | FBIT (bits): Returns the first (lowest) bit of the bits. 198 | LBIT (bits): Returns the last (highest) bit of the bits. 199 | 200 | Sample \*W Programs 201 | ------------------- 202 | 203 | 1. Functions: 204 | || No functions for this program !! 205 | Stuff: 206 | 1/Hello is chrs! 207 | 1/Sz, 1/Total are all cplx! 208 | Text: 209 | || Initialize the data !! 210 | Hello < "Hello, World!"! 211 | Size Hello > Sz! 212 | Total < 0! 213 | || Take the string length and multiply by 100 !! 214 | - Size - 0 Total > Total %10000! 215 | || Print and delete a character that many times !! 216 | & WORLD < FCHRS (Hello)! 217 | & Hello < - Hello FCHRS (Hello)! 218 | && %Total! 219 | || Add a newline !! 220 | WORLD < nl! 221 | :Endtext 222 | Result: Prints "Hello, World!" to the screen, followed by a 223 | newline. 224 | 2. Functions: 225 | @ mult = 226 | Stuff: 227 | 1/A, 1/B are all cplx! 228 | Text: 229 | cplx (RESULT) < 0! 230 | cplx (RESULT) < - cplx (RESULT) - 0 B 231 | %10000! 232 | bits (B) < RESULT! 233 | cplx (RESULT) < 0! 234 | cplx (RESULT) < - cplx (RESULT) - 0 A 235 | %B! 236 | :Endtext 237 | @ fact = 238 | Stuff: 239 | 1/n is cplx! 240 | Text: 241 | RESULT < bits (1)! 242 | RESULT < bits (mult (n, - n 1)) 243 | unless - 1 ?n! 244 | :Endtext 245 | Stuff: 246 | 1/Input, 1/Output are all chrs! 247 | 1/SR is chrs "0"! 248 | 1/Num, 1/Out, 1/Place, 1/Index, 1/Mod are all cplx 0! 249 | Text: 250 | WORLD > Input! 251 | Place < 1! 252 | Index < cplx (mult (SIZE Input, 100))! 253 | & Num < - Num - 0 cplx (mult (Place, 254 | - LCHRS (Input) ZR))! 255 | & Input < - Input LCHRS (Input)! 256 | & cplx (mult (10, Place)) > Place! 257 | && %Index! 258 | cplx (fact (Num)) > Out! 259 | & - Out cplx (mult (/ Out 10, 10)) > Mod! 260 | & + 100 / Out 10 > Out! 261 | & + Output chrs (+ Mod Zr) > Output! 262 | && %%Out! 263 | Size Output > Index! 264 | Index < cplx (mult (100, Index))! 265 | & WORLD < LCHRS (Hello)! 266 | & Hello < - Hello LCHRS (Hello)! 267 | && %Index! 268 | WORLD < nl! 269 | :Endtext 270 | Result: Accepts a positive integer (n) as input, then outputs 271 | the factorial of n (n!). 272 | -------------------------------------------------------------------------------- /sartre/sartre.markdown: -------------------------------------------------------------------------------- 1 | The Sartre Programming Language 2 | =============================== 3 | 4 | John Colagioia, 199? 5 | 6 | Introduction 7 | ------------ 8 | 9 | The Sartre programming language is named for the late existential 10 | philosopher, Jean-Paul Sartre. Sartre is an extremely unstructured 11 | language. Statements in Sartre have essentially no philosophical 12 | purpose; they just are. Thus, Sartre programs are often left to define 13 | their own functions. 14 | 15 | Unlike traditional programming languages (or maybe very much like them), 16 | nothing in Sartre is guaranteed, except maybe for the fact that nothing 17 | is guaranteed. The Sartre compiler, therefore, must be case insensitive 18 | (technically, it requires all capital letters, but since nothing matters 19 | anyway, why should this?). 20 | 21 | Names in Sartre may only contain letters (and only capital letters, at 22 | that, but nobody really cares that much), and maybe some trailing 23 | digits, but nothing else. 24 | 25 | No standard mathematical functionality is supplied in Sartre. Instead, 26 | nihilists are created, which may damage the properties inherent to the 27 | data, and nihilators are executed to reclaim storage dynamically. 28 | 29 | Sartre programmers, perhaps somewhat predictably, tend to be boring and 30 | depressed, and are no fun at parties. No comments will be made on the 31 | level of contrast between the Sartre programmer and any other 32 | programmer. 33 | 34 | In the words of a Sartre programmer who worked intensely for months, 35 | eating whatever junkfood wandered near his cubicle, "I have been gaining 36 | twenty-five pounds a week for two months, and I am now experiencing 37 | light tides. It is stupid to be so fat. My pain, ultimate solitude, and 38 | collection of Dilbert cartoons are still as authentic as they were when 39 | I was thin, but seem to impress girls far less. From now on, I will live 40 | on cigarettes and black coffee," which is the general diet of the Sartre 41 | programmer, for obvious reasons. 42 | 43 | Comments are available in Sartre, though not at all suggested, since 44 | nobody really wants to listen to you, anyway, by typing the comment at 45 | the beginning of the line, and terminating it (on the same line) with a 46 | squiggle (brace) pair ("{}"). Valid Sartre text may be placed after the 47 | comment if desired. If comments are absolutely necessary, they should 48 | adequately describe the futility of the program and the plight of 49 | programmer and computer in a world ruled by an unfeeling God and His 50 | compilers, as well as providing explanation of the surrounding program 51 | statements. 52 | 53 | They should not be misspelled, as some compilers may check. 54 | 55 | Admittedly, while it is not hard to string Sartre statements together to 56 | create a Sartre-compilable text file, it can be quite hard to program in 57 | the Sartre paradigm. To wit, one may keep creating programs, one after 58 | another, like soldiers marching into the sea, but each one may seem 59 | empty, hollow, like stone. One may want to create a program that 60 | expresses the meaninglessness of existence, and instead they average two 61 | numbers. 62 | 63 | Sartre Data Types 64 | ----------------- 65 | 66 | The Sartre language has two basic data types, the EN-SOI and the 67 | POUR-SOI. The en-soi is a completely filled heap of a specified rank, 68 | whereas the pour-soi is a dynamic structure which never has the same 69 | value. An integer may also be used in Sartre, but it may only take the 70 | value of zero (the Dada extensions to Sartre allow integers to also take 71 | on the value of "duck sauce", but that's neither here nor there--unless 72 | you happen to like duck sauce, of course). 73 | 74 | en-soi 75 | 76 | The en-soi, as mentioned before, is a full heap of a specified rank. As 77 | Sartre does not allow pre-initialized data, the actual data in the heap 78 | is non-specified (but the heap is "pre-heapified"). Data (en-sois of 79 | rank 0) may be "deconstruct"ed from the en-soi, or "rotate"d through. At 80 | all times, the en-soi remains a full heap, however. En-sois of rank zero 81 | are 32 bits with no inherent meaning. En-sois of higher rank may be 82 | defined as each element being of another (same) data type (an en-soi of 83 | integers, all with a value of zero (or duck sauce), for example). 84 | 85 | pour-soi 86 | 87 | The pour-soi is only, and precisely, what it is not. It may be 88 | "unassigned from" a certain value, thereby exactly increasing the number 89 | of things it isn't. It is specified to be a two-bit value, so it 90 | probably isn't. 91 | 92 | integer 93 | 94 | Unlike the integers in most programming languages, Sartre integers all 95 | have a value of zero (again, unless the Dada extensions are being used). 96 | Like the rest of the dreary universe (duck sauce included), this is 97 | something that must be lived with. 98 | 99 | orthograph 100 | 101 | The orthograph is a special type of pictogram used in the specification 102 | of lexicographic elements. The set of orthographs varies from Sartre 103 | implementation to Sartre implementation, but is guaranteed to contain 104 | all the so-called "letters" from at least one modern language, 105 | transliterated to the closest element of the ASCII set and ordered as 106 | the bit-reversed EBCDIC value, assuming those bit-patterns were 107 | integers, which they probably aren't. Unavailable orthographs in a given 108 | implementation are represented by "frowny faces". As defined, the 109 | orthograph is possibly the simplest and most convenient data type to 110 | work with. 111 | 112 | const 113 | 114 | Not an actual data type, but somewhat useful, is the introduction of the 115 | symbolic constant into the Sartre language. Since Sartre does not allow 116 | for unconventional, potentially confusing symbols to be strewn about a 117 | program (for example, "17" meant to represent a certain quantity of 118 | items), this allows the programmer to define a set of symbolic constants 119 | he plans to use. To avoid confusion, symbolic constants are defined in 120 | unary, using the "wow" (!) as the unit (i.e., !, !!, !!!, !!!!, ...). 121 | 122 | Symbolic constants must be surrounded in "rabbit ears" (") in use during 123 | the action section of the program. 124 | 125 | Predefined Sartre Instances 126 | --------------------------- 127 | 128 | The following data instances are provided to the Sartre programming 129 | environment to facilitate programming certain concepts which would be 130 | (and probably should be) nearly impossible otherwise. 131 | 132 | MAXINT This is the maximum integer value allowed by the 133 | particular Sartre implementation: zero. 134 | MININT This is the minimum integer value allowed by the 135 | particular Sartre implementation. If using the Dada 136 | extensions, MININT is duck sauce; if not, it is zero. 137 | ORTH0 This is the "initial orthograph" of the Sartre 138 | implementation. 139 | ORTHL8 This is the "final orthograph" of the implementation. 140 | The name is properly pronounced "Orthograph: Lazy 141 | Eight". 142 | 143 | Sartre Program Segments 144 | ----------------------- 145 | 146 | The Sartre program is broken into simple, logical portions. In essence, 147 | all things must be declared before usage, and the declaration section 148 | comes before the action section (if any). Since the Sartre language has 149 | a recursive structure, the Sartre nihilist has the same structure as the 150 | main nihilator which is about to be described: 151 | 152 | Nihilator ; 153 | {Nihilist;} 154 | Const = ; 155 | Consts = ..; 156 | Matter { {, } : ;} 157 | Act 158 | { ;} 159 | No more ; 160 | . 161 | 162 | The only difference between a nihilist and the nihilator is that a 163 | nihilist does not use the trailing one-spot. 164 | 165 | Example Matter definitions (data declaration) might be: 166 | 167 | Const 3 = !!; 168 | Matter dooM: integer; 169 | Rniqqlj:en-soi, rank "3", of pour-soi; 170 | 171 | which creates an integer (with a value of 0, since we are not invoking 172 | Dada extensions), and a heapified en-soi with seven pour-soi storage 173 | locations. 174 | 175 | Sartre Statement Types 176 | ---------------------- 177 | 178 | `IF ;` 179 | 180 | The Sartre conditional takes no arguments and then alters program flow 181 | accordingly. Put simply, on the condition where the most recently 182 | executed nihilator was successful, program execution is transferred to 183 | just beyond the next conditional, restarting the search from the 184 | beginning, if necessary. 185 | 186 | `LVAL := expr ;` 187 | 188 | The Sartre assignment statement takes the bourgois-perceived value of 189 | the damned expression and places it in LVAL. If LVAL is a pour-soi, this 190 | unassigns a value from the pour-soi. 191 | 192 | `No Exit ;` 193 | 194 | A reminder to the program that none of us can escape what we have 195 | wrought, or even escape what others have wrought. In programming terms, 196 | this may either cause the machine to hang or cause the program not to 197 | terminate, depending on the implementation. 198 | 199 | `Life Is Meaningless ;` 200 | 201 | A special command which, due to the resignation of the programmer, is 202 | permitted to perform a wide variety of tasks, among them, alter the 203 | direction of program flow, execute a random function, terminate the 204 | program, or positionally invert the bits in the data region. Since the 205 | programmer doesn't care anyway, this doesn't really matter. In the 206 | (tee-hee) ordinary version of Sartre, this operation is defined at 207 | compile-time, and is constant at that statement for each incidence of 208 | execution. The Dada extensions, however, redefine meaninglessness (since 209 | everything under Dada is meaningless to begin with) to be determined at 210 | run- time. Further, it may also logically negate each bit in the 211 | dataspace under Dada. 212 | 213 | `{ data } ;` 214 | 215 | This invokes the named nihilist and allows it to accomplish its goal. 216 | 217 | The Sartre scoping rules are somewhat complex in that it may only 218 | utilize data which has been accessed previously or any data which it 219 | makes up itself. Data which has not yet been accessed is unknown to the 220 | Sartre nihilist, however. 221 | 222 | `again ;` 223 | 224 | Repeats the last statement, for the computationally-impaired. 225 | 226 | `Act { ; } No more ;` 227 | 228 | Allows certain statement-sets to be considered a single, atomic 229 | statment. A conditional cannot jump to within such an atomic structure. 230 | 231 | Predefined Sartre Nihilists 232 | --------------------------- 233 | 234 | ` which` 235 | 236 | Gets replaced with a zero-rank en-soi with the bit pattern of the 237 | orthograph. The orthograph may be replaced by a symbolic constant, and 238 | returns the bit-pattern that would be associated if the symbolic 239 | constant were an integer, which it isn't, otherwise it would be zero. 240 | 241 | ` that` 242 | 243 | Gets replaced with the orthograph that matches the bit- pattern in the 244 | zero-rank en-soi. 245 | 246 | ` ` 247 | 248 | \ is one of "and", "or", or "xor", and returns the bitwise 249 | logical operation between the two zero-rank en-sois. 250 | 251 | `NOT ` 252 | 253 | This makes the pour-soi what it isn't, even if it is. 254 | 255 | `annihilate ;` 256 | 257 | Clears the values (sets to arbitrary values) of any data in the program. 258 | Proceeds to destroy any dynamically allocated storage. If no dynamic 259 | storage exists, causes a "Bad Faith" error in the program. 260 | 261 | ` Dump ;` 262 | 263 | Prints the zero-rank en-soi to the screen as up to four orthographs. The 264 | en-soi may be replaced with an orthograph, in which case the orthograph 265 | itself is printed. 266 | 267 | ` Get ;` 268 | 269 | Allows an orthograph to be input from the keyboard and stored in the 270 | specified orthograph. The orthograph may be replaced by any-rank en-soi 271 | of orthographs, in which case the statement will read in enough 272 | orthographs to fill the en-soi, then "heapify" the en-soi. 273 | 274 | ` Zip` 275 | 276 | First, removes an element from the en-soi, then adds the new element and 277 | "heapifies," returning the removed element. 278 | 279 | ` Dir` 280 | 281 | Flips direction of the en-soi's "heapification." If the Dir nihilist is 282 | never executed, the heapification of the en-soi is, by default, 283 | descending. 284 | 285 | ` Flip ;` 286 | 287 | Flips the bit represented by \ in \. 288 | 289 | ` Concat` 290 | 291 | Returns the concatenated bitpatterns of the two \ elements. 292 | 293 | Sample Sartre Programs 294 | ---------------------- 295 | 296 | Nihilator SartreExample1; 297 | Act 298 | No more ; . 299 | 300 | This program can be appreciated for its ability to not sort the input 301 | list of values in ascending order. It's elegance and simplicity in not 302 | accomplishing this goal are admirable. To fully appreciate that sorting 303 | is the activity being denied the user, as opposed to, say, searching or 304 | some sort of filtering, one should stare at the (lack of) program output 305 | forever and not turn on the lights when it gets dark. 306 | 307 | Nihilator SartreExample2; 308 | Act 309 | IF ; 310 | again ; 311 | No more ; . 312 | 313 | This program fully considers the implications of its existance. It 314 | begins by questioning itself and, if successful, control flow moves to 315 | find the next conditional, of which there is none, so flow wraps to 316 | itself. If unsuccessful, it defaults to the "again" statement, which 317 | makes the same consideration (of which the condition is now true), 318 | wrapping the control flow to the previously executed "IF". 319 | -------------------------------------------------------------------------------- /mdpn/mdpn.markdown: -------------------------------------------------------------------------------- 1 | Multi-Directional Pattern Notation 2 | ================================== 3 | 4 | Final - Sep 6 1999 5 | 6 | * * * * * 7 | 8 | Introduction 9 | ------------ 10 | 11 | MDPN is an extension to EBNF, which attributes it for the purposes of 12 | scanning and parsing input files which assume a non-unidirectional form. 13 | A familiarity with EBNF is assumed for the remainder of this document. 14 | 15 | MDPN was developed by Chris Pressey in late 1998, built on an earlier, 16 | less successful attempt at a "2D EBNF" devised to fill a void that the 17 | mainstream literature on parsing seemed to rarely if ever approach, with 18 | much help provided by John Colagioia throughout 1998. 19 | 20 | MDPN has possible uses in the construction of parsers and subsequently 21 | compilers for multi-directional and multi-dimensional languages such as 22 | Orthogonal, Befunge, Wierd, Blank, Plankalkül, and even less contrived 23 | notations like structured Flowchart and Object models of systems. 24 | 25 | As the name indicates, MDPN provides a notation for describing 26 | multidimensional patterns by extending the concept of linear scanning 27 | and matching with geometric attributes in a given number of dimensions. 28 | 29 | Preconditions for Multidirectional Parsing 30 | ------------------------------------------ 31 | 32 | The multidirectional parsing that MDPN concerns itself with assumes that 33 | any portion of the input file is accessable at any time. Concepts such 34 | as LL(1) are fairly meaningless in a non-unidirectional parsing system 35 | of this sort. The unidirectional input devices such as paper tape and 36 | punch cards that were the concern of original parsing methods have been 37 | superceded by modern devices such as hard disk drives and ample, cheap 38 | RAM. 39 | 40 | In addition, MDPN is limited to an orthogonal representation of the 41 | input file, and this document is generally less concerned about forms of 42 | four or higher dimensions, to reduce unnecessary complexity. 43 | 44 | Notation from EBNF 45 | ------------------ 46 | 47 | Syntax is drawn from EBNF. It is slightly modified, but should not 48 | surprise anyone who is familiar with EBNF. 49 | 50 | A freely-chosen unadorned ('bareword') alphabetic multicharacter 51 | identifier indicates the name of a nonterminal (named pattern) in the 52 | grammar. e.g. `foo`. (Single characters have special meaning as 53 | operators.) Case matters: `foo` is not the same name as `Foo` or `FOO`. 54 | 55 | Double quotes begin and end literal terminals (symbols.) e.g. `"bar"`. 56 | 57 | A double-colon-equals-sign (`::=`) describes a production (pattern 58 | match) by associating a single nonterminal on the left with a pattern on 59 | the right, terminated with a period. e.g. `foo ::= "bar".` 60 | 61 | A pattern is a series of terminals, nonterminals, operators, and 62 | parenthetics. 63 | 64 | The `|` operator denotes alternatives. e.g. `"foo" | "bar"` 65 | 66 | The `(` and `)` parentheses denote precedence and grouping. 67 | 68 | The `[` and `]` brackets denote that the contents may be omitted, that 69 | is, they may occur zero or one times. e.g. `"bar" ["baz"]` 70 | 71 | The `{` and `}` braces denote that the contents may be omitted or may be 72 | repeated any number of times. e.g. `"bar" {baz "quuz"}` 73 | 74 | Deviations from EBNF 75 | -------------------- 76 | 77 | The input file is spatially related to a coordinate system and it is 78 | useful to think of the input mapped to an orthogonally distributed 79 | (Cartesian) form with no arbitrary limit imposed on its size, 80 | hereinafter referred to as *scan-space*. 81 | 82 | The input file is mapped to scan-space. The first printable character in 83 | the input file always maps to the *origin* of scan-space regardless of 84 | the number of dimensions. The origin is enumerated with coordinates (0) 85 | in one dimension, (0,0) in two dimensions, (0,0,0) in three dimensions, 86 | etc. 87 | 88 | Scan-space follows the 'computer storage' co-ordinate system so that *x* 89 | coordinates increase to the 'east' (rightwards), *y* coordinates 90 | increase to the 'south' (downwards), and *z* coordinates increase on 91 | each successive 'page'. 92 | 93 | Successive characters in the input file indicate successive coordinate 94 | (*x*) values in scan-space. For two and three dimensions, end-of-line 95 | markers are assumed to indicate "reset the *x* dimension and increment 96 | the *y* dimension", and end-of-page markers indicate "reset the *y* 97 | dimension and increment the *z* dimension", thus following the 98 | commonplace mapping of computer text printouts. 99 | 100 | Whitespace in the input file are **not** ignored. The terminal `" "`, 101 | however, will match any whitespace (including tabs, which are **not** 102 | expanded.) The pattern `{" "}` may be used to indicate any number of 103 | whitespaces; `" " {" "}` may be used to indicate one or more 104 | whitespaces. Areas of scan-space beyond the bounds of the input file are 105 | considered to be filled with whitespaces. 106 | 107 | Therefore, `"hello"` as a terminal is exactly the same as 108 | `"h" "e" "l" "l" "o"` as an pattern of terminals. 109 | 110 | A `}` closing brace can be followed by a `^` (*constraint*) operator, 111 | which is followed by an expression in parentheses. 112 | 113 | This expression is actually in a subnotation which supports a very 114 | simple form of algebra. The expression (built with terms connected by 115 | infix `+-*/%` operators with their C language meanings) can either 116 | reduce to 117 | 118 | - a constant value, as in `{"X"} ^ (5)`, which would match five `X` 119 | terminals in a line; or 120 | - an unknown value, which can involve any single lowercase letters, 121 | which indicate variables local to the production, as in 122 | `{"+"}^(x) {"-"}^(x*2)`, which would match only twice as many minus 123 | signs as plus signs. 124 | 125 | Complex algebraic expressions in constraints can and probably should be 126 | avoided when constructing a MDPN grammar for a real (non-contrived) 127 | compiler. MDPN-based compiler-compilers aren't expected to support more 128 | than one or two unknowns per expression, for example. There is no such 129 | restriction, of course, when using MDPN as a guide for hand-coding a 130 | multidimensional parser, or otherwise using it as a more sophisticated 131 | pattern-matching tool. 132 | 133 | The Scan Pointer 134 | ---------------- 135 | 136 | It is useful to imagine a *scan pointer* (SP, not to be confused with a 137 | *stack pointer*, which is not the concern of this document) which is 138 | analogous to the current token in a single-dimensional parser, but 139 | exists in MDPN as a free spatial relationship to the input file, and 140 | thus also has associated geometric attributes such as direction. 141 | 142 | The SP's *location* is advanced through scan-space by its *heading* as 143 | terminals in the productions are successfully matched with symbols in 144 | the input buffer. 145 | 146 | The following geometric attribution operators modify the properties of 147 | the SP. Note that injudicious use of any of these operators *can* result 148 | in an infinite loop during scanning. There is no built-in contingency 149 | measure to escape from an infinite parsing loop in MDPN (but see 150 | exclusivity, below, for a possible way to overcome this.) 151 | 152 | `t` is the relative translation operator. It is followed by a vector, in 153 | parentheses, which is added to the location of the SP. This does not 154 | change its heading. 155 | 156 | For example, `t (0,-1)` moves the SP one symbol above the current symbol 157 | (the symbol which was *about* to be matched.) 158 | 159 | As a more realistic example of how this works, consider that the pattern 160 | `"." t(-1,1) "!" t(0,-1)` will match a period with an exclamation point 161 | directly below it, like: 162 | 163 | . 164 | ! 165 | 166 | `r` is the relative rotation operator. It is followed by an axis 167 | identifier (optional: see below) and an orthogonal angle (an angle *a* 168 | such that |*a*| **mod** 90 degrees = 0) assumed to be measured in 169 | degrees, both in parentheses. The angle is added to the SP's heading. 170 | Negative angle arguments are allowed. 171 | 172 | Described in two dimensions, the (default) heading 0 denotes 'east,' 173 | that is, parsing character by character in a rightward direction, where 174 | the SP's *x* axis coordinate increases and all other axes coordinates 175 | stay the same. Increasing angles ascend counterclockwise (90 = 'north', 176 | 180 = 'west', 270 = 'south'.) 177 | 178 | For example, `">" r(-90) "+^"` would match 179 | 180 | >+ 181 | ^ 182 | 183 | The axis identifier indicates which axis this rotation occurs around. If 184 | the axis identifier is omitted, the *z* axis is to be assumed, since 185 | this is certainly the most common axis to rotate about, in two 186 | dimensions. 187 | 188 | If the axis identifier is present, it may be a single letter in the set 189 | `xyz` (these unsurprisingly indicate the *x*, *y*, and *z* dimensions 190 | respectively), or it may be a non-negative integer value, where 0 191 | corresponds to the *x* dimension, 1 corresponds to the *y* dimension, 192 | etc. (Implementation note: in more than two dimensions, the SP's heading 193 | property should probably be broken up internally into theta, rho, &c 194 | components as appropriate.) 195 | 196 | For example, `r(z,180)` rotates the SP's heading 180 degrees about the 197 | *z* (dimension \#2) axis, as does `r(2,180)` or even just `r(180)`. 198 | 199 | `<` and `>` are the push and pop state-stack operators, respectively. 200 | Alternately, they can be viewed as lookahead-assertion parenthetics, 201 | since the stack is generally assumed to be local to the production. 202 | (Compiler-compilers should probably notify the user, but not necessarily 203 | panic, if they find unbalanced `<>`'s.) 204 | 205 | All properties of the SP (including location and heading, and scale 206 | factor if supported) are pushed as a group onto the stack during `<` and 207 | popped as a group off the stack during `>`. 208 | 209 | Advanced SP Features 210 | -------------------- 211 | 212 | These features are not absolutely necessary for most non-contrived 213 | multi-directional grammars. MDPN compiler-compilers are not expected to 214 | support them. 215 | 216 | `T` is the absolute translation operator. It is followed by a vector 217 | which is assigned to the location of the SP. e.g. `T (0,0)` will 'home' 218 | the scan. 219 | 220 | `R` is the absolute rotation operator. It is followed by an optional 221 | axis identifier, and an orthogonal angle assumed to be measured in 222 | degrees. The SP's heading is set to this angle. e.g. `R(270)` sets the 223 | SP scanning line after line down the input text, downwards. See the `r` 224 | operator, above, for how the axis identifier functions. 225 | 226 | `S` is the absolute scale operator. It is followed by an orthogonal 227 | *scaling factor* (a scalar *s* such that *s* = **int**(*s*) and *s* \>= 228 | 1). The SP's scale factor is set to this value. The finest possible 229 | scale, 1, indicates a 1:1 map with the input file; for each one input 230 | symbol matched, the SP advances one symbol in its path. When the scale 231 | factor is two, then for each one input symbol matched, the SP advances 232 | two symbols, skipping over an interim symbol. Etc. 233 | 234 | `s` is the relative scale operator. It is followed by a scalar integer 235 | which is added to the SP's scaling factor (so long as it does not cause 236 | the scaling factor to be zero or negative.) 237 | 238 | Scale operators may also take an optional axis identifier (as in 239 | `S(y,2)`), but when the axis identifier is omitted, all axes are assumed 240 | (non-distortional scaling). 241 | 242 | `!>` is a state-assertion alternative to `>`, for the purpose of 243 | determining that the SP successfully and completely reverted to a 244 | previous state that was pushed onto the stack ('came full circle'). This 245 | operator is something of a luxury; a grammar which uses constraints 246 | correctly should never *need* it, but it can come in handy. 247 | 248 | Other Advanced Features: Exclusivity 249 | ------------------------------------ 250 | 251 | Lastly, in the specification of a production, the *exclusivity* applying 252 | to that production can be given between a hyphen following the name of 253 | the nonterminal, and the `::=` operator. 254 | 255 | Exclusivity is a list of productions, named by their nonterminals, and 256 | comes into play at any particular *instance* of the production (i.e. 257 | when the production successfully matches specific symbols at specific 258 | points in scan-space during a parse, called the *domain*.) The 259 | exclusivity describes how the domain of each instance is protected from 260 | being the domain of any further instances. The domain of any subsequent 261 | instances of any productions listed in the exclusivity is restricted 262 | from sharing points in scan-space with the established domain. 263 | 264 | Exclusivity is a measure to prevent so-called *crossword grammars* - 265 | that is, where instances of productions can *overlap* and share common 266 | symbols - if desired. Internally it's generally considered a list of 267 | 'used-by-this-production' references associated with each point in 268 | scan-space. An example of the syntax to specify exclusivity is 269 | `bar - bar quuz ::= foo {"a"} baz`. Note that the domain of an instance 270 | of `bar` is the sum of the domains `foo`, `baz` and the chain of "`a`" 271 | terminals, and that neither a subsequent instance of `quuz` nor `bar` 272 | again can overlap it. 273 | 274 | Examples of MDPN-described Grammars 275 | ----------------------------------- 276 | 277 | **Example 1.** A grammar for describing boxes. 278 | 279 | The task of writing a translator to recognize a two-dimensional 280 | construct such as a box can easily be realized using a tool such as 281 | MDPN. 282 | 283 | An input file might contain a of box drawn in ASCII characters, such as 284 | 285 | +------+ 286 | | | 287 | | | 288 | +------+ 289 | 290 | Let's also say that boxes have a minimum height of four (they must 291 | contain at least two rows), but no minimum width. Also, edge characters 292 | must match up with which edge they are on. So, the following forms are 293 | both illegal inputs: 294 | 295 | +-+ 296 | +-+ 297 | 298 | +-|-+ 299 | | | 300 | * 301 | | | 302 | +-|-+ 303 | 304 | The MDPN production used to describe this box might be 305 | 306 | Box ::= "+" {"-"}^(w) r(-90) "+" "||" {"|"}^(h) r(-90) 307 | "+" {"-"}^(w) r(-90) "+" "||" {"|"}^(h) r(-90). 308 | 309 | **Example 2.** A simplified grammar for Plankalkül's assignments. 310 | 311 | An input file might contain an ASCII approximation of something Zuse 312 | might have jotted down on paper: 313 | 314 | |Z + Z => Z 315 | V|1 2 3 316 | S|1.n 1.n 1.n 317 | 318 | Simplified MDPN productions used to describe this might be 319 | 320 | Staff ::= Spaces "|" TempVar AddOp TempVar Assign TempVar. 321 | TempVar ::= "Z" t(-1,1) Index t(-1,1) Structure t(0,-2) Spaces. 322 | Index ::= . 323 | Digit ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9". 324 | Structure ::= . 325 | AddOp ::= ("+" | "-") Spaces. 326 | Assign ::= "=>" Spaces. 327 | Spaces ::= {" "}. 328 | -------------------------------------------------------------------------------- /you-are-reading-the-name-of-this-esolang/you-are-reading-the-name-of-this-esolang.markdown: -------------------------------------------------------------------------------- 1 | You are Reading the Name of this Esolang 2 | ======================================== 3 | 4 | November 2007, Chris Pressey, Cat's Eye Technologies 5 | 6 | Introduction 7 | ------------ 8 | 9 | This programming language, called **You are Reading the Name of this 10 | Esolang**, is my first foray into the design space of programming 11 | languages whose programs contain undecidable elements. In the case of 12 | You are Reading the Name of this Esolang, these elements are the 13 | instructions themselves — or rather, the symbols that the instructions 14 | are composed of. 15 | 16 | Before we begin, some lexical notes. The name of this language is not 17 | pronounced exactly how it looks; rather, it is pronounced as an English 18 | speaker would pronounce the phrase "you are hearing the name of this 19 | esolang." In addition, it is strongly discouraged to refer to this 20 | language by the name "Yartnote", whether spoken or in writing, and in 21 | any capitalization scheme. After all, there may actually be a completely 22 | unrelated esolang called Yartnote one day, Zeus willing. A similar logic 23 | applies to the taboo on calling it "YRNE". 24 | 25 | Program structure 26 | ----------------- 27 | 28 | A You are Reading the Name of this Esolang program is a string of 29 | symbols drawn from the alphabet `0`, `1`, `[`, and `]`. 30 | 31 | `0` and `1` are interpreted as they are in the programming language 32 | Spoon, or rather, the slightly more clearly-specified version of Spoon 33 | that follows. The Spoon tape is considered unbounded in both directions. 34 | Each cell of the Spoon tape may contain any non-negative integer (again, 35 | unbounded.) In addition, attempting to decrement a cell below 0 results 36 | in an immediate termination of the program. Oh, and there's no way to 37 | change what the symbols are. But that's the extent of the difference, I 38 | think. 39 | 40 | For your convenience, the Spoon instructions are repeated here (taken 41 | from the public-domain [Spoon](http://esolangs.org/wiki/Spoon) entry on 42 | the [Esolang wiki](http://esolangs.org/wiki/)): 43 | 44 | 1 Increment the memory cell under the pointer 45 | 000 Decrement the memory cell under the pointer 46 | 010 Move the pointer to the right 47 | 011 Move the pointer to the left 48 | 0011 Jump back to the matching 00100 49 | 00100 Jump past the matching 0011 if the cell under the pointer is zero 50 | 001010 Output the character signified by the cell at the pointer 51 | 0010110 Input a character and store it at the cell in the pointer 52 | 00101110 Output the entire memory array 53 | 00101111 Immediately terminate program execution 54 | 55 | Each `[` must be matched with a `]`; between them lies a subprogram with 56 | the same structure as a general You are Reading the Name of this Esolang 57 | program. The meaning of this subprogram is determined from its structure 58 | as follows. The subprogram is considered to be given the same input as 59 | the entire program. If the subprogram halts on this input, it is reduced 60 | to a `1`, and if it loops forever on this input, it is reduced to a `0`. 61 | These reduced instructions are interpreted as they are in Spoon, as 62 | described above. Any output produced by the subprogram is simply 63 | discarded. 64 | 65 | Subprograms may themselves contain subprograms, nested to arbitrary 66 | depth, in which case the reduction above is recursively applied, from 67 | the inside out, until a string of only `0` and `1` symbols remains. This 68 | is then executed as if it were a Spoon program. Any syntactically 69 | ill-formed program or subprogram is considered to halt immediately, 70 | producing no output. Note however that, as a consequence of this, a 71 | subprogram can be syntactically ill-formed (for example consisting of a 72 | single `0`) while the parent program can still be syntactically OK (the 73 | subprogram just reduces to a `1` in the parent program.) 74 | 75 | Implementation notes 76 | -------------------- 77 | 78 | An implementation will determine if a particular subprogram halts, or 79 | not, if it can. Implementations may vary in the power of their proof 80 | methods used for this, but at a minimum must be able to recognize at 81 | least one subprogram that halts on any input, and one subprogram that 82 | loops forever on any input. This implementation-dependence should not 83 | strike anyone as too bizarre, I don't think — it is quite similar to how 84 | different implementations of a traditional systems-programming language 85 | can, for example, provide different levels of support for sizes of 86 | numerical data types like integers. 87 | 88 | Recall that the problem of telling if an arbitrary program in some given 89 | Turing-complete language halts on some input is *undecidable*, or 90 | equivalently, that set of all programs that halt on some input is 91 | *recursively enumerable*. The set of all programs that loop forever on 92 | some given input is the complement of this set, and it is called 93 | *co-recursively enumerable* (*co-r.e.* for short.) 94 | 95 | Despite this, there are many methods, ranging from simplistic to 96 | sophisticated, that can be used to prove in *specific* circumstances 97 | that a given program, on a given input, will either halt or fail to 98 | halt. These methods can be used in a You are Reading the Name of this 99 | Esolang implementation. 100 | 101 | The simplest method for proving that a subprogram halts is probably just 102 | to simulate it on the given input indefinately, returning `1` if it 103 | halts. If the subprogram does indeed halt, this technique will 104 | (eventually) reveal that fact. The simulation can be peformed 105 | concurrently with other subprograms, so that if no proof of halting for 106 | one subprogram is ever found, this will not prevent other subprograms 107 | from being checked. 108 | 109 | The simplest method for proving that a subprogram loops forever is 110 | probably to check it against a library of subprograms known to loop 111 | forever. For example, it can check if the program is 112 | `0010000000111001000011` (in Brainfuck: `[-]+[]`. You can readily assure 113 | yourself that this program loops forever, on any input.) This technique 114 | of course limits the number of recognizably looping subprograms that the 115 | implementation can handle to a finite number. Checking the general case, 116 | that is, recognizing an infinite number of forever-looping programs, is 117 | more difficult, however. Such an implementation will require techniques 118 | such as automatically finding a proof by induction, or abstract 119 | interpretation. However, ultimately, we know from the Halting Problem 120 | that there is no perfectly general technique that will recognize *every* 121 | program that loops forever. 122 | 123 | Computability class 124 | ------------------- 125 | 126 | You are Reading the Name of this Esolang can be trivially shown to be as 127 | powerful as Spoon, since valid Spoon programs are valid You are Reading 128 | the Name of this Esolang programs. (Modulo the absence of 129 | negative-valued tape cells and the other little variations mentioned 130 | above. Since these issues have been dealt with extensively in the 131 | Brainfuck "literature", such as it is, I'm not going to worry about 132 | them.) 133 | 134 | What about the other way around? 135 | 136 | Well, take as a starting point the fact (in classical logic, at least) 137 | that every Spoon program either halts at some point or loops forever. 138 | (Whether we can *discover* which of these is the case is a different 139 | story.) This means that every You are Reading the Name of this Esolang 140 | program has a "canonical" form consisting of just `1`'s and `0`'s — 141 | again, whether we have an interpreter powerful enough to discover it or 142 | not. At this level, Spoon is as powerful as "canonical" You are Reading 143 | the Name of this Esolang, because "canonical" You are Reading the Name 144 | of this Esolang programs are valid Spoon programs. 145 | 146 | Now we can go in the other direction. We can start with any "canonical" 147 | You are Reading the Name of this Esolang program, and replace each `1` 148 | with any You are Reading the Name of this Esolang subprogram that always 149 | halts. Since even the simple method, described above, of proving that a 150 | subprogram halts will always resolve to `1` if the subprogram does 151 | indeed halt, this subset of You are Reading the Name of this Esolang 152 | programs is still executable in Spoon (or any other Turing-complete 153 | language). We only need to add the halting proof mechanism to rewrite 154 | the program into "canonical" form, before executing or simulating it. 155 | 156 | This extends recursively to any arbitrary level of nesting, too: we can 157 | replace each `1` in each subprogram with subprograms that always halt, 158 | with no bound. We only have to test these subprograms recursively, from 159 | the inside out, to eventually recover a "canonical" program. 160 | 161 | However, something strange happens when we turn our attention to `0`'s. 162 | If we replace even one `0` with a subprogram that loops forever on some 163 | input, there is always a possibility that: a) the You are Reading the 164 | Name of this Esolang program will be run with that input, and that b) 165 | the interpreter cannot prove that the subprogram loops forever with that 166 | input. Because the set of programs that loop forever is co-r.e., there 167 | is no Turing machine (or Spoon program) that can look at any one Spoon 168 | program and say, yep, I'm certain that this here program loops forever 169 | on this input, so it should darn well be rewritten into a `0` symbol. 170 | 171 | Thus it seems that there are You are Reading the Name of this Esolang 172 | programs which no Spoon interpreter — indeed, not any interpreter for 173 | any Turing-complete language — is able to interpret. 174 | 175 | Since the case of `0`'s seems to mirror the case of `1`'s when it comes 176 | to expanding them into subprograms, we may conjecture that, instead of 177 | remaining basically the same as we consider more and more deeply nested 178 | subprograms (as it was with expanding `1`'s,) maybe the problem becomes 179 | more intractable the deeper we go in expanding `0`'s. Perhaps we are 180 | climbing up the arithmetic hierarchy? 181 | 182 | At any rate, *I* certainly initially conjectured that that was the case, 183 | but it appears to be off the mark. Say you have a Spoon interpreter 184 | that's equipped with an oracle. You can feed an input string and a Spoon 185 | program into the oracle, and the oracle tells you whether or not that 186 | program halts on that input. You could then use that oracle to resolve a 187 | given You are Reading the Name of this Esolang subprogram into a `0` or 188 | `1`. But, you could also do this recursively, resolving them from the 189 | inside outward. A Spoon interpreter with such an oracle would be able to 190 | simulate any You are Reading the Name of this Esolang program — no more 191 | powerful oracle is needed, no matter how deep the subprograms are 192 | nested. 193 | 194 | Discussion 195 | ---------- 196 | 197 | I started designing You are Reading the Name of this Esolang shortly 198 | after reading about the programming language Gravity, while trying to 199 | determine the sense in which it is "non-computable." In particular, I 200 | noticed that these two statements (taken from the 201 | [Gravity](http://esolangs.org/wiki/Gravity) entry on the [Esolang 202 | wiki](http://esolangs.org/wiki/),) on which the claim of Gravity's 203 | "non-computability" apparently rests, have ready analogies to problems 204 | the world of Turing machines: 205 | 206 | - "Although [Gravity's] behavior is well-defined and deterministic, 207 | the evolution of its space is in general non-computable [...]" 208 | 209 | The evolution of the state-space (set of successive configurations) 210 | of a universal Turing machine is also in general non-computable 211 | (there's no Turing machine that can tell you that some given 212 | configuration will never be reached.) 213 | 214 | - "It can be shown that a Turing machine cannot compute, in the 215 | general case, whether even a single collision [in a given Gravity 216 | program] will ever happen." 217 | 218 | It can also be shown that a Turing machine cannot compute, in the 219 | general case, whether or not even a single given state of another 220 | Turing machine's finite control will ever be reached. (Just make 221 | that state a halt state, and you have the Halting Problem right 222 | there.) 223 | 224 | Because of this, I am skeptical that Gravity is any more 225 | "non-computable" than a universal Turing machine. (I am, however, far 226 | from an expert on the computability of differential equations; it could 227 | be that the rather nonspecific term "non-computable", as used in that 228 | subfield, means something stronger than simply "undecidable".) 229 | 230 | At any rate, the idea interested me enough to spur me into designing a 231 | language that I *could* be reasonably certain was non-computable, in 232 | some sense that I could explain. The name You are Reading the Name of 233 | this Esolang was drifting around nearby in the æther at that moment, and 234 | seemed fitting enough for this monstrosity. 235 | 236 | The general approach was to simply force the language interpreter to 237 | decide — that is, to reduce to either a `0` or a `1` — some problem that 238 | is undecidable. This led to looking for something that needed 239 | specifically either a `0` or a `1` to specify something necessary, and 240 | that in turn led to the choice of Spoon as a base language. (Of course, 241 | I could have picked just about any language which is "its own binary 242 | Gödel numbering"; there are plenty to choose from there, but Spoon had a 243 | cool name. What can I say — I like The Tick.) 244 | 245 | The obvious choice of undecidable problem was whether another program 246 | halts or not. Making the subject of this problem a *subprogram* with the 247 | same structure as the general program let me examine the case of 248 | unbounded recursive descent. This turned out to be not quite as 249 | interesting as I hoped, but perhaps still somewhat illuminating. (Just 250 | what *would* it take, to require that a Spoon interpreter have a more 251 | powerful oracle than HP, to run every You are Reading the Name of this 252 | Esolang program? Perhaps [Banana 253 | Scheme](http://esolangs.org/wiki/Banana_Scheme) could provide some 254 | inspiration, here. *It* certainly seems to be climbing the arithmetic 255 | hierarchy, although I can't quite say how far. Possibly "damned far.") 256 | 257 | I suppose one or two other things can be said about You are Reading the 258 | Name of this Esolang. 259 | 260 | Unlike both Gravity and Banana Scheme, You are Reading the Name of this 261 | Esolang has a recursively enumerable *syntax*: the problem of whether or 262 | not a given string over the alphabet `0`, `1`, `[`, and `]` is even a 263 | well-formed You are Reading the Name of this Esolang program is 264 | undecidable! 265 | 266 | It's not entirely clear how to interpret the instruction `00101110`, 267 | "Output the entire memory array," in the context of having a tape 268 | unbounded in both directions. I suppose I ought to stipulate that we are 269 | to just output the portion of the tape that has been "touched", i.e. 270 | that the tape head has moved over. But really, it's not so important for 271 | the goals of You are Reading the Name of this Esolang, so maybe I should 272 | just leave it undefined, for kicks. 273 | 274 | The fact that every subprogram takes the *same* input (same as the main 275 | program) might lead to some interesting programs — programs which are 276 | unduly sensitive to changes in the input, I imagine. Of course, this 277 | doesn't affect the undecidability of subprograms, since they are always 278 | free to ignore input completely. 279 | 280 | There is no implementation, yet, but constructing an efficient one would 281 | be a good exercise in static program analysis. 282 | 283 | Happy undeciding! 284 | 285 | -Chris Pressey 286 | November 5, 2007 287 | Chicago, Illinois, USA 288 | -------------------------------------------------------------------------------- /irishsea/doc/original-notes.markdown: -------------------------------------------------------------------------------- 1 | Thoughts about livecoding and related activities 2 | ================================================ 3 | 4 | This is a random collection of notes (not necessarily particularly 5 | intelligent ones, either) comparing some technical and creative activities. 6 | 7 | * Here are a few of the technical and creative activities I've undertaken: 8 | 9 | * Computer programming 10 | * Programming language design 11 | * Musical composition 12 | * Musical performance 13 | 14 | * These can be combined. 15 | 16 | Doing musical composition during a musical performance can be called 17 | _musical improvisation_. I've done this. 18 | 19 | Doing computer programming as a means of musical improvisation can be 20 | called _musical livecoding_. I've never done this. 21 | 22 | * Livecoding, eh? 23 | 24 | I only recently encountered livecoding. These notes are largely the 25 | result of me trying to come to grips with the concept. I'm coming at 26 | it in a very raw way; I basically stumbled upon it on Github. I've 27 | never experienced a livecoding performance. But I think that inexperience 28 | might let me think about it in a less biased way, too. 29 | 30 | * I have a hard time reconciling computer programming with musical improvisation. 31 | 32 | My experience with computer programming is similar to my experience with musical 33 | composition. 34 | 35 | I know I am able to musically improvise. So it must be theoretically possible to 36 | reconcile the two. But there are several things to account for. Here are a few. 37 | 38 | * Engineering is not an eager algorithm. But improvisation needs to be be. 39 | 40 | Any software project of any non-trifling size requires thought and planning and 41 | structure and being broken up into components and interfaces and invariants and 42 | ideally has a test suite, too. 43 | 44 | This can't be done effectively with an eager algorithm — the closest you can 45 | come to that is probably a quick prototype followed by a series of incremental 46 | improvements and refactorings. But even then, if you're not willing to do an 47 | occasional rewrite (i.e. significant rethink/refactoring of the design/code) at 48 | some point, you can paint yourself into a corner. And the rewrites require 49 | thought and planning and consideration and all that stuff that takes some time, 50 | i.e. not an eager algorithm. 51 | 52 | But improvisation, for the most part, requires that you compose something 53 | "on the fly" — you don't have time to sit down and plan it. But, in practice, 54 | that is not quite true; jazz musicians *practice* improvisation, and one of the 55 | results is they build up a stock of experience about improvisation that they can 56 | draw on, "on the fly". Some of this is accumulating a library of "licks", but 57 | much of it is also about building an intuition of what "works" with what. 58 | 59 | Also — when I played jazz, and had to play a solo for actual performance, I 60 | tended to refine a particular improvisation during practice, and play that 61 | during performance. But that's just an instance of composition. (I see nothing 62 | wrong with this approach, but some other musicians might feel that a solo 63 | played this way isn't "from the heart", or something. It might be interesting 64 | to know just how much of their solo your average musician works out beforehand, 65 | and how much they actually make up during performance. I'm sure it varies.) 66 | 67 | * Music is a performance art. Programming isn't. 68 | 69 | Generally, no one is watching you program. (Pair programming excepted, I 70 | suppose.) 71 | 72 | * Musical performance is "write-only". Composition isn't. Programming isn't. 73 | 74 | In a live performance, as soon as you hit a piano key, or as soon as you blow 75 | into the mouthpiece of your horn, the note you played is committed to the 76 | sound waves in the air. There's no going back. If you cacked it, if it was 77 | the wrong note, it was the wrong note and *it already happened*. Sorry! 78 | 79 | Putting together a composition on with a sequencer (which is how the vast 80 | majority of synthesized music is done) is not like that at all. You can 81 | arrange a few events, try it out, erase it, modify it until it's how you want 82 | it. 83 | 84 | Livecoding seems to necessarily fall under the first category. Even though 85 | you can perhaps backspace to correct a line of code before it causes anything 86 | to happen, as soon as you do commit it, it's committed, and it's going 87 | to start having an affect on the performance, as soon as those events are 88 | queued up to go. 89 | 90 | (I suppose a livecoding language could allow you to cancel changes you 91 | recently made, if the events they queued up haven't happened yet.) 92 | 93 | * I have ten fingers, two hands, two arms, two feet, and a mouth and lungs. 94 | 95 | Almost every instrument I can think of uses some combination of those to 96 | control it. Some examples: 97 | 98 | * piano: 10 fingers, 2 feet 99 | * drumkit: 2 hands, 2 feet 100 | * tuba: 4 fingers, mouth, lungs 101 | * trombone: 1 hand/arm, mouth, lungs 102 | * penny whistle: 8 fingers, lungs 103 | 104 | etc. 105 | 106 | A computer keyboard would be classified: 10 fingers. 107 | 108 | In combination with the idea that musical performance is "write-only", though: 109 | typically those 10 fingers must produce some combination of symbols, which 110 | is then comitted only after it is complete. Also, it is generally assumed 111 | there is *visual feedback* involved in letting the operator compose such a 112 | sequence (i.e. I need to see the words I am typing, or I won't be able to 113 | type anything without making a typo. I won't be able to know what I typed.) 114 | 115 | This all means the computer keyboard, as a musical instrument interface, is 116 | somewhat similar to, but also significantly different from, all other musical 117 | instruments' interfaces. 118 | 119 | * Livecoding is about external events. With programs, it varies. 120 | 121 | I'm assuming that the livecoding we're dealing with here (so far) is a 122 | performance art technique in that the events that are caused by the coding 123 | have an observable effect outside of the computer. They set up sounds to 124 | be played as part of music, lights, video, animation, dance, etc. etc. 125 | 126 | Programs, on the other hand, don't have to be about external events. In 127 | practice, they are: even a simple program produces output when it's finished, 128 | and that's an external event. 129 | 130 | But programs (especially in esolang circles) can also be seen merely as 131 | _embodiments of computations_. Many esolangs don't even have input and 132 | output — they just do a computation, and you examine the state of the 133 | virtual machine (or whatever) when it is finished executing, to see the 134 | result. 135 | 136 | Nevertheless, speaking *operationally*, the execution of a program can also 137 | be seen entirely as a sequence of events — assign a value to a variable, 138 | add two values, etc. These are all events, albeit _internal_ events which 139 | are usually not detectable (unless you are debugging the program.) 140 | 141 | * "Pure" livecoding? 142 | 143 | Examining the previous point, if the internal events of an executing program 144 | are exposed and made observable to an audience... livecoding with that 145 | program would not be tied to some other media like music or dance. This would 146 | be "pure" livecoding. 147 | 148 | It would obviously have a much smaller audience. 149 | 150 | But there are some interesting potentials here. If I assign sounds to be 151 | played when certain things happen in the internal event model of my program, 152 | might that help me debug it? Might certain algorithms produce inherently 153 | interesting patterns of events? (cf bytebeat. And things like John Conway's 154 | Game of Life, which is basically a computation, but often pleasing to watch.) 155 | 156 | * Computer is so much faster than me! What chance do I stand! 157 | 158 | If a program is producing a series of events at a musical time scale, then it 159 | is not unlikely that I could write a line of code and execute it in time, before 160 | the next musical sequence comes up, so I can affect the song in real-time. 161 | 162 | If the events are at a computational time scale — that is SO much faster than 163 | even the musical time scale that there is no chance that I could, for example, 164 | modify the computation of a function before it is done. (Only if it is operating 165 | on a very large amount of data, or is an inherently complex function, such as 166 | the Ackermann function, is it really even conceivable.) 167 | 168 | * Coding isn't necessarily general programming. 169 | 170 | "Coding" refers to writing a program in a Turing-complete programming language, 171 | but it can also refer to writing things in much weaker languages. For example 172 | it is possible to say "coding an HTML page". A purist might not want to call 173 | this "programming", maybe instead preferring "configuration". 174 | 175 | I will guess that most livecoding is configuration, not general programming. 176 | This is very reasonable, and a natural consequence of many of the things I've 177 | already mentioned. General programs require engineering, which is slow, and 178 | are hard to debug in part because they're done in "powerful" languages. 179 | 180 | You really, really wouldn't want to have to debug a program during a live 181 | performance! 182 | 183 | At the same time, it would be really neat to have livecoding be about more 184 | than just reconfiguring event streams — it would be awesome to have some kind 185 | of "higher" computational aspect to it too. 186 | 187 | What this all might mean for the design of a livecoding language/environment 188 | ============================================================================ 189 | 190 | * The operator ought to be able to enter a change quickly. But they also ought 191 | to be able to edit it easily before they commit it. 192 | 193 | Assuming we're sticking with the classical computer keyboard as the input 194 | device, this suggests to me that the most frequently used alterations should 195 | be performed with _short sequences of keystrokes which do not require multiple 196 | key combinations_ (so, no "shifted" symbols, like capital letters — of course, 197 | this varies internationally from one keyboard layout to another.) 198 | 199 | * Assuming there are event "generators" or "streams" which produce events that you 200 | want, until told to do otherwise, (for example, a drum track loop,) it seems 201 | like the most obvious operation is **tell a given generator to change what it 202 | will generate**. If we think of the generators as processes (a la Erlang), 203 | this operation is just sending a message to a generator. 204 | 205 | If we are content with 26 generators, we can use lowercase letters to identify 206 | them. Sending a message to a generator is a matter of naming the generator 207 | and then giving it the message you want to send to it. 208 | 209 | * Each generator should be pre-programmed (and perhaps re-programmable on the fly) 210 | to understand a certain set of messages. 211 | 212 | That is, each generator runs some "code" which reacts to events — a _handler_. 213 | Multiple generators could run the same handler. 214 | 215 | If a generator receives a message it doesn't understand, nothing should happen 216 | (well, maybe some feedback should be given to the operator, but other than that, 217 | nothing.) 218 | 219 | * So as an example, to send the message `p5` to the generator `a`, the operator 220 | might type: 221 | 222 | ap5↵ 223 | 224 | (where ↵ is the Enter key) 225 | 226 | Presumably the handler that generator `a` is running knows what `p5` means. 227 | It might be a command that says "switch to pattern number five at the next 228 | transition." In this case, `p` is the command, and `5` is a parameter to it. 229 | And assuming we have some rules for parsing commands and arguments like this 230 | (the simplest is that a space ends a command sequence) the operator could send 231 | several messages to one or more generators in one action, like 232 | 233 | ap5 bp7 br0↵ 234 | 235 | (Maybe `r0` means "turn off reverb" or something.) Or, if we have even more 236 | knowledge of the syntax, we might be able to shorten that to 237 | 238 | ap5bp7r0↵ 239 | 240 | but this gets tricky, as we might want to have more complex arguments. This 241 | syntax, obviously, needs formalizing. But the point is to demonstrate that you 242 | could have such a syntax. And you could type it quickly enough to do it during 243 | a performance — even a fast tempo performance, if you got practiced at it. 244 | 245 | * It might be useful for the editor to know about the syntax and do "structured 246 | editing", too; so that if, for example, `d` was not a valid message for the 247 | handler running in generator `a`, the editor would not even let you type `ad`. 248 | 249 | * Taking an idea from the design of the piano: maybe SHIFT and CTRL modifier keys 250 | could be foot pedals! (Or is that too weird?) 251 | 252 | * Most changes take effect "at the next transition point". (Computer is so much 253 | faster than me! Music is too, for that matter! But transitions should be 254 | kept smooth.) There might be some exceptions. 255 | 256 | * Programming the generators to know what kind of commands to accept at runtime is 257 | a large part of this. It's akin to accumulating those "licks" when practicing 258 | improvisation — teach the generator a set of tricks, and tell it what tricks to 259 | perform during the performance. Each trick can be arbitrarily complicated (and 260 | you don't have to debug it during the performance!) But the set of known tricks 261 | is limited. *Unless* you have good ways to sensibly combine tricks into new 262 | tricks on the fly (without everything falling apart.) This is where it seems 263 | very interesting. 264 | 265 | * Following that up, generators probably ought to be able to send messages to other 266 | generators. This would let them act as proxies (send a message to one generator, 267 | it translates/filters it, and passes it on to one or more other generators) and 268 | other things, e.g. periodically send "increase amplitude" events to all other 269 | generators to achieve a global crescendo. 270 | 271 | * For that matter, the output devices can be thought of as processes which receive 272 | messages, too; so a generator for producing a percussion track would periodically 273 | be sending "bass" and "snare" messages to the "drum machine output device". Would 274 | this use the same syntax as the operator's syntax for sending messages? Ideally, 275 | yes, or at least something similar/compatible. 276 | 277 | * Operator ought to be able to start up a generator with a particular handler (i.e. 278 | the "code" that the generator is running, which knows what to do with particular 279 | events). Switching handlers, and "tearing down" a generator could be handled by 280 | responding to messages, so the language probably doesn't need built-in operations 281 | for those. (But maybe forcibly terminating a generator would be one. And clearing 282 | a generator's message queue could be another — effectively, "cancel" what I've 283 | previously told this generator to do.) 284 | 285 | * This process/message model is very similar to Erlang's in many ways. 286 | 287 | * Multiple video monitors would probably be useful... but regardless, it would be useful 288 | for the operator to see a summary of **which generators are running what handlers** in 289 | real-time. 290 | -------------------------------------------------------------------------------- /sampo/Practical_Matters.markdown: -------------------------------------------------------------------------------- 1 | Practical Matters 2 | ================= 3 | 4 | This document is a collection of notes I've made over the years about the 5 | practical matters of production programming languages — usually stemming 6 | from being irked by some existing programming language's lack of adequate 7 | (in my opinion) support for them. As such, these thoughts may be overblown 8 | and sophistry-laden. But it is nice to have a place to put them. 9 | 10 | Fundamental Abstractions 11 | ------------------------ 12 | 13 | The following facilities should be either built-in to the language, or part 14 | of the standard (highly standardized) libraries: 15 | 16 | * Tracing. Ideally, the programmer should be able to easily browse all the 17 | relevant reduction steps, and the relevant data being manipulated therein, 18 | in the part of the program's execution that interests them. In addition, 19 | this should be something that can be enabled without polluting the source 20 | code (overmuch). 21 | 22 | This could be done, and fairly well, with techniques from aspect-oriented 23 | programming. The rules to describe what to trace (or to highlight in a 24 | full trace) could be specified in what amounts to a configuration file, 25 | and thus be an implementation issue rather than a language issue. 26 | 27 | Unfortunately, this ideal is hard to achieve, so the system should also 28 | support... 29 | 30 | * Logging. Logging is basically an ad-hoc way to explicitly achieve 31 | selective tracing: the programmer knows what points in the program, and 32 | what data, are of interest to them, and outputs that data to the log at 33 | those points. 34 | 35 | Whether this is "debug logging" during development, or to support post- 36 | mortem analysis of issues in production, it amounts to the same thing: 37 | debugging, just on different time scales. 38 | 39 | The use of a "log level" is mostly just a way to filter the trace built 40 | up in the log files. This is not necessarily a bad idea, but it should 41 | probably not be linear; information should be logged based on the reason 42 | that it is being logged, probably in the form of some sort of "tag", and 43 | filterable on that (whether at the time the log is being recorded, or 44 | being read.) 45 | 46 | Logging should not count as a side-effect. 47 | 48 | The logging function itself should have some properties: 49 | 50 | - Should not have side-effects (for example from evaluating its arguments), 51 | so that if it is not executed (because we are not interested in that 52 | part of the execution trace) the behaviour of the program is not changed. 53 | 54 | - In fact, should ensure that its arguments have no side-effects, and 55 | ideally, be total, with no chance of hanging or crashing. 56 | 57 | - Should pretty-print the relevant values, include the type and other 58 | metadata of the values, and put clearly visible delimeters around the 59 | values so printed. 60 | 61 | - Should include the source filename and line number. 62 | 63 | - Should not be overridable (shadowed? not sure what I meant here.) 64 | 65 | * History. This is more relevant in a language with mutable values, but 66 | as part of tracing, it is useful to know the history of mutations of a 67 | value. With immutable values, it would be useful to be able to view 68 | all the reductions which fed into the computation of the value at a 69 | point. Either way, however, this is expensive, so should be specified 70 | selectively. Again, an external, aspect-like configuration language 71 | for specifying which values to watch makes this an implementation issue. 72 | 73 | * Command-line option parsing. This should not rely on the Unix or DOS 74 | idea of a command line, and it should be unified with parameter passing 75 | in the language itself; calling an executable built in the language with 76 | arguments `a b c` should be no different from calling a function from 77 | within the language with the arguments `a b c` (probably as string values.) 78 | 79 | Reflection 80 | ---------- 81 | 82 | * First-class tracebacks. When a program, for example, encounters an error 83 | parsing an external file such as a configuration file, it should be able to 84 | report the position in that file that caused the error as part of the 85 | traceback, for consistency. Java has some limited facilities for this, and 86 | some Python libraries do this (Jinja2? werkzeug?) using frame hacks, but 87 | a less clumsy solution would be nice. 88 | 89 | * Tracebacks are *not* a special case of logging, or an artefact of throwing 90 | exceptions. Since the traceback is basically a formatted version of the 91 | current continuation, this suggests the two facilities should be unified, 92 | perhaps not totally, but to a high degree. 93 | 94 | Abstractions, not Wrappers 95 | -------------------------- 96 | 97 | The basic principle here is that the existing APIs of most libraries are 98 | (let's be polite) less than ideal, especially when they were designed for 99 | some other language (such as C), and instead of blindly wrapping them in a 100 | new language, the designer should at least *try* to make something nicer. 101 | 102 | The abstractions should also recognize that modern computer systems are 103 | generally not resource-starved (or at least that truly high-level 104 | programming languages should not treat them that way.) 105 | 106 | This applies to very basic facilities as well as what are usually thought 107 | of as external libraries. Specifically, 108 | 109 | * Date and time: We can do better than simply copycatting interfaces like 110 | `strftime`. All time data should be stored consistently, in GMT, always 111 | with a time zone. 112 | 113 | * String formatting: We can do better than simply copycatting interfaces 114 | like `printf`. We can use visual formatting strings, where fixed-size 115 | slots appear as fixed-sized placeholders (of the same size) in the 116 | formatting string. (See also the scathing prog21 criticism of the 117 | vertical tab character.) 118 | 119 | * Line-oriented communication: We can look at line-oriented communication 120 | more generally, as a form of record-oriented communication where the 121 | "delimiter set" for each record is {LF, CR, CRLF}. 122 | 123 | The programmer who really wants atavistic interfaces like those mentioned 124 | above can always implement them as "compatibility modules" if they wish. 125 | 126 | Seperation from the Implementation 127 | ---------------------------------- 128 | 129 | This is just a repeat of the above section in slightly different terms. 130 | 131 | A language should avoid tying any language construct (e.g. imports, 132 | include files) to the file system or the operating system. Instead, 133 | have mappings between e.g. module names and where they live in the file 134 | system, and between our model of a running computer and a real OS. 135 | These mapping could be specified in configuration files which are 136 | in the domain of the implementation and outside the domain of the 137 | language, i.e. they never appear in programs. 138 | 139 | Standard modules supplied with the language should expose *models* of 140 | commonplace artefacts out in the world, for example operating systems. 141 | The models are similar to the artefacts, in order that the burden of 142 | implementing an interface from the model to any given artefact is not 143 | too great. However, the models are *not* the artefacts. Programs 144 | should be written to the model, not to the artefact. 145 | 146 | People who construct bindings to the language should be encouraged 147 | (only because they can't effectively be required) to create models 148 | more abstract than the libraries that they are binding. 149 | 150 | Insofar as possible, we can have a compiler optimize things so that they 151 | match the underlying architecture. The language should allows and even 152 | encourage definitions in the most general sense; special cases are to be 153 | detected and optimized when they occur, instead of instituting those 154 | special cases into the language itself. 155 | 156 | Another aspect of this point of philosophy is that it should be possible 157 | to specify and change the performance characteristics of the program 158 | (but ideally not its behaviour) from outside the program, using 159 | configuration files. 160 | 161 | This counts as a practical matter because maintaining code which is 162 | cluttered with implementation-specific artefacts is burdensome. 163 | 164 | Serialization 165 | ------------- 166 | 167 | (This section needs to be rewritten) 168 | 169 | - All primitive values must be serializable 170 | - All primitive values must be round-trippable 171 | - All primitive values must thus have an order to them (like Ruby 1.9's 172 | hashes) because in this world of representations, orderless things don't 173 | really exist 174 | - When building user-defined values from primitive values it must be 175 | easy to retain these serialization properties in the composite value 176 | - This is actually fairly agnostic of the particular serialization format 177 | (yaml, xml, binary, etc) 178 | - S-expressions are trivially serializable, except for functions 179 | 180 | Formatting 181 | ---------- 182 | 183 | Closely related to serialization. 184 | 185 | Many languages support a "standard" operation to convert an arbitrary value to 186 | a string. Some even have two (e.g. Python's `str` and `repr`). 187 | 188 | But in reality, there are any number of ways to convert a value to a string. 189 | Why should the string representation of 16 necessarily be `"16"` — why not 190 | `"0xf"` or `"XVI"`? `"16"` is fine, but it should be explicitly noted to be 191 | the default for the reason that it's the most convenient for the audience of 192 | humans who use the decimal Arabic notation when dealing with numbers. 193 | 194 | How can we support both a reasonable (and possibly configurable) default 195 | formatting, as well as any number of other ways to format values which would 196 | be more appropriate in different contexts? 197 | 198 | Can we pass a "style" argument to the string-conversion function? 199 | 200 | Should we establish a "design pattern" for writing formatting functions, and 201 | provide support for implementing such patterns? 202 | 203 | (Also, `format` is probably a better name for this function than `str`.) 204 | 205 | Multiple Environments 206 | --------------------- 207 | 208 | (This section needs to be rewritten) 209 | 210 | - Lots of software runs in multiple environments - "development", "qa", 211 | "production" 212 | - Inherently support that idea 213 | 214 | Assertions 215 | ---------- 216 | 217 | (This section needs to be rewritten) 218 | 219 | - Software engineering is more about defining invariants than writing code. 220 | - An "assert" command which produces details errors in development, but only 221 | logs warnings in production environments 222 | - Very lightweight so that programmers use it without thinking 223 | (Python's `self.assertEqual()` is *not* lightweight) 224 | (Erlang's `A = {foo,B}` IS lightweight) 225 | - So a conditional, by itself, is an assertion. (?) 226 | 227 | Interfaces 228 | ---------- 229 | 230 | (This section needs to be rewritten) 231 | 232 | One way or another, it should be possible to discover (programmatically, 233 | through reflection of some sort) the set of operations that a value supports — 234 | its interface. Each operation has a name and a signature of some sort. 235 | 236 | Collections are interfaces. 237 | 238 | Some parts of an interface might be "private". This — information hiding — 239 | is obviously a somewhat complex topic. The obvious bit is that information 240 | hiding is useful to prevent unintended changes to program state, but it also 241 | hinders debugging and testing. 242 | 243 | Usability 244 | --------- 245 | 246 | Memorization is not a good thing to make programmers do. This can be 247 | addressed by either copying things from an existing language that the 248 | programmer base can be expected to already have memorized, or by providing 249 | a more orthogonal set of things which maps to the culture which programmers, 250 | as people, already live in. (For example, few people in the Western world 251 | do not know that `&` means "and".) 252 | 253 | Non-alphabetic symbols should, idealy, have the same meaning regardless of 254 | the context they're used in — in other words, the language should avoid 255 | using the same symbol for different purposes in different contexts. 256 | 257 | (Lots of languages are lacking here. In C, `*` is both multiplication and 258 | dereferencing. In Python, `.` is both object attribute access and package 259 | hierarchy — although packages are, at least, kind of like objects. In Lua, 260 | `=` is both assignment and key value association.) 261 | 262 | Programming Languages vs. Operating Systems 263 | ------------------------------------------- 264 | 265 | (this section needs to be cleaned up — not sure where to put it, and it 266 | arguably doesn't belong here) 267 | 268 | What you see before you in this distribution can be described as a 269 | programming language, but many of the ideas took root while thinking about 270 | operating systems. 271 | 272 | What's the difference between a programming language and an operating system? 273 | 274 | Well, maybe less than you think. 275 | 276 | Programming languages do need to define the environment in which they can 277 | express programs. Sometimes this is a specific OS (like early C on Unix) -- 278 | or they claim to be "portable", but then they're really just defining an 279 | abstraction against all the possible OS'es they think they'll run on. Often 280 | this abstract is clumsy, but some languages put a lot of thought into it, 281 | like Smalltalk. 282 | 283 | Operating systems, on the other hand, don't tell you what programming 284 | language to use -- or do they? A modern OS insists everything is, at some 285 | point, in native machine language, and a running instance will almost always 286 | be limited to a single machine language of a single architecture. Somewhat 287 | more alternative OS'es define a virtual machine language to abstract away 288 | from the concrete machine language. Usually this virtual machine language 289 | looks like a machine language, but sometimes it's a tad more high-level, 290 | like Lisp. Any way you slice it, the OS does sanction a particular, albeit 291 | usually low-level, programming language. 292 | 293 | Where PL's and OS's seem to meet more-or-less neatly is in the idea of the 294 | VM, so let's examine that. 295 | 296 | Most modern virtual machines are designed to implement high-level languages 297 | in a modern operating system environment. The JVM was specifically designed 298 | for running Java, and while .NET was ostensibly designed for multiple 299 | languages, the bytecode is pretty closely tuned to C\#. 300 | 301 | What these VMs were not designed to do, but what a VM "should really" be 302 | designed to do (if it, at least, wants to live up to the name "virtual 303 | machine") is to abstract the *hardware* and provide virtualizations 304 | (abstractions) of the available devices. 305 | 306 | An environment contains zero or more devices. A device exposes zero 307 | or more services. Each service conforms to one or more interfaces. 308 | Each service may additionally require one or more services be available 309 | (by interface). 310 | 311 | At one point I was calling this place where programming language and 312 | operating system meet a "CE" (Computational Environment) because 313 | "operating system" is far too generic-sounding and "programming language" 314 | doesn't address the important environmental aspect here. Whether I would 315 | continue to use the term CE or not, I'm not sure — it could just add to the 316 | confusion. 317 | 318 | How do most programming languages deal with the abstraction of available 319 | (or virtual) devices? Terribly, I would say. Take, as a simple example, 320 | an addressable character screen device. Someone writes a library, in C, 321 | to access it (e.g. `ncurses`,) providing an API comprising C functions 322 | and C structs. Someone then writes a binding or a wrapper (e.g. using 323 | `swig`) or otherwise foreign-function interfaces it to the language, usually 324 | exposing the exact same C-level API naively adapted to the programming 325 | language. Then you, the programmer in this language, wrestle with working 326 | with the device almost exactly as a C programmer would, initializing and 327 | releasing it as a C programmer would, with limitations on how you may or 328 | may not use it from multithreaded code like a C programmer would (which 329 | might be brutally different from how the runtime for your programming 330 | language implementation assumes that its world works.) All this, with the 331 | added hassle of having to make sure you have all these bindings for the 332 | device for your chosen implementation of your language built and installed 333 | correctly. 334 | -------------------------------------------------------------------------------- /oozlybub-and-murphy/oozlybub-and-murphy.markdown: -------------------------------------------------------------------------------- 1 | Oozlybub and Murphy 2 | =================== 3 | 4 | Language version 1.1 5 | 6 | Overview 7 | -------- 8 | 9 | This document describes a new programming language. The name of this 10 | language is Oozlybub and Murphy. Despite appearances, this name refers 11 | to a single language. The majority of the language is named Oozlybub. 12 | The fact that the language is not entirely named Oozlybub is named 13 | Murphy. Deal with it. 14 | 15 | For the sake of providing an "olde tyme esoterickal de-sign", the 16 | language combines several unusual features, including multiple 17 | interleaved parse streams, infinitely long variable names, gratuitously 18 | strong typing, and only-conjectural Turing completeness. While no 19 | implementation of the language exists as of this writing, it is thought 20 | to be sufficiently consistent to be implementable, modulo any errors in 21 | this docunemt. 22 | 23 | In places the language may resemble [SMITH][] and [Quylthulg][], but 24 | this was not intended, and the similarities are purely emergent. 25 | 26 | [SMITH]: http://catseye.tc/node/SMITH.html 27 | [Quylthulg]: http://catseye.tc/node/Quylthulg.html 28 | 29 | Program Structure 30 | ----------------- 31 | 32 | A Oozlybub and Murphy program consists of a number of variables and a 33 | number of objects called _dynasts_. A Oozlybub and Murphy program text 34 | consists of multiple parse streams. Each parse stream contains zero or 35 | more variable declarations, and optionally a single dynast. 36 | 37 | ### Parse Streams 38 | 39 | A parse stream is just a segment, possibly non-contiguous, of the text 40 | of a Oozlybub and Murphy program. A program starts out with a single 41 | parse stream, but certain parse stream manipulation pragmas can change 42 | this. These pragmas have the form `{@x}` and have a similar syntactic 43 | status as comments; they can appear anywhere except inside a lexeme. 44 | 45 | Parse streams are arranged as a ring (a cyclic doubly linked list.) When 46 | parsing of the program text begins initially, there is already a single 47 | pre-created parse stream. When the program text ends, all parse streams 48 | which may be active are deleted. 49 | 50 | The meanings of the pragmas are: 51 | 52 | - `{@+}` Create a new parse stream to the right of the current one. 53 | - `{@>}` Switch to the parse stream to the right of the current one. 54 | - `{@<}` Switch to the parse stream to the left of the current one. 55 | - `{@-}` Delete the current parse stream. The parse stream to the left 56 | of the deleted parse stream will become the new current parse 57 | stream. 58 | 59 | Deleting a parse stream while it contains an unfinished syntactic 60 | construct is a syntax error, just as an end-of-file in that circumstance 61 | would be in most other languages. 62 | 63 | Providing a concrete example of parse streams in action will be 64 | difficult in the absence of defined syntax for the rest of Oozlybub and 65 | Murphy, so we will, for the purposes of the following demonstration 66 | only, pretend that the contents of a parse stream is a sentence of 67 | English. Here is how three parse streams might be managed: 68 | 69 | `The quick {@+}brown{@>}Now is the time{@<}fox{@<} for all good men to {@+}{@>}Wherefore art thou {@>} jumped over {@>}{@>}Romeo?{@-} come to the aid of {@>}the lazy dog's tail.{@-}their country.{@-}` 70 | 71 | ### Variables 72 | 73 | All variables are declared in a block at the beginning of a parse 74 | stream. If there is also a dynast in that stream, the variables are 75 | private to that dynast; otherwise they are global and shared by all 76 | dynasts. (*Defined in 1.1*) Any dynamically created dynast gets its own 77 | private copies of any private variables the original dynast had; they 78 | will initially hold the values they had in the original, but they are 79 | not shared. 80 | 81 | The name of a variable in Oozlybub and Murphy is not a fixed, 82 | finite-length string of symbols, as you would find in other programming 83 | languages. No sir! In Oozlybub and Murphy, each variable is named by a 84 | possibly-infinite set of strings (over the alphanumeric-plus-spaces 85 | alphabet `[a-zA-Z0-9 ]`), at least one of which must be infinitely long. 86 | (*New in 1.1*: spaces [but no other kinds of whitespace] are allowed in 87 | these strings.) 88 | 89 | To accomodate this method of identifying a variable, in Oozlybub and 90 | Murphy programs, which are finite, variables are identified using 91 | regular expressions which match their set of names. An equivalence class 92 | of regular expressions is a set of all regular expressions which accept 93 | exactly the same set of strings; each equivalence class of regular 94 | expressions refers to the same, unique Oozlybub and Murphy variable. 95 | 96 | (In case you wonder about the implementability of this: Checking that 97 | two regular expressions are equivalent is decidable: we convert them 98 | both to NFAs, then to DFAs, then minimize those DFAs, then check if the 99 | transition graphs of those DFAs are isomorphic. Checking that the 100 | regular expression accepts at least one infinitely-long string is also 101 | decidable: just look for a cycle in the DFA's graph.) 102 | 103 | Note that these identifier-sets need not be disjoint. `/ma*/` and 104 | `/mb*/` are distinct variables, even though both contain the string `m`. 105 | (Note also that we are fudging slightly on how we consider to have 106 | described an infinitely long name; technically we would want to have a 107 | Büchi automaton that specifies an unending repetition with ^ω^ instead 108 | of \*. But the distinction is subtle enough in this context that we're 109 | gonna let it slide.) 110 | 111 | Syntax for giving a variable name is fairly straightforward: it is 112 | delimited on either side by `/` symbols; the alphanumeric symbols are 113 | literals; textual concatenation is regular expression sequencing, `|` is 114 | alteration, `(` and `)` increase precedence, and `*` is Kleene 115 | asteration (zero or more occurrences). Asteration has higher precedence 116 | than sequencing, which has higher precedence than alteration. Because 117 | none of these operators is alphanumeric nor a space, no escaping scheme 118 | needs to be installed. 119 | 120 | Variables are declared with the following syntax (`i` and `a` are the 121 | types of the variables, described in the next section): 122 | 123 | VARIABLES ARE i /pp*/, i /qq*/, a /(0|1)*/. 124 | 125 | This declares an integer variable identified by the names {`p`, `pp`, 126 | `ppp`, ...}, an integer variable identified by the names {`q`, `qq`, 127 | `qqq`, ...}, and an array variable identified by the names of all 128 | strings of `0`'s and `1`'s. 129 | 130 | When not in wimpmode (see below), any regular expression which denotes a 131 | variable may not be literally repeated anywhere else in the program. So 132 | in the above example, it would not be legal to refer to `/pp*/` further 133 | down in the program; an equivalent regular expression, such as 134 | `/p|ppp*/` or `/p*p/` or `/pp*|pp*|pp*/` would have to be used instead. 135 | 136 | ### Types 137 | 138 | Oozlybub and Murphy is a statically-typed language, in that variables as 139 | well as values have types, and a value of one type cannot be stored in a 140 | variable of another type. The types of values, however, are not entirely 141 | disjoint, as we will see, and special considerations may arise for 142 | checking and conversion because of this. 143 | 144 | The basic types are: 145 | 146 | - `i`, the type of integers. 147 | 148 | These are integers of unbounded extent, both positive and negative. 149 | Literal constants of type `i` are given in the usual decimal format. 150 | Variables of this type initially contain the value 0. 151 | 152 | - `p`, the type of prime numbers. 153 | 154 | All prime numbers are integers but not all integers are prime 155 | numbers. Thus, values of prime number type will automatically be 156 | coerced to integers in contexts that require integers; however the 157 | reverse is not true, and in the other direction a conversion 158 | function (`P?`) must be used. There are no literal constants of type 159 | `p`. Variables of this type initially contain the value 2. 160 | 161 | - `a`, the type of arrays of integers. 162 | 163 | An integer array has an integer index which is likewise of unbounded 164 | extent, both positive and negative. Variables of this type initially 165 | contain an empty array value, where all of the entries are 0. 166 | 167 | - `b`, the type of booleans. 168 | 169 | A boolean has two possible values, `true` and `false`. Note that 170 | there are no literal constants of type `b`; these must be specified 171 | by constructing a tautology or contradiction with boolean (or other) 172 | operators. It is illegal to retrieve the value of a variable of this 173 | type before first assigning it, except to construct a tautology or 174 | contradiction. 175 | 176 | - `t`, the type of truth-values. 177 | 178 | A truth-value has two possible values, `yes` and `no`. There are no 179 | literal constants of type `t`. It is illegal to retrieve the value 180 | of a variable of this type before first assigning it, except to 181 | construct a tautology or contradiction. 182 | 183 | - `z`, the type of bits. 184 | 185 | A bit has two possible values, `one` and `zero`. There are no 186 | literal constants of type `z`. It is illegal to retrieve the value 187 | of a variable of this type before first assigning it, except to 188 | construct a tautology or contradiction. 189 | 190 | - `c`, the type of conditions. 191 | 192 | A condition has two possible values, `go` and `nogo`. There are no 193 | literal constants of type `c`. It is illegal to retrieve the value 194 | of a variable of this type before first assigning it, except to 195 | construct a tautology or contradiction. 196 | 197 | ### Wimpmode 198 | 199 | (*New in 1.1*) An Oozlybub and Murphy program is in wimpmode if it 200 | declares a global variable of integer type which matches the string 201 | `am a wimp`, for example: 202 | 203 | VARIABLES ARE i /am *a *wimp/. 204 | 205 | Certain language constructs, noted in this document as such, are only 206 | permissible in wimpmode. If they are used in a program in which wimpmode 207 | is not in effect, a compile-time error shall occur and the program shall 208 | not be executed. 209 | 210 | ### Dynasts 211 | 212 | Each dynast is labeled with a positive integer and contains an 213 | expression. Only one dynast may be denoted in any given parse stream, 214 | but dynasts may also be created dynamically during program execution. 215 | 216 | Program execution begins at the lowest-numbered dynast that exists in 217 | the initial program. When a dynast is executed, the expression of that 218 | dynast is evaluated for its side-effects. If there is a dynast labelled 219 | with the next higher integer (i.e. the successor of the label of the 220 | current dynast), execution continues with that dynast; otherwise, the 221 | program halts. Once a dynast has been executed, it continues to exist 222 | until the program halts, but it may never be executed again. 223 | 224 | Evaluation of an expression may have side-effects, including writing 225 | characters to an output channel, reading characters from an input 226 | channel, altering the value of a variable, and creating a new dynast. 227 | 228 | Dynasts are written with the syntax `dynast(label) <-> expr`. A concrete 229 | example follows: 230 | 231 | dynast(100) <-> for each prime /p*/ below 1000 do write (./p*|p/+1.) 232 | 233 | ### TRIVIA PORTION OF SHOW 234 | 235 | WHO WAS IT FAMOUS MAN THAT SAID THIS? 236 | 237 | - A) RONALD REAGAN 238 | - B) RONALD REAGAN 239 | - B) RONALD STEWART 240 | - C) RENALDO 241 | 242 | contestant enters lightning round now 243 | 244 | ### Expressions 245 | 246 | In the following, the letter preceding -expr or -var indicates the 247 | expected type, if any, of that expression or variable. Where the 248 | expressions listed below are infix expressions, they are listed from 249 | highest to lowest precedence. Unless noted otherwise, subexpressions are 250 | evaluated left to right. 251 | 252 | - `(.expr.)` 253 | 254 | Surrounding an expression with dotted parens gives it that 255 | precedence boost that's just the thing to have it be evaluated 256 | before the expression it's in, but there is a catch. The number of 257 | parens in the dotted parens expression must match the nesting depth 258 | in the following way: if a set of dotted parens is nested within n 259 | dotted parens, it must contain fib(n) parens, where fib(n) is the 260 | nth member of the Fibonacci sequence. For example, `(.(.0.).)` and 261 | `(.(.((.(((.(((((.0.))))).))).)).).)` are syntactically well-formed 262 | expressions (when not nested in any other dotted paren expression), 263 | but `(.(((.0.))).)` and `(.(.(.0.).).)` are not. 264 | 265 | - `var` 266 | 267 | A variable evaluates to the value it contains at that point in 268 | execution. 269 | 270 | - `0`, `1`, `2`, `3`, etc. 271 | 272 | Decimal literals evaluate to the expected value of type `i`. 273 | 274 | - `#myself#` 275 | 276 | This special nullary token evaluates to the numeric label of the 277 | currently executing dynast. 278 | 279 | - `var := expr` 280 | 281 | Evaluates the expr and stores the result in the specified variable. 282 | The variable and the expression must have the same type. Evaluates 283 | to whatever expr evaluated to. 284 | 285 | - `a-expr [i-expr]` 286 | 287 | Evaluates to the `i` stored at the location in the array given by 288 | i-expr. 289 | 290 | - `a-expr [i-expr] := i-expr` 291 | 292 | Evaluates the second i-expr and stores the result in the location in 293 | the array given by the first i-expr. Evaluates to whatever the 294 | second i-expr evaluated to. 295 | 296 | - `a-expr ? i-expr` 297 | 298 | Evaluates to `go` if `a-expr [i-expr]` and `i-expr` evaluate to the 299 | same thing, `nogo` otherwise. The i-expr is only evaluated once. 300 | 301 | - `minus i-expr` 302 | 303 | Evaluate to the integer that is zero minus the result of evaluating 304 | i-expr. 305 | 306 | - `write i-expr` 307 | 308 | Write the Unicode code point whose number is obtained by evaluating 309 | i-expr, to the standard output channel. Writing a negative number 310 | shall produce one of a number of amusing and informative messages 311 | which are not defined by this document. 312 | 313 | - `#read#` 314 | 315 | Wait for a Unicode character to become available on the standard 316 | input channel and evaluate to its integer code point value. 317 | 318 | - `not? z-expr` 319 | 320 | Converts a bit value to a boolean value (`zero` becomes `true` and 321 | `one` becomes `false`). 322 | 323 | - `if? b-expr` 324 | 325 | Converts a boolean value to condition value (true becomes go and 326 | false becomes nogo). 327 | 328 | - `cvt? c-expr` 329 | 330 | Converts a condition value to a truth-value (`go` becomes `yes` and 331 | `nogo` becomes `no`). 332 | 333 | - `to? t-expr` 334 | 335 | Converts a truth-value to a bit value (`yes` becomes `one` and `no` 336 | becomes `zero`). 337 | 338 | - `P? i-expr [t-var]` 339 | 340 | If the result of evaluating i-expr is a prime number, evaluates to 341 | that prime number (and has the type `p`). If it is not prime, stores 342 | the value `no` into t-var and evaluates to 2. 343 | 344 | - `i-expr * i-expr` 345 | 346 | Evaluates to the product of the two i-exprs. The result is never of 347 | type `p`, but the implementation doesn't need to do anything based 348 | on that fact. 349 | 350 | - `i-expr + i-expr` 351 | 352 | Evaluates to the sum of the two i-exprs. 353 | 354 | - `exists/dynast i-expr` 355 | 356 | Evaluates to `one` if a dynast exists with the given label, or 357 | `zero` if one does not. 358 | 359 | - `copy/dynast i-expr, p-expr, p-expr` 360 | 361 | Creates a new dynast based on an existing one. The existing one is 362 | identified by the label given in the i-expr. The new dynast is a 363 | copy of the existing dynast, but with a new label. The new label is 364 | the sum of the two p-exprs. If a dynast with that label already 365 | exists, the program terminates. (*Defined in 1.1*) This expression 366 | evaluates to the value of the given i-expr. 367 | 368 | - `create/countably/many/dynasts i-expr, i-expr` 369 | 370 | Creates a countably infinite number of dynasts based on an existing 371 | one. The existing one is identified by the label given in the first 372 | i-expr. The new dynasts are copies of the existing dynast, but with 373 | new labels. The new labels start at the first odd integer greater 374 | than the second i-expr, and consist of every odd integer greater 375 | than that. If any dynast with such a label already exists, the 376 | program terminates. (*Defined in 1.1*) This expression evaluates to 377 | the value of the first given i-expr. 378 | 379 | - `b-expr and b-expr` 380 | 381 | Evaluates to `one` if both b-exprs are `true`, `zero` otherwise. 382 | Note that this is not short-circuting; both b-exprs are evaluated. 383 | 384 | - `c-expr or c-expr` 385 | 386 | Evaluates to `yes` if either or both c-exprs are `go`, `no` 387 | otherwise. Note that this is not short-circuting; both c-exprs are 388 | evaluated. 389 | 390 | - `do expr` 391 | 392 | Evaluates the expr, throws away the result, and evaluates to `go`. 393 | 394 | - `c-expr then expr` 395 | 396 | **Wimpmode only.** Evaluates the c-expr on the left-hand side for 397 | its side-effects only, throwing away the result, then evaluates to 398 | the result of evaluating the right-hand side expr. 399 | 400 | - `c-expr ,then i-expr` 401 | 402 | (*New in 1.1*) Evaluates the c-expr on the left-hand side; if it is 403 | `go`, evaluates to the result of evaluating the right-hand side 404 | i-expr; if it is `nogo`, evaluates to an unspecified and quite 405 | possibly random integer between 1 and 1000000 inclusive, without 406 | evaluating the right-hand side. Note that this operator has the same 407 | precedence as `then`. 408 | 409 | - `for each prime var below i-expr do i-expr` 410 | 411 | The var must be a declared variable of type `p`. The first i-expr 412 | must evaluate to an integer, which we will call k. The second i-expr 413 | is evaluated once for each prime number between k and 2, inclusive; 414 | each time it is evaluated, var is bound to a successively smaller 415 | prime number between k and 2. (*Defined in 1.1*) Evaluates to the 416 | result of the final evaluation of the second i-expr. 417 | 418 | ### Grammar 419 | 420 | This section attempts to capture and summarize the syntax rules (for a 421 | single parse stream) described above, using an EBNF-like syntax extended 422 | with a few ad-hoc annotations that I don't feel like explaining right 423 | now. 424 | 425 | ParseStream ::= VarDeclBlock {DynastLit}. 426 | VarDeclBlock ::= "VARIABLES ARE" VarDecl {"," VarDecl} ".". 427 | VarDecl ::= TypeSpec VarName. 428 | TypeSpec ::= "i" | "p" | "a" | "b" | "t" | "z" | "c". 429 | VarName ::= "/" Pattern "/". 430 | Pattern ::= {[a-zA-Z0-9 ]} 431 | | Pattern "|" Pattern /* ignoring precedence here */ 432 | | Pattern "*" /* and here */ 433 | | "(" Pattern ")". 434 | DynastLit ::= "dynast" "(" Gumber ")" "<->" Expr. 435 | Expr ::= Expr1[c] {"then" Expr1 | ",then" Expr1[i]}. 436 | Expr1 ::= Expr2[c] {"or" Expr2[c]}. 437 | Expr2 ::= Expr3[b] {"and" Expr3[b]}. 438 | Expr3 ::= Expr4[i] {"+" Expr4[i]}. 439 | Expr4 ::= Expr5[i] {"*" Expr5[i]}. 440 | Expr5 ::= Expr6[a] {"?" Expr6[i]}. 441 | Expr6 ::= Prim[a] {"[" Expr[i] "]"} [":=" Expr[i]]. 442 | Prim ::= {"("}* "." Expr "." {")"}* /* remember the Fibonacci rule! */ 443 | | VarName [":=" Expr] 444 | | Gumber 445 | | "#myself#" 446 | | "minus" Expr[i] 447 | | "write" Expr[i] 448 | | "#read#" 449 | | "not?" Expr[z] 450 | | "if?" Expr[b] 451 | | "cvt?" Expr[c] 452 | | "to?" Expr[t] 453 | | "P?" Expr[i] 454 | | "exists/dynast" Expr[i] 455 | | "copy/dynast" Expr[i] "," Expr[p] "," Expr[p] 456 | | "create/countably/many/dynasts" 457 | Expr[i] "," Expr[i] 458 | | "do" Expr 459 | | "for" "each" "prime" VarName "below" 460 | Expr[i] "do" Expr[i]. 461 | Gumber ::= {[0-9]}. 462 | 463 | ### Boolean Idioms 464 | 465 | Here we show how we can get any value of any of the `b`, `t`, `z`, and 466 | `c` types, without any constants or variables with known values of these 467 | types. 468 | 469 | VARIABLES ARE b /b*/. 470 | zero = /b*|b/ and not? to? cvt? if? /b*|b*/ 471 | true = not? zero 472 | go = if? true 473 | yes = cvt? go 474 | one = to? yes 475 | false = not? one 476 | nogo = if? false 477 | no = cvt? nogo 478 | 479 | ### Computational Class 480 | 481 | Because the single in-dynast looping construct, `for each prime below`, 482 | is always a finite loop, the execution of any fixed number of dynasts 483 | cannot be Turing-complete. We must create new dynasts at runtime, and 484 | continue execution in them, if we want any chance at being 485 | Turing-complete. We demonstrate this by showing an example of a 486 | (conjecturally) infinite loop in Oozlybub and Murphy, an idiom which 487 | will doubtless come in handy in real programs. 488 | 489 | VARIABLES ARE p /p*/, p /q*/. 490 | dynast(3) <-> 491 | (. do (. if? not? exists/dynast 5 ,then 492 | create/countably/many/dynasts #myself#, 5 .) .) ,then 493 | (. for each prime /p*|p/ below #myself#+2 do 494 | for each prime /q*|q/ below /p*|pp/+1 do 495 | if? not? exists/dynast /p*|p|p/+/q*|q|q/ ,then 496 | copy/dynast #myself#, /p*|ppp/, /q*|qqq/ .) 497 | 498 | As you can see, the ability to loop indefinitely in Oozlybub and Murphy 499 | hinges on whether Goldbach's Conjecture is true or not. Looping forever 500 | requires creating an unbounded number of new dynasts. We can create all 501 | the odd-numbered dynasts at once, but that won't be enough to loop 502 | forever, as we must proceed to the next highest numbered dynast after 503 | executing a dynast. So we must create new dynasts with successively 504 | higher even integer labels, and these can only be created by summing two 505 | primes. So, if Goldbach's conjecture is false, then there is some even 506 | number greater than two which is not the sum of two primes; thus there 507 | is some dynast that cannot be created by a running Oozlybub and Murphy 508 | program, thus it is not possible to loop forever in Oozlybub and Murphy, 509 | thus Oozlybub and Murphy is not Turing-complete (because it cannot 510 | simulate any Turing machine that loops forever.) 511 | 512 | It should not however be difficult to show that Oozlybub and Murphy is 513 | Turing-complete under the assumption that Goldbach's Conjecture is true. 514 | If Goldbach's Conjecture is true, then the above program is an infinite 515 | loop. We need only add to it appropriate conditional instructions to, 516 | say, simulate the execution of an arbitrarily-chosen Turing machine. An 517 | array can serve as the tape, and an integer can serve as the head. 518 | Another integer can serve as the state of the finite control. The 519 | integer can be tested against various fixed integers by establishing an 520 | array for each of these fixed integers and using the `?` operator 521 | against each in turn; each branch can mutate the tape, tape head, and 522 | finite control as desired. The program can halt by neglecting to create 523 | a new even dynast to execute next, or by trying to create a dynast with 524 | a label that already exists. 525 | 526 | Happy FLIMPING, 527 | Chris Pressey 528 | December 1, 2010 529 | Evanston, Illinois, USA 530 | -------------------------------------------------------------------------------- /madison/Madison.markdown: -------------------------------------------------------------------------------- 1 | Madison 2 | ======= 3 | 4 | Version 0.1 5 | December 2011, Chris Pressey, Cat's Eye Technologies 6 | 7 | Abstract 8 | -------- 9 | 10 | Madison is a language in which one can state proofs of properties 11 | of term-rewriting systems. Classical methods of automated reasoning, 12 | such as resolution, are not used; indeed, term-rewriting itself is 13 | used to check the proofs. Both direct proof and proof by induction 14 | are supported. Induction in a proof must be across a structure which 15 | has a well-founded inductive definition. Such structures can be 16 | thought of as types, although this is largely nominal; the traditional 17 | typelessness of term-rewiting systems is largely retained. 18 | 19 | Term-rewriting 20 | -------------- 21 | 22 | Madison has at its core a simple term-rewriting language. It is of 23 | a common form which should be unsurprising to anyone who has worked 24 | at all with term rewriting. A typical simple program contains 25 | a set of rules and a term on which to apply those rules. Each 26 | rule is a pair of terms; either term of the pair may contain 27 | variables, but any variable that appears on the r.h.s must also 28 | appear on the l.h.s. A rule matches a term if is the same as 29 | the term with the exception of the variables, which are bound 30 | to its subterms; applying a matching rule replaces the term 31 | with the r.h.s. of the rule, with the variables expanded approp- 32 | riately. Rules are applied in an innermost, leftmost fashion to 33 | the term, corresponding to eager evaluation. Rewriting terminates 34 | when there is no rule whose l.h.s. matches the current incarnation 35 | of the term being rewritten. 36 | 37 | A term is either an atom, which is a symbol that stands alone, 38 | or a constructor, which is a symbol followed by a comma-separated list 39 | of subterms enclosed in parentheses. Symbols may consist of letters, 40 | digits, and hyphens, with no intervening whitespace. A symbol is 41 | a variable symbol if it begins with a capital letter. Variable 42 | symbols may also begin with underscores, but these may only occur 43 | in the l.h.s. of a rewrite rule, to indicate that we don't care 44 | what value is bound to the variable and we won't be using it on 45 | the r.h.s. 46 | 47 | (The way we are using the term "constructor" may be slightly non- 48 | standard; in some other sources, this is called a "function symbol", 49 | and a "constructor" is a subtly different thing.) 50 | 51 | Because the rewriting language is merely a component (albeit the 52 | core component) of a larger system, the aforementioned typical 53 | simple program must be cast into some buttressing syntax. A full 54 | program consists of a `let` block which contains the rules 55 | and a `rewrite` admonition which specifies the term to be re- 56 | written. An example follows. 57 | 58 | | let 59 | | leftmost(tree(X,Y)) -> leftmost(X) 60 | | leftmost(leaf(X)) -> X 61 | | in 62 | | rewrite leftmost(tree(tree(leaf(alice),leaf(grace)),leaf(dan))) 63 | = alice 64 | 65 | In the above example, there are two rules for the constructor 66 | `leftmost/1`. The first is applied to the outer tree to obtain 67 | a new leftmost constructor containing the inner tree; the first 68 | is applied again to obtain a new leftmost constructor containing 69 | the leaf containing `alice`; and the second is applied to that 70 | leaf term to obtain just `alice`. At that point, no more rules 71 | apply, so rewriting terminates, yielding `alice`. 72 | 73 | Madison is deterministic; if rules overlap, the first one given 74 | (syntactically) is used. For this reason, it is a good idea 75 | to order rules from most specific to least specific. 76 | 77 | I used the phrase "typical simple program" above because I was 78 | trying intentionally to avoid saying "simplest program". In fact, 79 | technically no `let` block is required, so you can write some 80 | really trivial Madison programs, like the following: 81 | 82 | | rewrite cat 83 | = cat 84 | 85 | I think that just about covers the core term-rewriting language. 86 | Term-rewriting is Turing-complete, so Madison is too. If you 87 | wish to learn more about term rewriting, there are several good 88 | books and webpages on the subject; I won't go into it further 89 | here. 90 | 91 | Proof-Checking 92 | -------------- 93 | 94 | My desire with Madison was to design a language in which you 95 | can prove things. Not a full-blown theorem prover -- just a 96 | proof checker, where you supply a proof and it confirms either 97 | that the proof holds or doesn't hold. (Every theorem prover 98 | has at its core a proof checker, but it comes bundled with a lot of 99 | extra machinery to search the space of possible proofs cleverly, 100 | looking for one which will pass the proof-checking phase.) 101 | 102 | It's no coincidence that Madison is built on top of a term-rewriting 103 | language. For starters, a proof is very similar to the execution 104 | trace of a term being rewritten. In each of the steps of the proof, 105 | the statement to be proved is transformed by replacing some part 106 | of it with some equally true thing -- in other words, rewritten. 107 | In fact, Post Canonical Systems were an early kind of rewriting 108 | system, devised by Emil Post to (as I understand it) illustrate this 109 | similarity, and to show that proofs could be mechanically carried out 110 | in a rewriting system. 111 | 112 | So: given a term-rewriting language, we can give a trivial kind 113 | of proof simply by stating the rewrite steps that *should* occur 114 | when a term is rewritten, and check that proof by rewriting the term 115 | and confirming that those were in fact the steps that occurred. 116 | 117 | For the purpose of stating these sequences of rewrite steps to be 118 | checked, Madison has a `theorem..proof..qed` form. To demonstrate 119 | this form, let's use Madison to prove that 2 + 2 = 4, using Peano 120 | arithmetic. 121 | 122 | | let 123 | | add(s(X),Y) -> add(X,s(Y)) 124 | | add(z,Y) -> Y 125 | | in theorem 126 | | add(s(s(z)),s(s(z))) ~> s(s(s(s(z)))) 127 | | proof 128 | | add(s(s(z)),s(s(z))) 129 | | -> add(s(z),s(s(s(z)))) [by add.1] 130 | | -> add(z,s(s(s(s(z))))) [by add.1] 131 | | -> s(s(s(s(z)))) [by add.2] 132 | | qed 133 | = true 134 | 135 | The basic syntax should be fairly apparent. The `theorem` block 136 | contains the statement to be proved. The `~>` means "rewrites 137 | in zero or more steps to". So, here, we are saying that 2 + 2 138 | (in Peano notation) rewrites, in zero or more steps, to 4. 139 | 140 | The `proof` block contains the actual series of rewrite steps that 141 | should be carried out. For elucidation, each step may name the 142 | particular rule which is applied to arrive at the transformed term 143 | at that step. Rules are named by their outermost constructor, 144 | followed by a dot and the ordinal position of the rule in the list 145 | of rules. These rule-references are optional, but the fact that 146 | the rule so named was actually used to rewrite the term at that step 147 | could be checked too, of course. The `qed` keyword ends the proof 148 | block. 149 | 150 | Naturally, you can also write a proof which does not hold, and 151 | Madison should inform you of this fact. 2 + 3, for example, 152 | does not equal 4, and it can pinpoint exactly where you went 153 | wrong should you come to this conclusion: 154 | 155 | | let 156 | | add(s(X),Y) -> add(X,s(Y)) 157 | | add(z,Y) -> Y 158 | | in theorem 159 | | add(s(s(z)),s(s(s(z)))) ~> s(s(s(s(z)))) 160 | | proof 161 | | add(s(s(z)),s(s(s(z)))) 162 | | -> add(s(z),s(s(s(s(z))))) [by add.1] 163 | | -> add(z,s(s(s(s(z))))) [by add.1] 164 | | -> s(s(s(s(z)))) [by add.2] 165 | | qed 166 | ? Error in proof [line 6]: step 2 does not follow from applying [add.1] to previous step 167 | 168 | Now, while these *are* proofs, they don't tell us much about the 169 | properties of the terms and rules involved, because they are not 170 | *generalized*. They say something about a few fixed values, like 171 | 2 and 4, but they do not say anything about any *infinite* 172 | sets of values, like the natural numbers. Now, that would be *really* 173 | useful. And, while I could say that what you've seen of Madison so far 174 | is a proof checker, it is not a very impressive one. So let's take 175 | this further. 176 | 177 | Quantification 178 | -------------- 179 | 180 | To state a generalized proof, we will need to introduce variables, 181 | and to have variables, we will need to be able to say what those 182 | variables can range over; in short, we need *quantification*. Since 183 | we're particularly interested in making statements about infinite 184 | sets of values (like the natural numbers), we specifically want 185 | *universal quantification*: 186 | 187 | For all x, ... 188 | 189 | But to have universal quantification, we first need a *universe* 190 | over which to quantify. When we say "for all /x/", we generally 191 | don't mean "any and all things of any kind which we could 192 | possibly name /x/". Rather, we think of /x/ as having a type of 193 | some kind: 194 | 195 | For all natural numbers x, ... 196 | 197 | Then, if our proof holds, it holds for all natural numbers. 198 | No matter what integer value greater than or equal to zero 199 | we choose for /x/, the truism contained in the proof remains true. 200 | This is the sort of thing we want in Madison. 201 | 202 | Well, to start, there is one glaringly obvious type in any 203 | term-rewriting language, namely, the term. We could say 204 | 205 | For all terms t, ... 206 | 207 | But it would not actually be very interesting, because terms 208 | are so general and basic that there's not actually very much you 209 | can say about them that you don't already know. You sort of need 210 | to know the basic properties of terms just to build a term-rewriting 211 | language (like the one at Madison's core) in the first place. 212 | 213 | The most useful property of terms as far as Madison is concerned is 214 | that the subterm relationship is _well-founded_. In other words, 215 | in the term `c(X)`, `X` is "smaller than" `c(X)`, and since terms are 216 | finite, any series of rewrites which always results in "smaller" terms 217 | will eventually terminate. For completeness, we should probably prove 218 | that rigorously, but for expediency we will simply take it as a given 219 | fact for our proofs. 220 | 221 | Anyway, to get at something actually interesting, we must look further 222 | than the terms themselves. 223 | 224 | Types 225 | ----- 226 | 227 | What's actually interesting is when you define a restricted 228 | set of forms that terms can take, and you distinguish terms inside 229 | this set of forms from the terms outside the set. For example, 230 | 231 | | let 232 | | boolean(true) -> true 233 | | boolean(false) -> true 234 | | boolean(_) -> false 235 | | in 236 | | rewrite boolean(false) 237 | = true 238 | 239 | We call a set of forms like this a _type_. As you can see, we 240 | have basically written a predicate that defines our type. If any 241 | of the rewrite rules in the definition of this predicate rewrite 242 | a given term to `true`, that term is of our type; if it rewrites 243 | to `false`, it is not. 244 | 245 | Once we have types, any constructor may be said to have a type. 246 | By this we mean that no matter what subterms the constructor has, 247 | the predicate of the type of which we speak will always reduce to 248 | `true` when that term is inserted in it. 249 | 250 | Note that using predicates like this allows our types to be 251 | non-disjoint; the same term may reduce to true in two different 252 | predicates. My first sketches for Madison had disjoint types, 253 | described by rules which reduced each term to an atom which named 254 | the type of that term. (So the above would have been written with 255 | rules `true -> boolean` and `false -> boolean` instead.) However, 256 | while that method may be, on the surface, more elegant, I believe 257 | this approach better reflects how types are actually used in 258 | programming. At the end of the day, every type is just a predicate, 259 | and there is nothing stopping 2 from being both a natural number and 260 | an integer. And, for that matter, a rational number and a real 261 | number. 262 | 263 | In theory, every predicate is a type, too, but that's where things 264 | get interesting. Is 2 not also an even number, and a prime number? 265 | And in an appropriate (albeit contrived) language, is it not a 266 | description of a computation which may or may not always halt? 267 | 268 | The Type Syntax 269 | --------------- 270 | 271 | The above considerations motivate us to be careful when dealing 272 | with types. We should establish some ground rules so that we 273 | know that our types are useful to universally quantify over. 274 | 275 | Unfortunately, this introduces something of a chicken-and-egg 276 | situation, as our ground rules will be using logical connectives, 277 | while at the same time they will be applied to those logical 278 | connectives to ensure that they are sound. This is not, actually, 279 | a big deal; I mention it here more because it is interesting. 280 | 281 | So, the rules which define our type must conform to certain 282 | rules, themselves. While it would be possible to allow the 283 | Madison programmer to use any old bunch of rewrite rules as a 284 | type, and to check that these rules make for a "good" type when 285 | such a usage is seen -- and while this would be somewhat 286 | attractive from the standpoint of proving properties of term- 287 | rewriting systems using term-rewriting systems -- it's not strictly 288 | necessary to use a descriptive approach such as this, and there are 289 | certain organizational benefits we can achieve by taking a more 290 | prescriptive tack. 291 | 292 | Viz., we introduce a special syntax for defining a type with a 293 | set of rules which function collectively as a type predicate. 294 | Again, it's not strictly necessary to do this, but it does 295 | help organize our code and perhaps our thoughts, and perhaps make 296 | an implementation easier to build. It's nice to be able to say, 297 | yes, what it means to be a `boolean` is defined right here and 298 | nowhere else. 299 | 300 | So, to define a type, we write our type rules in a `type..in` 301 | block, like the following. 302 | 303 | | type boolean is 304 | | boolean(true) -> true 305 | | boolean(false) -> true 306 | | in 307 | | rewrite boolean(false) 308 | = true 309 | 310 | As you can see, the wildcard reduction to false can be omitted for 311 | brevity. (In other words, "Nothing else is a boolean" is implied.) 312 | And, the `boolean` constructor can be used for rewriting in a term 313 | just like any other plain, non-`type`-blessed rewrite rule. 314 | 315 | | type boolean is 316 | | boolean(true) -> true 317 | | boolean(false) -> true 318 | | in 319 | | rewrite boolean(tree(leaf(sabrina),leaf(joe))) 320 | = false 321 | 322 | Here are the rules that the type-defining rules must conform to. 323 | If any of these rules are violated in the `type` block, the Madison 324 | implementation must complain, and not proceed to try to prove anything 325 | from them. 326 | 327 | Once a type is defined, it cannot be defined further in a regular, 328 | non-type-defining rewriting rule. 329 | 330 | | type boolean is 331 | | boolean(true) -> true 332 | | boolean(false) -> true 333 | | in let 334 | | boolean(red) -> green 335 | | in 336 | | rewrite boolean(red) 337 | ? Constructor "boolean" used in rule but already defined as a type 338 | 339 | The constructor in the l.h.s. must be the same in all rules. 340 | 341 | | type foo is 342 | | foo(bar) -> true 343 | | baz(bar) -> true 344 | | in 345 | | rewrite cat 346 | ? In type "foo", constructor "bar" used on l.h.s. of rule 347 | 348 | The constructor used in the rules must be arity 1 (i.e. have exactly 349 | one subterm.) 350 | 351 | | type foo is 352 | | foo(bar,X) -> true 353 | | in 354 | | rewrite cat 355 | ? In type "foo", constructor has arity greater than one 356 | 357 | It is considered an error if the predicate rules ever rewrite, inside 358 | the `type` block, to anything besides the atoms `true` or `false`. 359 | 360 | | type foo is 361 | | foo(bar) -> true 362 | | foo(tree(X)) -> bar 363 | | in 364 | | rewrite cat 365 | ? In type "foo", rule reduces to "bar" instead of true or false 366 | 367 | The r.h.s.'s of the rules of the type predicate must *always* 368 | rewrite to `true` or `false`. That means, if we can't prove that 369 | the rules always rewrite to something, we can't use them as type 370 | predicate rules. In practice, there are a few properties that 371 | we insist that they have. 372 | 373 | They may involve type predicates that have previously been 374 | established. 375 | 376 | | type boolean is 377 | | boolean(true) -> true 378 | | boolean(false) -> true 379 | | in type boolbox is 380 | | boolbox(box(X)) -> boolean(X) 381 | | in 382 | | rewrite boolbox(box(true)) 383 | = true 384 | 385 | They may involve certain, pre-defined rewriting rules which can 386 | be thought of as operators on values of boolean type (which, honestly, 387 | is probably built-in to the language.) For now there is only one 388 | such pre-defined rewriting rule: `and(X,Y)`, where `X` and `Y` are 389 | booleans, and which rewrites to a boolean, using the standard truth 390 | table rules for boolean conjunction. 391 | 392 | | type boolean is 393 | | boolean(true) -> true 394 | | boolean(false) -> true 395 | | in type boolpair is 396 | | boolpair(pair(X,Y)) -> and(boolean(X),boolean(Y)) 397 | | in 398 | | rewrite boolpair(pair(true,false)) 399 | = true 400 | 401 | | type boolean is 402 | | boolean(true) -> true 403 | | boolean(false) -> true 404 | | in type boolpair is 405 | | boolpair(pair(X,Y)) -> and(boolean(X),boolean(Y)) 406 | | in 407 | | rewrite boolpair(pair(true,cheese)) 408 | = false 409 | 410 | Lastly, the r.h.s. of a type predicate rule can refer to the self-same 411 | type being defined, but *only* under certain conditions. Namely, 412 | the rewriting must "shrink" the term being rewritten. This is what 413 | lets us inductively define types. 414 | 415 | | type nat is 416 | | nat(z) -> true 417 | | nat(s(X)) -> nat(X) 418 | | in 419 | | rewrite nat(s(s(z))) 420 | = true 421 | 422 | | type nat is 423 | | nat(z) -> true 424 | | nat(s(X)) -> nat(s(X)) 425 | | in 426 | | rewrite nat(s(s(z))) 427 | ? Type not well-founded: recursive rewrite does not decrease in [foo.2] 428 | 429 | | type nat is 430 | | nat(z) -> true 431 | | nat(s(X)) -> nat(s(s(X))) 432 | | in 433 | | rewrite nat(s(s(z))) 434 | ? Type not well-founded: recursive rewrite does not decrease in [foo.2] 435 | 436 | | type bad 437 | | bad(leaf(X)) -> true 438 | | bad(tree(X,Y)) -> and(bad(X),bad(tree(Y,Y)) 439 | | in 440 | | rewrite whatever 441 | ? Type not well-founded: recursive rewrite does not decrease in [bad.2] 442 | 443 | We can check this by looking at all the rewrite rules in the 444 | definition of the type that are recursive, i.e. that contain on 445 | on their r.h.s. the constructor being defined as a type predicate. 446 | For every such occurrence on the r.h.s. of a recursive rewrite, 447 | the contents of the constructor must be "smaller" than the contents 448 | of the constructor on the l.h.s. What it means to be smaller 449 | should be fairly obvious: it just has fewer subterms. If all the 450 | rules conform to this pattern, rewriting will eventually terminate, 451 | because it will run out of subterms to rewrite. 452 | 453 | Application of Types in Proofs 454 | ------------------------------ 455 | 456 | Now, aside from these restrictions, type predicates are basically 457 | rewrite rules, just like any other. The main difference is that 458 | we know they are well-defined enough to be used to scope the 459 | universal quantification in a proof. 460 | 461 | Simply having a definition for a `boolean` type allows us to construct 462 | a simple proof with variables. Universal quantification over the 463 | universe of booleans isn't exactly impressive; we don't cover an infinite 464 | range of values, like we would with integers, or lists. But it's 465 | a starting point on which we can build. We will give some rewrite rules 466 | for a constructor `not`, and prove that this constructor always reduces 467 | to a boolean when given a boolean. 468 | 469 | | type boolean is 470 | | boolean(true) -> true 471 | | boolean(false) -> true 472 | | in let 473 | | not(true) -> false 474 | | not(false) -> true 475 | | not(_) -> undefined 476 | | in theorem 477 | | forall X where boolean(X) 478 | | boolean(not(X)) ~> true 479 | | proof 480 | | case X = true 481 | | boolean(not(true)) 482 | | -> boolean(true) [by not.1] 483 | | -> true [by boolean.1] 484 | | case X = false 485 | | boolean(not(false)) 486 | | -> boolean(false) [by not.2] 487 | | -> true [by boolean.2] 488 | | qed 489 | = true 490 | 491 | As you can see, proofs using universally quantified variables 492 | need to make use of _cases_. We know this proof is sound, because 493 | it shows the rewrite steps for all the possible values of the 494 | variable -- and we know they are all the possible values, from the 495 | definition of the type. 496 | 497 | In this instance, the cases are just the two possible values 498 | of the boolean type, but if the type was defined inductively, 499 | they would need to cover the base and inductive cases. In both 500 | matters, each case in a complete proof maps to exactly one of 501 | the possible rewrite rules for the type predicate. (and vice versa) 502 | 503 | Let's prove the type of a slightly more complex rewrite rule, 504 | one which has multiple subterms which can vary. (This `and` 505 | constructor has already been introduced, and we've claimed we 506 | can use it in the definition of well-founded inductive types; 507 | but this code proves that it is indeed well-founded, and it 508 | doesn't rely on it already being defined.) 509 | 510 | | let 511 | | and(true,true) -> true 512 | | and(_,_) -> false 513 | | in theorem 514 | | forall X where boolean(X) 515 | | forall Y where boolean(Y) 516 | | boolean(and(X,Y)) ~> true 517 | | proof 518 | | case X = true 519 | | case Y = true 520 | | boolean(and(true,true)) 521 | | -> boolean(true) [by and.1] 522 | | -> true [by boolean.1] 523 | | case Y = false 524 | | boolean(and(true,false)) 525 | | -> boolean(false) [by and.2] 526 | | -> true [by boolean.2] 527 | | case X = false 528 | | case Y = true 529 | | boolean(and(false,true)) 530 | | -> boolean(false) [by and.2] 531 | | -> true [by boolean.2] 532 | | case Y = false 533 | | boolean(and(false,false)) 534 | | -> boolean(false) [by and.2] 535 | | -> true [by boolean.2] 536 | | qed 537 | = true 538 | 539 | Unwieldy, you say! And you are correct. But making something 540 | easy to use was never my goal. 541 | 542 | Note that the definition of `and()` is a bit more open-ended than 543 | `not()`. `and.2` allows terms like `and(dog,cat)` to rewrite to `false`. 544 | But our proof only shows that the result of reducing `and(A,B)` is 545 | a boolean *when both A and B are booleans*. So it, in fact, 546 | tells us nothing about the type of `and(dog,cat)`, nor in fact anything 547 | at all about the properties of `and(A,B)` when one or more of `A` and 548 | `B` are not of boolean type. So be it. 549 | 550 | Anyway, since we were speaking of inductively defined types 551 | previously, let's do that now. With the help of `and()`, here is 552 | a type for binary trees. 553 | 554 | | type tree is 555 | | tree(leaf) -> true 556 | | tree(branch(X,Y)) -> and(tree(X),tree(Y)) 557 | | in 558 | | rewrite tree(branch(leaf,leaf)) 559 | = true 560 | 561 | We can define some rewrite rules on trees. To start small, 562 | let's define a simple predicate on trees. 563 | 564 | | type tree is 565 | | tree(leaf) -> true 566 | | tree(branch(X,Y)) -> and(tree(X),tree(Y)) 567 | | in let 568 | | empty(leaf) -> true 569 | | empty(branch(_,_)) -> false 570 | | in empty(branch(branch(leaf,leaf),leaf)) 571 | = false 572 | 573 | | type tree is 574 | | tree(leaf) -> true 575 | | tree(branch(X,Y)) -> and(tree(X),tree(Y)) 576 | | in let 577 | | empty(leaf) -> true 578 | | empty(branch(_,_)) -> false 579 | | in empty(leaf) 580 | = true 581 | 582 | Now let's prove that our predicate always rewrites to a boolean 583 | (i.e. that it has boolean type) when its argument is a tree. 584 | 585 | | type tree is 586 | | tree(leaf) -> true 587 | | tree(branch(X,Y)) -> and(tree(X),tree(Y)) 588 | | in let 589 | | empty(leaf) -> true 590 | | empty(branch(_,_)) -> false 591 | | in theorem 592 | | forall X where tree(X) 593 | | boolean(empty(X)) ~> true 594 | | proof 595 | | case X = leaf 596 | | boolean(empty(leaf)) 597 | | -> boolean(true) [by empty.1] 598 | | -> true [by boolean.1] 599 | | case X = branch(S,T) 600 | | boolean(empty(branch(S,T))) 601 | | -> boolean(false) [by empty.2] 602 | | -> true [by boolean.2] 603 | | qed 604 | = true 605 | 606 | This isn't really a proof by induction yet, but it's getting closer. 607 | This is still really us examining the cases to determine the type. 608 | But, we have an extra guarantee here; in `case X = branch(S,T)`, we 609 | know `tree(S) -> true`, and `tree(T) -> true`, because `tree(X) -> true`. 610 | This is one more reason why `and(X,Y)` is built-in to Madison; it 611 | needs to leverage what it means and make use of this information in a 612 | proof. We don't really use that extra information in this proof, but 613 | we will later on. 614 | 615 | Structural Induction 616 | -------------------- 617 | 618 | Let's try something stronger, and get into something that could be 619 | described as real structural induction. This time, we won't just prove 620 | something's type. We'll prove something that actually walks and talks 621 | like a real (albeit simple) theorem: the reflection of the reflection 622 | of any binary tree is the same as the original tree. 623 | 624 | | type tree is 625 | | tree(leaf) -> true 626 | | tree(branch(X,Y)) -> and(tree(X),tree(Y)) 627 | | in let 628 | | reflect(leaf) -> leaf 629 | | reflect(branch(A,B)) -> branch(reflect(B),reflect(A)) 630 | | in theorem 631 | | forall X where tree(X) 632 | | reflect(reflect(X)) ~> X 633 | | proof 634 | | case X = leaf 635 | | reflect(reflect(leaf)) 636 | | -> reflect(leaf) [by reflect.1] 637 | | -> leaf [by reflect.1] 638 | | case X = branch(S, T) 639 | | reflect(reflect(branch(S, T))) 640 | | -> reflect(branch(reflect(T),reflect(S))) [by reflect.2] 641 | | -> branch(reflect(reflect(S)),reflect(reflect(T))) [by reflect.2] 642 | | -> branch(S,reflect(reflect(T))) [by IH] 643 | | -> branch(S,T) [by IH] 644 | | qed 645 | = true 646 | 647 | Finally, this is a proof using induction! In the [by IH] clauses, 648 | IH stands for "inductive hypothesis", the hypothesis that we may 649 | assume in making the proof; namely, that the property holds for 650 | "smaller" instances of the type of X -- in this case, the "smaller" 651 | trees S and T that are used to construct the tree `branch(S, T)`. 652 | 653 | Relying on the IH is valid only after we have proved the base case. 654 | After having proved `reflect(reflect(S)) -> S` for the base cases of 655 | the type of S, we are free to assume that `reflect(reflect(S)) -> S` 656 | in the induction cases. And we do so, to rewrite the last two steps. 657 | 658 | Like cases, the induction in a proof maps directly to the 659 | induction in the definition of the type of the variable being 660 | universally quantified upon. If the induction in the type is well- 661 | founded, so too will be the induction in the proof. (Indeed, the 662 | relationship between induction and cases is implicit in the 663 | concepts of the "base case" and "inductive case (or step)".) 664 | 665 | Stepping Back 666 | ------------- 667 | 668 | So, we have given a simple term-rewriting-based language for proofs, 669 | and shown that it can handle a proof of a property over an infinite 670 | universe of things (trees.) That was basically my goal in designing 671 | this language. Now let's step back and consider some of the 672 | implications of this system. 673 | 674 | We have, here, a typed programming language. We can define types 675 | that look an awful lot like algebraic data types. But instead of 676 | glibly declaring the type of any given term, like we would in most 677 | functional languages, we actually have to *prove* that our terms 678 | always rewrite to a value of that type. That's more work, of 679 | course, but it's also stronger: in proving that the term always 680 | rewrites to a value of the type, we have, naturally, proved that 681 | it *always* rewrites -- that its rewrite sequence is terminating. 682 | There is no possibility that its rewrite sequence will enter an 683 | infinite loop. Often, we establish this with the help of previously 684 | established basis that our inductively-defined types are well-founded, 685 | which is itself proved on the basis that the subterm relationship is 686 | well-founded. 687 | 688 | Much like we can prove termination in course of proving a type, 689 | we can prove a type in the course of proving a property -- such 690 | as the type of `reflect(reflect(T))` above. (This does not directly 691 | lead to a proof of the type of `reflect`, but whatever.) 692 | 693 | And, of course, we are only proving the type of term on the 694 | assumption that its subterms have specific types. These proofs 695 | say nothing about the other cases. This may provide flexibility 696 | for extending rewrite systems -- or it might not, I'm not sure. 697 | It might be nice to prove that all other types result in some 698 | error term. (One of the more annoying things about term-rewriting 699 | languages is how an error can result in a half-rewritten program 700 | instead of a recgonizable error code. There seems to be a tradeoff 701 | between extensibility and producing recognizable errors.) 702 | 703 | Grammar so Far 704 | -------------- 705 | 706 | I think I've described everything I want in the language above, so 707 | the grammar should, modulo tweaks, look something like this: 708 | 709 | Madison ::= Block. 710 | Block ::= LetBlock | TypeBlock | ProofBlock | RewriteBlock. 711 | LetBlock ::= "let" {Rule} "in" Block. 712 | TypeBlock ::= "type" Symbol "is" {Rule} "in" Block. 713 | RewriteBlock ::= "rewrite" Term. 714 | Rule ::= Term "->" Term. 715 | Term ::= Atom | Constructor | Variable. 716 | Atom ::= Symbol. 717 | Constructor ::= Symbol "(" Term {"," Term} ")". 718 | ProofBlock ::= "theorem" Statement "proof" Proof "qed". 719 | Statement ::= Quantifier Statement | MultiStep. 720 | Quantifiers ::= "forall" Variable "where" Term. 721 | MultiStep ::= Term "~>" Term. 722 | Proof ::= Case Proof {Case Proof} | Trace. 723 | Trace ::= Term {"->" Term [RuleRef]}. 724 | RuleRef ::= "[" "by" (Symbol "." Integer | "IH") "]". 725 | 726 | Discussion 727 | ---------- 728 | 729 | I think that basically covers it. This document is still a little 730 | rough, but that's what major version zeroes are for, right? 731 | 732 | I have essentially convinced myself that the above-described system 733 | is sufficient for simple proof checking. There are three significant 734 | things I had to convince myself of to get to this point, which I'll 735 | describe here. 736 | 737 | One is that types have to be well-founded in order for them to serve 738 | as scopes for universal quantification. This is obvious in 739 | retrospect, but getting them into the language in a way where it was 740 | clear they could be checked for well-foundedness took a little 741 | effort. The demarcation of type-predicate rewrite rules was a big 742 | step, and a little disappointing because it introduces the loaded 743 | term `type` into Madison's vernacular, which I wanted to avoid. 744 | But it made it much easier to think about, and to formulate the 745 | rules for checking that a type is well-founded. As I mentioned, it 746 | could go away -- Madison could just as easily check that any 747 | constructor used to scope a universal quantification is well-founded. 748 | But that would probably muddy the presentation of the idea in this 749 | document somewhat. It would be something to keep in mind for a 750 | subsequent version of Madison that further distances itself from the 751 | notion of "types". 752 | 753 | Also, it would probably be possible to extend the notion of well- 754 | founded rewriting rules to mutually-recursive rewriting rules. 755 | However, this would complicate the procedure for checking that a 756 | type predicate is well-founded. 757 | 758 | The second thing I had to accept to get to this conviction is that 759 | `and(X,Y)` is built into the language. It can't just be defined 760 | in Madison code, because while this would be wonderful from a 761 | standpoint of minimalism, Madison has to know what it means to let 762 | you write non-trivial inductive proofs. In a nutshell, it has to 763 | know that `foo(X) -> and(bar(X),baz(X))` means that if `foo(X)` is 764 | true, then `bar(X)` is also true, and `baz(X)` is true as well. 765 | 766 | I considered making `or(X,Y)` a built-in as well, but after some 767 | thought, wasn't convinced that it was that valuable in the kinds 768 | of proofs I wanted to write. 769 | 770 | Lastly, the third thing I had to come to terms with was, in general, 771 | how we know a stated proof is complete. As I've tried to describe 772 | above, we know it's complete because each of the cases maps to a 773 | possible rewrite rule, and induction maps to the inductive definition 774 | of a type predicate, which we know is well-founded because of the 775 | checks Madison does on it (ultimately based on the assumption that 776 | the subterm property is well-founded.) There Is Nothing Else. 777 | 778 | This gets a little more complicated when you get into proofs by 779 | induction. The thing there is that we can assume the property 780 | we want to prove, in one of the cases (the inductive case) of the 781 | proof, so long as we have already proved all the other cases (the 782 | base case.) This is perfectly sound in proofs by hand, so it is 783 | likewise perfectly sound in a formal proof checker like Madison; 784 | the question is how Madison "knows" that it is sound, i.e. how it 785 | can be programmed to reject proofs which are not structured this 786 | way. Well, if we limit it to what I've just described above -- 787 | check that the scope of the universal quantification is well- 788 | founded, check that there are two cases, and check that we've already 789 | proved one case, then allow the inductive hypothesis to be used as a 790 | rewrite rule in the other case of the proof -- this is not difficult 791 | to see how it could be mechanized. 792 | 793 | However, this is also very limited. Let's talk about limitations. 794 | 795 | For real data structures, you might well have multiple base cases; 796 | for example, a tree with two kinds of leaf nodes. Does this start 797 | breaking down? Probably. It probably breaks down with multiple 798 | inductive cases, as well, although you might be able to get around 799 | that with breaking the proof into multiple proofs, and having 800 | subsequent proofs rely on properties proved in previous proofs. 801 | 802 | Another limitation I discovered when trying to write a proof that 803 | addition in Peano arithmetic is commutative. It seemingly can't 804 | be done in Madison as it currently stands, as Madison only knows 805 | how to rewrite something into something else, and cannot express 806 | the fact that two things (like `add(A,B)` and `add(B,A)`) rewrite 807 | to the same thing. Such a facility would be easy enough to add, 808 | and may appear in a future version of Madison, possibly with a 809 | syntax like: 810 | 811 | theorem 812 | forall A where nat(A) 813 | forall B where nat(B) 814 | add(A,B) ~=~ add(B,A) 815 | proof ... 816 | 817 | You would then show that `add(A,B)` reduces to something, and 818 | that `add(B,A)` reduces to something, and Madison would check 819 | that the two somethings are in fact the same thing. This is, in 820 | fact, a fairly standard method in the world of term rewriting. 821 | 822 | As a historical note, Madison is one of the pieces of fallout from 823 | the overly-ambitious project I started a year and a half ago called 824 | Rho. Rho was a homoiconic rewriting language with several very 825 | general capabilities, and it wasn't long before I decided it was 826 | possible to write proofs in it, as well as the other things it was 827 | designed for. Of course, this stretched it to about the limit of 828 | what I could keep track of in a single project, and it was soon 829 | afterwards abandoned. Other fallout from Rho made it into other 830 | projects of mine, including Pail (having `let` bindings within 831 | the names of other `let` bindings), Falderal (the test suite from 832 | the Rho implementation), and Q-expressions (a variant of 833 | S-expressions, with better quoting capabilities, still forthcoming.) 834 | 835 | Happy proof-checking! 836 | Chris Pressey 837 | December 2, 2011 838 | Evanston, Illinois 839 | --------------------------------------------------------------------------------