├── README.md ├── doc ├── arvind │ ├── Neelakantan-2015-comp-vector-space-models-1504.06662v2.pdf │ ├── Neelkantan-2016-ICLR.pdf │ ├── Neelkantan-2017-ICLR.pdf │ └── neelakantan-thesis-proposal-talk.pdf ├── liang-2015-wikitablequestions.pdf └── notes │ ├── #meta.txt# │ ├── README.md │ ├── aopmath.sty │ ├── commands.tex │ ├── history.txt │ ├── llncs.cls │ ├── llncs.dem │ ├── llncs.doc │ ├── llncs.dvi │ ├── llncs.ind │ ├── llncs2e.zip │ ├── llncsdoc.sty │ ├── master.bib │ ├── nips15submit_e.sty │ ├── notes-2.tex │ ├── notes.tex │ ├── notes.txt │ ├── paper.tex │ ├── splncs.bst │ └── sprmindx.sty └── src ├── ProPPR-paper-comments.txt ├── example.pl ├── my_eval.pl ├── my_tables.pl ├── np_proppr.pl ├── np_proppr_new.pl └── what_we_want.txt /README.md: -------------------------------------------------------------------------------- 1 | # Neural Programmer as Probabilistic Programming, and other approaches 2 | Goal is to work with the examples in [Arvind Neelakantan's work](https://arxiv.org/abs/1611.08945) and understand how to use other semantic parsing techniques to improve performance. 3 | 4 | One main line of attack is to use [Probabilistic CC](https://github.com/saraswat/pcc) (and implement via translation to [PRISM](http://rjida.meijo-u.ac.jp/sato-www/prism/)). For now, we will approximate PCC with definite clauses that have a fixed (left to right) order of evaluation, and ensure that our programs are such that atoms `cond` used in a sample operator `X | cond ~ PD` are ground when executed. 5 | 6 | The overall problem to be solved: Design a system that can take as input an utterance and a table 7 | and computes an answer to the question in the utterance, using only the information in the table. The number of columns of the table and their header and row information can vary from question to question. Entries in the table (cells) may have numbers or multiple words. The training set available is a corpus `(x_i, t_i, a_i)_i` where `x_i` is the utterance, `t_i` a table and `a_i` is the answer. The program is latent. The corpus is described in [7]. Note that with 37% results, there is considerable room for improvement! 8 | 9 | The basic approach is to augment a probabilistic CCP semantic parser `parse/2` with an evaluator of the logical form. 10 | ```prolog 11 | result(Query, Table, Ans):- parse(Query, Form), eval(Form, Table, Ans). 12 | ``` 13 | `parse/2` is intended to be a "standard" semantic parser that converts the input query into a logical form, e.g. using Abu's technique (stochastic definite clause grammar, with learning of terminal productions, and with Hierarchical Dirichlet Process priors). The evaluator, `eval(Form, Table, Ans)` treats `Form` as a program, evaluated against `Table` to produce `Ans`. The program is deterministic for the most part, but some complications noted below may be handled by letting the evaluator be probabilistic, and learning the probability distribution from data. 14 | 15 | Note that `Form` does not occur in the head of the clause -- it is "latent". Training is performed on a bunch of ground ``result/3`` triples. At test time, `Query` and `Table` are instantiated, and `Ans` is unknown and computed by the program. We will use [Cussen's Failure Adjusted Maximisation (FAM) algorithm](http://link.springer.com/article/10.1023/A:1010924021315) (based on EM) for training. We will use the Viterbi algorithm in PRISM to compute the most probable solution. Both these techniques are implemented in [PRISM](http://rjida.meijo-u.ac.jp/sato-www/prism/). 16 | 17 | _Q for Abu: Is your implemented system using techniques similar to PRISM's Viterbi training (see [5]) + generalized inside-out algorithm [see [2]), or are there different ideas? [5] contains a discussion of statistical parsing in this context. Note that PRISM implements a number of other inferencing techniques, including Variational Bayes, that may be of interest here._ 18 | 19 | The key to this approach is the design of the logical form. The language of logical forms should be _expressive_ -- rich enough to express all functions from columns to values that can be specified by users in an utterance. It should attempt to be _orthogonal_ in that given a set of column names, and a function from tables with those column names to values, the number of programs in the language that express that function is very small, preferrably one. 20 | 21 | A design in the variable-free "FunQL" style is given below. For this language, `eval(Form, Table, ?)` is determinate -- given a `Form` and a `Table`, the query will produce at most one answer determinately. 22 | 23 | The key to this problem is developing a trainable semantic parser for the given utterance and logical language. Any parser can be used to solve the problem provided that it can generate a small ordered set of candidate parses that contains the "correct" parse, and improve its performance with feedback (generating fewer parses, including the correct parse). Since the input language is conversational (rather than formal), the parser has to exhibit some flexibility in word order. It will also need to be exhibit some genericity with respect to its lexicon because column names will be known only at runtime (may never have been seen during training). 24 | 25 | Parsers such as Li Dong's sequence to sequence parsers are not directly applicable since they need the semantic form for training, and cannot generate a set of candidate logic forms. _But perhaps they can be modified?_ 26 | 27 | Probabilistic semantic parsers that permit productions to be generated on the fly during training, and that can adjust probabilities during training should be good candidates. 28 | 29 | Note that to help the training process it may make sense to augment the training set with some `parse/2` pairs and some `eval/3` triples (if `eval/3` is probabilistic). It will be interesting to determine how overall performance improves with these augmentations. 30 | 31 | An advantage of this approach is a clean separation between parsing (understanding the nuances of the natural language utterance and generating the logical form) and evaluation (using information in the logical form to obtain an answer, given the table). Note that it may be interesting to run `parse/2` and `eval/3` in parallel, so that feedback from `eval/3` can be used to reject a partial parse (fail early). 32 | 33 | 34 | _Q: What is the analog of [Arvind's](https://arxiv.org/abs/1611.08945) "soft selection", during training for PCC?_ 35 | 36 | ## Complications 37 | 38 | Note the answer may be a number that occurs directly in the table. We will represent this by permitting the evaluator to be non-deterministic -- choose either the number in the table, or choose the computed answer -- and let the choice be conditioned by the logical form, and then letting training determine the probability distribution. 39 | 40 | ## Other Approaches 41 | In this approach, `parse/2` is a probabilistic CC parser, hence `Form` is a symbolic expression and training is performed by variants of EM. 42 | 43 | A completely different approach is to develop a differentiable architecture, as in [1], and train end to end using SGD. A key question here is the representation of the utterance. In [1] this is done with a "Question RNN" whose weights are updated during training, presumably leading to an application specific abstraction of the utterance being learnt. Jianpeng has developed an end-to-end differentiable system [6] (based on RNN grammars) which produces a semantic parse in two steps, first generating an "ungrounded" semantic representation, and second learning the grounded lexicon (mapping from natural language predicates to pre-fixed vocabulary of grounded predicates). It may be worth considering replacing the Question RNN in the architecture of [1] with the RNN grammar based component of [6], modifying the rest of the system to accept the question representation as a symbolic expression, but continuing to train end-to-end with SGD. The key conceptual difference from [1] is that the problem is decomposed into translating the utterance into a (symbolic) semantic form and then evaluating that form against the table. The symbolic representation should be much easier to understand -- useful for debugging and explanation. 44 | 45 | Yet another alternative is to use [4] as a semantic parser. But here the gap between the form output by the semantic parser and the form needed for execution will be significant, and will need to be bridged by a separate learner, akin to phase II of [6]. 46 | 47 | This is also a good (but advanced example) for the "Differentiable Logic" project. 48 | 49 | # Initial cut at logical form 50 | The language of logical forms is the set of first order expressions obtained by using the operations given below. This is not dissimilar to the lambda-DCS language presented in [7]. 51 | 52 | ## Type system 53 | Given a table with many rows, and columns with colnames. Each row has an index. 54 | 1. `Value` -- an integer, boolean, date, ... 55 | 2. `Values` -- sets of integers, booleans, dates 56 | 3. `Rows` -- subsequence of rows from given table 57 | 4. `ColName` -- names of columns in the given table 58 | 59 | No operations are available on the type `ColName`. Therefore the only terms of this type are constants. 60 | ## Operations 61 | 62 | ### Comparison-based selection 63 | ``` 64 | op: Colname -> Value -> Rows -> Rows 65 | ge(colname, val, r): subsequence of rows in r whose colname cells have value >= val 66 | gt(colname, val, r) 67 | le(colname, val, r) 68 | lt(colname, val, r) 69 | eq(colname, val, r) 70 | ``` 71 | Return the subsequence of rows from `r` in which the value of the cell at `colname` has the given relationship to `val`. `colname`, `val` and `r` are evaluated in turn. `val` must evaluate to a singleton value; for this operations such as `first` or `last` may be used, perhaps together with `proj`. (See examples in the table below.) 72 | 73 | `eq(colname, val, r)` is written as `colname(cellvalue, rows)`. 74 | 75 | ### Superlatives 76 | 77 | ``` 78 | max, min: ColName -> Rows -> Rows 79 | max(colname, rows) subsequence of rows with the highest cell values for colname 80 | min(colname, rows) subsequence of rows with the lowest cell values for colname 81 | ``` 82 | 83 | ### Navigation 84 | 85 | ``` 86 | prev, next: Rows -> Rows 87 | first,last: Rows -> Row 88 | ``` 89 | 90 | `prev(r)` (`next(r)`) returns the subsequence of rows obtained by taking the preceding (succeeding) row in the original table (if it exists) for each row in `r`. `first(r)` (`last(r)`) returns the first (last) row in `r`. 91 | 92 | ### Projection 93 | ``` 94 | proj: ColName -> Rows -> Values 95 | ``` 96 | 97 | `proj(colname, r)` is written `colname(r)`. Returns the sequence of _values_ obtained by selecting the value of `colname` for each row in `r`. 98 | 99 | ### Numeric operations 100 | ``` 101 | +, -, *, /: Values -> Values -> Values 102 | +, -, *, /: Value -> Value -> Value 103 | ``` 104 | 105 | When applied to `Values`, the operators require that the two arguments have the same number of elements, the operator is applied to corresponding elements. 106 | 107 | ### Set operations 108 | ``` 109 | all: Rows 110 | either: Rows -> Rows -> Rows 111 | ``` 112 | `all` is the set of all rows. 113 | 114 | `either(rs, qs)` contains the rows of `rs` and `qs` in the sequence in which they occur in the original table. (Example: `either(country(china, all), country(france, all))` is the collection of all rows in the table whose `country` column contains `china` or `france`.) 115 | Note that `both(rs, qs)` should not be needed. (It contains rows that are in both `rs` and `qs`. Example: `both(country(china, all), city(beijing, all))` is the collection of all rows in the table whose `country` column contains `china` and `city` column contains `beijing`. It can also be expressed as `country(china, city(beijing, all))`.) 116 | 117 | ### Miscellaneous 118 | ``` 119 | card: Rows -> Value card(r) is the number of rows in r. 120 | ``` 121 | 122 | # Examples 123 | Consider [a table](https://en.wikipedia.org/wiki/List_of_Olympic_Games_host_cities) given by: 124 | 125 | |Year |City | Country | Nations | 126 | | --- | --- | --- | --- | 127 | | 1896 | athens | greece | 14 | 128 | | 1900 | paris | france | 24 | 129 | | 1904 | 'st louis' | usa | 12 | 130 | | ... | ... | ... | ... | 131 | | 2004 | athens | greece | 201 | 132 | | 2008 | beijing | china | 204 | 133 | | 2012 | london | uk | 204 | 134 | 135 | 136 | 137 | Here are some example questions and their translations. 138 | 139 | | Utterance | Form | 140 | | --------- | ---- | 141 | | _Events in Athens_ | `city(athens, all)`| 142 | | _Events in Athens or Beijing_ | `either(city(athens, all), city(beijing, all))`| 143 | | _Events in Athens before 1990_ | `lt(year, 1990, city(athens, all))`| 144 | | _How many events were in Athens, Greece?_ | `card(city(athens, all))` | 145 | | _Events in the same country as Athens_ | `country(country(first(city(athens, all))), all)` | 146 | | _Greece held its last Summer Olympics in which year?_ | `year(max(year, country(greece, all)))` | 147 | | _In which city's the first time with at least 20 nations?_ | `city(min(year, atleast(nations, 20, all)))` | 148 | | _Which years have the most participating countries?_ | `years(max(nations, all))` | 149 | | _How many more participants were there in 1990 than in the first year?_ | `nations(year(1990, all)) - nations(min(year, all))` | 150 | 151 | 152 | 153 | # References 154 | 1. Arvind Neelakantan, Quoc V. Le, Martin Abadi, Andrew McCallum, Dario Amode. [Learning a Natural Language Interface with Neural Programmer.](https://arxiv.org/abs/1606.04474) ICLR 2017. 155 | 2. Taisuke Sato. [PRISM Manual.](rjida.meijo-u.ac.jp/prism/download/prism21.pdf) 156 | 3. James Cussens. [Parameter Estimation in Stochastic Logic Programs.](http://link.springer.com/article/10.1023/A:1010924021315) _Machine Learning_. September 2001, Volume 44, Issue 3, pp 245–271 157 | 4. Siva Reddy, Oscar Täckström, Slav Petrov, Mark Steedman, Mirella Lapata. [Universal Semantic Parsing.](https://arxiv.org/abs/1702.03196) 158 | 5. Teisuke Sato and Keiichi Kubota. [Viterbi training in PRISM](https://www.semanticscholar.org/paper/Viterbi-training-in-PRISM-Sato-Kubota/92756666eff7dbac73ceb4b8b398e4ae61f33d7f). TPLP, 2015, pp 147--168. 159 | 6. Jianpeng Cheng, Siva Reddy, Vijay Saraswat and Mirella Lapata. Learning Structured Natural Language Representations for Semantic Parsing. ACL 2017 160 | 7. Panupong Pasupat and Percy Liang. [Compositional Semantic Parsing on Semi-Structured Tables.](https://cs.stanford.edu/~pliang/papers/compositional-acl2015.pdf) ACL 2015. 161 | 8. Vijay Saraswat. [Probabilistic CCP (logic programming subset)](https://github.com/saraswat/pcc). In progress. 162 | -------------------------------------------------------------------------------- /doc/arvind/Neelakantan-2015-comp-vector-space-models-1504.06662v2.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saraswat/NeuralProgrammerAsProbProg/d2ec5dd7b8aeeb084937f110b1daf859a8f3fe02/doc/arvind/Neelakantan-2015-comp-vector-space-models-1504.06662v2.pdf -------------------------------------------------------------------------------- /doc/arvind/Neelkantan-2016-ICLR.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saraswat/NeuralProgrammerAsProbProg/d2ec5dd7b8aeeb084937f110b1daf859a8f3fe02/doc/arvind/Neelkantan-2016-ICLR.pdf -------------------------------------------------------------------------------- /doc/arvind/Neelkantan-2017-ICLR.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saraswat/NeuralProgrammerAsProbProg/d2ec5dd7b8aeeb084937f110b1daf859a8f3fe02/doc/arvind/Neelkantan-2017-ICLR.pdf -------------------------------------------------------------------------------- /doc/arvind/neelakantan-thesis-proposal-talk.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saraswat/NeuralProgrammerAsProbProg/d2ec5dd7b8aeeb084937f110b1daf859a8f3fe02/doc/arvind/neelakantan-thesis-proposal-talk.pdf -------------------------------------------------------------------------------- /doc/liang-2015-wikitablequestions.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saraswat/NeuralProgrammerAsProbProg/d2ec5dd7b8aeeb084937f110b1daf859a8f3fe02/doc/liang-2015-wikitablequestions.pdf -------------------------------------------------------------------------------- /doc/notes/#meta.txt#: -------------------------------------------------------------------------------- 1 | Sat Apr 25 16:01:22 EDT 2015 2 | 3 | 4 | -------------------------------------------------------------------------------- /doc/notes/README.md: -------------------------------------------------------------------------------- 1 | # logic-nn 2 | ## Doing logic with neural nets. 3 | 4 | We want to rebuild logic on a machine learning foundation. 5 | 6 | Consider the [inductive logic programming](https://en.wikipedia.org/wiki/Inductive_logic_programming) situation. We are given some logical vocabulary `L`, a background theory `S` in `L`, and a collection of "observations", `O`, that have somehow been made in the real world. It is expected that the observations are noisy. For now, let us say the observations are simply sets of positive and negative literals `(O+,O-)` for a chosen predicate `p/k`. 7 | 8 | `S` is such that it alone cannot "explain" the given `O+` and `O-`. We desire to learn a theory `T` s.t. `S` and `T` are jointly consistent, can entail `O+`, and is not inconsistent with `O-`. 9 | 10 | The standard development of ILP goes through certain symbolic and search techniques where different (definite clause) theories `T` are constructed and tried. 11 | 12 | We want to try something different, using techniques such as stochastic gradient descent that have proved very powerful for neural networks. 13 | 14 | SGD: The basic idea is that you define a function `f(theta)`, parametrized by `theta`, and a loss function `L`. The setting is supervised, i.e. you are given a possibly large set of obsverations `O` of the form `(x,y)`, where each of the `x` and `y` can be tuples of a given fixed size. You are looking for a setting of the parameters `theta` which minimizes the loss function. Gradient descent proceeds by assuming a current value `u` of the parameters `theta`, using it to compute the loss `L(y, f(u)(x))` and then propagate the loss back through the function `f`, using the partial derivatives `do f/do j` for each parameter `j` in `theta`, to update the parameters. Repeat across all observations. For various choices of the function `f`, e.g. feed-forward neural nets, recurrent neural nets this technique has proved remarkably robust in practice, across a wide variety of data-sets, e.g. in vision, and in speech, and in some natural language tasks such as machine translation, text generation etc. 15 | 16 | So we are looking for some kind of continuous embedding of logical formulas: (For now consider the underlying logic is untyped, in practice we will want to use types) 17 | 1. Some space `B` for the interpretation of Boolean formulas, with interpretations for the logical connectives (conjunction, disjunction, negation, implication -- we should experiment with both intuitionistic and classical interpretations). 18 | 2. An embedding `I(c)` of individual constant symbols `c` into some space `U` of interpretations. (Similarly for function symbols.) 19 | 3. An embedding of the all and some quantifiers, `all:(U->B)->B`, `some:(U->B)->B` 20 | 3. An embedding `I(p/k)` of individual predicate symbols `p/k` into some space `P{k}` that is roughly `(U^k ->B)` 21 | 4. An embedding of predication: for every element `p` of `P{k}`, an application function `A(p)` that takes a tuple `u` in `U^k` to a value in `B` 22 | 23 | The parameters (which we wish to learn) are the values `I(c)` for some of the constant and function symbols, and `I(p/k)` for some of the predicate symbols. The loss function can be taken to be a "margin loss" -- measure the difference between a positive and a negative tuple for the given predicate `p/k`, and we seek to maximize it. (Other loss functions are possible.) 24 | 25 | 26 | 27 | The work on LSTMs for textual entailment establishes that they can be used for single step entailment. 28 | Can they be used for multi-step entailments, i.e. to implement reasoning, in a first order setting. 29 | Can we build theories into NN learners? 30 | 31 | ## References 32 | 1. [Bishan Yang's papers](http://arxiv.org/abs/1412.6575), [more](http://arxiv.org/abs/1412.6575) 33 | 2. [Tim's injecting paper](http://rockt.github.io/pdf/rocktaschel2015injecting.pdf) 34 | 3. [Tim's Neural Theorem Prover paper](http://www.akbc.ws/2016/papers/14_Paper.pdf) 35 | 4. [William's Tensorlog paper](https://arxiv.org/abs/1605.06523) 36 | 37 | (See reference lists of these papers for other papers.) 38 | -------------------------------------------------------------------------------- /doc/notes/aopmath.sty: -------------------------------------------------------------------------------- 1 | \DeclareMathVersion{zed} 2 | \SetMathAlphabet{\mathrm}{zed}{\encodingdefault}{\rmdefault}{m}{n}% 3 | \SetMathAlphabet{\mathbf}{zed}{\encodingdefault}{\rmdefault}{bx}{n}% 4 | \SetMathAlphabet{\mathsf}{zed}{\encodingdefault}{\sfdefault}{m}{n}% 5 | \SetMathAlphabet{\mathtt}{zed}{\encodingdefault}{\ttdefault}{m}{n}% 6 | \DeclareSymbolFont{italics}{\encodingdefault}{\rmdefault}{m}{it}% 7 | \DeclareSymbolFontAlphabet{\mathrm}{operators} 8 | \DeclareSymbolFontAlphabet{\mathit}{letters} 9 | \DeclareSymbolFontAlphabet{\mathcal}{symbols} 10 | \DeclareSymbolFont{lasy}{U}{lasy}{m}{n} 11 | \DeclareSymbolFont{AMSa}{U}{msa}{m}{n} 12 | \DeclareSymbolFont{AMSb}{U}{msb}{m}{n} 13 | \let\mho\undefined 14 | \let\Join\undefined 15 | \let\Box\undefined 16 | \let\Diamond\undefined 17 | \let\leadsto\undefined 18 | \let\sqsubset\undefined 19 | \let\sqsupset\undefined 20 | \let\lhd\undefined 21 | \let\unlhd\undefined 22 | \let\rhd\undefined 23 | \let\unrhd\undefined 24 | \DeclareMathSymbol{\mho}{\mathord}{lasy}{"30} 25 | \DeclareMathSymbol{\Join}{\mathrel}{lasy}{"31} 26 | \DeclareMathSymbol{\Box}{\mathord}{lasy}{"32} 27 | \DeclareMathSymbol{\Diamond}{\mathord}{lasy}{"33} 28 | \DeclareMathSymbol{\leadsto}{\mathrel}{lasy}{"3B} 29 | \DeclareMathSymbol{\sqsubset}{\mathrel}{lasy}{"3C} 30 | \DeclareMathSymbol{\sqsupset}{\mathrel}{lasy}{"3D} 31 | \DeclareMathSymbol{\lhd}{\mathrel}{lasy}{"01} 32 | \DeclareMathSymbol{\unlhd}{\mathrel}{lasy}{"02} 33 | \DeclareMathSymbol{\rhd}{\mathrel}{lasy}{"03} 34 | \DeclareMathSymbol{\unrhd}{\mathrel}{lasy}{"04} 35 | \DeclareSymbolFontAlphabet{\bbold}{AMSb} 36 | \mathversion{zed} 37 | 38 | 39 | \def\@setmcodes#1#2#3{{\count0=#1 \count1=#3 40 | \loop \global\mathcode\count0=\count1 \ifnum \count0<#2 41 | \advance\count0 by1 \advance\count1 by1 \repeat}} 42 | \@setmcodes{`A}{`Z}{"7\hexnumber@\symitalics41}% 43 | \@setmcodes{`a}{`z}{"7\hexnumber@\symitalics61}% 44 | 45 | \DeclareRobustCommand\em 46 | {\@nomath\em \ifdim \fontdimen\@ne\font >\z@ 47 | \upshape \else \slshape \fi} 48 | 49 | % Now to give sensible names for Squiggol and category theory symbols 50 | 51 | \def\implies{\Rightarrow} 52 | \def\iff{\Leftrightarrow} 53 | \def\concat{\mathbin{\plus\!\!\!\plus}} 54 | \def\suchthat{\mathrel{|}} 55 | \def\compose{\mathbin{\cdot}} 56 | \def\set#1{\mathord{\{#1\}}} 57 | \def\seq#1{\mathord{[#1]}} 58 | \def\emptyseq{\seq{\,}} 59 | \def\emptyrel{\emptyset} 60 | \def\nilset{\set{\,}} 61 | 62 | \newcommand{\split}[2]{\langle #1,#2\rangle} 63 | \newcommand{\converse}[1]{#1^{\circ}} 64 | \newcommand{\lbana}{\mbox{$(\![$}} 65 | \newcommand{\rbana}{\mbox{$]\!)$}} 66 | \newcommand{\lbcat}{\mbox{$[\!($}} 67 | \newcommand{\rbcat}{\mbox{$)\!]$}} 68 | \newcommand{\lccat}{\mbox{$[\!\langle$}} 69 | \newcommand{\rccat}{\mbox{$\rangle\!]$}} 70 | \newcommand{\cata}[1]{\mbox{$\lbana #1 \rbana$}} 71 | \newcommand{\ana}[1]{\mbox{$\lbcat #1 \rbcat$}} 72 | \newcommand{\cockett}[1]{\mbox{$\lccat #1 \rccat$}} 73 | \newcommand{\spe}{\mbox{$\ni$}} 74 | \newcommand{\eps}{\mbox{$\in$}} 75 | \newcommand{\snoc}{\mathbin{+\!\!\!+\!\!\!<}} 76 | \newcommand{\bnoc}{\mathbin{+\!\!\!<}} 77 | \newcommand{\func}[1]{{\sf #1}} 78 | \newcommand{\Pow}{{\sf P}} 79 | \newcommand{\Ext}{{\sf E}} 80 | \newcommand{\toprel}{\Pi} 81 | \newcommand{\botrel}{0} 82 | \newcommand{\id}{id} 83 | \newcommand{\union}{\;\cup\;} 84 | \newcommand{\bigunion}{\mbox{$\bigcup$}} 85 | \newcommand{\bigmeet}{\mbox{$\bigcap$}} 86 | \newcommand{\from}{\mathbin{\leftarrow}} 87 | \newcommand{\terminator}{{1}} 88 | \newcommand{\ffrom}{\Leftarrow} 89 | \newcommand{\wfrom}{\hookleftarrow} 90 | \newcommand{\wto}{\hookrightarrow} 91 | \newcommand{\acc}[1]{\mbox{$\lfloor #1 \rfloor$}} 92 | \newcommand{\bang}{\mbox{$!$}} 93 | \newcommand{\bong}{\mbox{!`}} 94 | \newcommand{\graph}{\mbox{\sf J}} 95 | \newcommand{\sse}{\subseteq} 96 | \newcommand{\spse}{\supseteq} 97 | \newcommand{\wok}[1]{\mbox{${#1}^{\circ}$}} 98 | \newcommand{\rtc}[1]{\mbox{${#1}^{\star}$}} 99 | \newcommand{\conj}{\;\wedge\;} 100 | \newcommand{\bs}{\mbox{$\backslash$}} 101 | \newcommand{\dom}[1]{\mbox{${#1}\Box$}} 102 | \newcommand{\ran}[1]{\mbox{$\Box{#1}$}} 103 | \newcommand{\tgt}{\triangleleft} 104 | \newcommand{\src}{\triangleright} 105 | \newcommand{\floor}[1]{\lfloor #1 \rfloor} 106 | \newcommand{\bscor}{\mathbin{\backslash\!\!\!\!-}} 107 | 108 | % general utilities 109 | 110 | \def\emark{$\Box$} 111 | \def\pause{\vspace{2ex}} 112 | \def\text#1{\mbox{\rm #1}} 113 | \newcommand{\idisp}[1]{\makebox[0 in][l]{$#1$}} 114 | 115 | % Make ~ do a thin space in maths mode. We need to insert a space between a 116 | % function and its argument, because spaces aren't significant in maths mode. 117 | % Thus, you say "f~x". 118 | 119 | \let\twiddle=~ 120 | \def\applicationspace{\makebox[.15ex]{}} 121 | \def~{\ifmmode \applicationspace \else \twiddle \fi} 122 | 123 | 124 | % Make . active in maths mode, as a "compose". Save old . as \period 125 | % Thus, you can say "map~tails . inits", and "3\period 14" 126 | 127 | \mathchardef\period=\mathcode`. % ordinary symbol 128 | {\catcode`.=\active \gdef.{\compose}} 129 | \mathcode`.="8000 130 | 131 | % Now define the abbreviations we will use. We have 132 | % => is an abbreviation for \implies 133 | % <= " \le 134 | % >= " \ge 135 | % ++ " \concat 136 | % <=> " \iff 137 | % -> " \rightarrow 138 | 139 | \mathchardef\equals=\mathcode`= 140 | \mathcode`=="8000 141 | 142 | {\catcode`==\active \gdef={\futurelet\@next\@newequals}} 143 | \def\@newequals{\ifx>\@next \def\@next##1{\implies} 144 | \else \def\@next{\equals} 145 | \fi 146 | \@next} 147 | 148 | \def\Relbar{\mathrel{=}} % \Relbar (in plain TeX) screwed up by making = active 149 | 150 | 151 | 152 | \mathchardef\lessthan=\mathcode`< 153 | \mathcode`<="8000 154 | 155 | {\catcode`<=\active \gdef<{\futurelet\@next\@newlessthan}} 156 | \def\@newlessthan{\ifx=\@next \def\@next##1{\futurelet\@next\@newle} 157 | \else \ifx<\@next \def\@next##1{\ll} 158 | \else \def\@next{\lessthan} 159 | \fi 160 | \fi 161 | \@next} 162 | 163 | \def\@newle{\ifx>\@next \def\@next##1{\iff} 164 | \else \def\@next{\le} 165 | \fi 166 | \@next} 167 | 168 | 169 | 170 | 171 | \mathchardef\greaterthan=\mathcode`> 172 | \mathcode`>="8000 173 | 174 | {\catcode`>=\active \gdef>{\futurelet\@next\@newgreaterthan}} 175 | \def\@newgreaterthan{\ifx=\@next \def\@next##1{\ge} 176 | \else \ifx>\@next \def\@next##1{\gg} 177 | \else \def\@next{\greaterthan} 178 | \fi 179 | \fi 180 | \@next} 181 | 182 | 183 | \mathchardef\plus=\mathcode`+ 184 | \mathcode`+="8000 185 | 186 | {\catcode`+=\active \gdef+{\futurelet\@next\@newplus}} 187 | \def\@newplus{\ifx+\@next \def\@next##1{\concat} 188 | \else \def\@next{\plus} 189 | \fi 190 | \@next} 191 | 192 | 193 | 194 | \mathchardef\minus=\mathcode`- 195 | \mathcode`-="8000 196 | 197 | {\catcode`-=\active \gdef-{\futurelet\@next\@newminus}} 198 | \def\@newminus{\ifx>\@next \def\@next##1{\rightarrow} 199 | \else \def\@next{\minus} 200 | \fi 201 | \@next} 202 | 203 | % The derivation environment 204 | 205 | \newenvironment{derivation}{\begin{eqnarray*} 206 | }{\end{eqnarray*} 207 | \global\@ignoretrue} 208 | \def\given#1{&&\begin{array}[t]{@{}l@{}}% 209 | #1 210 | \end{array} \\ \nopagebreak} 211 | \def\step{\@ifnextchar({\@step}{\@step(=)}} 212 | \def\@step(#1)[#2]#3{& \begin{oldtabular}[t]{@{}l@{}}% 213 | \quad \{#2\} 214 | \end{oldtabular} \\ 215 | & & \begin{array}[t]{@{}l@{}}% 216 | #3 217 | \end{array} \\ } 218 | \def\step{\@ifnextchar({\@step}{\@step(=)}} 219 | \def\result{\@ifnextchar({\@result}{\@result(=)}} 220 | \def\@result(#1)[#2]#3{& \begin{oldtabular}[t]{@{}l@{}}% 221 | \quad\{#2\} 222 | \end{oldtabular} \\ 223 | & & \begin{array}[t]{@{}l@{}}% 224 | #3 225 | \end{array} } 226 | 227 | % Set up Theorems, Lemmas and Corollaries and Propositions: 228 | 229 | 230 | \newtheorem{theorem}{Theorem} 231 | \newtheorem{fact}{Fact} 232 | \newtheorem{lemma}[fact]{Lemma} 233 | \newtheorem{corollary}{Corollary} 234 | \newtheorem{proposition}{Proposition} 235 | 236 | %\newcounter{facts} 237 | %\newenvironment{fact}{\begin{quote}\normalsize\sl 238 | % \noindent{\bf Fact \refstepcounter{facts}\thefacts}} 239 | % {\end{quote}} 240 | %\newenvironment{lemma}{\begin{quote}\normalsize \sl 241 | % \noindent{\bf Lemma \refstepcounter{facts}\thefacts}} 242 | % {\end{quote}} 243 | \newenvironment{refact}[1]{\begin{quote}\normalsize 244 | \noindent{\it Restatement of Fact #1.}}{\end{quote}} 245 | 246 | 247 | \newenvironment{program}{\[\begin{array}{rcll} 248 | }{\end{array}\]} 249 | 250 | 251 | % common hyphenation exceptions: 252 | 253 | \hyphenation{de-ter-mi-nism de-ter-mi-nistic non-de-ter-mi-nism} 254 | \hyphenation{analy-sis} 255 | \hyphenation{defi-ni-tion} 256 | \hyphenation{lit-era-ture} 257 | \hyphenation{cat-egory cat-egories cat-egorical} 258 | \hyphenation{tabu-lation} 259 | \hyphenation{equiv-al-ent equiv-al-ence} 260 | \hyphenation{necess-ary} 261 | \hyphenation{monic monics} 262 | \hyphenation{op-er-ation} 263 | \hyphenation{typi-cal} 264 | \hyphenation{sum-mar-izing sum-mary} 265 | 266 | 267 | \newdimen\mathindent 268 | \AtEndOfClass{\mathindent\leftmargini} 269 | \renewcommand{\[}{\relax 270 | \ifmmode\@badmath 271 | \else 272 | \begin{trivlist}% 273 | \@beginparpenalty\predisplaypenalty 274 | \@endparpenalty\postdisplaypenalty 275 | \item[]\leavevmode 276 | \hbox to\linewidth\bgroup $\m@th\displaystyle %$ 277 | \hskip\mathindent\bgroup 278 | \fi} 279 | \renewcommand{\]}{\relax 280 | \ifmmode 281 | \egroup $\hfil% $ 282 | \egroup 283 | \end{trivlist}% 284 | \else \@badmath 285 | \fi} 286 | \renewenvironment{equation}% 287 | {\@beginparpenalty\predisplaypenalty 288 | \@endparpenalty\postdisplaypenalty 289 | \refstepcounter{equation}% 290 | \trivlist \item[]\leavevmode 291 | \hbox to\linewidth\bgroup $\m@th% $ 292 | \displaystyle 293 | \hskip\mathindent}% 294 | {$\hfil % $ 295 | \displaywidth\linewidth\hbox{\@eqnnum}% 296 | \egroup 297 | \endtrivlist} 298 | \renewenvironment{eqnarray}{% 299 | \stepcounter{equation}% 300 | \def\@currentlabel{\p@equation\theequation}% 301 | \global\@eqnswtrue\m@th 302 | \global\@eqcnt\z@ 303 | \tabskip\mathindent 304 | \let\\=\@eqncr 305 | \setlength{\abovedisplayskip}{\topsep}% 306 | \ifvmode 307 | \addtolength{\abovedisplayskip}{\partopsep}% 308 | \fi 309 | % \addtolength{\abovedisplayskip}{\parskip}% 310 | \setlength{\belowdisplayskip}{\abovedisplayskip}% 311 | \setlength{\belowdisplayshortskip}{\abovedisplayskip}% 312 | \setlength{\abovedisplayshortskip}{\abovedisplayskip}% 313 | $$\everycr{}\halign to\linewidth% $$ 314 | \bgroup 315 | \hskip\@centering 316 | $\displaystyle\tabskip\z@skip{##}$\@eqnsel&% 317 | \global\@eqcnt\@ne \hskip \tw@\arraycolsep \hfil${##}$\hfil&% 318 | \global\@eqcnt\tw@ \hskip \tw@\arraycolsep 319 | $\displaystyle{##}$\hfil \tabskip\@centering&% 320 | \global\@eqcnt\thr@@ 321 | \hbox to \z@\bgroup\hss##\egroup\tabskip\z@skip\cr}% 322 | {\@@eqncr 323 | \egroup 324 | \global\advance\c@equation\m@ne$$% $$ 325 | \global\@ignoretrue 326 | } 327 | -------------------------------------------------------------------------------- /doc/notes/commands.tex: -------------------------------------------------------------------------------- 1 | \usepackage{relsize} 2 | \usepackage{turnstile} 3 | \usepackage{amsmath} 4 | \usepackage{url} 5 | 6 | \usepackage{listings} 7 | 8 | \usepackage{color} 9 | % 10 | \lstdefinelanguage{X10}% 11 | {morekeywords={abstract,break,case,catch,class,% 12 | const,continue,default,do,else,extends,false,final,% 13 | finally,for,goto,if,implements,import,instanceof,% 14 | interface,label,native,new,null,package,private,protected,% 15 | public,return,static,super,switch,synchronized,this,throw,% 16 | throws,transient,true,try,volatile,while,% 17 | async,atomic,when,foreach,ateach,finish,clocked,% 18 | type,here,% 19 | self,property,% 20 | proto,assert,% 21 | future,to,has,as,var,val,def,where,in,% 22 | value,or,await,current,any},% 23 | basicstyle=\normalfont\ttfamily,%\color{Red},% 24 | keywordstyle=\bf\ttfamily,%\color{OliveGreen},% 25 | commentstyle=\normalfont\ttfamily,%\color{Gray},% 26 | identifierstyle=\normalfont\ttfamily,%\color{Red},% 27 | stringstyle=\normalfont\ttfamily,% 28 | tabsize=4,% 29 | showstringspaces=false,% 30 | sensitive,% 31 | morecomment=[l]//,% 32 | morecomment=[s]{/*}{*/},% 33 | morestring=[b]",% 34 | morestring=[b]',% 35 | columns=fullflexible,% 36 | mathescape=false,% 37 | keepspaces=true,% 38 | showlines=false,% 39 | breaklines=true,% 40 | breakatwhitespace=true,% 41 | postbreak={},% 42 | %breakautoindent=true,% 43 | %breakindent=0pt,% 44 | %prebreak={},% 45 | } 46 | 47 | \lstnewenvironment{xtenmath} 48 | {\lstset{language=X10,breaklines=false,captionpos=b,xleftmargin=2em,mathescape=true}} 49 | {} 50 | 51 | \lstnewenvironment{xten} 52 | {\lstset{language=X10,breaklines=false,captionpos=b}} %,xleftmargin=2em 53 | {} 54 | 55 | %{numbers=left, numberstyle=\tiny, stepnumber=2, numbersep=5pt 56 | % numbers=left, 57 | \lstset{language=x10,basicstyle=\ttfamily\small} 58 | %\lstset{language=java,basicstyle=\ttfamily\small} 59 | 60 | 61 | \usepackage[ruled]{algorithm} % [plain] 62 | \usepackage[noend]{algorithmic} % [noend] 63 | \renewcommand\algorithmiccomment[1]{// \textit{#1}} % 64 | 65 | 66 | % For fancy end of line formatting. 67 | \usepackage{microtype} 68 | 69 | 70 | % For smaller font. 71 | \usepackage{pslatex} 72 | 73 | 74 | %\usepackage{xspace} 75 | %\usepackage{yglabels} 76 | %\usepackage{yglang} 77 | %\usepackage{ygequation} 78 | %\usepackage{graphicx} 79 | %\usepackage{epstopdf} 80 | 81 | \newcommand{\formalrule}[1]{\mbox{\textsc{\scriptsize #1}}} 82 | \newcommand{\myrule}[2]{\textbf{Rule #1:} #2.} 83 | %\newcommand{\myrule}[1]{\textsc{\codesmaller #1 Rule}} 84 | \newcommand{\umyrule}[1]{\textbf{\underline{\textsc{\codesmaller #1 Rule}}}} 85 | 86 | 87 | % \small \footnotesize \scriptsize \tiny 88 | % \codesize and \scriptsize seem to do the same thing. 89 | % \newcommand{\code}[1]{\texttt{\textup{\footnotesize #1}}} 90 | % \newcommand{\code}[1]{\texttt{\textup{\codesize #1}}} 91 | \newcommand{\normalcode}[1]{\texttt{\textup{#1}}} 92 | \def\codesmaller{\small} 93 | \newcommand{\myCOMMENT}[1]{\COMMENT{\small #1}} 94 | \newcommand{\code}[1]{\texttt{\textup{\codesmaller #1}}} 95 | % \newcommand{\code}[1]{\ifmmode{\mbox{\smaller\ttfamily{#1}}} 96 | % \else{\smaller\ttfamily #1}\fi} 97 | \newcommand{\smallcode}[1]{\texttt{\textup{\scriptsize #1}}} 98 | %\newcommand{\myparagraph}[1]{\noindent\textit{\textbf{#1}}~} %\vspace{-1mm}\paragraph{#1}} 99 | 100 | % See: \usepackage{bold-extra} if you want to do \textbf 101 | \newcommand{\keyword}[1]{\code{#1}} 102 | 103 | % For new, method invocation, and cast: 104 | 105 | \newcommand{\Own}{{\it O}} 106 | \newcommand{\Ifn}[1]{\ensuremath{I(#1)}} 107 | \newcommand{\Ofn}[1]{\ensuremath{O(#1)}} 108 | \newcommand{\Cooker}[1]{\ensuremath{{\kappa}(#1)}} 109 | \newcommand{\Owner}[1]{\ensuremath{{\theta}(#1)}} 110 | \newcommand{\Oprec}[0]{\ensuremath{\preceq_{\theta}}} 111 | \newcommand{\Tprec}[0]{\ensuremath{\preceq^T}} 112 | \newcommand{\TprecNotEqual}[0]{\ensuremath{\prec^T}} 113 | \newcommand{\OprecNotEqual}[0]{\ensuremath{\prec_{\theta}}} 114 | \newcommand{\IfnDelta}[1]{\ensuremath{I_\Delta(#1)}} 115 | \newcommand\Abs[1]{\ensuremath{\left\lvert#1\right\rvert}} 116 | \newcommand{\erase}[1]{\ensuremath{\Abs{#1}}} 117 | 118 | \newcommand{\Gdash}[0]{\ensuremath{\Gamma \vdash }} 119 | \newcommand{\reducesto}[0]{\rightsquigarrow} 120 | \newcommand{\reduce}[0]{\rightsquigarrow} 121 | \newcommand{\preduce}[0]{\rightsquigarrow_{\pi}} 122 | \newcommand{\plreduce}[0]{\rightsquigarrow_{\pi[\ol{\hl}]}} 123 | \newcommand{\ureduce}[1]{\stackrel{{#1}}{\dashrightarrow}} %\rightsquigarrow\stackrel{a} 124 | \newcommand{\ptreduce}[0]{\rightsquigarrow_{\pi'}} 125 | 126 | 127 | %% For typesetting theorems and some math symbols (\rightsquigarrow). 128 | \usepackage{amssymb} 129 | %\usepackage{amsthm} 130 | 131 | \usepackage{color} 132 | \definecolor{light}{gray}{.75} 133 | 134 | 135 | \newcommand{\todo}[1]{\textbf{[[#1]]}} 136 | %% To disable, just uncomment this line: 137 | %\renewcommand{\todo}[1]{\relax} 138 | 139 | %% Additional todo commands: 140 | \newcommand{\TODO}[1]{\todo{TODO: #1}} 141 | 142 | \newcommand\xX[1]{$\textsuperscript{\textit{\text{#1}}}$} 143 | 144 | 145 | \newcommand{\ol}[1]{\overline{#1}} 146 | \newcommand{\nounderline}[1]{{#1}} 147 | 148 | 149 | %% Commands used to typeset the FIGJ type system. 150 | \newcommand{\typerule}[2]{ 151 | \begin{array}{c} 152 | #1 \\ 153 | \hline 154 | #2 155 | \end{array}} 156 | \newcommand{\typeax}[1]{ 157 | \begin{array}{c} 158 | \\ 159 | \hline 160 | #1 161 | \end{array}} 162 | 163 | 164 | %% Commands used to typeset the FOIGJ type system. 165 | \newcommand{\inside}{\prec} 166 | \newcommand{\visible}{{\it visible}} 167 | \newcommand{\placeholderowners}{{\it placeholderowners}} 168 | \newcommand{\nullexpression}{{\tt null}} 169 | \newcommand{\errorexpression}{{\tt error}} 170 | \newcommand{\locations}{{\it locations}} %% \mathop{\mathit{locations}}} 171 | \newcommand{\xo}{{\tt X^O}} 172 | \newcommand{\no}{{\tt N^O}} 173 | \newcommand{\co}{{\tt C^O}} 174 | \newcommand{\I}{\it I} 175 | 176 | 177 | \newcommand\mynewcommand[2]{\newcommand{#1}{#2\xspace}} 178 | 179 | 180 | \mynewcommand{\unknown}{\code{?}} 181 | \newcommand{\initsep}[0]{;} % tried \| and \dagger 182 | \newcommand{\initsets}[2]{\lb#1\initsep#2\rb} 183 | \newcommand{\myinit}[2]{\S{}\lb#1\initsep#2\rb} 184 | \mynewcommand{\mycooked}{\S{}\lb\initsep\rb} 185 | \newcommand{\valt}[2]{\code{val}~#1~=~#2;} 186 | \newcommand{\acct}[2]{\code{acc}~#1~=~#2;} 187 | 188 | \newcommand{\lt}{\code{<}}%{\mathop{\textrm{\tt <}}} 189 | \newcommand{\gt}{\code{>}}%{\mathop{\textrm{\tt >}}} 190 | 191 | \mynewcommand{\this}{\keyword{this}} 192 | \mynewcommand{\Object}{\code{Object}} 193 | \mynewcommand{\const}{\keyword{const}} %C++ keyword 194 | \mynewcommand{\mutable}{\keyword{mutable}} %C++ keyword 195 | \mynewcommand{\romaybe}{\keyword{romaybe}} %Javari keyword 196 | 197 | %% Define the behaviour of the theorem package. 198 | %% Use http://math.ucsd.edu/~jeggers/latex/amsthdoc.pdf for reference. 199 | 200 | \newtheorem{theorem}{Theorem}[section] 201 | \newtheorem{definition}[theorem]{Definition} 202 | \newtheorem{lemma}[theorem]{Lemma} 203 | %\newtheorem{lemma}[theorem]{Lemma} 204 | %\newtheorem{corollary}[theorem]{Corollary} 205 | %\newtheorem{fact}[theorem]{Fact} 206 | \newtheorem{proposition}[theorem]{Proposition} 207 | %\newtheorem{convention}[theorem]{Convention} 208 | \newtheorem{example}[theorem]{Example} 209 | %\newtheorem{remark}[theorem]{Remark} 210 | \def\withmmode#1{\relax\ifmmode#1\else{$#1$}\fi} 211 | \def\alt{\withmmode{\;{\tt\char`\|}\;}} 212 | 213 | \mynewcommand{\IP}{\code{I}} % formal type parameter 214 | \mynewcommand{\JP}{\code{J}} % formal type parameter (for soundness proofs) 215 | 216 | 217 | \mynewcommand{\Iparam}{Immutability parameter} 218 | \mynewcommand{\iparam}{immutability parameter} 219 | \mynewcommand{\iparams}{immutability parameters} 220 | \mynewcommand{\Iparams}{Immutability parameters} 221 | \mynewcommand{\Iarg}{Immutability argument} 222 | \mynewcommand{\iarg}{immutability argument} 223 | \mynewcommand{\iargs}{immutability arguments} 224 | \mynewcommand{\Iargs}{Immutability arguments} 225 | \mynewcommand{\ReadOnly}{\code{ReadOnly}} 226 | \mynewcommand{\WriteOnly}{\code{WriteOnly}} 227 | \mynewcommand{\None}{\code{None}} 228 | \mynewcommand{\Mutable}{\code{Mutable}} 229 | \mynewcommand{\Immut}{\code{Immut}} 230 | \mynewcommand{\Raw}{\code{Raw}} 231 | 232 | 233 | \mynewcommand{\This}{\code{This}} 234 | \mynewcommand{\World}{\code{World}} 235 | 236 | 237 | % Our annotations 238 | \mynewcommand{\OMutable}{\code{@OMutable}} 239 | \mynewcommand{\OI}{\code{@OI}} 240 | 241 | 242 | \mynewcommand{\InVariantAnnot}{\code{@InVariant}} 243 | 244 | 245 | \newcommand{\func}[1]{\text{\textnormal{\textit{\codesmaller #1}}}} 246 | 247 | 248 | \mynewcommand{\st}{\ensuremath{\mathrel{{\leq}}}} %{\mathop{\textrm{\tt <:}}} 249 | \mynewcommand{\notst}{\mathrel{\st\hspace{-1.5ex}\rule[-.25em]{.4pt}{1em}~}} 250 | \mynewcommand{\tl}{\ensuremath{\triangleleft}} 251 | \mynewcommand{\gap}{~ ~ ~ ~ ~ ~} 252 | 253 | 254 | \newcommand{\RULE}[1]{\textsc{\scriptsize{}#1}} %\RULEhape\scriptsize} 255 | 256 | 257 | \mynewcommand{\DA}{\texttt{DA}} 258 | \mynewcommand{\ok}{\texttt{OK}} 259 | \mynewcommand{\OK}{\texttt{OK}} 260 | \mynewcommand{\IN}{\texttt{IN}} 261 | \mynewcommand{\subterm}{\func{subterm}} 262 | \mynewcommand{\TP}{\func{TP}} % function that returns type parameters in a type 263 | \mynewcommand{\CT}{\func{CT}} % class table 264 | \mynewcommand{\mtype}{\func{mtype}} 265 | \mynewcommand{\mbody}{\func{mbody}} 266 | \mynewcommand{\mmodifier}{\func{mmodifier}} 267 | \mynewcommand{\fmodifier}{\func{fmodifier}} 268 | \mynewcommand{\fields}{\func{fields}} 269 | \mynewcommand{\cooked}{\func{cooked}} 270 | 271 | 272 | \mynewcommand{\facc}{\func{acc}} 273 | \mynewcommand{\fclocked}{\func{clocked}} 274 | 275 | \mynewcommand{\bound}{\func{bound}_\Delta} 276 | \mynewcommand{\substitute}{\func{substitute}} 277 | \mynewcommand{\ftype}{\func{ftype}} 278 | \mynewcommand{\mguard}{\func{mguard}} 279 | \mynewcommand{\isTransitive}{\func{isTransitive}} 280 | \DeclareMathOperator{\dom}{dom} 281 | 282 | 283 | \newcommand{\async}[1]{\code{async}~#1} 284 | \newcommand{\asynct}[2]{\code{async(}{\tt #1})\lb#2\rb} 285 | \newcommand{\finisht}[2]{\code{finish(}{\tt #1})\lb#2\rb} 286 | \newcommand{\blockt}[2]{\code{(}{\tt #1}\code{)}\lb#2\rb} 287 | \newcommand{\finishasynct}[2]{\code{[finish | async]}({\tt #1})\lb#2\rb} 288 | \newcommand{\clockedfinishasync}[1]{\code{(async | [clocked] finish)}~{\tt #1}} 289 | \newcommand{\finishasync}[1]{\code{[finish | async]}~#1} 290 | \newcommand{\finish}[1]{\code{finish~#1}} 291 | \newcommand{\clocked}[0]{\code{clocked}} 292 | \newcommand{\pclocked}[0]{\code{[clocked]}} 293 | \newcommand{\xadvance}[0]{\code{advance;}} 294 | \newcommand{\acc}[2]{\code{acc}~ #1=#2;} 295 | 296 | % In the syntax: \hI or ReadOnly or Mutable or Immut 297 | 298 | 299 | 300 | \mynewcommand{\hnull}{\code{null}} 301 | \mynewcommand{\htrue}{\code{true}} 302 | \mynewcommand{\hfalse}{\code{false}} 303 | 304 | \mynewcommand{\hUnused}{\code{\_}} 305 | \mynewcommand{\hA}{\code{A}} % inVariant definition 306 | \mynewcommand{\hB}{\code{B}} % inVariant definition 307 | \mynewcommand{\hC}{\code{C}} % class 308 | \mynewcommand{\hD}{\code{D}} % class 309 | \mynewcommand{\hF}{\code{F}} % field declaration 310 | \mynewcommand{\hG}{\code{G}} 311 | \mynewcommand{\hI}{\code{I}} % iparam 312 | \mynewcommand{\hJ}{\code{J}} 313 | \mynewcommand{\hL}{\code{L}} % class decl 314 | \mynewcommand{\hM}{\code{M}} % Method decl 315 | \mynewcommand{\hN}{\code{N}} % Non-variable type 316 | \mynewcommand{\hO}{\code{O}} 317 | \mynewcommand{\hP}{\code{P}} 318 | \mynewcommand{\hR}{\code{R}} 319 | \mynewcommand{\hS}{\code{S}} 320 | \mynewcommand{\hT}{\code{T}} % types (vars or non vars) 321 | \mynewcommand{\hU}{\code{U}} % types (vars or non vars) 322 | \mynewcommand{\hV}{\code{V}} % closed types 323 | \mynewcommand{\hX}{\code{X}} % vars 324 | \mynewcommand{\hY}{\code{Y}} % vars 325 | \mynewcommand{\hZ}{\code{Z}} % inVariant definition 326 | 327 | 328 | \mynewcommand{\ha}{\code{a}} 329 | \mynewcommand{\hc}{\code{c}} % cooker 330 | \mynewcommand{\hd}{\code{d}} 331 | \mynewcommand{\hm}{\code{m}} % method 332 | \mynewcommand{\he}{\code{e}} % expression 333 | \mynewcommand{\hf}{\code{f}} % field 334 | \mynewcommand{\hg}{\code{g}} 335 | \mynewcommand{\hl}{\code{l}} % location in the store 336 | \mynewcommand{\ho}{\code{o}} 337 | \mynewcommand{\hp}{\code{p}} % cooker 338 | \mynewcommand{\hq}{\code{q}} 339 | \mynewcommand{\hr}{\code{r}} 340 | \mynewcommand{\hv}{\code{v}} % value 341 | \mynewcommand{\hw}{\code{w}} % value 342 | \mynewcommand{\hx}{\code{x}} % method parameter 343 | \mynewcommand{\hy}{\code{y}} % field 344 | \mynewcommand{\hz}{\code{z}} % method parameter 345 | 346 | \mynewcommand{\lroot}{\code{l}_\top} % root 347 | \mynewcommand{\lthis}{\code{l}_\smallcode{this}} % this 348 | \newcommand{\hparen}[1]{\code{(}#1\code{)}} 349 | \newcommand{\hgn}[1]{\lt#1\gt} % type parameters and generic method parameters 350 | \newcommand{\hadvance}[0]{\code{advance}} 351 | \mynewcommand{\hasync}{\code{async}} 352 | \mynewcommand{\hswitch}{\code{switch}} 353 | \mynewcommand{\hAcc}{\code{Acc}} 354 | \mynewcommand{\hfinish}{\code{finish}} 355 | \mynewcommand{\hreceiver}{\code{receiver}} % receiver for new 356 | \mynewcommand{\hSW}{\code{SW}} 357 | \mynewcommand{\hAW}{\code{AW}} 358 | \mynewcommand{\hObject}{\code{Object}} 359 | 360 | \mynewcommand{\hdef}{\code{def}} 361 | \mynewcommand{\hfor}{\code{for}} 362 | \mynewcommand{\hvar}{\code{var}} 363 | \mynewcommand{\hin}{\code{in}} 364 | \mynewcommand{\hPoint}{\code{Point}} 365 | \mynewcommand{\hand}{\code{~and~}} 366 | \mynewcommand{\hor}{\code{~or~}} 367 | \mynewcommand{\hthis}{\code{this}} % this 368 | \mynewcommand{\hclass}{\code{class}} 369 | \mynewcommand{\hreturn}{\code{return}} 370 | \mynewcommand{\hhnew}{\code{new}} 371 | \mynewcommand{\hsub}{\code{/}} % substitute (reduction rules) 372 | %\newcommand{\Ofn}[1]{\ensuremath{O(#1)}} 373 | \mynewcommand{\nonescaping}{\code{nonescaping}} 374 | \mynewcommand{\hescaping}{\code{escaping}} 375 | \mynewcommand{\hextends}{\code{extends}} 376 | \newcommand{\hnew}[1]{\code{new}~#1} 377 | \newcommand{\hval}[3]{\code{val}~#1~=~#2;#3} 378 | \newcommand{\hclocked}[0]{\code{clocked}} 379 | \newcommand{\hClockedAcc}[0]{\code{ClockedAcc}} 380 | \newcommand{\hvalues}[0]{\code{values}} 381 | 382 | \newcommand{\PREV}[1]{\withmmode{\mathtt{prev}~{#1}}} 383 | \newcommand{\ift}[2]{\withmmode{\mathtt{if}~{#1}~\mathtt{then}~{#2}}} 384 | \newcommand{\ife}[2]{\withmmode{\mathtt{if}~{#1}~\mathtt{else}~{#2}}} 385 | \newcommand{\some}[2]{\withmmode{\mathtt{some}~{#1}\,\mathtt{in}\,~{#2}}} 386 | \newcommand{\all}[2]{\withmmode{\mathtt{all}~{#1}\,\mathtt{in}\,{#2}}} 387 | \newcommand{\hence}[1]{\withmmode{\mathtt{hence}~{#1}}} 388 | \newcommand{\hitherto}[1]{\withmmode{\mathtt{hitherto}~{#1}}} 389 | \newcommand{\AND}[2]{\withmmode{{#1}~\mathtt{and}~{#2}}} 390 | \newcommand{\OR}[2]{\withmmode{{#1}~\mathtt{or}~{#2}}} 391 | \newcommand{\MU}[2]{\withmmode{\mathtt{mu}~{#1}\ \mathtt{in}\ {#2}}} 392 | \newcommand{\past}[2]{\withmmode{\mathtt{past}^{#1}\,{#2}}} 393 | \newcommand{\TIME}[2]{\withmmode{\mathtt{time}^{#1}\,{#2}}} 394 | 395 | \newcommand{\proves}[3]{\withmmode{\dststile{#3}{#1,#2}}} 396 | \newcommand{\evolves}[3]{\withmmode{\longrightarrow^{#1,#2}_{#3}}} 397 | \newcommand{\starevolves}[3]{\withmmode{\stackrel{\star}{\longrightarrow}^{#1,#2}_{#3}}} 398 | \newcommand{\steps}[2]{\withmmode{\leadsto^{#1,#2}}} 399 | \mynewcommand\xth{\xX{th}} 400 | \mynewcommand\xrd{\xX{rd}} 401 | \mynewcommand\xnd{\xX{nd}} 402 | \mynewcommand\xst{\xX{st}} 403 | \mynewcommand\ith{$i$\xth} 404 | \mynewcommand\jth{$j$\xth} 405 | 406 | 407 | %\mynewcommand{\emptyline}{\vspace{\baselineskip}} 408 | \mynewcommand{\myindent}{~~} 409 | 410 | 411 | % Add line between figure and text 412 | \makeatletter 413 | \def\topfigrule{\kern3\p@ \hrule \kern -3.4\p@} % the \hrule is .4pt high 414 | \def\botfigrule{\kern-3\p@ \hrule \kern 2.6\p@} % the \hrule is .4pt high 415 | \def\dblfigrule{\kern3\p@ \hrule \kern -3.4\p@} % the \hrule is .4pt high 416 | \makeatother 417 | 418 | \setlength{\textfloatsep}{.75\textfloatsep} 419 | 420 | 421 | % Remove line between figure and its caption. (The line is prettier, and 422 | % it also saves a couple column-inches.) 423 | \makeatletter 424 | %\@setflag \@caprule = \@false 425 | \makeatother 426 | 427 | 428 | % http://www.tex.ac.uk/cgi-bin/texfaq2html?label=bold-extras 429 | \usepackage{bold-extra} 430 | 431 | 432 | % Left and right curly braces in tt font 433 | \newcommand{\ttlcb}{\texttt{\char "7B}} 434 | \newcommand{\ttrcb}{\texttt{\char "7D}} 435 | \newcommand{\lb}{\ttlcb} 436 | \newcommand{\rb}{\ttrcb} 437 | \newcommand{\cc}{{\sf CC}} 438 | \newcommand{\TDCC}{{\sf TDCC}} 439 | 440 | 441 | \setlength{\leftmargini}{.75\leftmargini} 442 | \def\comment#1{\typeout#1} 443 | 444 | \def\from#1\infer#2{{{\textstyle #1}\over{\textstyle #2}}} 445 | \def\rname#1\from#2\infer#3{{{\textstyle #2}\over{\textstyle #3}}{\ \textstyle(#1)}} 446 | -------------------------------------------------------------------------------- /doc/notes/history.txt: -------------------------------------------------------------------------------- 1 | Version history for the LLNCS LaTeX2e class 2 | 3 | date filename version action/reason/acknowledgements 4 | ---------------------------------------------------------------------------- 5 | 29.5.96 letter.txt beta naming problems (subject index file) 6 | thanks to Dr. Martin Held, Salzburg, AT 7 | 8 | subjindx.ind renamed to subjidx.ind as required 9 | by llncs.dem 10 | 11 | history.txt introducing this file 12 | 13 | 30.5.96 llncs.cls incompatibility with new article.cls of 14 | 1995/12/20 v1.3q Standard LaTeX document class, 15 | \if@openbib is no longer defined, 16 | reported by Ralf Heckmann and Graham Gough 17 | solution by David Carlisle 18 | 19 | 10.6.96 llncs.cls problems with fragile commands in \author field 20 | reported by Michael Gschwind, TU Wien 21 | 22 | 25.7.96 llncs.cls revision a corrects: 23 | wrong size of text area, floats not \small, 24 | some LaTeX generated texts 25 | reported by Michael Sperber, Uni Tuebingen 26 | 27 | 16.4.97 all files 2.1 leaving beta state, 28 | raising version counter to 2.1 29 | 30 | 8.6.97 llncs.cls 2.1a revision a corrects: 31 | unbreakable citation lists, reported by 32 | Sergio Antoy of Portland State University 33 | 34 | 11.12.97 llncs.cls 2.2 "general" headings centered; two new elements 35 | for the article header: \email and \homedir; 36 | complete revision of special environments: 37 | \newtheorem replaced with \spnewtheorem, 38 | introduced the theopargself environment; 39 | two column parts made with multicol package; 40 | add ons to work with the hyperref package 41 | 42 | 07.01.98 llncs.cls 2.2 changed \email to simply switch to \tt 43 | 44 | 25.03.98 llncs.cls 2.3 new class option "oribibl" to suppress 45 | changes to the thebibliograpy environment 46 | and retain pure LaTeX codes - useful 47 | for most BibTeX applications 48 | 49 | 16.04.98 llncs.cls 2.3 if option "oribibl" is given, extend the 50 | thebibliograpy hook with "\small", suggested 51 | by Clemens Ballarin, University of Cambridge 52 | 53 | 20.11.98 llncs.cls 2.4 pagestyle "titlepage" - useful for 54 | compilation of whole LNCS volumes 55 | 56 | 12.01.99 llncs.cls 2.5 counters of orthogonal numbered special 57 | environments are reset each new contribution 58 | 59 | 27.04.99 llncs.cls 2.6 new command \thisbottomragged for the 60 | actual page; indention of the footnote 61 | made variable with \fnindent (default 1em); 62 | new command \url that copys its argument 63 | 64 | 2.03.00 llncs.cls 2.7 \figurename and \tablename made compatible 65 | to babel, suggested by Jo Hereth, TU Darmstadt; 66 | definition of \url moved \AtBeginDocument 67 | (allows for url package of Donald Arseneau), 68 | suggested by Manfred Hauswirth, TU of Vienna; 69 | \large for part entries in the TOC 70 | 71 | 16.04.00 llncs.cls 2.8 new option "orivec" to preserve the original 72 | vector definition, read "arrow" accent 73 | 74 | 17.01.01 llncs.cls 2.9 hardwired texts made polyglot, 75 | available languages: english (default), 76 | french, german - all are "babel-proof" 77 | 78 | 20.06.01 splncs.bst public release of a BibTeX style for LNCS, 79 | nobly provided by Jason Noble 80 | 81 | 14.08.01 llncs.cls 2.10 TOC: authors flushleft, 82 | entries without hyphenation; suggested 83 | by Wiro Niessen, Imaging Center - Utrecht 84 | 85 | 23.01.02 llncs.cls 2.11 fixed footnote number confusion with 86 | \thanks, numbered institutes, and normal 87 | footnote entries; error reported by 88 | Saverio Cittadini, Istituto Tecnico 89 | Industriale "Tito Sarrocchi" - Siena 90 | 91 | 28.01.02 llncs.cls 2.12 fixed footnote fix ; error reported by 92 | Chris Mesterharm, CS Dept. Rutgers - NJ 93 | 94 | 95 | 28.01.02 llncs.cls 2.13 fixed the fix (programmer needs vacation) 96 | -------------------------------------------------------------------------------- /doc/notes/llncs.dvi: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saraswat/NeuralProgrammerAsProbProg/d2ec5dd7b8aeeb084937f110b1daf859a8f3fe02/doc/notes/llncs.dvi -------------------------------------------------------------------------------- /doc/notes/llncs.ind: -------------------------------------------------------------------------------- 1 | % This is LLNCS.IND the handmade demonstration 2 | % file for an author index from Springer-Verlag 3 | % for Lecture Notes in Computer Science, 4 | % version 2.2 for LaTeX2e 5 | % 6 | \begin{theindex} 7 | \item Abt~I. \idxquad{7} 8 | \item Ahmed~T. \idxquad{3} 9 | \item Andreev~V. \idxquad{24} 10 | \item Andrieu~B. \idxquad{27} 11 | \item Arpagaus~M. \idxquad{34} 12 | \indexspace 13 | \item Babaev~A. \idxquad{25} 14 | \item B\"arwolff~A. \idxquad{33} 15 | \item B\'an~J. \idxquad{17} 16 | \item Baranov~P. \idxquad{24} 17 | \item Barrelet~E. \idxquad{28} 18 | \item Bartel~W. \idxquad{11} 19 | \item Bassler~U. \idxquad{28} 20 | \item Beck~H.P. \idxquad{35} 21 | \item Behrend~H.-J. \idxquad{11} 22 | \item Berger~Ch. \idxquad{1} 23 | \item Bergstein~H. \idxquad{1} 24 | \item Bernardi~G. \idxquad{28} 25 | \item Bernet~R. \idxquad{34} 26 | \item Besan\c con~M. \idxquad{9} 27 | \item Biddulph~P. \idxquad{22} 28 | \item Binder~E. \idxquad{11} 29 | \item Bischoff~A. \idxquad{33} 30 | \item Blobel~V. \idxquad{13} 31 | \item Borras~K. \idxquad{8} 32 | \item Bosetti~P.C. \idxquad{2} 33 | \item Boudry~V. \idxquad{27} 34 | \item Brasse~F. \idxquad{11} 35 | \item Braun~U. \idxquad{2} 36 | \item Braunschweig~A. \idxquad{1} 37 | \item Brisson~V. \idxquad{26} 38 | \item B\"ungener~L. \idxquad{13} 39 | \item B\"urger~J. \idxquad{11} 40 | \item B\"usser~F.W. \idxquad{13} 41 | \item Buniatian~A. \idxquad{11,37} 42 | \item Buschhorn~G. \idxquad{25} 43 | \indexspace 44 | \item Campbell~A.J. \idxquad{1} 45 | \item Carli~T. \idxquad{25} 46 | \item Charles~F. \idxquad{28} 47 | \item Clarke~D. \idxquad{5} 48 | \item Clegg~A.B. \idxquad{18} 49 | \item Colombo~M. \idxquad{8} 50 | \item Courau~A. \idxquad{26} 51 | \item Coutures~Ch. \idxquad{9} 52 | \item Cozzika~G. \idxquad{9} 53 | \item Criegee~L. \idxquad{11} 54 | \item Cvach~J. \idxquad{27} 55 | \indexspace 56 | \item Dagoret~S. \idxquad{28} 57 | \item Dainton~J.B. \idxquad{19} 58 | \item Dann~A.W.E. \idxquad{22} 59 | \item Dau~W.D. \idxquad{16} 60 | \item Deffur~E. \idxquad{11} 61 | \item Delcourt~B. \idxquad{26} 62 | \item Buono~Del~A. \idxquad{28} 63 | \item Devel~M. \idxquad{26} 64 | \item De Roeck~A. \idxquad{11} 65 | \item Dingus~P. \idxquad{27} 66 | \item Dollfus~C. \idxquad{35} 67 | \item Dreis~H.B. \idxquad{2} 68 | \item Drescher~A. \idxquad{8} 69 | \item D\"ullmann~D. \idxquad{13} 70 | \item D\"unger~O. \idxquad{13} 71 | \item Duhm~H. \idxquad{12} 72 | \indexspace 73 | \item Ebbinghaus~R. \idxquad{8} 74 | \item Eberle~M. \idxquad{12} 75 | \item Ebert~J. \idxquad{32} 76 | \item Ebert~T.R. \idxquad{19} 77 | \item Efremenko~V. \idxquad{23} 78 | \item Egli~S. \idxquad{35} 79 | \item Eichenberger~S. \idxquad{35} 80 | \item Eichler~R. \idxquad{34} 81 | \item Eisenhandler~E. \idxquad{20} 82 | \item Ellis~N.N. \idxquad{3} 83 | \item Ellison~R.J. \idxquad{22} 84 | \item Elsen~E. \idxquad{11} 85 | \item Evrard~E. \idxquad{4} 86 | \indexspace 87 | \item Favart~L. \idxquad{4} 88 | \item Feeken~D. \idxquad{13} 89 | \item Felst~R. \idxquad{11} 90 | \item Feltesse~A. \idxquad{9} 91 | \item Fensome~I.F. \idxquad{3} 92 | \item Ferrarotto~F. \idxquad{31} 93 | \item Flamm~K. \idxquad{11} 94 | \item Flauger~W. \idxquad{11} 95 | \item Flieser~M. \idxquad{25} 96 | \item Fl\"ugge~G. \idxquad{2} 97 | \item Fomenko~A. \idxquad{24} 98 | \item Fominykh~B. \idxquad{23} 99 | \item Form\'anek~J. \idxquad{30} 100 | \item Foster~J.M. \idxquad{22} 101 | \item Franke~G. \idxquad{11} 102 | \item Fretwurst~E. \idxquad{12} 103 | \indexspace 104 | \item Gabathuler~E. \idxquad{19} 105 | \item Gamerdinger~K. \idxquad{25} 106 | \item Garvey~J. \idxquad{3} 107 | \item Gayler~J. \idxquad{11} 108 | \item Gellrich~A. \idxquad{13} 109 | \item Gennis~M. \idxquad{11} 110 | \item Genzel~H. \idxquad{1} 111 | \item Godfrey~L. \idxquad{7} 112 | \item Goerlach~U. \idxquad{11} 113 | \item Goerlich~L. \idxquad{6} 114 | \item Gogitidze~N. \idxquad{24} 115 | \item Goodall~A.M. \idxquad{19} 116 | \item Gorelov~I. \idxquad{23} 117 | \item Goritchev~P. \idxquad{23} 118 | \item Grab~C. \idxquad{34} 119 | \item Gr\"assler~R. \idxquad{2} 120 | \item Greenshaw~T. \idxquad{19} 121 | \item Greif~H. \idxquad{25} 122 | \item Grindhammer~G. \idxquad{25} 123 | \indexspace 124 | \item Haack~J. \idxquad{33} 125 | \item Haidt~D. \idxquad{11} 126 | \item Hamon~O. \idxquad{28} 127 | \item Handschuh~D. \idxquad{11} 128 | \item Hanlon~E.M. \idxquad{18} 129 | \item Hapke~M. \idxquad{11} 130 | \item Harjes~J. \idxquad{11} 131 | \item Haydar~R. \idxquad{26} 132 | \item Haynes~W.J. \idxquad{5} 133 | \item Hedberg~V. \idxquad{21} 134 | \item Heinzelmann~G. \idxquad{13} 135 | \item Henderson~R.C.W. \idxquad{18} 136 | \item Henschel~H. \idxquad{33} 137 | \item Herynek~I. \idxquad{29} 138 | \item Hildesheim~W. \idxquad{11} 139 | \item Hill~P. \idxquad{11} 140 | \item Hilton~C.D. \idxquad{22} 141 | \item Hoeger~K.C. \idxquad{22} 142 | \item Huet~Ph. \idxquad{4} 143 | \item Hufnagel~H. \idxquad{14} 144 | \item Huot~N. \idxquad{28} 145 | \indexspace 146 | \item Itterbeck~H. \idxquad{1} 147 | \indexspace 148 | \item Jabiol~M.-A. \idxquad{9} 149 | \item Jacholkowska~A. \idxquad{26} 150 | \item Jacobsson~C. \idxquad{21} 151 | \item Jansen~T. \idxquad{11} 152 | \item J\"onsson~L. \idxquad{21} 153 | \item Johannsen~A. \idxquad{13} 154 | \item Johnson~D.P. \idxquad{4} 155 | \item Jung~H. \idxquad{2} 156 | \indexspace 157 | \item Kalmus~P.I.P. \idxquad{20} 158 | \item Kasarian~S. \idxquad{11} 159 | \item Kaschowitz~R. \idxquad{2} 160 | \item Kathage~U. \idxquad{16} 161 | \item Kaufmann~H. \idxquad{33} 162 | \item Kenyon~I.R. \idxquad{3} 163 | \item Kermiche~S. \idxquad{26} 164 | \item Kiesling~C. \idxquad{25} 165 | \item Klein~M. \idxquad{33} 166 | \item Kleinwort~C. \idxquad{13} 167 | \item Knies~G. \idxquad{11} 168 | \item Ko~W. \idxquad{7} 169 | \item K\"ohler~T. \idxquad{1} 170 | \item Kolanoski~H. \idxquad{8} 171 | \item Kole~F. \idxquad{7} 172 | \item Kolya~S.D. \idxquad{22} 173 | \item Korbel~V. \idxquad{11} 174 | \item Korn~M. \idxquad{8} 175 | \item Kostka~P. \idxquad{33} 176 | \item Kotelnikov~S.K. \idxquad{24} 177 | \item Krehbiel~H. \idxquad{11} 178 | \item Kr\"ucker~D. \idxquad{2} 179 | \item Kr\"uger~U. \idxquad{11} 180 | \item Kubenka~J.P. \idxquad{25} 181 | \item Kuhlen~M. \idxquad{25} 182 | \item Kur\v{c}a~T. \idxquad{17} 183 | \item Kurzh\"ofer~J. \idxquad{8} 184 | \item Kuznik~B. \idxquad{32} 185 | \indexspace 186 | \item Lamarche~F. \idxquad{27} 187 | \item Lander~R. \idxquad{7} 188 | \item Landon~M.P.J. \idxquad{20} 189 | \item Lange~W. \idxquad{33} 190 | \item Lanius~P. \idxquad{25} 191 | \item Laporte~J.F. \idxquad{9} 192 | \item Lebedev~A. \idxquad{24} 193 | \item Leuschner~A. \idxquad{11} 194 | \item Levonian~S. \idxquad{11,24} 195 | \item Lewin~D. \idxquad{11} 196 | \item Ley~Ch. \idxquad{2} 197 | \item Lindner~A. \idxquad{8} 198 | \item Lindstr\"om~G. \idxquad{12} 199 | \item Linsel~F. \idxquad{11} 200 | \item Lipinski~J. \idxquad{13} 201 | \item Loch~P. \idxquad{11} 202 | \item Lohmander~H. \idxquad{21} 203 | \item Lopez~G.C. \idxquad{20} 204 | \indexspace 205 | \item Magnussen~N. \idxquad{32} 206 | \item Mani~S. \idxquad{7} 207 | \item Marage~P. \idxquad{4} 208 | \item Marshall~R. \idxquad{22} 209 | \item Martens~J. \idxquad{32} 210 | \item Martin~A.@ \idxquad{19} 211 | \item Martyn~H.-U. \idxquad{1} 212 | \item Martyniak~J. \idxquad{6} 213 | \item Masson~S. \idxquad{2} 214 | \item Mavroidis~A. \idxquad{20} 215 | \item McMahon~S.J. \idxquad{19} 216 | \item Mehta~A. \idxquad{22} 217 | \item Meier~K. \idxquad{15} 218 | \item Mercer~D. \idxquad{22} 219 | \item Merz~T. \idxquad{11} 220 | \item Meyer~C.A. \idxquad{35} 221 | \item Meyer~H. \idxquad{32} 222 | \item Meyer~J. \idxquad{11} 223 | \item Mikocki~S. \idxquad{6,26} 224 | \item Milone~V. \idxquad{31} 225 | \item Moreau~F. \idxquad{27} 226 | \item Moreels~J. \idxquad{4} 227 | \item Morris~J.V. \idxquad{5} 228 | \item M\"uller~K. \idxquad{35} 229 | \item Murray~S.A. \idxquad{22} 230 | \indexspace 231 | \item Nagovizin~V. \idxquad{23} 232 | \item Naroska~B. \idxquad{13} 233 | \item Naumann~Th. \idxquad{33} 234 | \item Newton~D. \idxquad{18} 235 | \item Neyret~D. \idxquad{28} 236 | \item Nguyen~A. \idxquad{28} 237 | \item Niebergall~F. \idxquad{13} 238 | \item Nisius~R. \idxquad{1} 239 | \item Nowak~G. \idxquad{6} 240 | \item Nyberg~M. \idxquad{21} 241 | \indexspace 242 | \item Oberlack~H. \idxquad{25} 243 | \item Obrock~U. \idxquad{8} 244 | \item Olsson~J.E. \idxquad{11} 245 | \item Ould-Saada~F. \idxquad{13} 246 | \indexspace 247 | \item Pascaud~C. \idxquad{26} 248 | \item Patel~G.D. \idxquad{19} 249 | \item Peppel~E. \idxquad{11} 250 | \item Phillips~H.T. \idxquad{3} 251 | \item Phillips~J.P. \idxquad{22} 252 | \item Pichler~Ch. \idxquad{12} 253 | \item Pilgram~W. \idxquad{2} 254 | \item Pitzl~D. \idxquad{34} 255 | \item Prell~S. \idxquad{11} 256 | \item Prosi~R. \idxquad{11} 257 | \indexspace 258 | \item R\"adel~G. \idxquad{11} 259 | \item Raupach~F. \idxquad{1} 260 | \item Rauschnabel~K. \idxquad{8} 261 | \item Reinshagen~S. \idxquad{11} 262 | \item Ribarics~P. \idxquad{25} 263 | \item Riech~V. \idxquad{12} 264 | \item Riedlberger~J. \idxquad{34} 265 | \item Rietz~M. \idxquad{2} 266 | \item Robertson~S.M. \idxquad{3} 267 | \item Robmann~P. \idxquad{35} 268 | \item Roosen~R. \idxquad{4} 269 | \item Royon~C. \idxquad{9} 270 | \item Rudowicz~M. \idxquad{25} 271 | \item Rusakov~S. \idxquad{24} 272 | \item Rybicki~K. \idxquad{6} 273 | \indexspace 274 | \item Sahlmann~N. \idxquad{2} 275 | \item Sanchez~E. \idxquad{25} 276 | \item Savitsky~M. \idxquad{11} 277 | \item Schacht~P. \idxquad{25} 278 | \item Schleper~P. \idxquad{14} 279 | \item von Schlippe~W. \idxquad{20} 280 | \item Schmidt~D. \idxquad{32} 281 | \item Schmitz~W. \idxquad{2} 282 | \item Sch\"oning~A. \idxquad{11} 283 | \item Schr\"oder~V. \idxquad{11} 284 | \item Schulz~M. \idxquad{11} 285 | \item Schwab~B. \idxquad{14} 286 | \item Schwind~A. \idxquad{33} 287 | \item Seehausen~U. \idxquad{13} 288 | \item Sell~R. \idxquad{11} 289 | \item Semenov~A. \idxquad{23} 290 | \item Shekelyan~V. \idxquad{23} 291 | \item Shooshtari~H. \idxquad{25} 292 | \item Shtarkov~L.N. \idxquad{24} 293 | \item Siegmon~G. \idxquad{16} 294 | \item Siewert~U. \idxquad{16} 295 | \item Skillicorn~I.O. \idxquad{10} 296 | \item Smirnov~P. \idxquad{24} 297 | \item Smith~J.R. \idxquad{7} 298 | \item Smolik~L. \idxquad{11} 299 | \item Spitzer~H. \idxquad{13} 300 | \item Staroba~P. \idxquad{29} 301 | \item Steenbock~M. \idxquad{13} 302 | \item Steffen~P. \idxquad{11} 303 | \item Stella~B. \idxquad{31} 304 | \item Stephens~K. \idxquad{22} 305 | \item St\"osslein~U. \idxquad{33} 306 | \item Strachota~J. \idxquad{11} 307 | \item Straumann~U. \idxquad{35} 308 | \item Struczinski~W. \idxquad{2} 309 | \indexspace 310 | \item Taylor~R.E. \idxquad{36,26} 311 | \item Tchernyshov~V. \idxquad{23} 312 | \item Thiebaux~C. \idxquad{27} 313 | \item Thompson~G. \idxquad{20} 314 | \item Tru\"ol~P. \idxquad{35} 315 | \item Turnau~J. \idxquad{6} 316 | \indexspace 317 | \item Urban~L. \idxquad{25} 318 | \item Usik~A. \idxquad{24} 319 | \indexspace 320 | \item Valkarova~A. \idxquad{30} 321 | \item Vall\'ee~C. \idxquad{28} 322 | \item Van Esch~P. \idxquad{4} 323 | \item Vartapetian~A. \idxquad{11} 324 | \item Vazdik~Y. \idxquad{24} 325 | \item Verrecchia~P. \idxquad{9} 326 | \item Vick~R. \idxquad{13} 327 | \item Vogel~E. \idxquad{1} 328 | \indexspace 329 | \item Wacker~K. \idxquad{8} 330 | \item Walther~A. \idxquad{8} 331 | \item Weber~G. \idxquad{13} 332 | \item Wegner~A. \idxquad{11} 333 | \item Wellisch~H.P. \idxquad{25} 334 | \item West~L.R. \idxquad{3} 335 | \item Willard~S. \idxquad{7} 336 | \item Winde~M. \idxquad{33} 337 | \item Winter~G.-G. \idxquad{11} 338 | \item Wolff~Th. \idxquad{34} 339 | \item Wright~A.E. \idxquad{22} 340 | \item Wulff~N. \idxquad{11} 341 | \indexspace 342 | \item Yiou~T.P. \idxquad{28} 343 | \indexspace 344 | \item \v{Z}\'a\v{c}ek~J. \idxquad{30} 345 | \item Zeitnitz~C. \idxquad{12} 346 | \item Ziaeepour~H. \idxquad{26} 347 | \item Zimmer~M. \idxquad{11} 348 | \item Zimmermann~W. \idxquad{11} 349 | \end{theindex} 350 | -------------------------------------------------------------------------------- /doc/notes/llncs2e.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saraswat/NeuralProgrammerAsProbProg/d2ec5dd7b8aeeb084937f110b1daf859a8f3fe02/doc/notes/llncs2e.zip -------------------------------------------------------------------------------- /doc/notes/llncsdoc.sty: -------------------------------------------------------------------------------- 1 | % This is LLNCSDOC.STY the modification of the 2 | % LLNCS class file for the documentation of 3 | % the class itself. 4 | % 5 | \def\AmS{{\protect\usefont{OMS}{cmsy}{m}{n}% 6 | A\kern-.1667em\lower.5ex\hbox{M}\kern-.125emS}} 7 | \def\AmSTeX{{\protect\AmS-\protect\TeX}} 8 | % 9 | \def\ps@myheadings{\let\@mkboth\@gobbletwo 10 | \def\@oddhead{\hbox{}\hfil\small\rm\rightmark 11 | \qquad\thepage}% 12 | \def\@oddfoot{}\def\@evenhead{\small\rm\thepage\qquad 13 | \leftmark\hfil}% 14 | \def\@evenfoot{}\def\sectionmark##1{}\def\subsectionmark##1{}} 15 | \ps@myheadings 16 | % 17 | \setcounter{tocdepth}{2} 18 | % 19 | \renewcommand{\labelitemi}{--} 20 | \newenvironment{alpherate}% 21 | {\renewcommand{\labelenumi}{\alph{enumi})}\begin{enumerate}}% 22 | {\end{enumerate}\renewcommand{\labelenumi}{enumi}} 23 | % 24 | \def\bibauthoryear{\begingroup 25 | \def\thebibliography##1{\section*{References}% 26 | \small\list{}{\settowidth\labelwidth{}\leftmargin\parindent 27 | \itemindent=-\parindent 28 | \labelsep=\z@ 29 | \usecounter{enumi}}% 30 | \def\newblock{\hskip .11em plus .33em minus -.07em}% 31 | \sloppy 32 | \sfcode`\.=1000\relax}% 33 | \def\@cite##1{##1}% 34 | \def\@lbibitem[##1]##2{\item[]\if@filesw 35 | {\def\protect####1{\string ####1\space}\immediate 36 | \write\@auxout{\string\bibcite{##2}{##1}}}\fi\ignorespaces}% 37 | \begin{thebibliography}{} 38 | \bibitem[1982]{clar:eke3} Clarke, F., Ekeland, I.: Nonlinear 39 | oscillations and boundary-value problems for Hamiltonian systems. 40 | Arch. Rat. Mech. Anal. {\bf 78} (1982) 315--333 41 | \end{thebibliography} 42 | \endgroup} 43 | -------------------------------------------------------------------------------- /doc/notes/master.bib: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/saraswat/NeuralProgrammerAsProbProg/d2ec5dd7b8aeeb084937f110b1daf859a8f3fe02/doc/notes/master.bib -------------------------------------------------------------------------------- /doc/notes/nips15submit_e.sty: -------------------------------------------------------------------------------- 1 | %%%% NIPS Macros (LaTex) 2 | %%%% Style File 3 | %%%% Dec 12, 1990 Rev Aug 14, 1991; Sept, 1995; April, 1997; April, 1999 4 | 5 | % This file can be used with Latex2e whether running in main mode, or 6 | % 2.09 compatibility mode. 7 | % 8 | % If using main mode, you need to include the commands 9 | % \documentclass{article} 10 | % \usepackage{nips10submit_e,times} 11 | % as the first lines in your document. Or, if you do not have Times 12 | % Roman font available, you can just use 13 | % \documentclass{article} 14 | % \usepackage{nips10submit_e} 15 | % instead. 16 | % 17 | % If using 2.09 compatibility mode, you need to include the command 18 | % \documentstyle[nips10submit_09,times]{article} 19 | % as the first line in your document. Or, if you do not have Times 20 | % Roman font available, you can include the command 21 | % \documentstyle[nips10submit_09]{article} 22 | % instead. 23 | 24 | % Change the overall width of the page. If these parameters are 25 | % changed, they will require corresponding changes in the 26 | % maketitle section. 27 | % 28 | \usepackage{eso-pic} % used by \AddToShipoutPicture 29 | 30 | \renewcommand{\topfraction}{0.95} % let figure take up nearly whole page 31 | \renewcommand{\textfraction}{0.05} % let figure take up nearly whole page 32 | 33 | % Define nipsfinal, set to true if nipsfinalcopy is defined 34 | \newif\ifnipsfinal 35 | \nipsfinalfalse 36 | \def\nipsfinalcopy{\nipsfinaltrue} 37 | \font\nipstenhv = phvb at 8pt % *** IF THIS FAILS, SEE nips10submit_e.sty *** 38 | 39 | % Specify the dimensions of each page 40 | 41 | \setlength{\paperheight}{11in} 42 | \setlength{\paperwidth}{8.5in} 43 | 44 | \oddsidemargin .5in % Note \oddsidemargin = \evensidemargin 45 | \evensidemargin .5in 46 | \marginparwidth 0.07 true in 47 | %\marginparwidth 0.75 true in 48 | %\topmargin 0 true pt % Nominal distance from top of page to top of 49 | %\topmargin 0.125in 50 | \topmargin -0.625in 51 | \addtolength{\headsep}{0.25in} 52 | \textheight 9.0 true in % Height of text (including footnotes & figures) 53 | \textwidth 5.5 true in % Width of text line. 54 | \widowpenalty=10000 55 | \clubpenalty=10000 56 | 57 | % \thispagestyle{empty} \pagestyle{empty} 58 | \flushbottom \sloppy 59 | 60 | % We're never going to need a table of contents, so just flush it to 61 | % save space --- suggested by drstrip@sandia-2 62 | \def\addcontentsline#1#2#3{} 63 | 64 | % Title stuff, taken from deproc. 65 | \def\maketitle{\par 66 | \begingroup 67 | \def\thefootnote{\fnsymbol{footnote}} 68 | \def\@makefnmark{\hbox to 0pt{$^{\@thefnmark}$\hss}} % for perfect author 69 | % name centering 70 | % The footnote-mark was overlapping the footnote-text, 71 | % added the following to fix this problem (MK) 72 | \long\def\@makefntext##1{\parindent 1em\noindent 73 | \hbox to1.8em{\hss $\m@th ^{\@thefnmark}$}##1} 74 | \@maketitle \@thanks 75 | \endgroup 76 | \setcounter{footnote}{0} 77 | \let\maketitle\relax \let\@maketitle\relax 78 | \gdef\@thanks{}\gdef\@author{}\gdef\@title{}\let\thanks\relax} 79 | 80 | % The toptitlebar has been raised to top-justify the first page 81 | 82 | % Title (includes both anonimized and non-anonimized versions) 83 | \def\@maketitle{\vbox{\hsize\textwidth 84 | \linewidth\hsize \vskip 0.1in \toptitlebar \centering 85 | {\LARGE\bf \@title\par} \bottomtitlebar % \vskip 0.1in % minus 86 | \ifnipsfinal 87 | \def\And{\end{tabular}\hfil\linebreak[0]\hfil 88 | \begin{tabular}[t]{c}\bf\rule{\z@}{24pt}\ignorespaces}% 89 | \def\AND{\end{tabular}\hfil\linebreak[4]\hfil 90 | \begin{tabular}[t]{c}\bf\rule{\z@}{24pt}\ignorespaces}% 91 | \begin{tabular}[t]{c}\bf\rule{\z@}{24pt}\@author\end{tabular}% 92 | \else 93 | \begin{tabular}[t]{c}\bf\rule{\z@}{24pt} 94 | Anonymous Author(s) \\ 95 | Affiliation \\ 96 | Address \\ 97 | \texttt{email} \\ 98 | \end{tabular}% 99 | \fi 100 | \vskip 0.3in minus 0.1in}} 101 | 102 | \renewenvironment{abstract}{\vskip.075in\centerline{\large\bf 103 | Abstract}\vspace{0.5ex}\begin{quote}}{\par\end{quote}\vskip 1ex} 104 | 105 | % sections with less space 106 | \def\section{\@startsection {section}{1}{\z@}{-2.0ex plus 107 | -0.5ex minus -.2ex}{1.5ex plus 0.3ex 108 | minus0.2ex}{\large\bf\raggedright}} 109 | 110 | \def\subsection{\@startsection{subsection}{2}{\z@}{-1.8ex plus 111 | -0.5ex minus -.2ex}{0.8ex plus .2ex}{\normalsize\bf\raggedright}} 112 | \def\subsubsection{\@startsection{subsubsection}{3}{\z@}{-1.5ex 113 | plus -0.5ex minus -.2ex}{0.5ex plus 114 | .2ex}{\normalsize\bf\raggedright}} 115 | \def\paragraph{\@startsection{paragraph}{4}{\z@}{1.5ex plus 116 | 0.5ex minus .2ex}{-1em}{\normalsize\bf}} 117 | \def\subparagraph{\@startsection{subparagraph}{5}{\z@}{1.5ex plus 118 | 0.5ex minus .2ex}{-1em}{\normalsize\bf}} 119 | \def\subsubsubsection{\vskip 120 | 5pt{\noindent\normalsize\rm\raggedright}} 121 | 122 | 123 | % Footnotes 124 | \footnotesep 6.65pt % 125 | \skip\footins 9pt plus 4pt minus 2pt 126 | \def\footnoterule{\kern-3pt \hrule width 12pc \kern 2.6pt } 127 | \setcounter{footnote}{0} 128 | 129 | % Lists and paragraphs 130 | \parindent 0pt 131 | \topsep 4pt plus 1pt minus 2pt 132 | \partopsep 1pt plus 0.5pt minus 0.5pt 133 | \itemsep 2pt plus 1pt minus 0.5pt 134 | \parsep 2pt plus 1pt minus 0.5pt 135 | \parskip .5pc 136 | 137 | 138 | %\leftmargin2em 139 | \leftmargin3pc 140 | \leftmargini\leftmargin \leftmarginii 2em 141 | \leftmarginiii 1.5em \leftmarginiv 1.0em \leftmarginv .5em 142 | 143 | %\labelsep \labelsep 5pt 144 | 145 | \def\@listi{\leftmargin\leftmargini} 146 | \def\@listii{\leftmargin\leftmarginii 147 | \labelwidth\leftmarginii\advance\labelwidth-\labelsep 148 | \topsep 2pt plus 1pt minus 0.5pt 149 | \parsep 1pt plus 0.5pt minus 0.5pt 150 | \itemsep \parsep} 151 | \def\@listiii{\leftmargin\leftmarginiii 152 | \labelwidth\leftmarginiii\advance\labelwidth-\labelsep 153 | \topsep 1pt plus 0.5pt minus 0.5pt 154 | \parsep \z@ \partopsep 0.5pt plus 0pt minus 0.5pt 155 | \itemsep \topsep} 156 | \def\@listiv{\leftmargin\leftmarginiv 157 | \labelwidth\leftmarginiv\advance\labelwidth-\labelsep} 158 | \def\@listv{\leftmargin\leftmarginv 159 | \labelwidth\leftmarginv\advance\labelwidth-\labelsep} 160 | \def\@listvi{\leftmargin\leftmarginvi 161 | \labelwidth\leftmarginvi\advance\labelwidth-\labelsep} 162 | 163 | \abovedisplayskip 7pt plus2pt minus5pt% 164 | \belowdisplayskip \abovedisplayskip 165 | \abovedisplayshortskip 0pt plus3pt% 166 | \belowdisplayshortskip 4pt plus3pt minus3pt% 167 | 168 | % Less leading in most fonts (due to the narrow columns) 169 | % The choices were between 1-pt and 1.5-pt leading 170 | %\def\@normalsize{\@setsize\normalsize{11pt}\xpt\@xpt} % got rid of @ (MK) 171 | \def\normalsize{\@setsize\normalsize{11pt}\xpt\@xpt} 172 | \def\small{\@setsize\small{10pt}\ixpt\@ixpt} 173 | \def\footnotesize{\@setsize\footnotesize{10pt}\ixpt\@ixpt} 174 | \def\scriptsize{\@setsize\scriptsize{8pt}\viipt\@viipt} 175 | \def\tiny{\@setsize\tiny{7pt}\vipt\@vipt} 176 | \def\large{\@setsize\large{14pt}\xiipt\@xiipt} 177 | \def\Large{\@setsize\Large{16pt}\xivpt\@xivpt} 178 | \def\LARGE{\@setsize\LARGE{20pt}\xviipt\@xviipt} 179 | \def\huge{\@setsize\huge{23pt}\xxpt\@xxpt} 180 | \def\Huge{\@setsize\Huge{28pt}\xxvpt\@xxvpt} 181 | 182 | \def\toptitlebar{\hrule height4pt\vskip .25in\vskip-\parskip} 183 | 184 | \def\bottomtitlebar{\vskip .29in\vskip-\parskip\hrule height1pt\vskip 185 | .09in} % 186 | %Reduced second vskip to compensate for adding the strut in \@author 187 | 188 | % Vertical Ruler 189 | % This code is, largely, from the CVPR 2010 conference style file 190 | % ----- define vruler 191 | \makeatletter 192 | \newbox\nipsrulerbox 193 | \newcount\nipsrulercount 194 | \newdimen\nipsruleroffset 195 | \newdimen\cv@lineheight 196 | \newdimen\cv@boxheight 197 | \newbox\cv@tmpbox 198 | \newcount\cv@refno 199 | \newcount\cv@tot 200 | % NUMBER with left flushed zeros \fillzeros[] 201 | \newcount\cv@tmpc@ \newcount\cv@tmpc 202 | \def\fillzeros[#1]#2{\cv@tmpc@=#2\relax\ifnum\cv@tmpc@<0\cv@tmpc@=-\cv@tmpc@\fi 203 | \cv@tmpc=1 % 204 | \loop\ifnum\cv@tmpc@<10 \else \divide\cv@tmpc@ by 10 \advance\cv@tmpc by 1 \fi 205 | \ifnum\cv@tmpc@=10\relax\cv@tmpc@=11\relax\fi \ifnum\cv@tmpc@>10 \repeat 206 | \ifnum#2<0\advance\cv@tmpc1\relax-\fi 207 | \loop\ifnum\cv@tmpc<#1\relax0\advance\cv@tmpc1\relax\fi \ifnum\cv@tmpc<#1 \repeat 208 | \cv@tmpc@=#2\relax\ifnum\cv@tmpc@<0\cv@tmpc@=-\cv@tmpc@\fi \relax\the\cv@tmpc@}% 209 | % \makevruler[][][][][] 210 | \def\makevruler[#1][#2][#3][#4][#5]{\begingroup\offinterlineskip 211 | \textheight=#5\vbadness=10000\vfuzz=120ex\overfullrule=0pt% 212 | \global\setbox\nipsrulerbox=\vbox to \textheight{% 213 | {\parskip=0pt\hfuzz=150em\cv@boxheight=\textheight 214 | \cv@lineheight=#1\global\nipsrulercount=#2% 215 | \cv@tot\cv@boxheight\divide\cv@tot\cv@lineheight\advance\cv@tot2% 216 | \cv@refno1\vskip-\cv@lineheight\vskip1ex% 217 | \loop\setbox\cv@tmpbox=\hbox to0cm{{\nipstenhv\hfil\fillzeros[#4]\nipsrulercount}}% 218 | \ht\cv@tmpbox\cv@lineheight\dp\cv@tmpbox0pt\box\cv@tmpbox\break 219 | \advance\cv@refno1\global\advance\nipsrulercount#3\relax 220 | \ifnum\cv@refno<\cv@tot\repeat}}\endgroup}% 221 | \makeatother 222 | % ----- end of vruler 223 | 224 | % \makevruler[][][][][] 225 | \def\nipsruler#1{\makevruler[12pt][#1][1][3][0.993\textheight]\usebox{\nipsrulerbox}} 226 | \AddToShipoutPicture{% 227 | \ifnipsfinal\else 228 | \nipsruleroffset=\textheight 229 | \advance\nipsruleroffset by -3.7pt 230 | \color[rgb]{.7,.7,.7} 231 | \AtTextUpperLeft{% 232 | \put(\LenToUnit{-35pt},\LenToUnit{-\nipsruleroffset}){%left ruler 233 | \nipsruler{\nipsrulercount}} 234 | } 235 | \fi 236 | } 237 | -------------------------------------------------------------------------------- /doc/notes/notes-2.tex: -------------------------------------------------------------------------------- 1 | % This is LLNCS.DOC the documentation file of 2 | % the LaTeX2e class from Springer-Verlag 3 | % for Lbecture Notes in Computer Science, version 2.4 4 | \documentclass{article} % For LaTeX2e 5 | \usepackage{nips15submit_e,times} 6 | \usepackage{hyperref} 7 | \usepackage{url} 8 | \usepackage[dvips]{graphicx} 9 | \usepackage{xcolor} 10 | %\usepackage{url} 11 | \usepackage{colortbl} 12 | \usepackage{multirow} 13 | 14 | 15 | %\usepackage{amssymb} 16 | %\newtheorem{definition}{Definition} % [section] 17 | %\newtheorem{example}{Example} % [section] 18 | \newcommand{\pivot}[1]{\mathbin{\, {#1} \,}} 19 | \newcommand{\Pivot}[1]{\mathbin{\; {#1} \;}} 20 | \newcommand{\Var}[0]{\mbox{\texttt{Var}}} 21 | \let\from=\leftarrow 22 | 23 | \newcommand{\keywords}[1]{\par\addvspace\baselineskip 24 | \noindent\keywordname\enspace\ignorespaces#1} 25 | \input{commands.tex} 26 | 27 | \begin{document} 28 | %\bibliographystyle{acmtrans} 29 | 30 | \long\def\comment#1{} 31 | \def\mtimes{} 32 | \def\LL#1{#1} 33 | \title{Notes on ProPPR \\ 34 | {\small (DRAFT v0.02)}} 35 | \author{ 36 | Vijay Saraswat \\ 37 | IBM T.J. Watson Research Center\\ 38 | 1101 Kitchawan Road\\ 39 | Yorktown Heights, NY 10598 \\ 40 | \texttt{vijay@saraswat.org} \\ 41 | (March 2017) 42 | %\And 43 | %Radha Jagadeesan \\ 44 | %De Paul University \\ 45 | %243 S. Wabash Avenue \\ 46 | %Chicago, IL 60604 \\ 47 | %\texttt{rjagadeesan@cs.depaul.edu} 48 | } 49 | 50 | \newcommand{\fix}{\marginpar{FIX}} 51 | \newcommand{\new}{\marginpar{NEW}} 52 | 53 | \nipsfinalcopy % Uncomment for camera-ready version 54 | 55 | \maketitle 56 | 57 | \begin{abstract} 58 | Probabilistic logics offer a powerful framework for approximate representation and reasoning, key to working with domain-specific information, once one moves beyond surface-level extraction of meaning from natural language texts. 59 | We review a recent proposed framework, ProPPR \cite{Cohen-2015} for probabilistic logic programming, intended to leverage ideas from approximate Personalized PageRank algorithms. 60 | %\keywords{machine learning,logic programming,PageRank,CCP} 61 | \end{abstract} 62 | 63 | \def\Or{\vee} 64 | \def\And{\wedge} 65 | \def\Arrow{\rightarrow} 66 | \def\Xor{\;\mbox{xor}\;} 67 | \def\Ind{\;\mbox{Ind}} 68 | \def\pr{\mbox{\em pr}} 69 | \def\apr{\mbox{apr}} 70 | \def\APR{\mbox{ApproxPageRank}} 71 | \def\push{\mbox{\em push}} 72 | \def\vol{\mbox{\em vol}} 73 | \def\Supp{\mbox{\em Supp}} 74 | \def\True{\mbox{\tt true}} 75 | \def\Fail{\mbox{\tt false}} 76 | \def\var{\mbox{\em var}} 77 | \def\tuple#1{\langle#1\rangle} 78 | 79 | \section{Introduction} 80 | \cite{Cohen-2015} is an innovative attempt to combine machine learning ideas drawn from the literature on PageRank with the proof theoretic underpinnings of definite clause constraint logic programming (constrained SLD-resolution). This note is my attempt at understanding the ideas and laying out the underlying mathematical background clearly. 81 | 82 | The basic idea is to view probabilistic SLD-resolution as a kind of graph traversal, and use ideas behind PageRank \cite{PageRank} to speed up traversal. 83 | 84 | \section{PageRank} 85 | First we cover some basics about PageRank, following \cite{Andersen-2006,Andersen-2008}. We are interested in applying to constraint-SLD graphs, which we will develop in the next section. Crucially transitions in such graphs are associated with probabilities, hence we wish to consider the setting of directed, weighted graphs, unlike \cite{Andersen-2006}. 86 | 87 | Let $G=(V,E)$ be a directed graph with vertex set $V$ (of size $n$) and edge set $E$ (of size $m$). Let ${\bf 1}_v$ be the $n$-vector which takes on value $1$ at $v$ and is $0$ elsewhere. Let $d(v)$ be the out-degree of vertex $v\in V$. A {\em distribution} over $V$ is a non-negative vector over $V$. By $\vec{k}$ we will mean the $n$-vector that takes on the value $k$ everywhere, and by ${\bf 1}_v$ we will mean the vector that is $0$ everywhere except at $v$ where it is $1$. The {\em 1-norm} of a distribution $d$ is written $|d|_1$. The {\em support} of a distribution $p$, $\Supp(p)$, is $\{v \alt p(v) \not= 0\}$. The {\em volume} of a subset $S \subseteq V$, $\vol(S)$, is the sum of the degrees of the vertices in $S$. 88 | 89 | A {\em Markov chain} $M$ over $G$ associates with each vertex a probability distribution over its outgoing edges. Specifically, $M$ is an $n \times n$ matrix with positive elements, whose rows sum to $1$ and whose non-zero elements $M_{ij}$ (for $i,j\in 1\ldots n$) are exactly (the transition probabilities of) the edges $(i,j)\in E$. Markov chains over $G$ are our subject of interest. 90 | 91 | \begin{definition} For a Markov chain $M$, the PageRank vector $\pr_M(\alpha, s)$ is the unique solution of the linear system 92 | \begin{equation} 93 | \pr_M(\alpha, s) = \alpha s + (1 - \alpha)\pr_M(\alpha, s)M 94 | \end{equation} 95 | \end{definition} 96 | (Note that this definition is in line with \cite{Andersen-2008}, but generalizes \cite{Andersen-2006} where it is given in an unweighted, undirected setting for the specific matrix $M=W=1/2(I+ D^{-1}A)$, where $A$ is the adjacency matrix and $D$ is the diagonal degree matrix.) 97 | 98 | \begin{proposition}[Linearity of $\pr_M$]\label{Prop:R} 99 | For any $\alpha \in [0, 1)$ there is a linear transformation $R_{\alpha}$ s.t. $\pr_M(\alpha, s) = s R_{\alpha}$, where 100 | 101 | $$ R_{\alpha} = \alpha \Sigma_{t=0}^{\infty} (1-\alpha)^t M^t = \alpha I + (1-\alpha)M R_{\alpha}$$ 102 | \end{proposition} 103 | 104 | \begin{proposition} 105 | \begin{equation}\label{Equation-key} 106 | \pr_M (\alpha,s)=\alpha s + (1 -\alpha)\pr_M(\alpha, sM) 107 | \end{equation} 108 | \end{proposition} 109 | The proof is based on Proposition~\ref{Prop:R}: 110 | $$ 111 | \begin{array}{llll} 112 | \pr_M(\alpha,s) &=& sR_{\alpha} \\ 113 | &=& \alpha s + (1-\alpha)s M R_{\alpha} & \mbox{(Proposition~\ref{Prop:R})}\\ 114 | &=& \alpha s + (1-\alpha)\pr_M(\alpha,s M) & \mbox{(Proposition~\ref{Prop:R})} 115 | \end{array} 116 | $$ 117 | 118 | \label{sec:pr-nibble} 119 | The following (``PageRank Nibble'') algorithm is taken from \cite{Andersen-2006} with a slight variation (working with a Markov chain rather than an adjacency matrix). The key insight is to work with a pair of distributions $(p,r)$ satisfying: 120 | \begin{equation}\label{Equation-apr} 121 | p + \pr_M(\alpha,r) = \pr_M(\alpha,s) 122 | \end{equation} 123 | We shall think of $p$ as an {\em approximate} PageRank vector, approximating $\pr_M(\alpha,s)$ (from below) with {\em residual} vector $r$. Below we will use the notation $\apr_M(\alpha,s,r)$ to stand for a vector $p$ in the relationship given by Equation~\ref{Equation-apr}. We are looking for an iterative algorithm that will let us approximate $\pr_M(\alpha,s)$ as closely as we want. Specifically, we would like to get the probability mass in $p$, $|p|_1$, as close to $1$ as we want.\footnote{In this our interest is slightly different from \cite{Andersen-2006}, which focuses on application to low conductance partitions.} 124 | 125 | We now introduce the operation $\push_u(p,r)$, generalizing \cite[Section 3]{Andersen-2006} to the setting of Markov chains (and correcting some typos in \cite[Table 2]{Cohen-2015}): 126 | \begin{enumerate} 127 | \item Let $p'=p$ and $r'=r$ except for the following changes: 128 | \begin{enumerate} 129 | \item $p'(u)=p(u) + \alpha r(u)$ 130 | \item $r'(u) = (1-\alpha)r(u) M_{uu}$ 131 | \item For each $v$ s.t. $(u,v)\in E$: $r'(v)=r(v)+ (1-\alpha)r(u)M_{uv}$ 132 | \end{enumerate} 133 | \item Return $(p',r')$. 134 | \end{enumerate} 135 | \noindent It simulates one step of a random walk, at $u$, irrevocably moving some probability mass to $u$.\footnote{The definition given in \cite[Table 2]{Cohen-2015} differs in the update to $r$. However, we are not able to establish its correctness; indeed key lemmas below do not hold for that definition. We cannot relate the comment ``$\alpha'$ is a lower-bound on $\mbox{Pr}(v_0|u)$ for any node $u$ to be added to the graph $\hat{G}$'' to the code -- given the term $(\mbox{Pr}(v|u)-\alpha'){\bf r}[u]$ in the update to ${\bf r}[v]$ perhaps ``$\alpha'$ is a lower-bound on $\mbox{Pr}(v|u)$'' was intended. But, in any case, we cannot see the need for this factor; in our code the corresponding factor is $(1-\alpha)r(u)M_{uv}$. We also cannot establish the subsequent assertion ``it can be shown that afer each push ${\bf p}+{\bf r} = \mbox{\bf ppr}(v_0)$''; indeed we believe it is incorrect, the correct assertion is ${\bf p}+\mbox{\bf ppr}(\alpha,r) = \mbox{\bf ppr}(\alpha,v_0)$.} 136 | 137 | The key lemma satisfied by this definition is: 138 | \begin{lemma} 139 | Let $p',r'$ be the result of the operation $\push_u(p,r)$. Then $p'+\pr_M(\alpha,r')=p+\pr_M(\alpha,r) (=\pr_M(\alpha,s))$. 140 | \end{lemma} 141 | In proof, following \cite[Appendix]{Andersen-2006}, note that after a $push_u(p,r)$ operation the following is true: 142 | \[ 143 | \begin{array}{l} 144 | p' = p + \alpha r(u) {\bf 1}_u\\ 145 | r' = r - r(u) {\bf 1}_u + (1-\alpha)r(u){\bf 1}_uM 146 | \end{array} 147 | \] 148 | Now: 149 | $$ 150 | \begin{array}{llll} 151 | p+\pr_M(\alpha,r) &=& p+\pr_M(\alpha, r- r(u){\bf 1}_u) + \pr_M(\alpha, r(u){\bf 1}_u) & \mbox{(Linearity)}\\ 152 | &=& p+\pr_M(\alpha, r- r(u){\bf 1}_u) + (\alpha r(u){\bf 1}_u + (1-\alpha)\pr_M(\alpha, r(u){\bf 1}_u M) & \mbox{(\ref{Equation-key})}\\ 153 | &=& (p+ (\alpha r(u){\bf 1}_u) + \pr_M(\alpha, r- r(u){\bf 1}_u + (1-\alpha)r(u){\bf 1}_u M) & \mbox{(Linearity)}\\ 154 | &=& p'+\pr_M(\alpha,r') 155 | \end{array} 156 | $$ 157 | 158 | Some simple calculations establish: 159 | \begin{lemma}\label{Lemma:ProbMassConserve} 160 | Let $p',r'$ be the result of the operation $\push_u(p,r)$. Then $|p'|_1 + |r'|_1 = |p|_1 + |r|_1$. 161 | \end{lemma} 162 | 163 | Now define the $\APR(v,\alpha,e)$ algorithm as follows: 164 | \begin{enumerate} 165 | \item Let $p=\vec{0}$ and $r={\bf 1}_v$. 166 | \item While there exists a vertex $u\in V: r(u) \geq \epsilon d(u)$, apply $\push_u(p,r)$. 167 | \item Return $p$. 168 | \end{enumerate} 169 | The value returned is an $\apr(\alpha,{\bf 1}_v, r)$ s.t. for all $u \in V, r(u)< \epsilon d(u)$. Hence we get an upper bound of $\epsilon m$ on $|r|_1$ (by summing over all vertices) at the end of the program. 170 | 171 | The main results are as follows, with proofs as in \cite[Appendix]{Andersen-2006}. 172 | 173 | \begin{lemma}(\cite[Lemma 2]{Andersen-2006}) Let $T$ be the total number of push operations performed by \APR, and let $d_i$ be the degree of the vertex $u$ used in the $i$'th push. Then $\Sigma_{i=1}^T d_i \leq 1/\epsilon \alpha$. 174 | \end{lemma} 175 | 176 | \begin{theorem}\label{theorem:main} 177 | $\APR(v,\alpha,\epsilon)$ runs in time $O(1/\epsilon \alpha)$, and computes an approximate PageRank vector $p=\apr_M(\alpha, {\bf 1}_v,r)$ s.t. $\max(1 - \epsilon m, \alpha \epsilon \Sigma_{i=1}^T d_i) < |p|_1 \leq 1 $. 178 | \end{theorem} 179 | 180 | We note in passing that the termination condition for the $\APR(v,\alpha,e)$ algorithm could be changed (e.g.{} to $r(u) \geq d(u)^c\epsilon$, for some constant $c$) without affecting the correctness of the algorithm. 181 | 182 | 183 | 184 | \section{Constraint-SLD resolution}\label{sec:SLD} 185 | Though \cite{Cohen-2015} is presented for just definite clause programs, we shall follow the tradition of logic programming research and consider constraint logic programming, after \cite{Jaffar-1987}. This gives us significant generality and lets us avoid speaking of syntactic notions such as most general unifiers. Hence we assume an underlying constraint system $\cal C$ \cite{Saraswat-1992}, defined over a logical vocabulary. Atomic formulas in this vocabulary are called {\em constraints}. $\cal C$ specifies the notions of {\em consistency} of constraints and {\em entailment} between constraints. We assume for simplicity the existence of a vacuous constraint $\True$. 186 | 187 | We assume a fixed program $P$, consisting of a (finite) collection of (implicitly universally quantified) clauses $h \leftarrow c, b_1, \ldots, b_k$ (where $h, b_i$ are atomic formulas and $c$ is a constraint). Below, for a formula $\phi$ we will use the notation $\var(\phi)$ to refer to the set of variables in $\phi$. Given a set of variables $Z$ and a formula $\phi$ by $\delta Z\, \phi$ we will mean the formula $\exists V\,\phi$ where $V=\var(\phi)\setminus Z$. 188 | 189 | We assume given an initial {\em goal} $g$ (an atomic formula), with $Z=\var(g)$. A {\em configuration} (or {\em state}) $s$ is a pair $\tuple{a_1, \ldots, a_n; c}$, with $n\geq 0$, $c$ a constraint, and goals $a_i$. The {\em variables} of the state are $\var(a_1\wedge\ldots\wedge a_n\wedge c)$. $s$ is said to be {\em successful} if $n=0$, {\em consistent} if $c$ is consistent and {\em failed} (or {\em inconsistent}) if $c$ is inconsistent. 190 | 191 | Two states $\tuple{a_1, \ldots, a_n; c}$ and $\tuple{b_1, \ldots, b_k; d}$ are equivalent if $\vdash \delta Z (a_1 \wedge \ldots \wedge a_n \wedge c) \Leftrightarrow \delta Z (b_1 \wedge \ldots \wedge b_k \wedge d)$ (where the $\vdash$ represents the entailment relation of the underlying logic, including the constraint entailment relation). Note that any two inconsistent states are equivalent, per this definition. 192 | 193 | We now consider the transitions between states. For simplicity, we shall confine ourselves to a fixed {\em selection rule}. Given the sequence of goals in a state, a selection rule chooses one of those goals for execution. Logically, any goal can be chosen (e.g.{} Prolog chooses the first goal). 194 | 195 | A clause is said to be {\em renamed apart} from a state if it has no variables in common with the state. If $g=p(s_1,\ldots,s_k)$ and $h=p(t_1,\ldots, t_k)$ are two atomic formulas with the same predicate $p$ and arity $k$, then $g=h$ stands for the collection of equalities $s_1=t_1, \ldots, s_k=t_k$. 196 | We say that a state 197 | $s=\tuple{a_1, \ldots, a_n; c}$ {\em can transition to} 198 | a state $\tuple{a_1, \ldots, a_{i-1}, b_1, \ldots, b_k, a_{i+1}, \ldots a_n; c, d, (a_i=h)}$ provided that (a)~$s$ is consistent, (b)~$a_i$ is chosen by the selection rule, (c)~there is a clause $C=h \leftarrow d, b_1,\ldots, b_k$ in $P$, renamed apart from $s$ s.t.{} $h$ and $a_i$ are atomic formulas with the same predicate name and arity. We say that $a_i$ is the {\em selected goal} (for the transition) and $C$ the {\em selected clause}. 199 | 200 | Note that the current state will have at most $k$ states it can transition to, if the program has $k$ clauses with the predicate and arity of the selected goal.\footnote{Note that in theory a clause has an infinite number of variants that are renamed apart from a given state. It can be shown that only one of them needs to be considered for selection, the results for all other choices can be obtained by merely renaming the results for this choice.} It will have fewer than $k$ if resulting states are equivalent. Of course, not all resulting configurations may be consistent. Finally, note that a state may transit to itself.\footnote{Consider for instance a program with a clause $p(X)\leftarrow p(X)$, and configuration $\tuple{p(U),\True}$, with $Z=\emptyset$. This configuration is equivalent to the one obtained after transition, $\tuple{p(X),\True}$.} 201 | 202 | Constraint-SLD resolution starts with a state $\tuple{g;\True}$ and transitions to successive states, until a state is reached which is successful or failed. 203 | 204 | We are interested in {\em stochastic logic programs}. For the purposes of this note, these are programs that supply with each transition a {\em probability} for the transition (a non-negative number bounded by $1$) in such a way that the sum of the probabilities across all transitions from this state is $1$. The probabilities may depend on the current state. In stochastic logic programs as described in \cite{Muggleton-1996} the probability is a number directly associated with the clause (and independent of the state). \cite{Cohen-2015} describes a more elaborate setting: a clause specifies ``features'' (dependent on the current state) which are combined with a (learnt) matrix of weights to produce the probability. For the purposes of this note we shall not be concerned with the specific mechanism. 205 | 206 | Note that multiple transitions from a state $s$ (each using a different clause) may lead to the same (equivalent) state $t$. In such cases we consider that there is only one transition $s \rightarrow t$, and its associated probability is the sum of the probabilitives across all clauses contributing to the transition. 207 | 208 | In general, we will only be concerned with goals that have a finite derivation graph. This can be guaranteed by placing restrictions on the expressiveness of programs (e.g. by requiring that programs satisfy the Datalog condition), but we shall not make such further requirements. 209 | 210 | \subsection{Learning weights via training, using probabiistic constraint-SLD resolution} 211 | Top-down proof procedures for definite clauses, such as SLD, are readily adaptable to probabilistic calculations proofs and are not susceptible to the ``grounding'' problem that plagues Markov Logic Networks \cite{domingos:srl07}. See e.g. \cite{Saraswat-2016-pcc} for an implementation. One simply implements a meta-interpreter which carries the probability mass generated on the current branch, multiplying the current value with the probability of the clause used to extend the proof by one step, discarding failed derivations, and summing up the results for different derivations of the same result. 212 | 213 | Training can be performed in a routine fashion. Given a proof tree for a particular query, the features used in each clause in the proof (and hence the weights used) are uniquely determined. A loss function similar to the one in \cite[Sec 3.3]{Cohen-2015} can be used to determine the gradient, and this can be propagated back to each weight used in the proof. Training for each query can be performed independently (the proof trees construted in parallel); though, as usual for SGD, weights must be updated using the gradients from all examples (e.g.{} in a mini-batch). 214 | 215 | Note that the procedure described here naturally takes care of ``impure'' programs, that is, programs some of whose predicates have non-probabilitic clauses. This is the case in most real-world programs -- certain parts of the program (e.g.{} those that deal with operations on data-structures) are usually deterministic. While finding a probabilistic SLD-refutation, steps that do not involve a probabilistic goal do not contribute to updating the probability associated with the branch. 216 | 217 | In passing, we note that it is advisable to adopt an ``Andorra''-style execution strategy which favors the selection of non-probabilistic predicates for execution, as long as there are any in the current resolvent. Among probabilistic goals, those with the least degree may be preferred. Most (constraint) logic programming implementations do not canonicalize and record generated states because of the book-keeping expense. Whether that is useful in the current setting should be determined empirically. 218 | \section{Applying PageRank to constraint-SLD graphs} 219 | We consider now the application of PageRank to constraint-SLD graphs, as described in \cite{Cohen-2015} (but modified per Sections~\ref{sec:pr-nibble} and~\ref{sec:SLD}) and discuss its features. 220 | 221 | \cite[Sec 3.2]{Cohen-2015} proposes to use the PageRank-Nibble algorithm (Section~\ref{sec:pr-nibble}) to generate a constrained-SLD graph, per the following procedure (\cite[Table 4]{Cohen-2015} with some typos corrected): 222 | \begin{enumerate} 223 | \item Let $p = \APR(\tuple{Q;\True},\alpha,\epsilon)$. 224 | \item Let $S=\{ u : p(u) > \epsilon, u=\tuple{;c}\}$. 225 | \item Let $Z=\Sigma_{u\in S} p(u)$. 226 | \item For every solution $u=\tuple{;c}$ define $\Pr(u)=(1/Z) p(u)$. 227 | \end{enumerate} 228 | Running \APR{} will produce an SLD-graph with $O(1/\alpha\epsilon)$ nodes; this is independent of the size of the program (and its included database). The algorithm will also work with loops in the graph (as might be generated, for example, by recursive programs). 229 | 230 | Unfortunately, the procedure is oblivious to the logical interpretation of the graph. In particular there is no guarantee that on termination the graph will contain {\em any} proof of the query described by the initial node $v$. Worse, even if the graph contains a successful terminal node (corresponding to a proof), if is does not contain nodes corresponding to {\em all} the proofs, the estimate of probability (computed in the last line of the algorithm above) could be arbitrarily off as we now analyze. 231 | 232 | 233 | Consider a solution $u\in S$ computed by the PageRank-Nibble algorithm, with probability estimate $p(u)$. We consider now numerous factors that bear on the relationship between $p(u)$ and $\Pr(u)$, the true probability of $u$. 234 | 235 | First, note that (unlike the claim in \cite{Cohen-2015}\footnote{\cite[Sec 2.2]{Cohen-2015} ``Specifically, following their proof technique, it can be shown that after each push, ${\bf p} + {\bf r}={\bf ppr}(v_0)$. It is also clear that when PageRank-Nibble terminates, then for any $u$, the error ${\bf ppr}(v_0)[u]-{\bf p}[u]$ is bounded by $\epsilon |N(u)|$ \ldots''}) the error on $p(u)$ is {\em not} bound by $\epsilon d(u)$. While $r(u) < \epsilon d(u)$, the error $\pr_M(\alpha,{\bf 1}_v)(u)- p(u)$ is $\pr_M(\alpha,r)(u)$, not $r(u)$. The only other conclusion that can be made is on the lower bound for $|p|_1$ (of which $p(u)$ is a summand). 236 | 237 | Second, note that the probability of a final solution is the sum of all paths to that solution, divided by the sum of probabilities of all solutions. In particular the probability mass $|p|_1 - Z$ is allocated to either failed nodes or intermediate nodes. Ultimately, all of this must be divided up among successful nodes. 238 | 239 | Let $\Fail$ stand for the node corresponding to the inconsistent state, with probability estimate $p(\Fail)$. The residual unallocated probability mass is then $1-(Z+p(\Fail))$. Some of it may go to failed nodes, rest must go to successful nodes. Consider several possibilities: 240 | \begin{enumerate} 241 | \item All the probability mass goes to $\Fail$. In this case, $\Pr(u)=p(u)/Z$, as estimated above. 242 | \item All the probability mass goes to other solutions. In this case, $\Pr(u)=p(u)/(1-p(\Fail))$. 243 | \item All the probability mass goes to $u$. In this case, 244 | $\Pr(u)=(p(u) + (1-(Z+p(\Fail))))/(1-p(\Fail)) = 1-(Z-p(u))/(1-p(\Fail))$. 245 | \end{enumerate} 246 | As we can see, the actual value may be a factor of $Z$ off (second case, $p(\Fail)=0$, $\Pr(u)=p(u)$). Numerically, $\Pr(u)$ may be close to $1$, even though $p(u)/Z$ is extremely small (e.g.{} $p(u)=0.001, Z=0.1, p(\Fail)=0$ gives an estimate of $0.01$ for a true value of $0.901$). 247 | 248 | Thus $p(u)/Z$ cannot be used as a reliable predictor for $\Pr(u)$ unless $Z$ is close to $1$. But how much work has to be expended to get to a proof-graph for which $Z$ is close to $1$ ultimately depends on the logical structure of the program and may not be very strongly dependent on the particular control strategy used to develop the proof-graph. 249 | 250 | \paragraph{Conclusions.} 251 | We believe that a practical probabilistic logical system can be built utilizing the idea of per-clause features, and trainable weights to learn probability distributions. However, PageRank-based ideas may not be as useful as we thought. 252 | 253 | \paragraph{Acknowledgements.} 254 | Thanks to Radha Jagadeesan, Kyle Gao and Cristina Cornelio for discussions, and to William Cohen and Kathryn Mazaitis for responding to questions. The responsibility for conclusions drawn above remains mine... 255 | \bibliographystyle{alpha} 256 | \bibliography{master} 257 | 258 | 259 | 260 | \end{document} 261 | -------------------------------------------------------------------------------- /doc/notes/notes.tex: -------------------------------------------------------------------------------- 1 | % This is LLNCS.DOC the documentation file of 2 | % the LaTeX2e class from Springer-Verlag 3 | % for Lbecture Notes in Computer Science, version 2.4 4 | \documentclass{article} % For LaTeX2e 5 | \usepackage{nips15submit_e,times} 6 | \usepackage{hyperref} 7 | \usepackage{url} 8 | \usepackage[dvips]{graphicx} 9 | \usepackage{xcolor} 10 | %\usepackage{url} 11 | \usepackage{colortbl} 12 | \usepackage{multirow} 13 | 14 | 15 | %\usepackage{amssymb} 16 | %\newtheorem{definition}{Definition} % [section] 17 | %\newtheorem{example}{Example} % [section] 18 | \newcommand{\pivot}[1]{\mathbin{\, {#1} \,}} 19 | \newcommand{\Pivot}[1]{\mathbin{\; {#1} \;}} 20 | \newcommand{\Var}[0]{\mbox{\texttt{Var}}} 21 | \let\from=\leftarrow 22 | 23 | \newcommand{\keywords}[1]{\par\addvspace\baselineskip 24 | \noindent\keywordname\enspace\ignorespaces#1} 25 | \input{commands.tex} 26 | 27 | \begin{document} 28 | %\bibliographystyle{acmtrans} 29 | 30 | \long\def\comment#1{} 31 | \def\mtimes{} 32 | \def\LL#1{#1} 33 | \title{Notes on ProPPR \\ 34 | {\small ***DRAFT v0.01 ***}} 35 | 36 | \author{ 37 | %Vijay Saraswat \\ 38 | %IBM T.J. Watson Research Center\\ 39 | %1101 Kitchawan Road\\ 40 | %Yorktown Heights, NY 10598 \\ 41 | %\texttt{vijay@saraswat.org} \\ 42 | %\And 43 | %Radha Jagadeesan \\ 44 | %De Paul University \\ 45 | %243 S. Wabash Avenue \\ 46 | %Chicago, IL 60604 \\ 47 | %\texttt{rjagadeesan@cs.depaul.edu} 48 | } 49 | 50 | \newcommand{\fix}{\marginpar{FIX}} 51 | \newcommand{\new}{\marginpar{NEW}} 52 | 53 | \nipsfinalcopy % Uncomment for camera-ready version 54 | 55 | \maketitle 56 | 57 | \begin{abstract} 58 | 59 | %\keywords{machine learning, concurrent constraint programming} 60 | \end{abstract} 61 | 62 | \def\Or{\vee} 63 | \def\And{\wedge} 64 | \def\Arrow{\rightarrow} 65 | \def\Xor{\;\mbox{xor}\;} 66 | \def\Ind{\;\mbox{Ind}} 67 | \def\pr{\mbox{\em pr}} 68 | \def\apr{\mbox{apr}} 69 | \def\APR{\mbox{ApproxPageRank}} 70 | \def\push{\mbox{\em push}} 71 | \def\vol{\mbox{\em vol}} 72 | \def\Supp{\mbox{\em Supp}} 73 | \def\True{\mbox{\tt true}} 74 | \def\var{\mbox{\em var}} 75 | \def\tuple#1{\langle#1\rangle} 76 | 77 | \section{Introduction} 78 | \cite{Cohen-2015} is an innovative attempt to combine machine learning ideas drawn from the literature on PageRank with the proof theoretic underpinnings of definite clause constraint logic programming (constrained SLD-resolution). This note is my attempt at understanding the ideas and laying out the underlying mathematical background clearly. 79 | 80 | The basic idea is to view probabilistic SLD-resolution as a kind of graph traversal, and use ideas behind PageRank \cite{PageRank} to speed up traversal. 81 | 82 | \section{PageRank} 83 | First we cover some basics about PageRank, following \cite{Andersen-2006,Andersen-2008}. Our interest is to develop the basics in such a way that they can directly apply to constraint-SLD graphs, which we will develop in the next section. Crucially transitions in such graphs are associated with probabilities, hence we wish to consider the setting of directed, weighted graphs, unlike \cite{Andersen-2006}. 84 | 85 | Let $G=(V,E)$ be a directed graph with vertex set $V$ (of size $n$) and edge set $E$ (of size $m$). Let ${\bf 1}_v$ be the $n$-vector which takes on value $1$ at $v$ and is $0$ elsewhere. Let $d(v)$ be the out-degree of vertex $v\in V$. A {\em distribution} over $V$ is a non-negative vector over $V$. The {\em support} of a distribution $p$, $\Supp(p)$ is $\{v \alt p(v) \not= 0\}$. The {\em volume} of a subset $S \subseteq V$, $\vol(S)$ is the sum of the degrees of the vertices in $S$. 86 | 87 | Let $M$ be a Markov chain over $G$, i.e.{} an $n \times n$ matrix with positive elements, whose rows sum to $1$ and whose non-zero elements $M_{ij}$ (for $i,j\in 1\ldots n$) are exactly (the transition probabilities of) the edges $(i,j)\in E$. Markov chains over $G$ are our subject of interest. 88 | 89 | \begin{definition} For a Markov chain $M$, the PageRank vector $\pr_M(\alpha, s)$ is the unique solution of the linear system 90 | \begin{equation} 91 | \pr_M(\alpha, s) = \alpha s + (1 - \alpha)\pr_M(\alpha, s)M 92 | \end{equation} 93 | \end{definition} 94 | (Note that this definition is in line with \cite{Andersen-2008}, but generalizes \cite{Andersen-2006} where it is given in an unweighted, undirected setting for the specific matrix $M=W=1/2(I+ D^{-1}A)$, where $A$ is the adjacency matrix and $D$ is the diagonal degree matrix.) 95 | 96 | \begin{proposition}\label{Prop:R} 97 | For any $\alpha \in [0, 1)$ there is a linear transformation $R_{\alpha}$ s.t. $\pr_M(\alpha, s) = s R_{\alpha}$, where 98 | 99 | $$ R_{\alpha} = \alpha \Sigma_{t=0}^{\infty} (1-\alpha)^t M^t = \alpha I + (1-\alpha)M R_{\alpha}$$ 100 | \end{proposition} 101 | 102 | \begin{proposition} 103 | \begin{equation}\label{Equation-key} 104 | \pr_M (\alpha,s)=\alpha s + (1 -\alpha)\pr_M(\alpha, sM) 105 | \end{equation} 106 | \end{proposition} 107 | The proof is elementary: 108 | $$ 109 | \begin{array}{llll} 110 | \pr_M(\alpha,s) &=& sR_{\alpha} \\ 111 | &=& \alpha s + (1-\alpha)s M R_{\alpha} & \mbox{(Proposition~\ref{Prop:R})}\\ 112 | &=& \alpha s + (1-\alpha)\pr_M(\alpha,s M) & \mbox{(Proposition~\ref{Prop:R})} 113 | \end{array} 114 | $$ 115 | 116 | The following (``PageRank Nibble'') algorithm is taken from \cite{Andersen-2006} with a slight variation (working with a Markov chain rather than an adjacency matrix). The key insight is to work with a pair of distributions $(p,r)$ satisfying: 117 | \begin{equation}\label{Equation-apr} 118 | p + \pr_M(\alpha,r) = \pr_M(\alpha,s) 119 | \end{equation} 120 | We shall think of $p$ as an {\em approximate} PageRank vector, approximating $\pr_M(\alpha,s)$ (from below) with {\em residual} vector $r$. Below we will use the notation $\apr_M(\alpha,s,r)$ to stand for a vector $p$ in the relationship given by Equation~\ref{Equation-apr}. 121 | 122 | We now introduce the operation $\push_u(p,r)$, generalizing \cite[Section 3]{Andersen-2006} to the setting of Markov chains (and correcting some typos in \cite[Table 2]{Cohen-2015}): 123 | \begin{enumerate} 124 | \item Let $p'=p$ and $r'=r$ except for the following changes: 125 | \begin{enumerate} 126 | \item $p'(u)=p(u) + \alpha r(u)$ 127 | \item $r'(u) = (1-\alpha)r(u) M_{uu}$ 128 | \item For each $v$ s.t. $(u,v)\in E$: $r'(v)=r(v)+ (1-\alpha)r(u)M_{uv}$ 129 | \end{enumerate} 130 | \item Return $(p',r')$. 131 | \end{enumerate} 132 | \noindent It simulates one step of a random walk, at $u$. 133 | 134 | The key lemma satisfied by this definition is: 135 | \begin{lemma} 136 | Let $p',r'$ be the result of the operation $\push_u(p,r)$. Then $p'+\pr_M(\alpha,r')=p+\pr_M(\alpha,r) (=\pr_M(\alpha,s))$. 137 | \end{lemma} 138 | In proof, following \cite[Appendix]{Andersen-2006}, note that after a $push_u(p,r)$ operation, the following is true: 139 | \[ 140 | \begin{array}{l} 141 | p' = p + \alpha r(u) {\bf 1}_u\\ 142 | r' = r - r(u) {\bf 1}_u + (1-\alpha)r(u){\bf 1}_uM 143 | \end{array} 144 | \] 145 | Now: 146 | $$ 147 | \begin{array}{llll} 148 | p+\pr_M(\alpha,r) &=& p+\pr_M(\alpha, r- r(u){\bf 1}_u) + \pr_M(\alpha, r(u){\bf 1}_u) & \mbox{(Linearity)}\\ 149 | &=& p+\pr_M(\alpha, r- r(u){\bf 1}_u) + (\alpha r(u){\bf 1}_u + (1-\alpha)\pr_M(\alpha, r(u){\bf 1}_u M) & \mbox{(\ref{Equation-key})}\\ 150 | &=& (p+ (\alpha r(u){\bf 1}_u) + \pr_M(\alpha, r- r(u){\bf 1}_u + (1-\alpha)r(u){\bf 1}_u M) & \mbox{(Linearity)}\\ 151 | &=& p'+\pr_M(\alpha,r') 152 | \end{array} 153 | $$ 154 | 155 | Now define the $\APR(v,\alpha,e)$ algorithm as follows: 156 | \begin{enumerate} 157 | \item Let $p=\vec{0}$ and $r={\bf 1}_v$. 158 | \item While there exists a vertex $u\in V: (r(u)/d(u)) \geq \epsilon$, apply $\push_u(p,r)$. 159 | \item Return $p$. 160 | \end{enumerate} 161 | The value returned is an $\apr(\alpha,{\bf 1}_v, r)$ s.t. for all $u \in V, r(u)< \epsilon d(u)$. 162 | 163 | The following are the main results, with proofs as in \cite[Appendix]{Andersen-2006}. 164 | 165 | \begin{lemma}(\cite[Lemma 2]{Andersen-2006}) Let $T$ be the total number of push operations performed by \APR, and let $d_i$ be the degree of the vertex $u$ used in the $i$'th push. Then $\Sigma_{i=1}^T d_i \leq (1/\epsilon \alpha)$. 166 | \end{lemma} 167 | 168 | \begin{theorem}(\cite[Theorem 1]{Andersen-2006}) 169 | $\APR(v,\alpha,\epsilon)$ runs in time $O(1/(\epsilon \alpha))$, and computes an approximate PageRank vector $p=\apr_M(\alpha, {\bf 1}_v,r)$ s.t. $\vol(\Supp(p))\leq 1/(\epsilon\alpha)$ and for all $u\in V$, $r(u) < \epsilon d(u)$. 170 | \end{theorem} 171 | 172 | \section{Constrained SLD-resolution} 173 | Though \cite{Cohen-2015} is presented for just definite clause programs, we shall follow the tradition of logic programming research and consider constraint logic programming, after \cite{Jaffar-1987}. This gives us significant generality and lets us avoid speaking of syntactic notions such as most general unifiers. Hence we assume an underlying constraint system $\cal C$ \cite{Saraswat-1992}, defined over a logical vocabulary. Atomic formulas in this vocabulary are called {\em constraints}. $\cal C$ specifies the notions of {\em consistency} of constraints and {\em entailment} between constraints. We assume for simplicity the existence of a vacuous constraint $\True$. 174 | 175 | We assume a fixed program $P$, consisting of a (finite) collection of (implicitly universally quantified) clauses $h \leftarrow c, b_1, \ldots, b_k$ (where $h, b_i$ are atomic formulas and $c$ is a constraint). Below, for a formula $\phi$ we will use the notation $\var(\phi)$ to refer to the set of variables in $\phi$. Given a set of variables $Z$ and a formula $\phi$ by $\delta Z\, \phi$ we will mean the formula $\exists V\,\phi$ where $V=\var(\phi)\setminus Z$. 176 | 177 | We assume given an initial {\em goal} $g$ (an atomic formula), with $Z=\var(g)$. A {\em configuration} (or {\em state}) $s$ is a pair $\tuple{a_1, \ldots, a_n; c}$, with $n\geq 0$, $c$ a constraint, and goals $a_i$. The {\em variables} of the state are $\var(a_1\wedge\ldots\wedge a_n\wedge c)$. $s$ is said to be {\em successful} if $n=0$, {\em consistent} if $c$ is consistent and {\em failed} (or {\em inconsistent}) if $c$ is inconsistent. 178 | 179 | Two states $\tuple{a_1, \ldots, a_n; c}$ and $\tuple{b_1, \ldots, b_k; d}$ are equivalent if $\vdash \delta Z (a_1 \wedge \ldots \wedge a_n \wedge c) \Leftrightarrow \delta Z (b_1 \wedge \ldots \wedge b_k \wedge d)$ (where the $\vdash$ represents the entailment relation of the underlying logic, including the constraint entailment relation). Note that any two inconsistent states are equivalent, per this definition. 180 | 181 | We now consider the transitions between states. For simplicity, we shall confine ourselves to a fixed {\em selection rule}. Given the sequence of goals in a state, a selection rule chooses one of those goals for execution. Logically, any goal can be chosen (e.g.{} Prolog chooses the first goal). 182 | 183 | A clause is said to be {\em renamed apart} from a state if it has no variables in common with the state. If $g=p(s_1,\ldots,s_k)$ and $h=p(t_1,\ldots, t_k)$ are two atomic formulas with the same predicate $p$ and arity $k$, then $g=h$ stands for the collection of equalities $s_1=t_1, \ldots, s_k=t_k$. 184 | We say that a state 185 | $s=\tuple{a_1, \ldots, a_n; c}$ {\em can transition to} 186 | a state $\tuple{a_1, \ldots, a_{i-1}, b_1, \ldots, b_k, a_{i+1}, \ldots a_n; c, d, (a_i=h)}$ provided that (a)~$s$ is consistent, (b)~$a_i$ is chosen by the selection rule, (c)~there is a clause $C=h \leftarrow d, b_1,\ldots, b_k$ in $P$, renamed apart from $s$ s.t.{} $h$ and $a_i$ are atomic formulas with the same predicate name and arity. We say that $a_i$ is the {\em selected goal} (for the transition) and $C$ the {\em selected clause}. 187 | 188 | Note that the current state will have at most $k$ states it can transition to, if the program has $k$ clauses with the predicate and arity of the selected goal.\footnote{Note that in theory a clause has an infinite number of variants that are renamed apart from a given state. It can be shown that only one of them needs to be considered for selection, the results for all other choices can be obtained by merely renaming the results for this choice.} It will have fewer than $k$ if resulting states are equivalent. Of course, not all resulting configurations may be consistent. Finally, note that a state may transit to itself.\footnote{Consider for instance a program with a clause $p(X)\leftarrow p(X)$, and configuration $\tuple{p(U),\True}$, with $Z=\emptyset$. This configuration is equivalent to the one obtained after transition, $\tuple{p(X),\True}$. 189 | 190 | Constrained SLD-resolution starts with a state $\tuple{g;\True}$ and transitions to successive states, until a state is reached which is successful or failed. 191 | 192 | We are interested in {\em stochastic logic programs}. For the purposes of this note, these are programs that supply with each transition a {\em probability} for the transition (a non-negative number bounded by $1$) in such a way that the sum of the probabilities across all transitions from this state is $1$. The probabilities may depend on the current state. In stochastic logic programs as described in \cite{Muggleton-1996} the probability is a number directly associated with the clause (and independent of the state). \cite{Cohen-2015} describes a more elaborate setting: a clause specifies ``features'' (dependent on the current state) which are combined with a (learnt) matrix of weights to produce the probability. For the purposes of this note we shall not be concerned with the specific mechanism. 193 | 194 | Note that multiple transitions from a state $s$ (each using a different clause) may lead to the same (equivalent) state $t$. In such cases we consider that there is only one transition $s \rightarrow t$, and its associated probability is the sum of the probabilities across all clauses contributing to the transition. 195 | 196 | In general, we will only be concerned with goals that have a finite derivation graph. This can be guaranteed by placing restrictions on the expressiveness of programs (e.g. by requiring that programs satisfy the Datalog condition), but we shall not make such further requirements. 197 | 198 | \section{Applying PageRank to constraint-SLD graphs} 199 | 200 | \paragraph{Acknowledgements.} 201 | 202 | \bibliographystyle{alpha} 203 | \bibliography{master} 204 | 205 | 206 | 207 | \end{document} 208 | -------------------------------------------------------------------------------- /doc/notes/notes.txt: -------------------------------------------------------------------------------- 1 | Sat May 28 16:12:38 EDT 2016 2 | 3 | We consider a variant of the presentation of Boolean formulas as real-valued multi-linear polynomials, where the underlying domain is taken to be {0,1} rather than {-1,1}. 4 | 5 | Domain: {0,1} 6 | true=1 7 | false=0 8 | not a = 1-a 9 | a and b = ab 10 | 11 | The definition of the other operators (or, xor etc) follows. 12 | Polynomials can be simplified with x.x=x. 13 | 14 | In particular: 15 | a or b = 1-(1-a)(1-b) 16 | a -> b = 1-a(1-b) 17 | a xor b = ab + (1-a)(1-b) 18 | 19 | Specifically we can write the indicator function, Ind_a(x) which, for vectors a and x returns 1 if a equals x, else 0. 20 | 21 | Ind_a(x) = 1 if a=x, 0 ow 22 | = ax + (1-a)(1-x) 23 | = ax + 1-a-x+ax 24 | = 1-(a+x-2ax) 25 | = 1-(a-x)^2 26 | 27 | Therefore Fourier expansion still holds. Any function f:{0,1}^n-> R can be represented as a multi-linear polynomial sum(a in {0,1}^n) f(a)*Ind_a(x). 28 | 29 | Lets work out and2: 30 | 31 | and2(x) = 32 | = 1.(1*x1+ 0.(1-x1))(x2*1+0*(1-x2)) 33 | + 0.Ind_(1,0) 34 | + 0.Ind_(0,1) 35 | + 0.Ind_(0,0) 36 | = x1.x2 37 | 38 | General representation of a clause: 39 | a1,..., an -> b1; ...; bk = 1-a1*...*an*(1-b1)*...(1-bk) 40 | 41 | (Think: a1,..., an -> b1; ...; bk is the same as not((a1 and... an)and (not b1) and ... and (not bk)).) 42 | 43 | 44 | Specifically: 45 | a1, ..., an -> b = 1-a1*..*an(1-b) -- definite clause 46 | a1, ..., an -> = 1-a1*...*an -- negative clause 47 | -> b = b -- unit clause 48 | 49 | 50 | Now we can see how unit resolution works. 51 | 52 | Given a=1, we can reduce 53 | 54 | 1-a*phi 55 | 56 | to 1-phi 57 | 58 | Given a=0, we can reduce 59 | 60 | 1 - (1-a)*phi 61 | 62 | to 1-phi 63 | 64 | More generally, one can think of definite clause programming thus. We are given a theory, corresponding to a collection of clauses, (1-phi1), ..., (1-phik), where each phik is a product of positive literals (of the form a, for some variable a) and exactly one negative literals (of the form 1-b for some variable b). We are only interested in valuations that are zeros of the phii (such valuations assign 1 to each of the clauses). 65 | 66 | Now we are given a monomial a1*...*ak (the "current resolvent") and we are trying to establish that for every such valuation it must evaluate to 1. We establish this by looking for a formula psi, such that if every such valuation assigns 1 to psi, then it must assign 1 to a1*...*ak, and keep repeating until we arrive at the formula true. Thus gives us a proof. 67 | 68 | How do we look for this formula? We look for a clause b1*..*bj*(1-ai), and replace a1*...*ak with a1*...*a(i-1)*b1*...*bj*a(i+1)*....ak. 69 | 70 | Why is this sound? Because a v that assigns 0 to b1*...*bj*(1-ai) and 1 to a1*...*a(i-1)*b1*...*bj*a(i+1)*....ak can only do so by assigning 1 to b1,..., bj, 1 to ai, and 1 to each of a1,..., a(i-1), a(i+1),..., ak. 71 | 72 | 73 | -- Use in optimization. e,g, go back to constrained clustering, machine learning. 74 | 75 | -- convex optimization 76 | 77 | 78 | Applications in constraint programming, through the first order formulation from Radha. 79 | 80 | -------------------------------------------------------------------------------- /doc/notes/paper.tex: -------------------------------------------------------------------------------- 1 | % This is LLNCS.DOC the documentation file of 2 | % the LaTeX2e class from Springer-Verlag 3 | % for Lbecture Notes in Computer Science, version 2.4 4 | \documentclass{llncs} 5 | \usepackage{llncsdoc} 6 | % 7 | %\usepackage{aopmath} 8 | \usepackage{graphicx} 9 | 10 | \usepackage{xcolor} 11 | \usepackage{amsmath} 12 | \usepackage{url} 13 | \usepackage{colortbl} 14 | \usepackage{multirow} 15 | 16 | 17 | \usepackage{amssymb} 18 | %\newtheorem{definition}{Definition} % [section] 19 | %\newtheorem{example}{Example} % [section] 20 | \newcommand{\pivot}[1]{\mathbin{\, {#1} \,}} 21 | \newcommand{\Pivot}[1]{\mathbin{\; {#1} \;}} 22 | \newcommand{\Var}[0]{\mbox{\texttt{Var}}} 23 | \let\from=\leftarrow 24 | 25 | \newcommand{\keywords}[1]{\par\addvspace\baselineskip 26 | \noindent\keywordname\enspace\ignorespaces#1} 27 | \def\code#1{\texttt{#1}} 28 | 29 | \begin{document} 30 | %\bibliographystyle{acmtrans} 31 | 32 | \long\def\comment#1{} 33 | 34 | \title{Differentiable Concurrent Constraint Programs} 35 | 36 | \author{{\sc Incomplete Draft}} 37 | \institute{} 38 | \maketitle 39 | 40 | \begin{abstract} 41 | 42 | \keywords{machine learning, concurrent constraint programming} 43 | \end{abstract} 44 | \section{Basic idea} 45 | 46 | The {\em supervised machine learning} approach is applicable in 47 | settings where the programmer is concerned with specifying a function 48 | that must work accurately even in the presence of significant amounts 49 | of noise in the input. For instance, suppose the programmer must write 50 | code that labels an arbitrary input image with a tag (``dog with 51 | hat'') that best characterizes the image. The space of variations in 52 | the input is generally so vast and potentially so ill-understood 53 | mathematically that it may simply not be possible for the user to 54 | programmatically specify all the logic for the function. 55 | 56 | Instead we desire an approach whereby (portions of the) logic may be 57 | {\em learnt} by automatic techniques, given a (potentially very) large 58 | collection of observations (input/output pairs) of the given 59 | function. 60 | 61 | Concretely, we may describe the problem as follows. Instead of 62 | providing a fully realized, executable function $f$, the programmer 63 | specifies a space $\cal F$ of possible 64 | functions, obtained by varying the parameters $\theta$ of 65 | some parametric function $f_{\theta}$. Also 66 | available is the set of {\em observations} 67 | $(\bar{x}_i, \bar{y}_i)$ (for $i\in I$) specifying pairs of input values with their 68 | associated output values, and a {\em loss function} $L$ which can take 69 | two values $\bar{y},\bar{y}'$ and return a real value 70 | $L(\bar{y},\bar{y}')$ which measures how far one value is from the 71 | other. The machine learning problem is now to determine a specific 72 | value $\hat{\theta}$ for the parameters which minimizes 73 | $$\Sigma_{i \in I} L(f_\theta(\bar{x}_i), \bar{y}_i)$$ 74 | 75 | Many algorithmic techniques are available, under various conditions, 76 | to address this problem. For instance, in situations 77 | where $f_{\theta}$ represents a differentiable function of its 78 | parameters $\theta$, one may use stochastic gradient descent to find a 79 | minima. Under other conditions, the Expectation Maximization algorithm 80 | may be used. 81 | 82 | A very powerful instance of this general picture is obtained with 83 | (feed forward) {\em 84 | deep neural networks}. In this setting, the parametric function 85 | $f_{\theta}$ is given by $k$ layers of compute elements (called 86 | ``neurons''), an element in layer $i$ connected to some (possibly all) 87 | elements in the preceding layer. Each neuron is associated with a set 88 | of weights $W$ (its parameters) used to linearly sum its inputs. 89 | (There is also usually a bias value.) Some non-linear 90 | (but differentiable) thresholding function (e.g. sigmoid, $\lambda 91 | t.\,(1/(1+e^{-t}))$) is used to determine the output of this 92 | element. One can think of the entire network as specified by a 93 | $k$-nested functional term, involving summations and pointwise 94 | applications of the thresholding function. It has been shown that any 95 | arbitrary non-linear (real-valued) function can be approximated 96 | arbitrarily closely with sufficiently many parameters, and sufficient 97 | training input. 98 | 99 | What has made the machine learning problem of great interest in recent 100 | years is the availability of large amounts of labeled training data, 101 | and vast amounts of (CPU, GPU) computation that can be brought to bear 102 | to train the given function. A better understanding of how training 103 | algorithms such as asynchronous stochastic gradient networks can be 104 | made to work for deep networks has brought about a startling increase 105 | in accuracy in many applications, and a significant increase in the 106 | range of applications amenable to these techniques. 107 | 108 | \subsection{Learning concurrent constraint programs} 109 | A significant drawback of deep neural networks is their {\em 110 | opacity}. In general, it is not possible to give a meaningful answer 111 | to the question of {\em why} the learnt program does what it 112 | does. At hand is simply the value of the (potentially billions of) 113 | parameters, poor material from which to construct causally coherent 114 | explanations accessible to a human observer. 115 | 116 | Our intuition is that it will be easier to develop explanations in a 117 | context in which the machine learning paradigm is applied to a 118 | logic-based programming framework, leading to a natural combination of 119 | symbolic computation and numeric approximation. 120 | 121 | In this paper we show how the machine learning problem can be 122 | instantiated in the context of concurrent constraint programming. 123 | (Our main interest is going to be to develop this theory for timed 124 | CCP.) The major novelty of our approach is that, since it is based on 125 | CCP, it naturally permits the input to be only partially specified 126 | (rather than a fully formed image, for instance). 127 | 128 | Recall that all concurrent constraint programs $f$ describe 129 | a function (a closure operator) that takes an input constraint $c$ to 130 | an output $f(c)$. Our basic idea is that instead of specifying a 131 | fully executable cc program $f$, the programmer will supply a parametric version 132 | $f_{\theta}$. How do the parameters show up in the code? Clearly, we 133 | need to permit ask and tell constraints to contain parameters, 134 | e.g. $X + \alpha \geq Y$ ($\alpha$ is the parameter). 135 | 136 | Additionally, we need to extend the notion of a constraint system to 137 | admit a metric $d: {\cal C} \rightarrow {\cal C} \rightarrow {\cal 138 | R}$ which takes pairs of constraints to a real value. Intuitively, 139 | the metric measures how ``close'' the two constraints are, e.g.{} how 140 | many valuations do the constraints differ on. 141 | The metric must satisfy certain obvious properties, such as symmetry, 142 | equivalent constraints are $0$ apart from each other, and respect for 143 | constraint equivalence. 144 | \begin{enumerate} 145 | \item $d(c,c')=0$ if $c \vdash c', c' \vdash c$. 146 | \item $d(b,c)=d(c,b)$ 147 | \item $d(b,c)=d(b,c')$ if $c \vdash c', c' \vdash c$. 148 | \item {\em triangle inequality?} 149 | \end{enumerate} 150 | 151 | {\em Now we need to develop the algorithms that will take as input (1) 152 | the parametrized program, (b) the observations, (c) the loss 153 | function and output the value of the parameters that minimizes the 154 | loss over the observation set.} 155 | 156 | Note that deep neural networks can be described as parametric CCP 157 | programs. 158 | 159 | \newpage 160 | 161 | \bibliographystyle{alpha} 162 | %\bibliography{../../biblio} 163 | 164 | \end{document} 165 | -------------------------------------------------------------------------------- /doc/notes/splncs.bst: -------------------------------------------------------------------------------- 1 | % BibTeX bibliography style `splncs' 2 | 3 | % An attempt to match the bibliography style required for use with 4 | % numbered references in Springer Verlag's "Lecture Notes in Computer 5 | % Science" series. (See Springer's documentation for llncs.sty for 6 | % more details of the suggested reference format.) Note that this 7 | % file will not work for author-year style citations. 8 | 9 | % Use \documentclass{llncs} and \bibliographystyle{splncs}, and cite 10 | % a reference with (e.g.) \cite{smith77} to get a "[1]" in the text. 11 | 12 | % Copyright (C) 1999 Jason Noble. 13 | % Last updated: Thursday 20 May 1999, 13:22:19 14 | % 15 | % Based on the BibTeX standard bibliography style `unsrt' 16 | 17 | ENTRY 18 | { address 19 | author 20 | booktitle 21 | chapter 22 | edition 23 | editor 24 | howpublished 25 | institution 26 | journal 27 | key 28 | month 29 | note 30 | number 31 | organization 32 | pages 33 | publisher 34 | school 35 | series 36 | title 37 | type 38 | volume 39 | year 40 | } 41 | {} 42 | { label } 43 | 44 | INTEGERS { output.state before.all mid.sentence after.sentence 45 | after.block after.authors between.elements} 46 | 47 | FUNCTION {init.state.consts} 48 | { #0 'before.all := 49 | #1 'mid.sentence := 50 | #2 'after.sentence := 51 | #3 'after.block := 52 | #4 'after.authors := 53 | #5 'between.elements := 54 | } 55 | 56 | STRINGS { s t } 57 | 58 | FUNCTION {output.nonnull} 59 | { 's := 60 | output.state mid.sentence = 61 | { " " * write$ } 62 | { output.state after.block = 63 | { add.period$ write$ 64 | newline$ 65 | "\newblock " write$ 66 | } 67 | { 68 | output.state after.authors = 69 | { ": " * write$ 70 | newline$ 71 | "\newblock " write$ 72 | } 73 | { output.state between.elements = 74 | { ", " * write$ } 75 | { output.state before.all = 76 | 'write$ 77 | { add.period$ " " * write$ } 78 | if$ 79 | } 80 | if$ 81 | } 82 | if$ 83 | } 84 | if$ 85 | mid.sentence 'output.state := 86 | } 87 | if$ 88 | s 89 | } 90 | 91 | FUNCTION {output} 92 | { duplicate$ empty$ 93 | 'pop$ 94 | 'output.nonnull 95 | if$ 96 | } 97 | 98 | FUNCTION {output.check} 99 | { 't := 100 | duplicate$ empty$ 101 | { pop$ "empty " t * " in " * cite$ * warning$ } 102 | 'output.nonnull 103 | if$ 104 | } 105 | 106 | FUNCTION {output.bibitem} 107 | { newline$ 108 | "\bibitem{" write$ 109 | cite$ write$ 110 | "}" write$ 111 | newline$ 112 | "" 113 | before.all 'output.state := 114 | } 115 | 116 | FUNCTION {fin.entry} 117 | { write$ 118 | newline$ 119 | } 120 | 121 | FUNCTION {new.block} 122 | { output.state before.all = 123 | 'skip$ 124 | { after.block 'output.state := } 125 | if$ 126 | } 127 | 128 | FUNCTION {stupid.colon} 129 | { after.authors 'output.state := } 130 | 131 | FUNCTION {insert.comma} 132 | { output.state before.all = 133 | 'skip$ 134 | { between.elements 'output.state := } 135 | if$ 136 | } 137 | 138 | FUNCTION {new.sentence} 139 | { output.state after.block = 140 | 'skip$ 141 | { output.state before.all = 142 | 'skip$ 143 | { after.sentence 'output.state := } 144 | if$ 145 | } 146 | if$ 147 | } 148 | 149 | FUNCTION {not} 150 | { { #0 } 151 | { #1 } 152 | if$ 153 | } 154 | 155 | FUNCTION {and} 156 | { 'skip$ 157 | { pop$ #0 } 158 | if$ 159 | } 160 | 161 | FUNCTION {or} 162 | { { pop$ #1 } 163 | 'skip$ 164 | if$ 165 | } 166 | 167 | FUNCTION {new.block.checka} 168 | { empty$ 169 | 'skip$ 170 | 'new.block 171 | if$ 172 | } 173 | 174 | FUNCTION {new.block.checkb} 175 | { empty$ 176 | swap$ empty$ 177 | and 178 | 'skip$ 179 | 'new.block 180 | if$ 181 | } 182 | 183 | FUNCTION {new.sentence.checka} 184 | { empty$ 185 | 'skip$ 186 | 'new.sentence 187 | if$ 188 | } 189 | 190 | FUNCTION {new.sentence.checkb} 191 | { empty$ 192 | swap$ empty$ 193 | and 194 | 'skip$ 195 | 'new.sentence 196 | if$ 197 | } 198 | 199 | FUNCTION {field.or.null} 200 | { duplicate$ empty$ 201 | { pop$ "" } 202 | 'skip$ 203 | if$ 204 | } 205 | 206 | FUNCTION {emphasize} 207 | { duplicate$ empty$ 208 | { pop$ "" } 209 | { "" swap$ * "" * } 210 | if$ 211 | } 212 | 213 | FUNCTION {bold} 214 | { duplicate$ empty$ 215 | { pop$ "" } 216 | { "\textbf{" swap$ * "}" * } 217 | if$ 218 | } 219 | 220 | FUNCTION {parens} 221 | { duplicate$ empty$ 222 | { pop$ "" } 223 | { "(" swap$ * ")" * } 224 | if$ 225 | } 226 | 227 | INTEGERS { nameptr namesleft numnames } 228 | 229 | FUNCTION {format.springer.names} 230 | { 's := 231 | #1 'nameptr := 232 | s num.names$ 'numnames := 233 | numnames 'namesleft := 234 | { namesleft #0 > } 235 | { s nameptr "{vv~}{ll}{, jj}{, f{.}.}" format.name$ 't := 236 | nameptr #1 > 237 | { namesleft #1 > 238 | { ", " * t * } 239 | { numnames #1 > 240 | { ", " * } 241 | 'skip$ 242 | if$ 243 | t "others" = 244 | { " et~al." * } 245 | { "" * t * } 246 | if$ 247 | } 248 | if$ 249 | } 250 | 't 251 | if$ 252 | nameptr #1 + 'nameptr := 253 | namesleft #1 - 'namesleft := 254 | } 255 | while$ 256 | } 257 | 258 | FUNCTION {format.names} 259 | { 's := 260 | #1 'nameptr := 261 | s num.names$ 'numnames := 262 | numnames 'namesleft := 263 | { namesleft #0 > } 264 | { s nameptr "{vv~}{ll}{, jj}{, f.}" format.name$ 't := 265 | nameptr #1 > 266 | { namesleft #1 > 267 | { ", " * t * } 268 | { numnames #2 > 269 | { "," * } 270 | 'skip$ 271 | if$ 272 | t "others" = 273 | { " et~al." * } 274 | { " \& " * t * } 275 | if$ 276 | } 277 | if$ 278 | } 279 | 't 280 | if$ 281 | nameptr #1 + 'nameptr := 282 | namesleft #1 - 'namesleft := 283 | } 284 | while$ 285 | } 286 | 287 | FUNCTION {format.authors} 288 | { author empty$ 289 | { "" } 290 | { author format.springer.names } 291 | if$ 292 | } 293 | 294 | FUNCTION {format.editors} 295 | { editor empty$ 296 | { "" } 297 | { editor format.springer.names 298 | editor num.names$ #1 > 299 | { ", eds." * } 300 | { ", ed." * } 301 | if$ 302 | } 303 | if$ 304 | } 305 | 306 | FUNCTION {format.title} 307 | { title empty$ 308 | { "" } 309 | { title "t" change.case$ } 310 | if$ 311 | } 312 | 313 | FUNCTION {n.dashify} 314 | { 't := 315 | "" 316 | { t empty$ not } 317 | { t #1 #1 substring$ "-" = 318 | { t #1 #2 substring$ "--" = not 319 | { "--" * 320 | t #2 global.max$ substring$ 't := 321 | } 322 | { { t #1 #1 substring$ "-" = } 323 | { "-" * 324 | t #2 global.max$ substring$ 't := 325 | } 326 | while$ 327 | } 328 | if$ 329 | } 330 | { t #1 #1 substring$ * 331 | t #2 global.max$ substring$ 't := 332 | } 333 | if$ 334 | } 335 | while$ 336 | } 337 | 338 | FUNCTION {format.date} 339 | { year empty$ 340 | { "there's no year in " cite$ * warning$ } 341 | 'year 342 | if$ 343 | } 344 | 345 | FUNCTION {format.btitle} 346 | { title emphasize 347 | } 348 | 349 | FUNCTION {tie.or.space.connect} 350 | { duplicate$ text.length$ #3 < 351 | { "~" } 352 | { " " } 353 | if$ 354 | swap$ * * 355 | } 356 | 357 | FUNCTION {either.or.check} 358 | { empty$ 359 | 'pop$ 360 | { "can't use both " swap$ * " fields in " * cite$ * warning$ } 361 | if$ 362 | } 363 | 364 | FUNCTION {format.bvolume} 365 | { volume empty$ 366 | { "" } 367 | { "Volume" volume tie.or.space.connect 368 | series empty$ 369 | 'skip$ 370 | { " of " * series emphasize * } 371 | if$ 372 | add.period$ 373 | "volume and number" number either.or.check 374 | } 375 | if$ 376 | } 377 | 378 | FUNCTION {format.number.series} 379 | { volume empty$ 380 | { number empty$ 381 | { series field.or.null } 382 | { output.state mid.sentence = 383 | { "number" } 384 | { "Number" } 385 | if$ 386 | number tie.or.space.connect 387 | series empty$ 388 | { "there's a number but no series in " cite$ * warning$ } 389 | { " in " * series * } 390 | if$ 391 | } 392 | if$ 393 | } 394 | { "" } 395 | if$ 396 | } 397 | 398 | FUNCTION {format.edition} 399 | { edition empty$ 400 | { "" } 401 | { output.state mid.sentence = 402 | { edition "l" change.case$ " edn." * } 403 | { edition "t" change.case$ " edn." * } 404 | if$ 405 | } 406 | if$ 407 | } 408 | 409 | INTEGERS { multiresult } 410 | 411 | FUNCTION {multi.page.check} 412 | { 't := 413 | #0 'multiresult := 414 | { multiresult not 415 | t empty$ not 416 | and 417 | } 418 | { t #1 #1 substring$ 419 | duplicate$ "-" = 420 | swap$ duplicate$ "," = 421 | swap$ "+" = 422 | or or 423 | { #1 'multiresult := } 424 | { t #2 global.max$ substring$ 't := } 425 | if$ 426 | } 427 | while$ 428 | multiresult 429 | } 430 | 431 | FUNCTION {format.pages} 432 | { pages empty$ 433 | { "" } 434 | { pages multi.page.check 435 | { "" pages n.dashify tie.or.space.connect } 436 | { "" pages tie.or.space.connect } 437 | if$ 438 | } 439 | if$ 440 | } 441 | 442 | FUNCTION {format.vol} 443 | { volume bold 444 | } 445 | 446 | FUNCTION {pre.format.pages} 447 | { pages empty$ 448 | 'skip$ 449 | { duplicate$ empty$ 450 | { pop$ format.pages } 451 | { " " * pages n.dashify * } 452 | if$ 453 | } 454 | if$ 455 | } 456 | 457 | FUNCTION {format.chapter.pages} 458 | { chapter empty$ 459 | 'format.pages 460 | { type empty$ 461 | { "chapter" } 462 | { type "l" change.case$ } 463 | if$ 464 | chapter tie.or.space.connect 465 | pages empty$ 466 | 'skip$ 467 | { " " * format.pages * } 468 | if$ 469 | } 470 | if$ 471 | } 472 | 473 | FUNCTION {format.in.ed.booktitle} 474 | { booktitle empty$ 475 | { "" } 476 | { editor empty$ 477 | { "In: " booktitle emphasize * } 478 | { "In " format.editors * ": " * booktitle emphasize * } 479 | if$ 480 | } 481 | if$ 482 | } 483 | 484 | FUNCTION {empty.misc.check} 485 | { author empty$ title empty$ howpublished empty$ 486 | month empty$ year empty$ note empty$ 487 | and and and and and 488 | { "all relevant fields are empty in " cite$ * warning$ } 489 | 'skip$ 490 | if$ 491 | } 492 | 493 | FUNCTION {format.thesis.type} 494 | { type empty$ 495 | 'skip$ 496 | { pop$ 497 | type "t" change.case$ 498 | } 499 | if$ 500 | } 501 | 502 | FUNCTION {format.tr.number} 503 | { type empty$ 504 | { "Technical Report" } 505 | 'type 506 | if$ 507 | number empty$ 508 | { "t" change.case$ } 509 | { number tie.or.space.connect } 510 | if$ 511 | } 512 | 513 | FUNCTION {format.article.crossref} 514 | { key empty$ 515 | { journal empty$ 516 | { "need key or journal for " cite$ * " to crossref " * crossref * 517 | warning$ 518 | "" 519 | } 520 | { "In {\em " journal * "\/}" * } 521 | if$ 522 | } 523 | { "In " key * } 524 | if$ 525 | " \cite{" * crossref * "}" * 526 | } 527 | 528 | FUNCTION {format.crossref.editor} 529 | { editor #1 "{vv~}{ll}" format.name$ 530 | editor num.names$ duplicate$ 531 | #2 > 532 | { pop$ " et~al." * } 533 | { #2 < 534 | 'skip$ 535 | { editor #2 "{ff }{vv }{ll}{ jj}" format.name$ "others" = 536 | { " et~al." * } 537 | { " and " * editor #2 "{vv~}{ll}" format.name$ * } 538 | if$ 539 | } 540 | if$ 541 | } 542 | if$ 543 | } 544 | 545 | FUNCTION {format.book.crossref} 546 | { volume empty$ 547 | { "empty volume in " cite$ * "'s crossref of " * crossref * warning$ 548 | "In " 549 | } 550 | { "Volume" volume tie.or.space.connect 551 | " of " * 552 | } 553 | if$ 554 | " \cite{" * crossref * "}" * 555 | } 556 | 557 | FUNCTION {format.incoll.inproc.crossref} 558 | { editor empty$ 559 | editor field.or.null author field.or.null = 560 | or 561 | { key empty$ 562 | { booktitle empty$ 563 | { "need editor, key, or booktitle for " cite$ * " to crossref " * 564 | crossref * warning$ 565 | "" 566 | } 567 | { "" } 568 | if$ 569 | } 570 | { "" } 571 | if$ 572 | } 573 | { "" } 574 | if$ 575 | " \cite{" * crossref * "}" * 576 | } 577 | 578 | FUNCTION {and.the.note} 579 | { note output 580 | note empty$ 581 | 'skip$ 582 | { add.period$ } 583 | if$ 584 | } 585 | 586 | FUNCTION {article} 587 | { output.bibitem 588 | format.authors "author" output.check 589 | stupid.colon 590 | format.title "title" output.check 591 | new.block 592 | crossref missing$ 593 | { journal emphasize "journal" output.check 594 | format.vol output 595 | format.date parens output 596 | format.pages output 597 | } 598 | { format.article.crossref output.nonnull 599 | format.pages output 600 | } 601 | if$ 602 | and.the.note 603 | fin.entry 604 | } 605 | 606 | FUNCTION {book} 607 | { output.bibitem 608 | author empty$ 609 | { format.editors "author and editor" output.check } 610 | { format.authors output.nonnull 611 | crossref missing$ 612 | { "author and editor" editor either.or.check } 613 | 'skip$ 614 | if$ 615 | } 616 | if$ 617 | stupid.colon 618 | format.btitle "title" output.check 619 | new.sentence 620 | crossref missing$ 621 | { format.edition output 622 | format.bvolume output 623 | new.block 624 | format.number.series output 625 | new.sentence 626 | publisher "publisher" output.check 627 | address empty$ 628 | 'skip$ 629 | { insert.comma } 630 | if$ 631 | address output 632 | format.date parens output 633 | } 634 | { format.book.crossref output.nonnull 635 | } 636 | if$ 637 | and.the.note 638 | fin.entry 639 | } 640 | 641 | FUNCTION {booklet} 642 | { output.bibitem 643 | format.authors output 644 | stupid.colon 645 | format.title "title" output.check 646 | howpublished address new.block.checkb 647 | howpublished output 648 | address empty$ 649 | 'skip$ 650 | { insert.comma } 651 | if$ 652 | address output 653 | format.date parens output 654 | and.the.note 655 | fin.entry 656 | } 657 | 658 | FUNCTION {inbook} 659 | { output.bibitem 660 | author empty$ 661 | { format.editors "author and editor" output.check } 662 | { format.authors output.nonnull 663 | crossref missing$ 664 | { "author and editor" editor either.or.check } 665 | 'skip$ 666 | if$ 667 | } 668 | if$ 669 | stupid.colon 670 | crossref missing$ 671 | { chapter output 672 | new.block 673 | format.number.series output 674 | new.sentence 675 | "In:" output 676 | format.btitle "title" output.check 677 | new.sentence 678 | format.edition output 679 | format.bvolume output 680 | publisher "publisher" output.check 681 | address empty$ 682 | 'skip$ 683 | { insert.comma } 684 | if$ 685 | address output 686 | format.date parens output 687 | } 688 | { chapter output 689 | new.block 690 | format.incoll.inproc.crossref output.nonnull 691 | } 692 | if$ 693 | format.pages output 694 | and.the.note 695 | fin.entry 696 | } 697 | 698 | FUNCTION {incollection} 699 | { output.bibitem 700 | format.authors "author" output.check 701 | stupid.colon 702 | format.title "title" output.check 703 | new.block 704 | crossref missing$ 705 | { format.in.ed.booktitle "booktitle" output.check 706 | new.sentence 707 | format.bvolume output 708 | format.number.series output 709 | new.block 710 | format.edition output 711 | publisher "publisher" output.check 712 | address empty$ 713 | 'skip$ 714 | { insert.comma } 715 | if$ 716 | address output 717 | format.date parens output 718 | format.pages output 719 | } 720 | { format.incoll.inproc.crossref output.nonnull 721 | format.chapter.pages output 722 | } 723 | if$ 724 | and.the.note 725 | fin.entry 726 | } 727 | 728 | FUNCTION {inproceedings} 729 | { output.bibitem 730 | format.authors "author" output.check 731 | stupid.colon 732 | format.title "title" output.check 733 | new.block 734 | crossref missing$ 735 | { format.in.ed.booktitle "booktitle" output.check 736 | new.sentence 737 | format.bvolume output 738 | format.number.series output 739 | address empty$ 740 | { organization publisher new.sentence.checkb 741 | organization empty$ 742 | 'skip$ 743 | { insert.comma } 744 | if$ 745 | organization output 746 | publisher empty$ 747 | 'skip$ 748 | { insert.comma } 749 | if$ 750 | publisher output 751 | format.date parens output 752 | } 753 | { insert.comma 754 | address output.nonnull 755 | organization empty$ 756 | 'skip$ 757 | { insert.comma } 758 | if$ 759 | organization output 760 | publisher empty$ 761 | 'skip$ 762 | { insert.comma } 763 | if$ 764 | publisher output 765 | format.date parens output 766 | } 767 | if$ 768 | } 769 | { format.incoll.inproc.crossref output.nonnull 770 | } 771 | if$ 772 | format.pages output 773 | and.the.note 774 | fin.entry 775 | } 776 | 777 | FUNCTION {conference} { inproceedings } 778 | 779 | FUNCTION {manual} 780 | { output.bibitem 781 | author empty$ 782 | { organization empty$ 783 | 'skip$ 784 | { organization output.nonnull 785 | address output 786 | } 787 | if$ 788 | } 789 | { format.authors output.nonnull } 790 | if$ 791 | stupid.colon 792 | format.btitle "title" output.check 793 | author empty$ 794 | { organization empty$ 795 | { address new.block.checka 796 | address output 797 | } 798 | 'skip$ 799 | if$ 800 | } 801 | { organization address new.block.checkb 802 | organization output 803 | address empty$ 804 | 'skip$ 805 | { insert.comma } 806 | if$ 807 | address output 808 | } 809 | if$ 810 | new.sentence 811 | format.edition output 812 | format.date parens output 813 | and.the.note 814 | fin.entry 815 | } 816 | 817 | FUNCTION {mastersthesis} 818 | { output.bibitem 819 | format.authors "author" output.check 820 | stupid.colon 821 | format.title "title" output.check 822 | new.block 823 | "Master's thesis" format.thesis.type output.nonnull 824 | school empty$ 825 | 'skip$ 826 | { insert.comma } 827 | if$ 828 | school "school" output.check 829 | address empty$ 830 | 'skip$ 831 | { insert.comma } 832 | if$ 833 | address output 834 | format.date parens output 835 | and.the.note 836 | fin.entry 837 | } 838 | 839 | FUNCTION {misc} 840 | { output.bibitem 841 | format.authors "author" output.check 842 | stupid.colon 843 | format.title "title" output.check 844 | howpublished new.block.checka 845 | howpublished output 846 | format.date parens output 847 | and.the.note 848 | fin.entry 849 | empty.misc.check 850 | } 851 | 852 | FUNCTION {phdthesis} 853 | { output.bibitem 854 | format.authors "author" output.check 855 | stupid.colon 856 | format.btitle "title" output.check 857 | new.block 858 | "PhD thesis" format.thesis.type output.nonnull 859 | school empty$ 860 | 'skip$ 861 | { insert.comma } 862 | if$ 863 | school "school" output.check 864 | address empty$ 865 | 'skip$ 866 | { insert.comma } 867 | if$ 868 | address output 869 | format.date parens output 870 | and.the.note 871 | fin.entry 872 | } 873 | 874 | FUNCTION {proceedings} 875 | { output.bibitem 876 | editor empty$ 877 | { organization empty$ 878 | { "" } 879 | { organization output 880 | stupid.colon } 881 | if$ 882 | } 883 | { format.editors output.nonnull 884 | stupid.colon 885 | } 886 | if$ 887 | format.btitle "title" output.check 888 | new.block 889 | crossref missing$ 890 | { format.in.ed.booktitle "booktitle" output.check 891 | new.sentence 892 | format.bvolume output 893 | format.number.series output 894 | address empty$ 895 | { organization publisher new.sentence.checkb 896 | organization empty$ 897 | 'skip$ 898 | { insert.comma } 899 | if$ 900 | organization output 901 | publisher empty$ 902 | 'skip$ 903 | { insert.comma } 904 | if$ 905 | publisher output 906 | format.date parens output 907 | } 908 | { insert.comma 909 | address output.nonnull 910 | organization empty$ 911 | 'skip$ 912 | { insert.comma } 913 | if$ 914 | organization output 915 | publisher empty$ 916 | 'skip$ 917 | { insert.comma } 918 | if$ 919 | publisher output 920 | format.date parens output 921 | } 922 | if$ 923 | } 924 | { format.incoll.inproc.crossref output.nonnull 925 | } 926 | if$ 927 | and.the.note 928 | fin.entry 929 | } 930 | 931 | FUNCTION {techreport} 932 | { output.bibitem 933 | format.authors "author" output.check 934 | stupid.colon 935 | format.title "title" output.check 936 | new.block 937 | format.tr.number output.nonnull 938 | institution empty$ 939 | 'skip$ 940 | { insert.comma } 941 | if$ 942 | institution "institution" output.check 943 | address empty$ 944 | 'skip$ 945 | { insert.comma } 946 | if$ 947 | address output 948 | format.date parens output 949 | and.the.note 950 | fin.entry 951 | } 952 | 953 | FUNCTION {unpublished} 954 | { output.bibitem 955 | format.authors "author" output.check 956 | stupid.colon 957 | format.title "title" output.check 958 | new.block 959 | note "note" output.check 960 | format.date parens output 961 | fin.entry 962 | } 963 | 964 | FUNCTION {default.type} { misc } 965 | 966 | MACRO {jan} {"January"} 967 | 968 | MACRO {feb} {"February"} 969 | 970 | MACRO {mar} {"March"} 971 | 972 | MACRO {apr} {"April"} 973 | 974 | MACRO {may} {"May"} 975 | 976 | MACRO {jun} {"June"} 977 | 978 | MACRO {jul} {"July"} 979 | 980 | MACRO {aug} {"August"} 981 | 982 | MACRO {sep} {"September"} 983 | 984 | MACRO {oct} {"October"} 985 | 986 | MACRO {nov} {"November"} 987 | 988 | MACRO {dec} {"December"} 989 | 990 | MACRO {acmcs} {"ACM Computing Surveys"} 991 | 992 | MACRO {acta} {"Acta Informatica"} 993 | 994 | MACRO {cacm} {"Communications of the ACM"} 995 | 996 | MACRO {ibmjrd} {"IBM Journal of Research and Development"} 997 | 998 | MACRO {ibmsj} {"IBM Systems Journal"} 999 | 1000 | MACRO {ieeese} {"IEEE Transactions on Software Engineering"} 1001 | 1002 | MACRO {ieeetc} {"IEEE Transactions on Computers"} 1003 | 1004 | MACRO {ieeetcad} 1005 | {"IEEE Transactions on Computer-Aided Design of Integrated Circuits"} 1006 | 1007 | MACRO {ipl} {"Information Processing Letters"} 1008 | 1009 | MACRO {jacm} {"Journal of the ACM"} 1010 | 1011 | MACRO {jcss} {"Journal of Computer and System Sciences"} 1012 | 1013 | MACRO {scp} {"Science of Computer Programming"} 1014 | 1015 | MACRO {sicomp} {"SIAM Journal on Computing"} 1016 | 1017 | MACRO {tocs} {"ACM Transactions on Computer Systems"} 1018 | 1019 | MACRO {tods} {"ACM Transactions on Database Systems"} 1020 | 1021 | MACRO {tog} {"ACM Transactions on Graphics"} 1022 | 1023 | MACRO {toms} {"ACM Transactions on Mathematical Software"} 1024 | 1025 | MACRO {toois} {"ACM Transactions on Office Information Systems"} 1026 | 1027 | MACRO {toplas} {"ACM Transactions on Programming Languages and Systems"} 1028 | 1029 | MACRO {tcs} {"Theoretical Computer Science"} 1030 | 1031 | READ 1032 | 1033 | STRINGS { longest.label } 1034 | 1035 | INTEGERS { number.label longest.label.width } 1036 | 1037 | FUNCTION {initialize.longest.label} 1038 | { "" 'longest.label := 1039 | #1 'number.label := 1040 | #0 'longest.label.width := 1041 | } 1042 | 1043 | FUNCTION {longest.label.pass} 1044 | { number.label int.to.str$ 'label := 1045 | number.label #1 + 'number.label := 1046 | label width$ longest.label.width > 1047 | { label 'longest.label := 1048 | label width$ 'longest.label.width := 1049 | } 1050 | 'skip$ 1051 | if$ 1052 | } 1053 | 1054 | EXECUTE {initialize.longest.label} 1055 | 1056 | ITERATE {longest.label.pass} 1057 | 1058 | FUNCTION {begin.bib} 1059 | { preamble$ empty$ 1060 | 'skip$ 1061 | { preamble$ write$ newline$ } 1062 | if$ 1063 | "\begin{thebibliography}{" longest.label * "}" * write$ newline$ 1064 | } 1065 | 1066 | EXECUTE {begin.bib} 1067 | 1068 | EXECUTE {init.state.consts} 1069 | 1070 | ITERATE {call.type$} 1071 | 1072 | FUNCTION {end.bib} 1073 | { newline$ 1074 | "\end{thebibliography}" write$ newline$ 1075 | } 1076 | 1077 | EXECUTE {end.bib} 1078 | 1079 | 1080 | 1081 | -------------------------------------------------------------------------------- /doc/notes/sprmindx.sty: -------------------------------------------------------------------------------- 1 | delim_0 "\\idxquad " 2 | delim_1 "\\idxquad " 3 | delim_2 "\\idxquad " 4 | delim_n ",\\," 5 | -------------------------------------------------------------------------------- /src/ProPPR-paper-comments.txt: -------------------------------------------------------------------------------- 1 | Sun Feb 26 15:20:11 EST 2017 2 | 3 | Overall, the writing could be significantly improved. Algorithmic and logical consquences are not made clear. 4 | 5 | (a) What is alpha' for reasonable programs? Looks like it should be 0! alpha' (for Page-Rank-Nibble) is suppo 6 | 7 | P3: provably-correct and proveably-correct in the same line! 8 | 9 | P3: Do not say what "linearized version of the proof space" means. 10 | P3: epsilon is "the worst case approximation error" -- not defined. 11 | 12 | P3: "inference .. in size independent of the size of the underlying database". What does this mean, i.e. how could it be so? 13 | 14 | P5 Table 2: 15 | |N(u)| used but not defined. Is this just the outdegree of u? 16 | phi(u -> v) used but not defined. 17 | 18 | P6 ppr(v0) not defined. 19 | 20 | Table 2: What does "p=r=0" mean? is (bold) 0 a vector? Note earlier message about definining a vector properly. 21 | 22 | r is clearly not "0", it is defined for v0, and maps it to 1. 23 | p is emptyset, not "0". 24 | 25 | In the definition of push(u, alpha'), r[v] on the RHS in r[v] = r[v] + ... doesn't make sense because it may be undefined. 26 | 27 | Presumably it should "default" to 0. 28 | 29 | -------------------------------------------------------------------------------- /src/example.pl: -------------------------------------------------------------------------------- 1 | /*Yeah let's try with a simple example like: 2 | S -> N:select_left VP:delete_left 3 | VP -> V:identity 4 | VP -> V:delete_right N:select_right 5 | V -> "swims", "plays", "eat" 6 | N -> "williams", "phelps", "tennis", "horses", "hay" 7 | 8 | Some example derivations would be 9 | (S athleteplayssport(phelps,swimming) 10 | (N phelps "phelps") 11 | (VP athleteplayssport(null,swimming) 12 | (V athleteplayssport(null,swimming) "swims"))) 13 | 14 | (S animaleatsfood(horses,hay) 15 | (N horses "horses") 16 | (VP animaleatsfood(null,hay) 17 | (V animaleatsfood(null,null) "eat") 18 | (N hay "hay"))) 19 | */ 20 | 21 | s(H-T, VP):- 22 | n(H-M, N), % N:selecte_right. 23 | vp(M-T, VP), % VP:delete_left-- why is delete_left needed? 24 | arg(1, VP, N). 25 | 26 | vp(H-T, V):- 27 | v(H-M, V), % V:delete_right -- why is delete_right needed? 28 | n(M-T, N), % N:select_right 29 | arg(2, V, N). 30 | vp(H-T, V):- v(H-T, V). 31 | 32 | n([tennis |X]-X, tennis). 33 | n([horses |X]-X, horses). 34 | n([hay |X]-X, hay). 35 | n([piano |X]-X, piano). 36 | n([mandolin|X]-X, mandolin). 37 | n([golf |X]-X, golf). 38 | n([cricket |X]-X, cricket). 39 | 40 | n(['Andre', 'Agassi'|X]-X, agassi). 41 | n(['Williams' |X]-X, williams). 42 | n(['Phelps' |X]-X, phelps). 43 | n(['Chopin' |X]-X, chopin). 44 | 45 | v([swims|X]-X, 'athlete plays sport'(_, swimming)). 46 | v([eats|X]-X, 'animal eats food'(_,_)). 47 | v([eat|X]-X, 'animals eats food'(_,_)). 48 | v([plays|X]-X, 'athlete plays _sport'(_,S)) :- sport(S). 49 | v([plays|X]-X, 'musician plays instrument'(_,I)):- instrument(I). 50 | 51 | instrument(piano). 52 | instrument(mandolin). 53 | 54 | sport(tennis). 55 | sport(golf). 56 | sport(cricket). -------------------------------------------------------------------------------- /src/my_eval.pl: -------------------------------------------------------------------------------- 1 | /* 2 | ColName is of the form cols(C1, ..., CJ). 3 | Rows is of the form rows(row(V11,..., V1J), ..., row(VI1,...VIJ)). 4 | 5 | The subsequence of rows is represented as an increasing sequence of indices in 1..J, 6 | indices([i1, ..., ik]). Hence, they qualify as sets, and operations such as intersection 7 | can be used with them. (There is no particular reason to keep them sorted, seems this 8 | will be of use later.) 9 | 10 | */ 11 | % /*Comment for SWIPL 12 | /* for XSB 13 | :- auto_table. 14 | */ 15 | /* For XSB 16 | :- table form_rows/3 as intern. 17 | :- table form_values/3 as intern. 18 | :- table form_value/3 as intern. 19 | */ 20 | %/* For BP 21 | :- table form_values/3, form_rows/3, form_value/3. 22 | %*/ 23 | 24 | 25 | % Top-level entry points: 26 | my_top_s(D, S, N) :- 27 | time((table_1(T), gen_solutions(S, D, T), length(S, N))). 28 | 29 | gen_solutions(S, Depth, T):- 30 | setof(X-Form, (form_values(Form, Depth, T), eval(Form, T, val(X))), S). 31 | 32 | my_top_p(D, S, N) :- 33 | time((table_1(T), gen_programs(S, D, T), length(S, N))). 34 | 35 | gen_programs(S, Depth, T):- 36 | bagof(Form, (form_values(Form, Depth, T)), S). 37 | 38 | 39 | /* 40 | Type checking formulas 41 | formula(F, T) :- F is a legal formula, given table T 42 | */ 43 | form(Form, Table):- form(Form, _Depth, Table). 44 | form_values(Form, Table):- form_values(Form, _Depth, Table). 45 | 46 | /* 47 | form(Form, Depth, T):- 48 | Form is a logical form of depth at most Depth, with colnames and constants 49 | specified in Table T. 50 | 51 | Can be used generatively. 52 | */ 53 | 54 | form(Form, Depth, T):- form_values(Form, Depth, T). 55 | form(Form, Depth, T):- form_value(Form, Depth, T). 56 | form(Form, Depth, T):- form_rows(Form, Depth, T). 57 | 58 | /* 59 | Given input itable(ColNames, RowList), Types is the types of the columns, 60 | inferred by examining the data in Data, Numbers is the set of all numbers 61 | in Data, and Dates is the set of all dates in Data. 62 | */ 63 | processed_table(itable(ColNames, Data), table(ColNames, Types, Data, Numbers, Dates)):- 64 | get_types(Data, Types), 65 | get_dates(Data, Dates), 66 | get_nums(Data, Nums-[]), 67 | sort(Nums, Numbers). 68 | 69 | /* 70 | eval_top(Form, Table, Res):- evaluate Form with Table to produce Res, type-checking first. 71 | eval(Form, Table, Res):- evaluate Form with Table to produce Res. 72 | */ 73 | eval_top(Form, Table, Res):- once(form(Form, Table)), eval(Form, Table, Res). 74 | 75 | 76 | % Section Table. Table is of the form table(ColNames, ColTypes, RowList, Numbers, Dates). 77 | 78 | colNames(table(X, _, _, _, _), X). 79 | colTypes(table(_, X, _, _, _), X). 80 | rowList(table(_, _, X, _, _), X). 81 | numbers(table(_, _, _, X, _), X). 82 | dates(table(_, _, _, _, X), X). 83 | 84 | get_types([Row | Rows], Type):- get_type(Row, T), get_types(Rows, T, Type). 85 | 86 | get_types([], T, T). 87 | get_types([Row | Rows], T, Types):- get_type(Row, T1), merge_types(T1, T, T2), get_types(Rows, T2, Types). 88 | 89 | get_type(Row, Types):- Row=..[row|Args], get_type_1(Args, Types). 90 | get_type_1([], []). 91 | get_type_1([A|As], [T|Ts]):- type_d(A,T), get_type_1(As, Ts). 92 | 93 | merge_types([T|Tr], [S|Sr], [U|Ur]):- merge_type(T, S, U), merge_types(Tr, Sr, Ur). 94 | merge_types([],[],[]). 95 | 96 | merge_type(X, Y, Z):- X == Y -> Z=X; Z=unk. 97 | 98 | get_dates(_L, []). % TODO: extract dates from tables. 99 | get_nums([], X-X). 100 | get_nums([Row | Rows], X-Z):- 101 | Row =..[_Functor | Args], 102 | nums_rows(Args, X-Y), 103 | get_nums(Rows, Y-Z). 104 | 105 | nums_rows([], X-X). 106 | nums_rows([A|R], T-Y):- (number(A) -> T=[A|X];T=X), nums_rows(R, X-Y). 107 | 108 | % Section logical forms 109 | form_col(Col, Type, Table):- colNames(Table, Cs), colTypes(Table, Ct), member(Col, Type, Cs, Ct). 110 | 111 | form_rows(all, D, _T):- D >= 1. 112 | form_rows(X, D, T):- dec(D, E), form_rows_1(X, E, T). 113 | %form_rows(both(R, S), D, T) :- dec(D, E), form_rows(R, E, T), form_rows(S, E, T). 114 | form_rows_1(either(R,S), E, T):- form_rows(R, E, T), form_rows(S, E, T). 115 | form_rows_1(ge(Col, V, Rows), E, T):- form_rows(Rows, E, T), form_value(V, E, T), are_nums(Col, V, T). 116 | form_rows_1(gt(Col, V, Rows), E, T):- form_rows(Rows, E, T), form_value(V, E, T), are_nums(Col, V, T). 117 | form_rows_1(le(Col, V, Rows), E, T):- form_rows(Rows, E, T), form_value(V, E, T), are_nums(Col, V, T). 118 | form_rows_1(lt(Col, V, Rows), E, T):- form_rows(Rows, E, T), form_value(V, E, T), are_nums(Col, V, T). 119 | form_rows_1(eq(Col, V, Rows), E, T):- form_rows(Rows, E, T), form_value(V, E, T), are_same(Col, V, T). 120 | form_rows_1(max(Col, Rows), E, T):- is_num(Col, T), form_rows(Rows, E, T). 121 | form_rows_1(min(Col, Rows), E, T):- is_num(Col, T), form_rows(Rows, E, T). 122 | form_rows_1(prev(Rows), E, T):- form_rows(Rows, E, T). 123 | form_rows_1(next(Rows), E, T):- form_rows(Rows, E, T). 124 | form_rows_1(first(Rows), E, T):- form_rows(Rows, E, T). 125 | form_rows_1(last(Rows), E, T):- form_rows(Rows, E, T). 126 | 127 | form_values(X, D, T) :- dec(D, E), form_values1(X, E, T). 128 | form_values1(proj(Col, Rows), E, T):- form_col(Col, _, T), form_rows(Rows,E, T). 129 | form_values1(L+R, E, T):- form_values(L, E, T), form_values(R, E, T), are_nums_v(L, R, T). 130 | form_values1(L-R, E, T):- form_values(L, E, T), form_values(R, E, T), are_nums_v(L, R, T). 131 | form_values1(L/R, E, T):- form_values(L, E, T), form_values(R, E, T), is_num_v(L, T), is_num_v(R, T). 132 | form_values1(L*R, E, T):- form_values(L, E, T), form_values(R, E, T), are_nums_v(L, R, T). 133 | form_values1(card(Rows), E, T):- form_rows(Rows, E, T). 134 | 135 | form_value(X, ignore, _T) :- base_value(X). 136 | form_value(X, ignore, T) :- form_values(X, T). % maybe at runtime we will get a singleton value. 137 | form_value(X, D, T) :- D >= 1, numbers(T, Numbers), member(X, Numbers). 138 | 139 | base_value(X) :- atomic(X). 140 | base_value(Date) :- (functor(Date, date, N), (N==9; N==3)); functor(Date, time, 3). 141 | base_value([X | Xs]):- base_value(X), base_value(Xs). 142 | 143 | dec(D, E):- D > 1, E is D-1. 144 | 145 | %% type-checking 146 | num_type(int). 147 | num_type(float). 148 | are_same(Col, Val, Table):- are_same(Col, Val, Table, _Type). 149 | are_nums(Col, Val, Table):- are_same(Col, Val, Table, Type), num_type(Type). 150 | 151 | is_num(Col, Table):- form_col(Col, Type, Table), num_type(Type). 152 | is_num_v(Col, Table):- is_num_v(Col, Table, _Type). 153 | are_nums_v(L, R, Table):- is_num_v(L, Table, Type), is_num_v(R, Table, Type). 154 | 155 | are_same(Col, Val, Table, Type):- form_col(Col, Type, Table), type_v(Val, Type). 156 | is_num_v(Exp, Table, Type):- type_v(Exp, Table, Type), num_type(Type). 157 | 158 | % type of value. it can be a constant or a +, -, *, /, card or proj. 159 | type_v(L+_, Table, Type):- type_v(L, Table, Type). 160 | type_v(L-_, Table, Type):- type_v(L, Table, Type). 161 | type_v(L*_, Table, Type):- type_v(L, Table, Type). 162 | type_v(_/_, _Table, float). 163 | type_v(card(_), _Table, int). 164 | type_v(proj(Col, _), Table, Type):- form_col(Col, Type, Table). 165 | type_v(Exp, Type):- type_d(Exp, Type). 166 | 167 | % type of datum 168 | type_d(A, int):- integer(A). 169 | type_d(A, float):- float(A). 170 | type_d(A, atom):- atom(A). 171 | %% evaluating forms: 172 | 173 | % Base case -- normalized indices, values. 174 | eval(indices(L), _Table, indices(L)). 175 | eval(val(X), _Table, val(X)). 176 | eval(X, _Table, val(X)):- X \== all, base_value(X). 177 | 178 | % 179 | eval(all, Table, indices(Res)):- all_rows(Table, Res). 180 | eval(either(R,S), Table, indices(Res)):- 181 | eval(R, Table, indices(IndR)), 182 | eval(S, Table, indices(IndS)), 183 | append(IndR, IndS, Ind), 184 | sort(Ind, Res). 185 | 186 | % probably both is not needed. One can do a CPS embedding of R in S. 187 | %eval(both(R,S), Table, indices(Res)):- 188 | % eval(R, Table, indices(IndR)), 189 | % eval(S, Table, indices(IndS)), 190 | % intersection(IndR, IndS, Res). 191 | 192 | eval(card(Rows), Table, val(R)) :- eval(Rows, Table, indices(Inds)), length(Inds, R). 193 | eval(first(Rows), Table, indices([Row])):- eval(Rows, Table, indices([Row|_])). 194 | eval(last(Rows), Table, indices([Row])):- eval(Rows, Table, indices(Ind)), last(Ind, Row). 195 | eval(next(Rows), Table, indices(Res)) :- eval(Rows, Table, indices(Inds)), next(Inds, Table, Res). 196 | eval(prev(Rows), Table, indices(Res)) :- eval(Rows, Table, indices(Inds)), prev(Inds, Res). 197 | 198 | eval(max(Col, Rows), Table, indices(Res)):- e_extrema(max, Col, Rows, Table, Res). 199 | eval(min(Col, Rows), Table, indices(Res)):- e_extrema(min, Col, Rows, Table, Res). 200 | 201 | eval(ge(ColName, Cell, Rows), Table, indices(Res)):- e_comp(ge, ColName, Cell, Rows, Table, Res). 202 | eval(gt(ColName, Cell, Rows), Table, indices(Res)):- e_comp(gt, ColName, Cell, Rows, Table, Res). 203 | eval(le(ColName, Cell, Rows), Table, indices(Res)):- e_comp(le, ColName, Cell, Rows, Table, Res). 204 | eval(lt(ColName, Cell, Rows), Table, indices(Res)):- e_comp(lt, ColName, Cell, Rows, Table, Res). 205 | eval(eq(ColName, Cell, Rows), Table, indices(Res)):- e_comp(eq, ColName, Cell, Rows, Table, Res). 206 | 207 | eval(proj(ColName, Rows), Table, val(Res)):- 208 | eval(Rows, Table, indices(Inds)), 209 | col(ColName, Table, J), 210 | proj(J, Inds, Table, Res). 211 | 212 | eval(V+W, Table, val(Res)):- e_valop(plus, V, W, Table, Res). 213 | eval(V*W, Table, val(Res)):- e_valop(mult, V, W, Table, Res). 214 | eval(V/W, Table, val(Res)):- e_valop(divide, V, W, Table, Res). 215 | eval(V-W, Table, val(Res)):- e_valop(minus, V, W, Table, Res). 216 | 217 | e_extrema(Op, ColName, Rows, Table, Res):- 218 | eval(Rows, Table, indices(Inds)), 219 | col(ColName, Table, J), 220 | extrema(Op, J, Inds, Table, Res). 221 | 222 | e_valop(Op, L, R, Table, Res):- 223 | eval(L, Table, val(L1)), 224 | eval(R, Table, val(R1)), 225 | valop(Op, L1, R1, Res). 226 | 227 | e_comp(Op, ColName, CellForm, Rows, Table, Res):- 228 | eval(CellForm, Table, Vals), single_value(Vals, Cell), 229 | eval(Rows, Table, indices(Inds)), 230 | col(ColName, Table, J), 231 | rowList(Table, TRows), 232 | comp_1(Op, J, Cell, Inds, TRows, Res). 233 | 234 | % Support definitions 235 | single_value(val([X]), X). 236 | single_value(val(X), X):- atomic(X). 237 | 238 | %row(I, table(_ColNames, Rows), Row):- arg(I, Rows, Row). 239 | col(ColName, Table, J) :- colNames(Table, ColNames), col(ColName, 1, ColNames, J). 240 | col(ColName, I, [ColName | _Rest], I). 241 | col(ColName, I, [C | Rest], J) :- ColName \== C, I1 is I+1, col(ColName, I1, Rest, J). 242 | 243 | cell(I, J, TRows, Cell):- nth1(I, TRows, Row), arg(J, Row, Cell). 244 | 245 | range(K, K, [K]). 246 | range(I, K, [I | Res]):- I < K, I1 is I+1, range(I1, K, Res). 247 | 248 | all_rows(Table, ToK):- rowList(Table, TRows), length(TRows, K), once(range(1, K, ToK)). 249 | 250 | next(Inds, Table, Res):- rowList(Table, TRows), length(TRows, K), next_1(Inds, K, Res). 251 | next_1([], _K, []). 252 | next_1([I|Inds], K, [I1|Res]):- I < K, I1 is I+1, next_1(Inds, K, Res). 253 | next_1([I|Inds], K, Res) :- I >= K, next_1(Inds, K, Res). 254 | 255 | prev([], []). 256 | prev([I|Inds], [I1|Res]):- I > 1, I1 is I-1, prev(Inds, Res). 257 | prev([I|Inds], Res) :- I =< 1, prev(Inds, Res). 258 | 259 | 260 | comp_1(_Op, _J, _Cell, [], _TRows, []). 261 | comp_1(Op, J, Cell, [I | Inds], TRows, Res):- 262 | cell(I, J, TRows, Cell1), 263 | comp_cell(Op, Cell1, Cell, I, Res, Res1), 264 | comp_1(Op, J, Cell, Inds, TRows, Res1). 265 | 266 | comp_cell(ge, Cell1, Cell, I, [I|Res1], Res1):- number(Cell1), number(Cell), Cell1 >= Cell. 267 | comp_cell(gt, Cell1, Cell, I, [I|Res1], Res1):- number(Cell1), number(Cell), Cell1 > Cell. 268 | comp_cell(le, Cell1, Cell, I, [I|Res1], Res1):- number(Cell1), number(Cell), Cell1 =< Cell. 269 | comp_cell(lt, Cell1, Cell, I, [I|Res1], Res1):- number(Cell1), number(Cell), Cell1 < Cell. 270 | comp_cell(eq, Cell1, Cell, I, [I|Res1], Res1):- Cell1 == Cell. 271 | comp_cell(ge, Cell1, Cell, _I, Res, Res) :- number(Cell1), number(Cell), Cell1 < Cell. 272 | comp_cell(gt, Cell1, Cell, _I, Res, Res) :- number(Cell1), number(Cell), Cell1 =< Cell. 273 | comp_cell(le, Cell1, Cell, _I, Res, Res) :- number(Cell1), number(Cell), Cell1 > Cell. 274 | comp_cell(lt, Cell1, Cell, _I, Res, Res) :- number(Cell1), number(Cell), Cell1 >= Cell. 275 | comp_cell(eq, Cell1, Cell, _I, Res, Res) :- Cell1 \== Cell. 276 | 277 | 278 | extrema(Op, J, [I | Inds], Table, Res):- 279 | rowList(Table, TRows), 280 | cell(I, J, TRows, Cell), 281 | extrema(Op, J, Inds, Cell-[I|Tail]-Tail, TRows, Res). 282 | 283 | extrema_cell(max, Cell1, I, Cell-_Is-_Tail, Cell1-[I|NewTail]-NewTail) :- 284 | number(Cell), number(Cell1), Cell1 > Cell. 285 | extrema_cell(max, Cell1, I, Cell-Is-[I|NewTail], Cell-Is-NewTail) :- 286 | number(Cell), number(Cell1), Cell1 == Cell. 287 | extrema_cell(max, Cell1, _I, Cell-Is-Tail, Cell-Is-Tail) :- 288 | number(Cell), number(Cell1), Cell1 < Cell. 289 | extrema_cell(min, Cell1, _I, Cell-Is-Tail, Cell-Is-Tail) :- 290 | number(Cell), number(Cell1), Cell1 > Cell. 291 | extrema_cell(min, Cell1, I, Cell-Is-[I|NewTail], Cell-Is-NewTail) :- 292 | number(Cell), number(Cell1), Cell1 == Cell. 293 | extrema_cell(min, Cell1, I, Cell-_Is-_Tail, Cell1-[I|NewTail]-NewTail) :- 294 | number(Cell), number(Cell1), Cell1 < Cell. 295 | 296 | 297 | extrema(_Op, _J, [], _Cell-Maxes-[], _TRows, Maxes). 298 | extrema(Op, J, [I|Inds], Old, TRows, Res):- 299 | cell(I, J, TRows, Cell1), 300 | extrema_cell(Op, Cell1, I, Old, New), 301 | extrema(Op, J, Inds, New, TRows, Res). 302 | 303 | 304 | proj(J, Inds, Table, Res):- rowList(Table, TRows), proj_1(J, Inds, TRows, Res). 305 | proj_1(_J, [], _TRows, []). 306 | proj_1(J, [I|Inds], TRows, [Cell|Res]):- cell(I, J, TRows, Cell), proj_1(J, Inds, TRows, Res). 307 | 308 | 309 | valop(Op, L, R, Res):- is_list(L), is_list(R), valop_list(Op, L, R, Res). 310 | valop(Op, L, R, Res):- number(L), number(R), perform_op(Op, L, R, Res). 311 | 312 | valop_list(_Op, [], [], []). 313 | valop_list(Op, [L|Ls], [R|Rs], [X|Xs]):- 314 | perform_op(Op, L, R, X), 315 | valop_list(Op, Ls, Rs, Xs). 316 | 317 | perform_op(plus, L, R, X):- number(L), number(R), X is L+R. 318 | perform_op(minus, L, R, X):- number(L), number(R), X is L-R. 319 | perform_op(mult, L, R, X):- number(L), number(R), X is L*R. 320 | perform_op(divide, L, R, X):- number(L), number(R), nonzero(R), X is L/R. 321 | 322 | nonzero(R):- integer(R), (R > 0; R < 0). 323 | nonzero(R):- float(R), (R > 0.0001; R < -0.0001). 324 | 325 | % utilities. 326 | member(X, Y, [X|_], [Y|_]). 327 | member(X, Y, [_|Xs], [_|Ys]):- member(X, Y, Xs, Ys). 328 | 329 | /* Comment for bp. 330 | nth1(I, TRows, Row):- nth1(I, 1, TRows, Row). 331 | nth1(I, I, [Row|_], Row). 332 | nth1(I, J, [_|Rows], Row):- J < I, J1 is J + 1, nth1(I, J1, Rows, Row). 333 | 334 | append([], X, X). 335 | append([A|R], S, [A|T]):- append(R, S, T). 336 | 337 | last(L, S):- append(_, [S], L). 338 | 339 | member(X, S):- append(_, [X|_], S). 340 | 341 | member_chk(X, S):- once(member(X,S)). 342 | 343 | intersection([], _, []). 344 | intersection([H|T], L2, Out) :- 345 | (member_chk(H, L2) -> Out=[H|L3]; Out=L3), 346 | intersection(T, L2, L3). 347 | 348 | %/* Comment for SWIPL 349 | length(L, X):- length(L, 0, X). 350 | length([], A, A). 351 | length([_|X], A, B):- A1 is A+1, length(X, A1, B). 352 | %*/ -------------------------------------------------------------------------------- /src/my_tables.pl: -------------------------------------------------------------------------------- 1 | table_1(Table):- 2 | table_1_data(OT), 3 | once(processed_table(OT, Table)). 4 | 5 | table_1_data(itable([year, city, country, nations], 6 | [row(1896, athens, greece, 14), 7 | row(1900, paris, france, 24), 8 | row(1904, 'st louis', usa, 12), 9 | row(1908, london, uk, 22), 10 | row(1912, stockholm, sweden, 28), 11 | row(1916, berlin, germany, 0), 12 | row(1920, antwerp, belgium, 29), 13 | row(1924, paris, france, 44), 14 | row(1928, amsterdam, netherlands, 46), 15 | row(1932, 'los angeles', usa, 37), 16 | row(1936, berlin, germany, 49), 17 | row(1948, london, uk, 59), 18 | row(1952, helsinki, finland, 69), 19 | row(1956, melbourne, australia, 72), 20 | row(1960, rome, italy, 83), 21 | row(1964, tokyo, japan, 93), 22 | row(1968, 'mexico city', mexico, 112), 23 | row(1972, munich, germany, 121), 24 | row(1976, montreal, canada, 92), 25 | row(1980, moscow, 'soviet union', 80), 26 | row(1984, 'los angeles', usa, 140), 27 | row(1988, seoul, korea, 159), 28 | row(1992, barcelona, spain, 169), 29 | row(1996, atlanta, usa, 197), 30 | row(2000, sydney, australia, 199), 31 | row(2004, athens, greece, 201), 32 | row(2008, beijing, china, 204), 33 | row(2012, london, uk, 204)])). 34 | -------------------------------------------------------------------------------- /src/np_proppr.pl: -------------------------------------------------------------------------------- 1 | % TODO: Figure out how the Input utterance is to be represented. 2 | % cf [Neelakantan 2017]'s Question RNN. 3 | program(_Input, Table, Ans) :- 4 | compute_state(Table, InitState), 5 | make_array(M, C, Table), 6 | make_array(M, RS0), 7 | assign(RS0, 1, 1, M), 8 | 9 | op(Op1), 10 | col(Col1, C), 11 | step(RS0, 1, M, C, Table, Op1, Col1, InitState, RS1, _Ans1), 12 | 13 | op(Op2), 14 | col(Col2, C), 15 | step(RS1, 1, M, C, Table, Op2, Col2, InitState, RS2, _Ans2), 16 | 17 | op(Op3), 18 | col(Col3, C), 19 | step(RS2, 1, M, C, Table, Op3, Col3, InitState, RS3, _Ans3), 20 | 21 | op(Op4), 22 | col(Col4, C), 23 | step(RS3, 1, M, C, Table, Op4, Col4, InitState, _RS4, Ans). 24 | 25 | % A probabilistic predicate -- op selector. 26 | % We have to learn the probabilities associated with each clause. 27 | % TODO: Figure out what syntax ProPPR is using, a bit worried their use of 28 | % braces is not Prolog syntax. 29 | % TODO: Figure out whether PRoPPR supports non-probabilistic predicates. 30 | 31 | op(gt) . %:- {op(gt)}. 32 | op(lt) . %:- {op(lt)}. 33 | op(geq) . %:- {op(geq)}. 34 | op(leq) . %:- {op(leq)}. 35 | op(argmax) . %:- {op(argmax)}. 36 | op(argmin) . %:- {op(argmin)}. 37 | op(select) . %:- {op(select)}. 38 | op(mfe) . %:- {op(mfe)}. 39 | op(previous). %:- {op(previous)}. 40 | op(next) . %:- {op(next)}. 41 | op(first) . %:- {op(first)}. 42 | op(last) . %:- {op(last)}. 43 | op(reset) . %:- {op(reset)}. 44 | op(print) . %:- {op(print)}. 45 | op(count) . %:- {op(count)}. 46 | 47 | % A probabilistic predicate -- column selector. 48 | col(C, Bound):- C < Bound. %, {col(C)}. 49 | 50 | % The paper also deals with "pivots" (numbers that occur in the question and 51 | % that can be/should be used in the synthesized program) using soft selection. 52 | % We will deal with these numbers as well through a learnable probability 53 | % distribution. 54 | 55 | % TODO: figure out what features should pivot depend on. 56 | % pivot/2 chooses the first element from the list that is the second. 57 | 58 | pivot(M, [M | _]). 59 | pivot(M, [_ | R]):- pivot(M, R). 60 | 61 | % One step of the computation -- figure out the Ans and the output 62 | % row selector, based on the input row selector, information about the 63 | % Table, and the selected operation and column. 64 | % Note: We are using hard select here. 65 | % TODO: Figure out what a soft-select version of ProPPR might mean. 66 | 67 | step(RS, M, C, Table, Op, Col, S, RSOut, Ans):- 68 | make_array(M, RSOut), 69 | jump(RS, M, C, Table, Op, Col, S, RSOut, Ans). 70 | 71 | jump(RS, M, _C, Table, gt, Col, s(_, _, Pivot, _, _, _), RSOut, _):- 72 | gt(Col, Pivot, Table, M, RS, 1, RSOut). 73 | jump(RS, M, _C, Table, lt, Col, s(_, _, _, Pivot, _, _), RSOut, _):- 74 | lt(Col, Pivot, Table, M, RS, 1, RSOut). 75 | jump(RS, M, _C, Table, geq, Col, s(_, _, _, _, Pivot, _), RSOut, _):- 76 | geq(Col, Pivot, Table, M, RS, 1, RSOut). 77 | jump(RS, M, _C, Table, leq, Col, s(_, _, _, _, _,Pivot), RSOut, _):- 78 | leq(Col, Pivot, Table, M, RS, 1, RSOut). 79 | 80 | jump(RS, M, C, Table, argmax, Col, _, RSOut, _):- 81 | argmax(Col, Table, M, C, RS, RSOut). 82 | jump(RS, M, C, Table, argmin, Col, _, RSOut, _):- 83 | argmin(Col, Table, M, C, RS, RSOut). 84 | jump(RS, M, _C, _Table, first, _Col, _, RSOut, _):- 85 | array(RS, 1, RS1), first(RS, M, 1, RS1, RSOut). 86 | jump(RS, M, _C, _Table, last, _Col, _, RSOut, _):- 87 | array(RS, M, RSM), last(RS, M, M, RSM, RSOut). 88 | jump(RS, M, _C, _Table, previous, _Col, _, RSOut, _):- 89 | array(RS, 2, RSM), previous(RS, M, 1, RSM, RSOut). 90 | jump(RS, M, _C, _Table, next, _Col, _, RSOut, _):- 91 | next(RS, M, 1, 0, RSOut). 92 | 93 | jump(_RS, M, _C, _Table, select, Col, s(Select,_,_,_,_,_), RSOut, _):- 94 | select(Select, Col, M, 1, RSOut). 95 | 96 | jump(_RS, M, _C, _Table, select, Col, s(_,MFE,_,_,_,_), RSOut, _):- 97 | select(MFE, Col, M, 1, RSOut). 98 | 99 | jump(RS, _M, _C, _Table, count, _Col, _, _, ans(ScalarA, _)):- sum(RS, ScalarA). 100 | jump(RS, _M, _C, _Table, count, Col, _, _, ans(_,LookupA)):- print(RS, Col, LookupA). 101 | jump(_RS, M, _C, _Table, reset, _Col, _, RSOut, _):- assign(RSOut, 1, 1, M). 102 | 103 | 104 | compute_state(_Table, s(_Select, _MFE)):- 105 | true. % TODO 106 | 107 | 108 | % ------------------------ Support code --------------------------- 109 | 110 | 111 | make_array(I, A):- functor(A, row, I). 112 | 113 | % An M x N array is represented as an instance of rows/M, with each entry an 114 | % instance of row/N. 115 | % TODO: consider col-major representation. cols(...) with col entries, at least 116 | % for the main Table. 117 | make_array(I, J, A):- functor(A, rows, I), cols(A, 1, I, J). 118 | cols(A, K, I, J):- 119 | arg(K, A, AK), 120 | functor(AK, row, J), 121 | (K==I ; (K < I, K1 is K+1, cols(A, K1, I, J))). 122 | 123 | % Accessor -- X = A(I,J), X=A(I) 124 | array(A, I, X) :- arg(I, A, X). 125 | array(A, I, J, X) :- arg(I, A, Row), arg(J, Row, X). 126 | 127 | assign(RS, IndexI, IndexJ, Val, Def, M, C, I, J):- 128 | ((I==IndexI, J==IndexJ) -> Value = Val; Value = Def), 129 | array(RS, I, J, Value), 130 | (J==C, 131 | (I==M; 132 | (I Value = Val; Value = Def), 138 | array(RS, I, Value), 139 | (I==M; (I Ans=X1 ; (I1 is I+1, sum(I1, J, Array, X1, Ans))). 154 | 155 | % ------------------------- Implementation of operations -------------- 156 | 157 | gt(Col, Pivot, Table, IMax, RS, I, RSOut) :- 158 | array(Table, I, Col, AIJ), 159 | (AIJ > Pivot -> Val=1; Val=0), 160 | array(RSOut, I, Val), 161 | (I==IMax ; (I < IMax, I1 is I+1, gt(Col, Pivot, IMax, RS, I1, RSOut))). 162 | 163 | geq(Col, Pivot, Table, IMax, RS, I, RSOut) :- 164 | array(Table, I, Col, AIJ), 165 | (AIJ >= Pivot -> Val=1; Val=0), 166 | array(RSOut, I, Val), 167 | (I == IMax; (I Val=1; Val=0), 172 | array(RSOut, I, Val), 173 | (I==IMax; (I < IMax, I1 is I+1, lt(Col, Pivot, IMax, RS, I1, RSOut))). 174 | 175 | leq(Col, Pivot, Table, IMax, RS, I, RSOut) :- 176 | array(Table, I, Col, AIJ), 177 | (AIJ =< Pivot -> Val=1; Val=0), 178 | array(RSOut, I, Val), 179 | (I==IMax; (I < IMax, I1 is I+1, leq(Col, Pivot, IMax, RS, I1, RSOut))). 180 | 181 | argmax(Col, Table, M, RS, RSOut) :- 182 | array(Table, 1, Col, Val), 183 | argmax(Col, RS, Table, M, 2, 1, Val, Index), 184 | assign(RSOut, Index, 1, 0, 1, M). 185 | 186 | argmax(Col, RS, Table, M, I, CI, CVal, Index) :- 187 | array(RS, I, RSI), array(Table, I, Col, TIJ), 188 | ((RSI==1, TIJ > CVal)-> (NCI=I, NCVal=TIJ); (NCI=CI, NCVal=CVal)), 189 | (I = M 190 | -> Index=NCI 191 | ; (I1 is I+1, argmax(Col, RS, Table, M, I1, NCI, NCVal, Index))). 192 | 193 | argmin(Col, Table, M, RS, RSOut) :- 194 | array(Table, 1, Col, Val), 195 | argmin(Col, RS, Table, M, 2, 1, Val, Index), 196 | assign(RSOut, Index, 1, 0, 1, M). 197 | 198 | argmin(Col, RS, Table, M, I, CI, CVal, Index) :- 199 | array(RS, I, RSI), array(Table, I, Col, TIJ), 200 | ((RSI==1, TIJ < CVal)-> (NCI=I, NCVal=TIJ); (NCI=CI, NCVal=CVal)), 201 | (I = M 202 | -> Index=NCI 203 | ; (I1 is I+1, argmin(Col, RS, Table, M, I1, NCI, NCVal, Index))). 204 | 205 | first(_RS, M, I, 1, RSOut) :- assign(RSOut, I, 1, 0, 1, M). 206 | first(RS, M, I, 0, RSOut):- 207 | I < M, 208 | array(RS, I, RSI), 209 | I1 is I+1, first(RS, M, I1, RSI, RSOut). 210 | first(_RS, M, M, 0, RSOut):- assign(RSOut, 0, 1, M). 211 | 212 | last(_RS, M, I, 1, RSOut) :- assign(RSOut, I, 1, 0, 1, M). 213 | last(RS, M, I, 0, RSOut):- 214 | I > 1, 215 | array(RS, I, RSI), 216 | I1 is I-1, last(RS, M, I1, RSI, RSOut). 217 | last(_RS, M, 1, 0, RSOut):- assign(RSOut, 0, 1, M). 218 | 219 | previous(RS, M, I, V, RSOut) :- 220 | array(RSOut, I, V), 221 | I1 is I+1, 222 | (I1=M -> array(RSOut, M, 0) 223 | ; (array(RS, I1, V1), previous(RS, M, I1, V1, RSOut))). 224 | 225 | next(RS, M, I, V, RSOut) :- 226 | array(RSOut, I, V), array(RS, I, V1), 227 | I1 is I+1, 228 | (I1=M; (I1 < M, next(RS, M, I1, V1, RSOut))). 229 | 230 | print(RS, M, C, Col, LookupA) :- 231 | make_array(M, C, LookupA), 232 | print_ij(RS, M, C, Col, 1, 1, LookupA). 233 | 234 | print_ij(RS, M, C, Col, I, J, LookupA) :- 235 | (J==Col -> array(RS, I, Val) ; Val=0), 236 | array(LookupA, I, J, Val), 237 | (J==C 238 | -> (I == M ; (I < M, I1=I+1, print_ij(RS, M, C, I1, 1, LookupA))) 239 | ; (J1 is J+1, print_ij(RS, M, C, I, J1, LookupA))). 240 | 241 | select(Select, Col, M, I, RSOut) :- 242 | array(Select, I, Col, V), 243 | array(RSOut, I, V), 244 | (I==M; (I < M, I1 is I+1, select(Select, Col, M, I1, RSOut))). 245 | 246 | -------------------------------------------------------------------------------- /src/np_proppr_new.pl: -------------------------------------------------------------------------------- 1 | /* 2 | TODO: Figure out how the Input utterance is to be represented. 3 | cf [Neelakantan 2017]'s Question RNN. 4 | 5 | 6 | The SLP/PCC model does not really work. 7 | 8 | We introduce a new construct -- a conditional probability distribution 9 | 10 | X | V1, ..., Vk ~ pdspec 11 | 12 | One form of pdspec is X\Goal, for Goal a term. The intent is: execute bag(X, Goal, Xs), yielding in Xs the set of X's for which Goal is true. Assume this terminates. It is now required that the probability distribution from which X is to be sampled (and that will be learnt via training) will yield only values in Xs. 13 | 14 | (Note: Goal may have other free variables. TODO: Figure out if there needs to be any connection between the free varaibles and V1, ..., Vk.) 15 | 16 | Further, the user may wish to specify some information about how the pd is to be learnt. One technique is to 17 | 18 | 19 | */ 20 | program(Input, Table, Ans) :- 21 | compute_state(Table, InitState), 22 | make_array(M, C, Table), 23 | make_array(M, RS0), 24 | assign(RS0, 1, 1, M), 25 | 26 | Op1 | Input ~ X\op(X), 27 | 28 | op(Op1), 29 | col(Col1, C), 30 | step(RS0, 1, M, C, Table, Op1, Col1, InitState, RS1, _Ans1), 31 | 32 | op(Op2), 33 | col(Col2, C), 34 | step(RS1, 1, M, C, Table, Op2, Col2, InitState, RS2, _Ans2), 35 | 36 | op(Op3), 37 | col(Col3, C), 38 | step(RS2, 1, M, C, Table, Op3, Col3, InitState, RS3, _Ans3), 39 | 40 | op(Op4), 41 | col(Col4, C), 42 | step(RS3, 1, M, C, Table, Op4, Col4, InitState, _RS4, Ans). 43 | 44 | % A probabilistic predicate -- op selector. 45 | % We have to learn the probabilities associated with each clause. 46 | % TODO: Figure out what syntax ProPPR is using, a bit worried their use of 47 | % braces is not Prolog syntax. 48 | % TODO: Figure out whether PRoPPR supports non-probabilistic predicates. 49 | 50 | op(gt) . %:- {op(gt)}. 51 | op(lt) . %:- {op(lt)}. 52 | op(geq) . %:- {op(geq)}. 53 | op(leq) . %:- {op(leq)}. 54 | op(argmax) . %:- {op(argmax)}. 55 | op(argmin) . %:- {op(argmin)}. 56 | op(select) . %:- {op(select)}. 57 | op(mfe) . %:- {op(mfe)}. 58 | op(previous). %:- {op(previous)}. 59 | op(next) . %:- {op(next)}. 60 | op(first) . %:- {op(first)}. 61 | op(last) . %:- {op(last)}. 62 | op(reset) . %:- {op(reset)}. 63 | op(print) . %:- {op(print)}. 64 | op(count) . %:- {op(count)}. 65 | 66 | % A probabilistic predicate -- column selector. 67 | col(C, Bound):- C < Bound. %, {col(C)}. 68 | 69 | % The paper also deals with "pivots" (numbers that occur in the question and 70 | % that can be/should be used in the synthesized program) using soft selection. 71 | % We will deal with these numbers as well through a learnable probability 72 | % distribution. 73 | 74 | % TODO: figure out what features should pivot depend on. 75 | % pivot/2 chooses the first element from the list that is the second. 76 | 77 | pivot(M, [M | _]). 78 | pivot(M, [_ | R]):- pivot(M, R). 79 | 80 | % One step of the computation -- figure out the Ans and the output 81 | % row selector, based on the input row selector, information about the 82 | % Table, and the selected operation and column. 83 | % Note: We are using hard select here. 84 | % TODO: Figure out what a soft-select version of ProPPR might mean. 85 | 86 | step(RS, M, C, Table, Op, Col, S, RSOut, Ans):- 87 | make_array(M, RSOut), 88 | jump(RS, M, C, Table, Op, Col, S, RSOut, Ans). 89 | 90 | jump(RS, M, _C, Table, gt, Col, s(_, _, Pivot, _, _, _), RSOut, _):- 91 | gt(Col, Pivot, Table, M, RS, 1, RSOut). 92 | jump(RS, M, _C, Table, lt, Col, s(_, _, _, Pivot, _, _), RSOut, _):- 93 | lt(Col, Pivot, Table, M, RS, 1, RSOut). 94 | jump(RS, M, _C, Table, geq, Col, s(_, _, _, _, Pivot, _), RSOut, _):- 95 | geq(Col, Pivot, Table, M, RS, 1, RSOut). 96 | jump(RS, M, _C, Table, leq, Col, s(_, _, _, _, _,Pivot), RSOut, _):- 97 | leq(Col, Pivot, Table, M, RS, 1, RSOut). 98 | 99 | jump(RS, M, C, Table, argmax, Col, _, RSOut, _):- 100 | argmax(Col, Table, M, C, RS, RSOut). 101 | jump(RS, M, C, Table, argmin, Col, _, RSOut, _):- 102 | argmin(Col, Table, M, C, RS, RSOut). 103 | jump(RS, M, _C, _Table, first, _Col, _, RSOut, _):- 104 | array(RS, 1, RS1), first(RS, M, 1, RS1, RSOut). 105 | jump(RS, M, _C, _Table, last, _Col, _, RSOut, _):- 106 | array(RS, M, RSM), last(RS, M, M, RSM, RSOut). 107 | jump(RS, M, _C, _Table, previous, _Col, _, RSOut, _):- 108 | array(RS, 2, RSM), previous(RS, M, 1, RSM, RSOut). 109 | jump(RS, M, _C, _Table, next, _Col, _, RSOut, _):- 110 | next(RS, M, 1, 0, RSOut). 111 | 112 | jump(_RS, M, _C, _Table, select, Col, s(Select,_,_,_,_,_), RSOut, _):- 113 | select(Select, Col, M, 1, RSOut). 114 | 115 | jump(_RS, M, _C, _Table, select, Col, s(_,MFE,_,_,_,_), RSOut, _):- 116 | select(MFE, Col, M, 1, RSOut). 117 | 118 | jump(RS, _M, _C, _Table, count, _Col, _, _, ans(ScalarA, _)):- sum(RS, ScalarA). 119 | jump(RS, _M, _C, _Table, count, Col, _, _, ans(_,LookupA)):- print(RS, Col, LookupA). 120 | jump(_RS, M, _C, _Table, reset, _Col, _, RSOut, _):- assign(RSOut, 1, 1, M). 121 | 122 | 123 | compute_state(_Table, s(_Select, _MFE)):- 124 | true. % TODO 125 | 126 | 127 | % ------------------------ Support code --------------------------- 128 | 129 | 130 | make_array(I, A):- functor(A, row, I). 131 | 132 | % An M x N array is represented as an instance of rows/M, with each entry an 133 | % instance of row/N. 134 | % TODO: consider col-major representation. cols(...) with col entries, at least 135 | % for the main Table. 136 | make_array(I, J, A):- functor(A, rows, I), cols(A, 1, I, J). 137 | cols(A, K, I, J):- 138 | arg(K, A, AK), 139 | functor(AK, row, J), 140 | (K==I ; (K < I, K1 is K+1, cols(A, K1, I, J))). 141 | 142 | % Accessor -- X = A(I,J), X=A(I) 143 | array(A, I, X) :- arg(I, A, X). 144 | array(A, I, J, X) :- arg(I, A, Row), arg(J, Row, X). 145 | 146 | assign(RS, IndexI, IndexJ, Val, Def, M, C, I, J):- 147 | ((I==IndexI, J==IndexJ) -> Value = Val; Value = Def), 148 | array(RS, I, J, Value), 149 | (J==C, 150 | (I==M; 151 | (I Value = Val; Value = Def), 157 | array(RS, I, Value), 158 | (I==M; (I Ans=X1 ; (I1 is I+1, sum(I1, J, Array, X1, Ans))). 173 | 174 | % ------------------------- Implementation of operations -------------- 175 | 176 | gt(Col, Pivot, Table, IMax, RS, I, RSOut) :- 177 | array(Table, I, Col, AIJ), 178 | (AIJ > Pivot -> Val=1; Val=0), 179 | array(RSOut, I, Val), 180 | (I==IMax ; (I < IMax, I1 is I+1, gt(Col, Pivot, IMax, RS, I1, RSOut))). 181 | 182 | geq(Col, Pivot, Table, IMax, RS, I, RSOut) :- 183 | array(Table, I, Col, AIJ), 184 | (AIJ >= Pivot -> Val=1; Val=0), 185 | array(RSOut, I, Val), 186 | (I == IMax; (I Val=1; Val=0), 191 | array(RSOut, I, Val), 192 | (I==IMax; (I < IMax, I1 is I+1, lt(Col, Pivot, IMax, RS, I1, RSOut))). 193 | 194 | leq(Col, Pivot, Table, IMax, RS, I, RSOut) :- 195 | array(Table, I, Col, AIJ), 196 | (AIJ =< Pivot -> Val=1; Val=0), 197 | array(RSOut, I, Val), 198 | (I==IMax; (I < IMax, I1 is I+1, leq(Col, Pivot, IMax, RS, I1, RSOut))). 199 | 200 | argmax(Col, Table, M, RS, RSOut) :- 201 | array(Table, 1, Col, Val), 202 | argmax(Col, RS, Table, M, 2, 1, Val, Index), 203 | assign(RSOut, Index, 1, 0, 1, M). 204 | 205 | argmax(Col, RS, Table, M, I, CI, CVal, Index) :- 206 | array(RS, I, RSI), array(Table, I, Col, TIJ), 207 | ((RSI==1, TIJ > CVal)-> (NCI=I, NCVal=TIJ); (NCI=CI, NCVal=CVal)), 208 | (I = M 209 | -> Index=NCI 210 | ; (I1 is I+1, argmax(Col, RS, Table, M, I1, NCI, NCVal, Index))). 211 | 212 | argmin(Col, Table, M, RS, RSOut) :- 213 | array(Table, 1, Col, Val), 214 | argmin(Col, RS, Table, M, 2, 1, Val, Index), 215 | assign(RSOut, Index, 1, 0, 1, M). 216 | 217 | argmin(Col, RS, Table, M, I, CI, CVal, Index) :- 218 | array(RS, I, RSI), array(Table, I, Col, TIJ), 219 | ((RSI==1, TIJ < CVal)-> (NCI=I, NCVal=TIJ); (NCI=CI, NCVal=CVal)), 220 | (I = M 221 | -> Index=NCI 222 | ; (I1 is I+1, argmin(Col, RS, Table, M, I1, NCI, NCVal, Index))). 223 | 224 | first(_RS, M, I, 1, RSOut) :- assign(RSOut, I, 1, 0, 1, M). 225 | first(RS, M, I, 0, RSOut):- 226 | I < M, 227 | array(RS, I, RSI), 228 | I1 is I+1, first(RS, M, I1, RSI, RSOut). 229 | first(_RS, M, M, 0, RSOut):- assign(RSOut, 0, 1, M). 230 | 231 | last(_RS, M, I, 1, RSOut) :- assign(RSOut, I, 1, 0, 1, M). 232 | last(RS, M, I, 0, RSOut):- 233 | I > 1, 234 | array(RS, I, RSI), 235 | I1 is I-1, last(RS, M, I1, RSI, RSOut). 236 | last(_RS, M, 1, 0, RSOut):- assign(RSOut, 0, 1, M). 237 | 238 | previous(RS, M, I, V, RSOut) :- 239 | array(RSOut, I, V), 240 | I1 is I+1, 241 | (I1=M -> array(RSOut, M, 0) 242 | ; (array(RS, I1, V1), previous(RS, M, I1, V1, RSOut))). 243 | 244 | next(RS, M, I, V, RSOut) :- 245 | array(RSOut, I, V), array(RS, I, V1), 246 | I1 is I+1, 247 | (I1=M; (I1 < M, next(RS, M, I1, V1, RSOut))). 248 | 249 | print(RS, M, C, Col, LookupA) :- 250 | make_array(M, C, LookupA), 251 | print_ij(RS, M, C, Col, 1, 1, LookupA). 252 | 253 | print_ij(RS, M, C, Col, I, J, LookupA) :- 254 | (J==Col -> array(RS, I, Val) ; Val=0), 255 | array(LookupA, I, J, Val), 256 | (J==C 257 | -> (I == M ; (I < M, I1=I+1, print_ij(RS, M, C, I1, 1, LookupA))) 258 | ; (J1 is J+1, print_ij(RS, M, C, I, J1, LookupA))). 259 | 260 | select(Select, Col, M, I, RSOut) :- 261 | array(Select, I, Col, V), 262 | array(RSOut, I, V), 263 | (I==M; (I < M, I1 is I+1, select(Select, Col, M, I1, RSOut))). 264 | 265 | -------------------------------------------------------------------------------- /src/what_we_want.txt: -------------------------------------------------------------------------------- 1 | Sat Feb 25 08:18:07 EST 2017 2 | 3 | Unfortunately, the PCC / SLP model doesn't quite work -- we want to learn probability 4 | distributions that are *dependent* on certain arguments. 5 | 6 | Consider the program np_proppr.pl. Here we specify a predicate program/3 that takes as arguments Utterance, Table and Ans. At training time we are given a large number of examples (~ 10K or so), which contain (as it turns out, ground) tuples that are in the given relation. 7 | 8 | At test time, we would like to supply only the values of Utterance and Table, and generate Ans. 9 | 10 | That is, we want to learn a function that takes as input an (Utterance, M x C Table) pair and produces a sequence of T probability distributions, each identifying one of 15 operations, one of the numbers between 1..C, and depending on the operation, one of the numbers in the utterance. 11 | 12 | p(T1 | utterance, table) 13 | p(T2 | T1, utterance, table) 14 | p(Tk | Tk-1, ..., T1, utterance, table) 15 | 16 | One way to address this is described in [Neelakantan 2017]. One creates a setence vector , q, for the utterance. Then one uses an RNN. At each stage, the hidden state of the RNN and q is used to generate the state for the Op and column choice (using appropriate-sized matrices of weights). Using softmax this is converted into a probability distribution over the O operations and C columns, respectively. These distributions are used (this is the soft-select strategy) to update the row_select vector for the next iteration (this is how information is transmitted across program evaluation steps). The pds are then fed back as input to the RNN to generate the next state and so on. 17 | 18 | The SLP model does not distinguish input/output polarity. It learns probabilities for the specified clauses so that the expectation of the given goals is maximized. 19 | 20 | I do not believe this will give us good answers for the Neural Programmer problem. All we will get are a fixed probability distribution for the op/1 and col/2 predicates that are independent of the actual content of the utterance. So the learning is going to end up tapping a very weak signal. 21 | 22 | Ideally, we want to set up a system that learns from the incoming sentence and table headers the right program. 23 | 24 | (Hmm, the learner should not be shown the actual contents to learn from, just do the calculations from...?) 25 | 26 | So we should progress as follows. 27 | 28 | We need to make two changes: 29 | 30 | (a) Somehow we have to define not just probability distributions, but also specify what the probability distributions depend on. Here, they they depend on the input utterance and table. 31 | 32 | We do this by generalizing the X ~ pd "sampling" construct in probabilistic CCP to the "conditional sampling" construct: 33 | 34 | X | V1, ..., Vk ~ pd 35 | 36 | This is to be understood as saying given the values the variables V1, ..., Vk, X is to be drawn from the given 37 | 38 | The general idea abstracting from ProPPR, is that of course they should depend, generically, on the state. For ProPPR, these are the feature (terms) in each clause, for us they could be constraints that are entailed by the store. Groundness should not be required -- instead we can think of the term being appropriately existentially quantified. 39 | 40 | (TODO: How does ProPPR ensure that the number of features are finite?) 41 | 42 | (b) (ProPPR idea) Change the focus from a random variable and its probability distribution to feature vectors -- in essence each "choice" is associated with multiple numbers, a whole vector full of numbers. These contribute to the desired pd through learnable weights. 43 | 44 | That is really all there is to it! 45 | 46 | And then training is via SGD. 47 | 48 | So this is a general recipe for SLP / PCC programs? 49 | 50 | 51 | 52 | --------------------------------------------------------------------------------