├── (un)folding-programs.org
├── how-to-learn.org
├── img
├── birkhoff-universal-algebra.png
├── christopher-strachey.png
├── dana-scott.png
├── eugenio-moggi.png
├── garrett-birkhoff.png
├── john-reynolds.png
├── nn-types-fp.png
├── peter-landin.png
├── robin-milner.png
├── roger-godement.png
├── samuel-eilenberg.png
├── saunders-maclane.png
├── tony-hoare.png
├── universal-co-algebra-chart.png
└── øysten-ore.png
├── journal.org
├── notes-on-history-of-interactive-theorem-proving.org
├── papers-read.org
└── readme.org
/(un)folding-programs.org:
--------------------------------------------------------------------------------
1 | * (Un)folding Programs
2 |
3 | ** Boom Hierarchy
4 |
5 | One framework to begin with which kinds of pulls in together a lot of interesting ideas here is the Boom hierarchy.
6 |
7 | It is a structure between presence/absence of 4 properties giving us 2⁴ = 16 structures.
8 |
9 | Write about how they can be classified as { monoids, semigroups, trees, non-empty trees }
10 |
11 | TODO: Find out what the best structures to characterize the trees and their variants (idempotent, non-empty) are.
12 |
13 | TODO: Insert a diagram on this here.
14 |
15 | Also, show how some are forgetful and some are free.
16 |
17 | Talk about the idea of adjunctions.
18 |
19 | Probably not the right place to expand out on Galois theory, but a very broad strokes overview of Galois theory might help here.
20 | Just keep the salient bits for computation here.
21 | I can then point to resources which then elaborate this.
22 |
23 | Abstraction Interpretation, Noson Yanofsky / Rohith Parikh work.
24 |
25 | Discuss about the free vs. forgetful distinction in category theory.
26 |
27 | Talk about why ++ is a free structure while Last or +, * are not.
28 |
29 | Discuss about the practical implication of having a free structure in programming.
30 |
31 | Discuss about the properties: associativity, commutativity, distributivity, idempotency, reflexivity, symmetry, transitivity
32 |
33 | TODO: Describe how can logic be used to capture the various enumerative combinatorics of the structures?
34 |
35 | Quantifying structures using logical operators and another approach using generating functions.
36 |
37 | TODO: Tie in groupoidification, combinatory species, generating functionology, orthogonal polynomials, analytic functors, and 12 fold way once you investigate them deeper here. Maintain only the contextually relevance details when describing: we are using them to understand computation and how they are related to program derivation.
38 |
39 | ** Lattices
40 |
41 | These are nice structures. They can be thought of as giving a principled way to locate the evolution of an entity through multiple ways. And by the nature of its axiomatization, it is ensured that each stage in this evolution is tracked in a way that the commonalities between the branches of evolution converge to common points and then converge/diverge out from there. This is the process interpretation of the lattices. If we think about the structure interpretation, the best way to understand them is as closures and kernels of the problem domain.
42 |
43 | TODO: I have to understand the idea of closures/kernels, implication/conjunction, and their counterparts in algebraic geometry and how it connects up with the composition of the lattice.
44 |
45 | There are at least two dualities in play here
46 |
47 | Intersection / Union duality
48 |
49 | Chain / Antichain duality
50 |
51 | TODO: Describe how they are related to partial orders.
52 |
53 | How you can do closures and kernels on these structures to generate transitive reduction and transitive closure. What does this mean in terms of logic?
54 |
55 | What role does the lattice play?
56 |
57 | How did the idea of lattices emerge?
58 |
59 | Dualgruppen idea
60 |
61 | Link with how splitting fields and group extensions idea gave birth to Galois theory
62 |
63 | TODO: Investigate why Peirce was after grounding some kind of duality using lattice like structures.
64 | TODO: Investigate how it connects with Schröder’s work
65 | TODO: Link with the history papers here
66 | TODO: Link with the work of Dedekind and how he leveraged the idea of ideals/filters and identified the concept of PID
67 |
68 | Ideals/Filters
69 |
70 | Link to articles like that of Sowa and ISKO
71 |
72 | TODO: Think about how lattices are connected to quantifiers and quantifier elimination.
73 | TODO: Talk about how they are related to sheaves/fibres/sections/stack etc.
74 |
75 | Talk about how adjunctions between lattices give you closure/kernel operators and this is mirrored by monads and comonads.
76 |
77 | How lattices connect up with the idea of eigenvalue decomposition and extended Euclidean algorithm
78 |
79 | Briefly mention about the connection between Fourier Transforms, Bezout theorem, Euclidean GCD, Singular Value Decomposition, Krohn-Rhodes theory, Eigenvalue Decomposition etc.
80 |
81 | https://twitter.com/LdaFCosta/status/1454520031132459008
82 | https://sridharramesh.github.io/HowSridharThinks/math/BezoutEtc.html
83 |
84 | ** Formal Concept Analysis
85 |
86 | With formal concept analysis, you get a way to partition out the problem space in such a way that the components are linearly independent. You attempt to coordinatize the space in a way that your axes, like eigenvectors spans the whole space so that when you decompose them into subcomponents, the intertwining is reduced and what you get out is orthogonal components which you can linearly recombine to recover the original space. They are in some sense the necessary and sufficient concepts to describe the space in a parsimonious manner. So an equivalence class with maximal information, i.e. the least amount of redundancy (entropy?) needed to describe the original space. It also prevents a skewed coordinate system which might cause overlap of two things on one idea or the converse of one thing as two ideas.
87 |
88 | Mention about Duality of Structure and Process from Anna Sfard’s paper and how it connects up with the idea of right and left fold having time / space tradeoffs.
89 |
90 | Eigenvalues describing the variance in a distribution
91 |
92 | Covariance/correlation of distributions
93 |
94 | Lattice homomorphisms
95 |
96 | Sets ~ Distributive? Lattices
97 | Ordered Sets ~ Matrices
98 | Multisets ~ Modular? Lattices
99 | Lists ~ Hypermatrices
100 |
101 | One of my realization lately has been that matrices model functions: https://twitter.com/prathyvsh/status/1459936336182452233 So I think matrices represent ordered sets in the BOOM hierarchy because you have a certain sense of direction with domain mapping to codomain. With lattices, what you are representing are the multisets/sets which can have different overlaps / elisions within them. I will gladly elaborate on this if you have any doubts on this. But I think I
102 |
103 | Monotonicity/continuity
104 |
105 | ** Program calculation
106 |
107 | Discuss about the two schools: Eindhoven / Oxford
108 |
109 | Talk about their difference
110 |
111 | Talk about the general approach in a schematic manner once you investigate it.
112 |
113 | TODO: Read that Backhouse paper and investigate how Eindhoven style relates to Oxford style
114 |
115 | Stalactite vs. Stalagmite distinction in quotients vs. subgroups
116 |
117 | Injectivity vs. Surjectivity
118 |
119 | Two aspects of Homomorphism
120 |
121 | How this is also reflected in the canonical decomposition of a function and in formal concept analysis style disjunction
122 |
123 | * [[http://podcasts.ox.ac.uk/introduction-theory-lists][An Introduction to a Theory of Lists]]
124 |
125 | Richard Bird
126 | Oxford, 16th and 17th December 1986
127 |
128 | A calculus used for deriving efficient solutions to certain kinds of problems in computation.
129 | A set of laws, lemmas, and theorems in the same sense of integral calculus.
130 | It does not stand for a formal systems with axioms and inference rules.
131 | Semantic and foundational issues are not touched upon.
132 |
133 | Theory of expression trees are waiting to be organized.
134 |
135 | Program specification / transformation
136 |
137 | General laws which enable programs to be calculated.
138 |
139 | Richard Bird has been dissatisfied about the lack of penetrating results such as in integral calculus that scientists and engineers use daily.
140 |
141 | Only two players: Richard Bird and Lambert Meertens
142 |
143 | Dijkstra, Grieves, Backhouse program calculation with invariant assertions
144 |
145 | Difference between Eindhoven school is that of style not of objective.
146 |
147 | Eindhoven use imperative notation andd uses predicate calculus as the main tool and
148 | many of the results are presented using arrays.
149 |
150 | Bird-Meertens is: Functional notation, specialized functional calculus and considers lists as a more basic data structure than arrays.
151 |
152 | ** Summary:
153 |
154 | First Lecture
155 |
156 | Notation suggested by David Turner
157 |
158 | Second Lecture
159 |
160 | First example of a calculational proof and some further notation
161 |
162 | Third Lecture
163 |
164 | Problems about segments of lists
165 |
166 | Fourth Lecture
167 |
168 | Problems about partitions of lists
169 |
170 | Fifth Lecture
171 |
172 | More material and further examples
173 |
174 | ** 1 List Notation
175 |
176 | Lists: Ordered and homogenous
177 |
178 | []
179 | ['a']
180 | ['a','p','p','l','e']
181 |
182 | ** 2 Convention
183 |
184 | a, b, c … elements of lists
185 | x, y, z … lists
186 | xs, ys, zs … lists of lists
187 |
188 | ** 3 Length
189 |
190 | # :: [ α ] -> Num
191 | #[a1,a2,…an] = n
192 |
193 | ** 4 Concatenation
194 |
195 | ++ :: [ α ] × [ α ] → [ α ]
196 | Associativity: (x ++ y) ++ z = x ++ ( y ++ z )
197 | Identity: x ++ [] = [] ++ x = x
198 |
199 | # (x ++ y) = # x + # y
200 | (#, ++ distribution)
201 |
202 | ** 5 Map
203 | * :: ( α → β ) × [ α ] → [ β ]
204 |
205 | f * [ a₁, a₂, … , aₓ ] = [ fa₁ , fa₂, … faₓ]
206 |
207 | f * ( x ++ y ) = ( f * x ) ++ ( f * y )
208 | { *, ++ distributivity }
209 |
210 | (f • g) * = (f * ) • ( g * )
211 | { *, • dist }
212 |
213 | ( f * )⁻¹ = ( f⁻¹ * )
214 | { *, ⁻¹ comm }
215 |
216 | ** 6 Notational interlude
217 |
218 | Let ⊕ :: ( ⍺ ✕ β ) → 𝛾
219 |
220 | (a ⊕ ) :: β → 𝛾 | ( a ⊕ ) b = a ⊕ b
221 |
222 | ( ⊕ b ) :: ⍺ → 𝛾 | ( ⊕ b ) a = a ⊕ b
223 |
224 | ( f * ) f-map function
225 | ( + 1 ) successor function
226 | ( ++ [a] ) append a function
227 |
228 | Function application is left-associative and has highest precedence
229 |
230 | f x y + 3 = ((f x) y) + 3
231 |
232 | ** 7 Filter
233 |
234 | ◁ :: ( α → Bool ) × [ ⍺ ] → [ ⍺ ]
235 | p ◁ x the sublists of elements of x satisfying p
236 |
237 | even ◁ [1 .. 10] = [2, 4, 6, 8, 10]
238 |
239 | p ◁ ( x ++ y) = (p ◁ x) ++ (p ◁ y)
240 | {◁, ++ dist }
241 |
242 | (p ◁ ) • (p ◁) = (p ◁)
243 | { ◁ idem }
244 | (p ◁) • (q ◁) = (q ◁) • (p ◁)
245 | { ◁ comm } (For total functions)
246 |
247 | (p ◁) • (f *) = (f *) • ((p • f) ◁)
248 | { ◁, • comm }
249 |
250 | 8 Reduce
251 |
252 | Borrowed from APL
253 |
254 | Operator which takes an operator
255 | / :: ( ⍺ × ⍺ → ⍺) × [ ⍺ ] → ⍺
256 | ⊕ / [ a₁, a₂, … , aₓ ] = a₁ ⊕ a₂ ⊕ … ⊕ aₓ
257 | Only defined if ⊕ is associative
258 |
259 | Laws
260 |
261 | ⊕ / [ a ] = a { / singletons }
262 | ⊕ / (x ++ y) = (⊕ / x) ⊕ (⊕ / y) { / dist }
263 |
264 | If ⊕ has an identity element e, then
265 | ⊕ / [] = e
266 | otherwise,
267 | ⊕ / [] is not defined
268 |
269 | ⊕ / y = ⊕ / ( [] ++ y ) = (⊕ / [] ) ⊕ (⊕ / y) = e ⊕ (⊕/y)
270 |
271 | 9 Examples
272 |
273 | sum = + /
274 | product = × /
275 |
276 | n! = x / [1 .. n]
277 |
278 | flatten = ++ /
279 |
280 | flatten [[1, 2], [], [2, 3]] = [1, 2, 2, 3]
281 |
282 | min = ↓ /
283 | max = ↑ /
284 |
285 | head = << /
286 | last = >> /
287 |
288 | all p = (˄ / ) • (p •)
289 | some p = (˅ / ) • (p •)
290 |
291 | 10 Promotion Lemmas
292 |
293 | Generalize the distribution laws of map, filter, reduce
294 |
295 | (f *) • (++ /) = (++ /) • ((f *) *)
296 | { * promotion }
297 |
298 | fmap to a flatten list is the same as flatten map of f map map.
299 |
300 | f*(++ / [ x₁, x₂, … , xₓ ] = f*(x₁ ++ x₂ ++ … ++ xₓ)
301 | = (f*x₁) ++ f(x₂) ++ … ++ f(xₓ)
302 | = ++ / [f*x₁, f(x₂), … , f(xₓ) ]
303 | = ++ / (f*)* [ x₁, x₂, … , xₓ ]
304 |
305 | Rather than flattening
306 | Promote the map into each component list and then flatten the result
307 |
308 | ( p ◁ ) • (++ /) = (++ /) • ( p ◁ * )
309 | { ◁ promotion }
310 |
311 | ( ⊕ ◁ ) • (++ /) = (⊕ /) • ( ⊕ / * )
312 | { ⊕ promotion }
313 |
314 | 11 Homomorphisms
315 |
316 | A function that preserves the properties of associativity and identity e.
317 |
318 | h [] = e
319 | h (x ++ y) = h x ⊕ h y
320 |
321 | Equivalently, if h • ( ++ / ) = (⊕ / ) • (h *)
322 |
323 | 12 Homomorphism Lemma
324 |
325 | h is a homomorphism iff
326 | h = (⊕ / ) • (f *) for some ⊕ and f.
327 |
328 | Proof
329 |
330 | Suppose h = (⊕ / ) • (f *)
331 | Then h • (++ /) = (⊕ / ) • (f *) • ( ++ / )
332 | { hypothesis }
333 | = (⊕ / ) • (++ /) • ( (f *) * )
334 | { *-promotion }
335 | = (⊕ / ) • (+(⊕ / ) *) • ( (f *) * )
336 | { /-promotion }
337 | = (⊕ / ) • (+(⊕ / ) *) • ( (f *) * )
338 | { *, • dist }
339 | = (⊕ / ) • (h *)
340 | { hypothesis }
341 |
342 | Second, define □ a = [a]
343 |
344 | so (++ /) • (□ *) = id
345 |
346 | Now h = h • (++ /) • ( □ * )
347 | { definition of □ }
348 | = (⊕ /) • (h *) • (□ *)
349 | { h is a homomorphism}
350 | = (⊕ /) • (f *)
351 | { *, • dist }
352 |
353 | where f = h • □
354 | Hence h = (⊕ /) • (f *)
355 | for suitable ⊕ and f.
356 |
357 | ** 13 Examples of homomorphisms
358 |
359 | Filter is a homomorphism
360 | (p ◁) = (++ /) • (f_p *)
361 | where f_p a = [a] if p a
362 | = [] otherwise
363 |
364 | # = (+ /) • (K_1 *) where K_a b = a
365 | K is the K combinator for combinatory calculus
366 |
367 | sort = (merge /) • (□ *)
368 |
369 | reverse = (++~ /) • (□ *)
370 | where a ⊕~ b = b ⊕ a
371 |
372 | ** 14 Lemma
373 |
374 | a ⊕ b = h(h⁻¹ a ++ h⁻¹ b)
375 |
376 | Then h (x ++ y) = h x ⊕ h y
377 |
378 | Try and solve a problem by looking for a homomorphism
379 |
380 | ** 15 Text Processing
381 |
382 | Text = [ Char ]
383 | Line = [ Char \ { NL } ]
384 |
385 | unlines :: [Line]⁺ → Text
386 | unlines = ⊕ /
387 | x ⊕ y = x ++ [ NL ] ++ y
388 |
389 | lines is an injective function
390 |
391 | lines :: Text → [ Line ]
392 | lines • unlines = id
393 |
394 | Problem: give a constructive definition of lines
395 |
396 | Since line is an injective function (intuitively at least)
397 |
398 | lines = (⊗ / ) • (f *)
399 |
400 | Direct calculation yields:
401 |
402 | f a = [[], []] if a = NL
403 | = [[a]], otherwise
404 |
405 | (xs ++ [x]) ⊗ ([y] ++ ys) = xs ++ [x ++ y] ++ ys
406 |
407 | ** 16 More text-processing
408 |
409 | Word = [ Char \ { SP, NL ]⁺
410 | Para = [ Line⁺ ]⁺
411 |
412 | unwords :: [ Word ]⁺ → Line
413 | unwords = ⊕SP /
414 | x ⊕sp y = x ++ [ SP ] ++ y
415 |
416 | words :: Line → [ Word ]
417 | words = (( ≠ [] ) ◁ ) • ( ⊗ / ) • (f_SP * )
418 |
419 | unparas :: [ Para ]⁺ → [ Lines ]
420 | unparas = ⊕[] /
421 | x ⊕[] y = x ++ [ [] ] ++ y
422 |
423 | paras :: [ Line ] → [ Para ]
424 | paras = (( ≠ [] ) ◁ ) • ( ⊗ / ) • (f_[] * )
425 |
426 | ** 17 Examples of use
427 |
428 | countlines = # • lines
429 |
430 | countwords = # • (++ / ) • (words * ) • lines
431 |
432 | countparas = # • paras • lines
433 |
434 | normalise :: Text → Text
435 | normalize = unparse • parse
436 |
437 | parse = ( ( words * ) * ) • paras • lines
438 | unparse = unlines • unparas • ( ( unwords * ) * )
439 |
440 | Unparse is correct because
441 |
442 | ( f • g )⁻¹ = g⁻¹ • f⁻¹
443 | (f *)⁻¹ = (f⁻¹ *)
444 |
445 | ** 20 Directed Reduction
446 |
447 | ←/- right-reduce
448 | -/→ left-reduce
449 |
450 | (⊕ ←/-)[ a₁, a₂, … , aₓ ] = a₁ ⊕( a₂ ⊕ … (aₓ⊕e))
451 | The ⊕ need not be associative and doesn’t need to have a unit
452 |
453 | ←/- :: (( ⍺ × β → β ) × β ) → [ ⍺ ] → β
454 | -/→ :: (( β × ⍺ → β) × β ) → [ ⍺ ] → β
455 |
456 | (⊕ -/→)[ a₁, a₂, … , aₓ ] = (((e⊕a₁)⊕ a₂) ⊕ … aₓ)
457 |
458 | Why do we need more reductions? Because they are implementations of the fold function and are closer to what can be achieved by machines.
459 |
460 | (f * ) (⊕ ←/- []) where a ⊕ x = [ f a ] ++ x
461 |
462 | ** 21 Duality Lemma
463 |
464 | (⊕ -/→) = (⊕~ ←/-) • reverse
465 |
466 | Example: fact n = x / [ 1 … n ]
467 | fact n = ( × -/→ 1 ) [1 … n] "going up"
468 |
469 | fact n = (× ←/- 1) [ 1 … n ]
470 | = ( ×~ -/→ 1) [n, n-1 … 1]
471 | = (× -/→ 1) [n, n-1, … 1]
472 | "going down"
473 |
474 | ** 22 Specialization Lemma
475 |
476 | Every homomorphism can be written as a left-reduction, or as a right reduction.
477 |
478 | ( ⊙ / ) • ( f * ) = (⊕ ←/- e) = (⊗ -/→ e)
479 | where
480 | a ⊕ b = f a ⊙ b
481 | a ⊗ b = a ⊙ f b
482 |
483 | and e is the identity element of ⊙.
484 |
485 | ** 22.5 Cons
486 |
487 | (:) :: ⍺ × [ ⍺ ] → [ ⍺ ]
488 | a : x = [ a ] ++ x
489 | x ++ y = (: ←/- y ) x
490 |
491 | Either : or ++ can be taken as primitive, but unlike ++, every list can be expressed/constructed in terms of [] and : in exactly one way.
492 |
493 | Cons has a canonical form, ++ has many ways in which the same thing can be expressed.
494 |
495 | ** 23 Recursive characterisation
496 |
497 | (⊕ ←/- e) [] = e
498 |
499 | (⊕ ←/- e) ( [a] ++ x ) = a ⊕ (⊕ ←/- e) x
500 |
501 | We can say f = (⊕ ←/- e) is the solution of:
502 |
503 | f [] = e
504 | f([a] ++ x) = a ⊕ fx
505 |
506 | Progress of computation is recursive.
507 |
508 | Similarly,
509 |
510 | (⊕ -/→ e)[] = e
511 | (⊕ -/→ e)(x ++ [a] ) = (⊕ -/→ e) x ⊕ a
512 |
513 | But also
514 | (⊕ -/→ e) [] = e
515 | (⊕ -/→ e)([a] ++ x) = (⊕ -/→ (e ⊕ a) ) x
516 |
517 | So f = (⊕ -/→ e)(⊕ -/→ e) is the solution of
518 | f x = g e x
519 | g e [] = e
520 | g e([a] ++ x) = g(e ⊕ a) x
521 |
522 | So left reductions have the advantage that they give an iterative notion of programming. They are immediately expressing in terms of imperative notions:
523 |
524 | y := (⊕ -/→ e) x ⟹ y := e;
525 | for a ← x
526 | do y := y ⊕ a
527 |
528 | ** 24 Efficiency Consideration
529 |
530 | In a functional programming language ←/- can be more time-efficient than -/→.
531 | ←/- can be more space-efficient than -/→.
532 |
533 | Recall a << b = a and a >> b = b
534 | (<< ←/- e)[1, 2, 3]
535 |
536 | (⊕ ←/- e)([a] ++ x) = a ⊕ (⊕ ←/- e) x
537 | (⊕ -/→ e)([a] ++ x) = (⊕ -/→ (e ⊕ a)) x
538 |
539 | (<< ←/- e)[1, 2, 3]
540 | = 1 << (<< ←/- e)[2, 3] (←/- .2)
541 | = 1 (<<.1)
542 | This can terminate after one step
543 |
544 | (>> -/→ e)[3, 2, 1]
545 | = (>> -/→ (e >> 3))[2, 1] (-/→ .2)
546 | = (>> -/→ 3)[2, 1] (>>.1)
547 | = (>> -/→ (3 >> 2))[1] (-/→ .2)
548 | = (>> -/→ 2)[1] (>>.1)
549 | = (>> -/→ ( 2 >> 1))[] (-/→ .2)
550 | = (>> -/→ 1)[] (>>.1)
551 | = 1 (-/→ .1)
552 |
553 | Whole of the list must be traversed
554 |
555 |
556 | (+ ←/- 0)[1, 2, 3]
557 | = 1 + (+ ←/- 0)[2, 3]
558 | = 1 + (2 + (+ ←/- 0)[3])
559 | = 1 + (2 + (3 + (+ ←/- 0)[])
560 | = 1 + (2 + (3 + 0))
561 | = 6
562 |
563 | Linear space. Same size as the list that we started with.
564 |
565 | (+ -/→ 0)[1, 2, 3]
566 | = (+ -/→ (0 + 1))[2, 3]
567 | { We can evaluate the answer now reducing the size of the list }
568 | = (+ -/→ 1)[2, 3]
569 | = (+ -/→ (1 + 2))[3]
570 | = (+ -/→ 3)[3]
571 | = (+ -/→ (3 + 3))[]
572 | = (+ -/→ 6)[]
573 | = 6
574 | Constant space
575 |
576 | Conclusion
577 |
578 | Use (⊕ -/→ e) when ⊕ is strict in the sense that it requires the evaluation of both arguments to return the result. Eg: + × ↑ ↓
579 |
580 | Use (⊕ ←/- e) when ⊕⊕ is non-strict that is it does not always demand the complete evaluation of both left and right arguments to return the result.
581 |
582 | e.g. and or <<
583 |
--------------------------------------------------------------------------------
/how-to-learn.org:
--------------------------------------------------------------------------------
1 | * How to learn these ideas
2 |
3 | Learning hard concepts in Computer Science can take a lot of time. Along with reading the literature, grounding these ideas by working with them in an interactive editor would be of great aid to the learning process.
4 |
5 | ** Prerequisites
6 |
7 | *** Learning interactively
8 | One of the steps that helped me was learning to program in a language with recursion. My previous exposure to this was languages where a procedural/iterative mode of organization was preferred like Javascript and Ruby, switching to Scheme by learning [[https://github.com/prathyvsh/htdp][HtDP]] helped me a lot. Once I had a working knowledge of how computational structures can be built up and processed recursively, I got a foundation to ground the abstract ideas I learnt and try them out using a development environment.
9 |
10 | *** Dealing with the barebones concepts
11 |
12 | Once this was a strength, the next thing that was of immense help was to learn Lambda Calculus. It is a language with just three primitives (variables, abstraction, combination) that has deep implications for the mathematical side of programming languages. I have put together an (incomplete) guide to [[https://github.com/prathyvsh/lambda-calculus][learning it here]] and run a [[https://twitter.com/prathyvsh/status/1188787773441888257][twitter thread that captures the beautiful resources]] that I find along the way.
13 |
14 | Once you have understood Lambda Calculus and is fluent in the notational language that is used to talk about it, you would have a good starting point to explore the literature further. And with enough practice would be able to start understanding about the ideas being talked about and understand what import they have for daily programming practice. I am at this stage and it is leading me to different directions to explore logic, algebra, type theory, geometry, analysis, number theory, category theory, and relevant literature that you see linked here.
15 |
16 | ** On Mathematics required to study the subjects
17 |
18 | Mathematics is a scary thing for most programmers, and I think for the right reasons. The dense symbolic paradigm that current mathematics is enmeshed is only accessible to trained experts in this runespeak. And that too, only after a good time of working with them and understanding what the symbols mean and how you can use them together to signify things and make your findings tractable. While symbolic setup has helped us hugely in advancing the sciences, there’s something about the surface structure and deep structure to be said here.
19 |
20 | Mathematical symbology can be thought of as the surface structure of what you learn, but they signify the deep structures of various mathematical constructs. When you algebraically denote x = y, interpreted graphically, you just described a line. In a similar fashion most of such mathematical surface descriptions have a semantic counterpart that mathematicians have access to in their mind’s eye. Getting access to this space is what I consider to be the most important stuff.
21 |
22 | If this is where you are stuck, I would just like let to know that it’s natural and mathematical notation could do a better job at improving the disemmination by making it interactive. One such great resource is [[http://immersivemath.com/][Immersive Math]] and another one is the terrific videos by [[https://twitter.com/3blue1brown][3Blue1Brown]]
23 |
24 | What I think we need is something similar to that but for computing where the abstract algebraic and categorical structures are grounded in an interactive environment that will enable us to explore these structures. But that is something I’m working towards and one of the motives behind starting this study.
25 |
26 | So what I can do at this point is to encourage you to go through those resources by keeping the semantic meaning of the ideas you learn in the back of your mind and try to slowly acquire a knowledge of the big picture of how these things fit together. Mathematical symbology can be thought of as a minimal programming syntax in which the ideas under studies are couched. Thus the ideas are stripped away of distracting details and you can focus on inspecting their different properties by working with the notation.
27 |
28 | ** Motivation and deep focus
29 |
30 | Motivation is central to learn much amount of stuff. One thing that drives me is the morphism of how one thing changes into another which I think points at a universality of how these concepts share an identity in the way they display manifold vagaries of their appearances. Going beneath the surface, you are able to witness parallels as you go deeper. To really understand this you need to focus deeply on tough math subjects for long periods of time. I have come to realize that mathematics is not just about learning ideas or propositions and then applying them to solve problems; but it is also at the same time working on yourself to improve your focus and get into the flow to explore deeper ideas and form conceptions that make the complex ideas tractable.
31 |
32 | There is something to be said about setting up the environment around you to make this easy to achieve, but I need to reflect on this part more to articulate how one might go about setting up a conducive environment to learn tought concepts better.
33 |
34 | ** Getting stuck
35 |
36 | A common observation during these studies is that you’ll often become confused, stuck, or even at loss on how to even conceptualize a novel idea that you just come across. It is these points of inquiry that I think are very valuable in the learning process and what you do when you get stuck determines the learning outcome.
37 |
38 | A trajectory that I constantly recourse to when coming across a concept that I can’t understand is to refer to other resources on the same material that would give me a different perspective on the subject. This gives multiple view/inspection points to the object of study and it really helps in connecting ideas together and gives you a richer conception for the import of the ideas being studied. This in many cases has given me motivation to explore related material. And then if you go back to the stumbling point, it would make much more sense and you would be able to see it in a richer network of concepts that gives it meaning.
39 |
40 | Hope that helps in giving a working base for people who are intrigued about these connections and feel free to touch base with me on [[https://twitter.com/prathyvsh][Twitter]] if you come across much difficulty when learning.
41 |
--------------------------------------------------------------------------------
/img/birkhoff-universal-algebra.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/prathyvsh/morphisms-of-computational-structures/028bd57d59895afaaf4e61d3f9feafb58a8843a1/img/birkhoff-universal-algebra.png
--------------------------------------------------------------------------------
/img/christopher-strachey.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/prathyvsh/morphisms-of-computational-structures/028bd57d59895afaaf4e61d3f9feafb58a8843a1/img/christopher-strachey.png
--------------------------------------------------------------------------------
/img/dana-scott.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/prathyvsh/morphisms-of-computational-structures/028bd57d59895afaaf4e61d3f9feafb58a8843a1/img/dana-scott.png
--------------------------------------------------------------------------------
/img/eugenio-moggi.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/prathyvsh/morphisms-of-computational-structures/028bd57d59895afaaf4e61d3f9feafb58a8843a1/img/eugenio-moggi.png
--------------------------------------------------------------------------------
/img/garrett-birkhoff.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/prathyvsh/morphisms-of-computational-structures/028bd57d59895afaaf4e61d3f9feafb58a8843a1/img/garrett-birkhoff.png
--------------------------------------------------------------------------------
/img/john-reynolds.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/prathyvsh/morphisms-of-computational-structures/028bd57d59895afaaf4e61d3f9feafb58a8843a1/img/john-reynolds.png
--------------------------------------------------------------------------------
/img/nn-types-fp.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/prathyvsh/morphisms-of-computational-structures/028bd57d59895afaaf4e61d3f9feafb58a8843a1/img/nn-types-fp.png
--------------------------------------------------------------------------------
/img/peter-landin.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/prathyvsh/morphisms-of-computational-structures/028bd57d59895afaaf4e61d3f9feafb58a8843a1/img/peter-landin.png
--------------------------------------------------------------------------------
/img/robin-milner.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/prathyvsh/morphisms-of-computational-structures/028bd57d59895afaaf4e61d3f9feafb58a8843a1/img/robin-milner.png
--------------------------------------------------------------------------------
/img/roger-godement.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/prathyvsh/morphisms-of-computational-structures/028bd57d59895afaaf4e61d3f9feafb58a8843a1/img/roger-godement.png
--------------------------------------------------------------------------------
/img/samuel-eilenberg.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/prathyvsh/morphisms-of-computational-structures/028bd57d59895afaaf4e61d3f9feafb58a8843a1/img/samuel-eilenberg.png
--------------------------------------------------------------------------------
/img/saunders-maclane.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/prathyvsh/morphisms-of-computational-structures/028bd57d59895afaaf4e61d3f9feafb58a8843a1/img/saunders-maclane.png
--------------------------------------------------------------------------------
/img/tony-hoare.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/prathyvsh/morphisms-of-computational-structures/028bd57d59895afaaf4e61d3f9feafb58a8843a1/img/tony-hoare.png
--------------------------------------------------------------------------------
/img/universal-co-algebra-chart.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/prathyvsh/morphisms-of-computational-structures/028bd57d59895afaaf4e61d3f9feafb58a8843a1/img/universal-co-algebra-chart.png
--------------------------------------------------------------------------------
/img/øysten-ore.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/prathyvsh/morphisms-of-computational-structures/028bd57d59895afaaf4e61d3f9feafb58a8843a1/img/øysten-ore.png
--------------------------------------------------------------------------------
/journal.org:
--------------------------------------------------------------------------------
1 | ** Early pointers
2 |
3 | Some of these connections where what drew may attention to find that there is morphisms happening across different control constructs.
4 |
5 | *** Continuations vs. Lazy Evaluation
6 | Chris Okasaki has found the link between [[https://link.springer.com/article/10.1007/BF01019945][Lazy Evaluation / Call by Need and Continuations]].
7 |
8 | *** Closures vs. Actors
9 | Closures — Actor isomorphism was demonstrated by Guy Steele/Dan Friedman but was [[https://arxiv.org/vc/arxiv/papers/1008/1008.1459v8.pdf][rejected by Hewitt]].
10 |
11 | *** Montague Quantification vs. Continuations
12 | Natural language exhibiting continuations is described by Barker [[https://www.cs.bham.ac.uk/~hxt/cw04/barker.pdf][here]] and in [[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.454.8690&rep=rep1&type=pdf][this book]].
13 |
14 | *** Initial/Final (Co)Algebras
15 |
16 | *** September 25
17 |
18 | Learnt that combinational logic is one level lower to regular languages in that, they don’t need memory. There is a finite set of combinations rather than a set of rules that have to recall what came before. I think this difference is apparent only at the hardware level. I need to poke deeper to understand if there’s a mathematically tractable divide between these two sections: Combinational Logic and Regular languages. The similarity I feel is that both have a set of preset rules which they carry in memory.
19 |
20 | *** September 26
21 |
22 | There seems to be a link with sheafs, pre-sheaves, and computation. I think I need to make my understanding of how topology and algebra is related to ground exactly how these relationship is structured. I need to start collecting resources towards this end here. Some details can be found here: https://en.wikipedia.org/wiki/Computable_topology
23 |
24 | Learnt about [[https://en.wikipedia.org/wiki/B%C3%B6hm_tree][Böhm trees]]. Need to understand their relationship to topology of Lambda Calculus.
25 |
26 | Noson Yanofsky seems to be treading similar space: http://www.sci.brooklyn.cuny.edu/~noson/MCbeginning.pdf
27 | There are a lot of neat diagrams here that shows how different ideas are related in a Pascal-esque espirite de geométrie.
28 |
--------------------------------------------------------------------------------
/notes-on-history-of-interactive-theorem-proving.org:
--------------------------------------------------------------------------------
1 | * History of Interactive Theorem Proving
2 | John Harrison, Josef Urban, Freek Wiedijk (2014)
3 |
4 |
5 | Interactive Theorem Proving
6 | An arrangement where the machine and a human user work together interactively to produce a formal proof.
7 |
8 | Possibilities:
9 |
10 | Computer as a checker of a formal proof produced by a human
11 | Prover may be highly automated and powerful
12 |
13 | Most of the earliest work on computer-assisted proof in the 1950s
14 |
15 | A computer program for Presburger’s algorithm
16 | Davis 1957
17 |
18 | A proof method for quantification theory: Its justification and realization
19 | Gilmore 1960
20 |
21 | A computer procedure for quantification theory
22 | Davis and Putnam 1960
23 |
24 | Toward Mechanical Mathematics
25 | Wang 1960
26 |
27 | A mechanical proof procedure and its realization in an electronic computer
28 | Dag Prawitz, Håken Prawitz, Neri Voghera 1960
29 |
30 | and 1960s
31 |
32 | A machine-oriented logic based on the resolution principle
33 | Robinson 1965
34 |
35 | An inverse method of establishing deducibility in classical predicate calculus
36 | Maslov 1964
37 |
38 | Mechanical theorem-proving by model elimination
39 | Loveland 1968
40 |
41 | was dedicated to automated theorem proving.
42 |
43 | AI-style approaches:
44 |
45 | The logic theory machine
46 | Newell and Simon 1956
47 |
48 | Realization of a geometry-theorem proving machine
49 | Gelerntner 1959
50 |
51 | Some automatic proofs in analysis
52 | Bledsoe 1984
53 |
54 | The work of proof in the age of human-machine collaboration
55 | Dick 2011
56 |
57 |
58 | Proofchecker
59 | Paul Abrahams
60 | Machine Verification of Mathematical Proof (1963)
61 |
62 | Paul Abrahams introduced in embryonic form many ideas that became significant later:
63 |
64 | - a kind of macro facility for derived inference rules
65 | - the integration of calculational derivations as well as natural deduction rules
66 |
67 | Automatic theorem proof-checking in set theory: A preliminary report — Bledsoe and Gilbert 1967 was inspired by Bledsoe’s interest in formalizing the already unusually formal proofs in his PhD adviser A. P. Morse’s ‘Set Theory’.
--------------------------------------------------------------------------------
/papers-read.org:
--------------------------------------------------------------------------------
1 | * [[https://mdpi-res.com/d_attachment/philosophies/philosophies-03-00015/article_deploy/philosophies-03-00015.pdf][The Algebraic View of Computation: Implementation, Interpretation and Time]]
2 | Attila Egri-Nagy
3 | 2018
4 | 14 Pages
5 |
6 | Computational implementations are special relations between what is computed and what computes it.
7 |
8 | A chain of emulation is ultimately grounded in an algebraic object, a full transformation semigroup.
9 |
10 | Mathematically, emulation is defined by structure preserving maps (morphisms) between semigroups.
11 |
12 | In contrast, interpretations are general functions with no morphic properties. They can be used to derive semantic content from computations.
13 |
14 | Hierarchical structure imposed on a computational structure plays a similar semantic role.
15 |
16 | Beyond bringing precision into the investigation, the algebraic approach also sheds light on the interplay between time and computation.
17 |
18 | It is argued that for doing a philosophical investigation of computation, semigroup theory provides the right framework.
19 |
20 | ** 1. Semigroup — Composition Table of Computations
21 |
22 | Roughly speaking, computation is a dynamical process governed by some rules.
23 |
24 | *** 1.1 Generalizing Traditional Models of Computation
25 |
26 | When we limit the tape or the number of calculation steps to a finite number, the halting problem ceases to be undecidable.
27 |
28 | We restrict the Turing Machine to a finite capacity of memory and such a restricted Turing machine is said to be a finite state automaton.
29 |
30 | Thought: Why isn’t it mapped to PDA or a 2-stack machine?
31 |
32 | *** Definition of Semigroups
33 |
34 | xy = z
35 |
36 | We can say that x is combined with y results in z.
37 |
38 | One interpretation is that two sequential events x and y happens resulting in the event z.
39 |
40 | Alternatively, x can be an input, y a function and xy denotes a function application usually denoted as y(x)
41 |
42 | Or, the same idea with different terminology, x is a state and y is a state-transition operator. This explains how to compute.
43 |
44 | Semigroup: a set with an associative binary operation and the composition often called multiplication.
45 |
46 | Principle 1: State event Abstraction
47 |
48 | We can identify an event with its resulting state: state x is where we end up when event x happens, relative to a ground state. The ground state, in turn, corresponds to a neutral event that does not change any state.
49 |
50 | #+BEGIN_QUOTE
51 | Numbers measure size, groups measure symmetry.
52 |
53 | M. Armstrong, Groups and Symmetry (1988)
54 | #+END_QUOTE
55 |
56 | The general idea of measuring is that we assign a mathematical object to the thing to be measured. In the everyday sense, the assigned object is a number, but for measuring more complicated properties, we need more structured objects. In this sense, we say that semigroups measure computation. Turing completeness is just a binary measure, hence the need for a more fine-grained scale. Thus, we can ask questions like “What is the smallest semigroup that can represent a particular computation?”
57 |
58 | *** 1.3 Computation as Accretion of Structure
59 |
60 | The idea of accretion has a clear algebraic description: the set of generators. These are elements of a group whose combinations can generate the whole table.
61 |
62 | ** 2. Computation and Time
63 |
64 | *** 2.1 Different Times
65 |
66 | An extreme difference between logical time and actual time is that computation can be suspended
67 |
68 | *** 2.2 Not Enough Time
69 |
70 | If there were infinite time available for computation, or equivalently, infinitely fast computers, we would not need to do any programming at all, brute-force would suffice (time travel even changes computability). A possible shortcut to the future would render the P vs. NP problem meaningless with enormous consequences.
71 |
72 | *** 2.2 Timeless Computation?
73 |
74 | Memory space can often be exchanged for time and the limit of this process is the lookup table, the precomputed result. Taking the abstraction process to its extreme, we can replace the 2D composition table with a 1D look up table with keys as pairs (x,y) and values xy.
75 |
76 | Computation, completely stripped off any locality the process may have, is just an association of keys to values.
77 |
78 | We can talk about static computational structures, composition tables, and we can also talk about computational processes, sequences of events tracing a path in the composition table.
79 |
80 | If computation is information processing, then information is frozen computation.
81 |
82 | For a Rubik’s Cube, one can specify a particular state either by describing the positions and orientations of the cubelets or by giving a sequence of moves from the initial ordered configuration. This is another example of the state-event abstraction (Principle 1).
83 |
84 | ** 3. Homomorphism — The Algebraic Notion of Implementation
85 |
86 | “A physical system implements a given computation when the causal structure of the physical system mirrors the formal structure of the computation.”
87 |
88 | *** 3.1 Homomorphisms
89 | Homomorphism is a knowledge extension tool: we can apply knowledge about one system to another. It is a way to predict outcomes of events in one dynamical system based on what we known about what happens in another one, given that a homomorphic relationship has been established. This is also a general trick for problem solving, widely used in mathematics. When obtaining a solution is not feasible in one problem domain, we can use easier operations by transferring the problem to another domain — assuming that we can move between the domains with structure preserving maps.
90 |
91 | What does it mean to be in a homomorphic relationship for computational structures? Using the composition table definition, we can now define their structure preserving maps. If in a systems S event x combined with event y yields the event z = xy, then by a homomorphism φ : S → T, then in another system T the outcome of φ(x) combined with φ(y) is bound to be φ(z) = φ(xy), so the following equation holds
92 |
93 | φ(xy) = φ(x)φ(y)
94 |
95 | On the left hand side, composition happens in S, while on the right hand side composition is done in T.
96 |
97 | A distinguished class of homomorphisms is isomorphisms, where the correspondence is one-to-one. In other words, isomorphisms are strictly structure preserving, while homomorphisms can be structure forgetting down to the extreme of mapping everything to a single state and to the identity operation. The technical details can be complicated due to clustering states (surjective homomorphism) and by the need of turning around homomorphism we also consider homomorphic relations.
98 |
99 | By turning around implementations we can define computational models. We can say that a physical system implements an abstract computer, or we can say that the abstract computer is a computational model of the physical system.
100 |
101 | *** 3.2 Computers as Physical systems
102 |
103 | Definition 1 (vague). Computers are physical systems that are homomorphic images of computational structures (semigroups).
104 |
105 | The first definition begs the question, how can a physical system be an image of a homomorphism, i.e., a semigroup itself? How can we cross the boundary between the mathematical realm and the external reality? First, there is an easy but hypothetical answer. According to the Mathematical Universe Hypothesis, all physical systems are mathematical structures, so we never actually leave the mathematical realm.
106 |
107 | Secondly, the implementation relation can be turned around. Implementation and modeling are the 2 directions of the same isomorphic relation. If T implements S, then S is a computational model of T. Again, we stay in the mathematical realm, we just need to study mappings between semigroups.
108 |
109 | Definition 2. Computers are physical systems whose computational models are homomorphic images of semigroups.
110 |
111 | This definition of computers is orthogonal to the problem of whether mathematics is an approximation or a perfect description of a physical reality, and the definition does not depend on how physical systems are characterized.
112 |
113 | Biological systems are also good candidates for hosting computation, since they’re already doing some information processing. However, it is radically different from digital computation. The computation in digital computers is like toppling dominoes, a single sequence of chain reactions of bit-flips. Biological computation is done in a massively parallel way (e.g., all over in a cell), more in a statistical mode.
114 |
115 | *** 3.3 Difficulty in Programming
116 |
117 | *** 3.4 Interpretations
118 |
119 | Computational implementation is a homomorphism, while an arbitrary function with no homomorphic properties is an interpretation. We can take a computational structure and assign freely some meaning to its elements, which we call the semantic content. Interpretations look more powerful since they can bypass limitations imposed by the morphic nature of implementations. However, since they are not necessarily structure preserving, the knowledge transfer is just one way. Changes in the underlying system may not be meaningful in the target system. If we ask a new question, then we have to devise a new encoding for the possible solutions.
120 |
121 | For instance, reversible system can carry out irreversible computation by a carefully chose output encoding. A morphism can only produce irreversible systems out of irreversible(?) systems. This in turn demonstrates that today's computers are not based on the reversible laws of physics. From the logic gates up, we have proper morphic relations, but the gates themselves are not homomorphic images of the underlying physics. When destroying information, computers dissipate heat. Whether we can implement group computation with reversible transformations and hook on a non-homomorphic function to extract semantic content is an open engineering problem. In essence, the problem of reversible computation implementing programs with memory erasure is similar to trying to explain the arrow of time arising from the symmetrical laws of physics.
122 |
123 | Throwing computing into reverse (2017) — M. P. Frank
124 |
125 | ** 4. High-Level Structure: Hierarchies
126 |
127 | Composition and lookup tables are the “ultimate reality” of computation, but they are not adequate descriptions of practical computing. The low-level process in a digital computer, the systematic bit flips in a vast array of memory, is not very meaningful. The usefulness of a computation is expressed at several hierarchical layers above (e.g., computer architecture, operating system, and end user applications).
128 |
129 | A semigroup is seldom just a flat structure, its elements may have different roles. For example, if xy = z but yx = y (assuming x ≠ y ≠ z), then we say that x has no effect on y (leaves it fixed) while y turns x into z. There is an asymmetric relationship between x and y: y can influence x but not the other way around. This unidirectional influence gives rise to hierarchical structrues. It is actually better than that. According to the Krohn-Rhodes theory, every automaton can be emulated by a hierarchical combination of simpler automata. This is true even for inherently non-hierarchical automata built with feedback loops between its components. It is a surprising result of algebraic automata theory that recurrent networks can be rolled out to one-way hierarchies. These hierarchies can be thought as easy-to-use cognitive tools for understanding complex systems. They also give a framework for quantifying biological complexity.
130 |
131 | The simpler components of a hierarchical decomposition are roughly of two kinds. Groups correspond to reversible combinatorial computation. Groups are also associated with isomorphisms (due to the existence of unique inverses), so their computation can also be viewed as pure data conversion. Semigroups, which fail to be groups, contain some irreversible computation, i.e., destructive memory storage.
132 |
133 | ** 5. Wild Considerations
134 |
135 | The question of whether cognition is computational or not, might be the same as the question of whether mathematics is a perfect description of physical reality or is just an approximation of it. If it is just an approximation, then there is a possibility that cognition resides in physical properties that are left out.
136 |
137 | A recurring question in philosophical conversations is the possibility of the same physical system realizing two different minds simultaneously. Let’s say n is the threshold for being a mind. You need at least n states for a computational structure to do so. Then suppose there is more than one way to produce a mind with n states, so the corresponding full transformation group T_n can have subsemigroups corresponding to several mind. We then need a physical system to implement T_n. Now, it is a possibility to have different embeddings into the same system, so the algebra might allow the possibility of two minds coexisting in the same physical system. However, it also implies that most subsystems will be shared or we need a bigger host with at least 2n states. If everything is shared, the embeddings can still be different, but then a symmetry operation could take one mind into the other. This is how far mathematics can go in answering the question. For scientific investigation, these questions are still out of scope. Simpler questions about the computation form a research program. For instance, What is the minimum number of states to implement a self-referential system? and, more generally, What are the minimal implementations of certain functionalities? and How many computational solutions are there for the same problem? These are engineering problems, but solutions for these may shed light on the more difficult questions about the possibilities and constraints of embedding cognition and consciousness into computers.
138 |
139 | * 6. Summary
140 |
141 | In the opinion of the author, philosophy should be ahead of mathematics, as it deals with more difficult questions, and it should not be bogged down by the shortcomings of terminology. In philosophy, we want to deal with the inherent complexity of the problem, not the incidental complexity caused by the chosen tool. The algebraic viewpoint provides a solid base for further philosophical investigations of the computational phenomena.
142 |
143 | * Finite Computational Structures and Implementations
144 | Attila Egri-Nagy
145 | 2016
146 |
147 | 12 Pages
148 |
149 | ** Abstract
150 |
151 | What is computable with limited resources?
152 | How can we verify the correctness of computations?
153 | How to measure computational power with precision?
154 |
155 | Despite the immense scientific and engineering progress in computing, we still have only partial answers to these questions. In order to make these problems more precise, we describe an abstract algebraic definition of classical computation, generalizing traditional models to semigroups.
156 |
157 | ** 1. Introduction
158 |
159 | The exponential growth of the computing power of hardware (colloquially known as Moore’s Law) seems to be ended by reaching its physical and economical limits. In order to keep up technological development, producing more mathematical knowledge about digital computation is crucial for improving the efficiency of software.
160 |
161 | ** 2. Computational Structures
162 |
163 | Our computers are physical devices ad the theory of computation is abstracted from physical processes. Mathematical models clearly define the notion of computation, but mapping the abstract computation back to teh physical realm is often considered problematic. It is argued that structure-preserving maps between computations work from one mathematical model to another just as well as from the abstract to the concrete physical implementation, easily crossing any ontological borderline one might assume between the two.
164 |
165 | Since abstract algebra provides the required tools, suggestion is made for further abstractions to the models of computations to reach the algebraic level safe for discussing implementations. It is also suitable for capturing the hierarchical structure of computers. Finiteness and the abstract algebraic approach paint a picture where universal computation becomes relative and the ‘mathematical versus physical’ distinction less important.
166 |
167 | First it is attempted to define computations and implementations purely as abstract functions, then the need for combining functions leads us to definition of computational structures.
168 |
169 | *** 2.1 Computation as a function: input-output mappings
170 |
171 | Starting from the human perspective, computation is a tool. We want a solution for some problem: the input is the question, the output is the answer. Formally, the input-output pairs are represented as a function f : X → Y, and computation is function evaluation f(x) = y, x ∈ X, y ∈ Y. As an implementation of f, we need a dynamical system whose behaviour can be modelled by another function g, which is essentially the same as f.
172 |
173 | Computation can be described by a mapping
174 | f : { 0, 1 }^ m → { 0, 1 }^n , m,n ∈ ℕ
175 |
176 | According to the fundamental theorem of reversible computing, any finite function can be computed by an invertible function. This apparently contradicts the idea of implementation, that important properties of functions have to be preserved.
177 |
178 | #+BEGIN_SRC
179 |
180 | Embedding XOR:
181 |
182 | [00] ↦ [0]0
183 | [01] ↦ [1]1
184 | [10] ↦ [1]0
185 | [11] ↦ [0]1
186 |
187 | Embedding FAN-OUT:
188 |
189 | [0]0 ↦ [00]
190 | 0[1] ↦ [11]
191 | 10 ↦ 10
192 | 11 ↦ 01
193 |
194 | #+END_SRC
195 |
196 | into the same bijective function. By putting information into the abstract elements, any function can ‘piggyback’ even on the identity function.
197 |
198 | These ‘tricks’ work by composing the actual computations with special input and output functions, that might have different properties. In reversible computing the readout operation may not be a reversible function.
199 |
200 | *** 2.2 Computation as a process: state transitions
201 |
202 | Focussing on the process view, what is the most basic unit of computation? A state transition: an event changes the current state of a system. A state is defined by a configuration of a system’s components, or some other distinctive properties the system may have.
203 |
204 | Let’s say the current state is x, then event s happens changing the state to y. We might write y = s(x) emphasizing that s is a function, but it is better to write xs = y meaning that s happens in state x yielding state y. Why? Because combining events as they happen one after the other, e.g. xst = z, is easier to read following our left to right convention.
205 |
206 | State-event abstraction: We can identify an event with its resulting state: state x is where we end up when event x happens.
207 |
208 | According to the action interpretation, xs = y can be understood as event s changes the current state x to the next state y. But xs = y can also be read as event x combined with event s yields the composite event y, the event algebra interpretation. We can combine abstract events into longer sequences. These can also be considered as a sequence of instructions, i.e. an algorithm. These sequences of events should have the property of associativity
209 |
210 | (xy)z = x(yz) for all abstract events x, y, z
211 |
212 | since a given sequence xyz should be a well-defined algorithm.
213 |
214 | We can put all event combinations into a table. These are the rules describing how to combine any two events.
215 |
216 | Computational Structure: A finite set X and a rule for combining elements of X that assigns a value x' for each two-element sequence, written as xy = x', is a computational structure if (xy)z = x(yz) for all x, y, z ∈ X.
217 |
218 | In mathematics, a set closed under an associative binary operation is an abstract algebraic structure called semigroup.
219 |
220 | Computation is a process in time — an obvious assumption, since most of engineering and complexity studies are about doing computation in shorter time. Combining two events yield a third one (which can be the same), and we can continue with combining them to have an ordered sequence of events. This ordering may be referred as time. However, at the abstraction level of the state transition table, time is not essential. The table implicitly describes all possible sequences of events, it defines the rules how to combine any two events, but it is a timeless entity. This is similar to some controversial ideas in theoretical physics such as the idea of “The End of Time” by D. M. Barbour of Shape Dynamics fame.
221 |
222 | *** 2.3 The computation spectrum
223 |
224 | How are the function and the process view of computation related? They are actually the same. Given a computable function, we ca construct a computational structure capable of computing the function. An algorithm (a sequence of state transition events) takes an initial state (encoded input) into a final state (encoded output). The simplest way to achieve this is by a lookup table.
225 |
226 | Lookup table semigroup: Let f : X → Y be a function, where X ∩ Y = ∅. Then the semigroup S = X ∩ Y ∩ {l} consists of resets X ∪ Y and the lookup operation l defined by xl = y if f(x) = y for all x ∈ X and ul = u for all u ∈ S \ X.
227 |
228 | Is it associative? Let v ∈ X ∪ Y be an arbitrary reset element, and s, t ∈ S any element. Since the rightmost event is a reset, we have (st)v = v and s(tv) = sv = v. For (sv)l = vl = s(vl) since vl is also a reset. For (vl)l = vl, since l does not change anything in S \ X and v(ll) = vl since l is an idempotent (ll = l). Separating the domain and the codomain of f is crucial, for function X → X we can simply have a separate copy of elements of X. When trying to make it more economical associativity may not be easy to achieve.
229 |
230 | Turning a computational structure into a function is also easy. Pick any algorithm (a composite event), and that is also a function from states to states.
231 |
232 | Information storage and retrieval are forms of computation. By the same token computation can be considered as a general form of information storage and retrieval, where looking up the required piece of data may need many steps. We can say that if computation is information processing, then information is frozen computation.
233 |
234 | *** 2.4 Traditional mathematical models of computation
235 |
236 | From finite state automata, we abstract away the initial and accepting states. Those special states are needed only for the important application of recognizing languages. Input symbols of a finite state automaton are fully defined transformations (total functions) of its state set.
237 |
238 | A transformation is a function f : X → X from a set to itself, and a transformation semigroup (X, S) of degree n is a collection S of transformations of an n-element set closed under function composition.
239 |
240 | If we focus on the possible state transitions of a finite state automaton only, we get a transformation semigroup with a generator set corresponding to the input symbols. These semigroups are very special representations of abstract semigroups, where state transition is realized by composing functions.
241 |
242 | If we focus on the possible state transitions of a finite state automaton only, we get a transformation semigroup with a generator set corresponding to the input symbols. These semigroups are very special representations of abstract semigroups, where state transition is realized by composing functions.
243 |
244 | In general, if we take the models of computation that describe the detailed dynamics of computation, and remove all the model specific decorations, we get a semigroup.
245 |
246 | *** 2.5 Computers: physical realizations of computation
247 |
248 | Intuitively, a computer is a physical system whose dynamics at some level can be described as a computational structure. For any equation xy = z in the computational structure, we should be able to induce in the physical system an event corresponding to x and another one corresponding to y such that their overall effect corresponds to z. Algebraically, this special correspondence is a structure-preserving map, a homomorphism. If we want exact realizations, not just approximations, then we need stricter one-to-one mappings, isomorphisms. However, for computational structures we need to use relations instead of functions.
249 |
250 | Isomorphic relations of computational structures: Let S and T be computational structures (semigroups). A relation ɸ : S → T is an isomorphic relation if it is:
251 |
252 | 1) homomorphic: ɸ(s)ɸ(t) ⊆ ɸ(st)
253 | 2) fully defined: ɸ(s) ≠ ∅ for all s ∈ Structures
254 | 3) lossless: ɸ(s) ∩ ɸ(t) ≠ ∅ ⇒ s = t
255 | for all s, t ∈ S. We also say that T emulates, or implements S.
256 |
257 | Homomorphic is the key property, it ensures that similar computation is done in T by matching individual state transitions. Here ɸ(s) and ɸ(t) are subsets of T (not just single elements), and ɸ(s)ɸ(t) denotes all possible state transitions induced by these subsets (element-wise product of two sets). Fully defined meas that we assign some state(s) of T for all elements of S, so we account for everything that can happen in S. In general, homomorphic maps are structure-forgetting, since we can map several states to a single one. Being lossless excludes loosing information about state transitions. In semigroup theory, isomorphic relations are called divisions, a special type of relational morphisms.
258 |
259 | What happens if we turn an implementation around? It becomes a computational model.
260 |
261 | Modelling of computational structures. Let S and T be computational structures (semigroups). A function µ : T → S is a modelling if it is:
262 |
263 | 1) homomorphism µ(u)µ(v) = µ(uv) for all u, v ∈ T,
264 | 2) onto: for all s ∈ S there exists a u ∈ T such that µ(u) = s.
265 |
266 | We also say that S is a computational model of T. In algebra, functions of this kind are called surjective homomorphisms.
267 |
268 | A modelling is a function, so it is fully defined. A modelling µ turned around µ^-1 is an implementation, and an implementation ɸ turned around is a modelling ɸ^-1. This is an asymmetric relation, naturally we assume that a model of any system is smaller in some sense than the system itself. Also, to implement a computational structure completely we need another structure at least as big.
269 |
270 | According to the mathematical universal hypothesis, we have nothing more to do, since we covered mappings from one mathematical structure to another one. In practice, we do seem to have a distinction between mathematical models of computations and actual computers, since abstract models by definition are abstracted away from reality, they do not have any inherent dynamical force to carry out state transitions. Even pen and paper calculations require a driving force, the human hand and the pattern matching capabilities of the brain. But we can apply a simple strategy: we treat a physical system as a mathematical structure, regardless its ontological status. Building a computer then becomes the task of constructing an isomorphic relation.
271 |
272 | Computer: An implementation of a computational structure by a physical system is a computer.
273 |
274 | Anything that is capable of state transition can be used for some computation. The question is how useful that computation is? We can always map the target system’s mathematical model onto itself. In this sense the cosmos can be considered as a giant computer computing itself. However, this statement is a bit hollow since we don to have a complete mathematical model of the universe. Every physical system computes, at least its future states, but not everything does useful calculation. Much like entropy is heat not available for useful work. The same way as steam and combustion engines exploit physical processes to process materials and move things around, computers exploit physical process to transfer and transform information.
275 |
276 | *** 2.6 Hierarchical structure
277 |
278 | Huge state transition tables are not particularly useful to look at; they are like quark-level descriptions for trying to understand living organisms. Identifying substructures and their interactions is needed. Hierarchical levels of organizations provide an important way to understand computers. Information flow is limited to one-way only along a partial order, thus enabling functional abstraction. According to Krohn-Rhodes theory, any computational structure can be built by using destructive memory storage and the reversible computational structures in a hierarchical manner. The way the components are put together is the cascade product which is a substructure of the algebraic wreath product. The distinction between reversible and irreversible is sharp: there is no way to embed state collapsing into a permutation structure. Reversible computing seems to contradict this. The trick there is to put information into states and then selectively read off partial information from the resulting states. This selection of required information can be done by another computational structure. We can have a reversible computational structure on the top, and one at the bottom that implements the readout. We can have many transitions in the reversible part without a readout. Reversible implementations may prove to be decisive in terms of power efficiency of computers, but it does not erase the difference between reversible and irreversible computations.
279 |
280 | Important to note that hierarchical decompositions are possible even when the computational structure is not hierarchical itself. on of the research directions is the study of how it is possible to understand loopback systems in a hierarchical manner.
281 |
282 | *** 2.7 Universal computers
283 |
284 | What is the difference between a piece of rock and a silicon chip? They are made of the same material, but they have different computational power. The rock only computes its next state (its temperature, all wiggling of its atoms), so the only computation we can map to it homomorphically is its own mathematical description. While the silicon chip admits other computational structures. General purpose processors are homomorphic images of universal computers.
285 |
286 | ** 3. Open problems
287 |
288 | The main topics where further research needs to be done are:
289 |
290 | 1) exploring the space of possible computations
291 | 2) measuring finite computational power
292 | 3) computational correctness
293 |
294 | *** 3.1 What are the possible computational structures and implementations?
295 |
296 | Cataloguing stocktaking are basic human activities for answering the question What do we have exactly? For the classification of computational structures and implementations, we need to explore the space of all computational structures and their implementations, starting from the small examples. Looking at those is the same as asking the questions What can we compute with limited resources? What is computable with n states? This is a complementary approach to computational complexity, where the emphasis is on the growth rate of resource requirements.
297 |
298 | *** 3.2 How to measure the amount of computational power?
299 |
300 |
301 | Given an abstract or physical computer, what computations can it perform? The algebraic description gives a clear definition of emulation, when one computer can do the job of some other computer. This is a crude form of measuring computational power, in the sense of the ‘at least as much as’ relation. This way computational power can be measured on a partial order (the lattice of all finite semigroups).
302 |
303 | The remaining problem is to bring some computation into the common denominator semigroup form. For example, if we have a finite piece of cellular automata grid, what can we calculate with it? If the cellular automata is universal and big enough we might be able to fit in a universal Turing machine that would do the required computation. However, we might be able to run a computation directly on the cellular automata instead of a bulky construct.
304 |
305 | Extending the slogan, “Numbers measure size, groups measure symmetry.”, we can say that semigroups measure computation.
306 |
307 | *** How can we trust computers?
308 |
309 | Computations can differ by:
310 |
311 | 1) having different intermediate results
312 | 2) applying different operations
313 | 3) having different modular structure
314 |
315 | *** 4. Conclusion
316 |
317 | * Finite Computational Structures and Implementations: Semigroups and Morphic Relations
318 |
319 | Attila Egri-Nagy
320 | 2017
321 |
322 | 18 pages
323 |
324 | ** 1 Introduction
325 |
326 | Computational complexity studies the asymptotic behaviour of algorithms. Complementing that, it is here suggested to focus on small theoretical computing devices, and study of the possibilities of limited finite computations. Instead of asking what resources we need in order to solve bigger instances of a computational problem, we restrict the resources and ask what can we compute within those limitations. The practical benefit of such an approach is that we get to know the lowest number of states required to execute a given algorithm and having the minimal example are useful for low-level optimizations. Another example of such a reversed question is asking what is the total set of all possible solutions for a problem instead of looking for the single right solution. The mathematical formalism turns this into a well-defined combinatorial question, and the payoff could be that we will find solutions we had never thought of.
327 |
328 | ** 2 Computational Structures
329 |
330 | Since abstract algebra provides the required tools, we suggest further abstractions to the models of computations to reach the algebraic level safe for discussing implementations. It is also suitable for capturing the hierarchical structure of computers. Finiteness and the abstract algebraic approach paint a picture where universal computation becomes relative and the ‘mathematical versus physical’ distinction less important.
331 |
332 | *** 2.1 Computation as function: input-output mappings
333 |
334 | *** 2.2 Computation as a process: state transitions
335 |
336 | A composition table of groups that are semigroups with an identity and a unique inverse for each element is always a Latin square.
337 |
338 | **** 2.2.1 The computation spectrum
339 |
340 | *** 2.3 Traditional mathematical models of computation
341 |
342 | Finite State Automata is considered to be a discrete dynamical system.
343 |
344 | By a finite state automaton, we mean a triple (X, Σ, δ) where
345 |
346 | 1. X is the finite state set
347 | 2. Σ is the finite input alphabet
348 | 3. δ : X × Σ → X is the transition function
349 | How is this a definition of a semigroup? For each state x ∈ X an input symbol σ ∈ Σ gives the resulting state δ(X, σ) = (δ(x_1, σ), δ(x_2, σ), …, δ(x_n, σ)). Therefore, input symbols of a finite state automaton are fully defined transfomaitons (total functions) of its state set.
350 |
351 | A transformation is a function f : X → X from a set to itself, and a transformation semigroup (X, S) of degree n is a collection S of transformations of an n-element set closed under function composition.
352 |
353 | If we focus on the possible state transitions of a finite state automaton only, we get a transformation semigroup with a generator set corresponding to the input symbol.s These semigroups are very special representations of abstract semigroups, where state transitions is realized by composing functions. It turns out that any semigroup can be represented as a transformation semigroup (Cayley’s Theorem for semigroups).
354 |
355 | Transformation semigroup realization of the flip-flop semigroup. The set being transformed is simply X = { 0, 1 } also called the set of states.
356 |
357 | #+BEGIN_SRC
358 |
359 | s0 = { 0 ↦ 0, 1 ↦ 0 }
360 | s1 = { 0 ↦ 1, 1 ↦ 1 }
361 | r = { 0 ↦ 0, 1 ↦ 1 }
362 |
363 | #+END_SRC
364 |
365 | The events can be denoted algebraically by listing the images s0 = [0, 0], s1 = [1, 1], r = [0, 1].
366 |
367 | *** 2.4 Computers: physical realizations of computation
368 |
369 |
370 | **** 2.4.1 Morphic relations of computational structures
371 |
372 | First, we give an algebraic definition of computational implementations, then we justify the choices in the definitions by going through the alternatives.
373 |
374 | Emulation, isomorphic relation of computational strucutres. Let S and T be computational structures (semigroups). A relation ɸ : S → T is an isomorphic relation if it is:
375 |
376 | 1. homomorphic: ɸ(s)ɸ(t) ⊆ ɸ(st)
377 | 2. fully defined: ɸ(s) ≠ ∅ for all s ∈ S
378 | 3. lossless: ɸ(s) ∩ ɸ(t) ≠ ∅ ⇒ s = t
379 |
380 | Homomorphism is a fundamental algebraic concept often described by the equation:
381 |
382 | ɸ(s)ɸ(t) = ɸ(st),
383 |
384 | where the key idea is hidden in the algebraic generality. We have two semigroups S and T, in which the actions of computations are the composition of elements. In both semigroups these are denoted by juxtaposition of their elements. This obscures the fact that computations in S and in T are different. Writing ∘S for composition in S and ∘_T for composition in T we can make the homomorphism equation more transparent:
385 |
386 | ɸ(s) ∘_T ɸ(t) = ɸ(s ∘_S t)
387 |
388 | This shows the underlying idea clearly: it does not matter whether we convert the inputs in S to the corresponding inputs in T and do the computation, in T (left hand side), or do the computation in S then send the outputs in S to its counterpart in T (right hand side), we will get the same result. In the above definition, ɸ(s) and ɸ(t) are subsets of T (not just single elements), and ɸ(s)ɸ(t) denotes all possible state transitions induced by these subsets (element-wise product of two sets).
389 |
390 | #+BEGIN_SRC
391 |
392 | S : s ----------------------------> st
393 | | t |
394 | | | |
395 | | ɸ | ɸ | ɸ
396 | | | |
397 | | v |
398 | v ɸ(t) v
399 | T : ɸ(s) ------------->ɸ(s)ɸ(t) = ɸ(st)
400 | #+END_SRC
401 |
402 | Fully defined means that we assign some state(s) of T for all elements of S, so we account for everything that can happen in S. This is just a technical, not a conceptual requirement, since we can always restricte a morphism to a substructure of S.
403 |
404 | Being lossless excludes loosing information about state transitions. In general, homomorphic maps are structure-forgetting, since we can map several states to a single one. In case of s_1 ≠ s_2 and both s_1 and s_2 are sent to t in the target, the ability of distinguishing them is lost. For lossless correspondence we need one-to-one maps. For relations this requires the image sets to be disjoint. Varying these conditions we can have a classification of structure preserving correspondences between semigroups.
405 |
406 | | | lossy (many-to-1) | lossless (1-to-1) |
407 | |----+----------------+-------------------|
408 | | relation (set-valued) | relational morphism | division |
409 | | function (point-valued) | homomorphisms | isomorphism |
410 |
411 | In semigroup theory, isomorphic relations are called divisions, a special type of relational morphism. This is a bit unfortunate terminology from computer science perspective, emulation instead of division, and morphic relation instead of relational morphism perhaps would be slightly better. In semigroup theory, relational morphisms are used for the purpose of simplifying proofs, not for any deep reasons. However, for describing computational implementations relations are necessary, since we need to be able to cluster states (e.g. micro versus macro states in a real physical settings).
412 |
413 | *** 2.4.2 Computational models
414 | Modelling of computational structures: Let S and T be computational structures (semigroups). A function µ : T → S is a modeling if it is:
415 |
416 | 1. homomorphism: µ(u)µ(v) = µ(uv) for all u, v ∈ T
417 | 2. onto: for all s ∈ S there exists a u ∈ T such that µ(u) = s
418 |
419 | We also say that S is a computational model of T. In algebra, functions of this kind are called surjective homomorphisms.
420 |
421 | In the case of ℤ_2 → ℤ_4, there are more divisions that isomorphisms. ℤ_2 is a quotient of ℤ_4, so ℤ_4 has a surjective homomorphism to ℤ_2. The division here is exactly that surmorphism turned around. Exactly this reversal of surjective homomorphisms was the original motivation for defining divisions.
422 |
423 | Computer: An implementation of a computational structure by a physical system is a computer
424 |
425 | *** 2.5 Hierarchical structure
426 |
427 | Abstract state machines are generalizations of finite state machines. These models can be refined and coarsened forming a hierarchical succession, based on the same abstraction principles as in Krohn-Rhodes theory.
428 |
429 | *** 2.6 Universal Computers
430 |
431 | The full transformation semigroup of degree n (denoted by T_n) consists of all n^n transformations of an n-element set. These can be generated by compositions of three transformations: a cycle [1, 2, 3, …, n - 1, 0], a swap [1, 0, 2, 3, …, n - 1], and a collapsing [0, 0, 2, 3, …, n - 1]. Leaving out the state collapsing generator, we generate the symmetric group S_n, which is the universal structure of all permutations of n points.
432 |
433 | ** 3 Finite computation — research questions and results
434 |
435 | Some practical problems of finite computation which are the difficult ones, that were not really in the focus of mathematical research:
436 |
437 | 1. What are the possible computational structures and implementations?
438 | 2. How to measure the amount of computational power?
439 | 3. How can we trust computers?
440 |
441 | *** 3.1 Enumeration and classification of computational structures
442 |
443 | The method of enumerating certain types of semigroups by enumerating all subsemigroups of relatively universal semigroup of that type has been applied to a wider class of semigroups, called diagram semigroups. These generalize function composition to other combinatorial structures (partial functions, binary relations, partitions, etc.), while keeping the possibility of representing the semigroup operation as stacking diagrams. These can be considered as ‘unconventional’ mathematical models of computations (e.g. computing with binary relations or partitions instead of functions). The existence of different types of computers leads to the problem of comparing their power.
444 |
445 | *** 3.2 Measuring finite computational power
446 |
447 | For an abstract semigroup, finding the minimal number of states n such that it embeds into the full transformation semigroup T_n is the same of finding the minimal number of states such that the given computation can be realized. This state minimization is an important engineering problem. It is not to be mistaken with the finite state automata minimization problem, where the equivalence is defined by recognizing the semi regular language, not by isomorphic relation.
448 |
449 | *** 3.3 Algorithmic solution spaces and computational correctness
450 |
451 | The simplest definition of a computational task is that we want to produce output from some input data. How many different ways are there for completing a particular task? The answer is infinity, unless we prohibit wasteful computations and give a clear definition of being different. Computational complexity distinguishes between classes of algorithms based on their space and time requirements. This is only one aspect of comparing solutions, since there might be different algorithms with very similar performance characteristics (e.g. bubble sort and insertion sort). Therefore we propose to study the set of all solutions more generally.
452 |
453 | When are two solutions really different? The differences can be on the level of implementation or of the algorithm specification. Informally we can say that computations can differ by their:
454 |
455 | 1. intermediate results
456 | 2. applied operations
457 | 3. modular structure
458 | 4. or by any combination of these
459 |
460 | An algorithmic solution space is a set of computer programs solving a computational problem, i.e. realizing a function described by input-output mappings.
461 |
462 | * Algebraic Models for Understanding: Coordinate Systems and Cognitive Empowerment
463 | C. Lev Nehaniv
464 | 16 pages
465 | 1997
466 |
467 | ** Abstract
468 |
469 | We identify certain formal algebraic models affording understanding (including positional number systems, conservation laws in physics, and spatial coordinate systems) that have empowered humans when we have augmented ourselves using them. We survey how, by explicit mathematical constructions, that such algebraic models can be algorithmically derived for all finite-state systems and give examples illustrating this including coordinates for the rigid symmetries of a regular polygon, and recovery of the decimal expansion and coordinates arising from conserved quantities in physics.
470 |
471 | Coordinate systems derived by algebra or computation for affordances of the understanding and manipulation of physical and conceptual worlds are thus a ‘natural’ step in the use of ‘tools’ by biological systems as they/we learn to modify selves and identities appropriately and dynamically.
472 |
473 | * Introduction
474 |
475 | Throughout the history of science, the greatest advances have come about when human beings came to realizations about how to think about their subject “correctly”. The deepest such advances have resulted from the advent of appropriate coordinate systems for a phenomenon to be understood. Some obvious and familiar examples include the decimal or binary expansions of the real numbers, Cartesian coordinate in analytic geometry, Galilean or Newtonian spatial coordinates, the periodic table in chemistry, conservation laws in physics, locally Euclidean coordinates in Riemannian geometry, and more recently, the idea of object-orientation.
476 |
477 | Our chief claims in this paper are that:
478 |
479 | 1) such appropriate coordinate systems generally share certain crucial properties that explain their usefulness
480 | 2) such coordinate systems for understanding and manipulating a given system at least in the case of finite-state systems (such as our present day digital computers) can be generated algorithmically for use by humans or machines.
481 |
482 | Such models for understanding may empower a human being to manipulate real, abstract, or synthetic worlds. It is important to note that to exploit or access formal models constructed with the aid of computers, a human being need not understand the algebraic theory behind their construction and should be free from such intellectual and computational encumbrances; however, mastering manipulation and application of a coordinate system e.g. the decimal number system may require some effort.
483 |
484 | ** 2 Properties of Understanding in Coordinates
485 |
486 | When we reflect on how we understand a system using any of the historically useful coordinate systems mentioned above, certain properties of the coordinate systems are evident:
487 |
488 | - *Globality*: we have some sort of description of essential characteristics of the system and can describe or predict how it will change with occurrences of events important for that system.
489 |
490 | - *Hierarchy*: Except in the simplest cases, the whole is broken down into component parts, which themselves may consist of other parts, and so on, resulting in an ordering that encodes the dependencies among parts. Information from higher levels of the hierarchy gives good approximate knowledge, while “lower order” details are fleshed at subordinate levels
491 |
492 | - *Simplicity of Components*: The smallest component parts are by themselves easy to understand
493 |
494 | - *Covering*: We have implicitly or explicitly a knowledge of how to map the coordinate representation of the system to the system we are trying to understand.
495 |
496 | A non-required property is that of a one-to-one correspondence between possible coordinate states and states of the system. Indeed, there is usually a many-to-one relationship between representations in coordinates and states of the system (as for example, with real numbers and their decimal representations).
497 |
498 | Taking these properties as axiomatic for appropriate coordinate systems on understanding, we note that such understanding of a system is something quite different from knowledge (but may be related to such knowledge) of how the original system can be built or efficiently emulated, nor is it the same as knowledge of how the system is really structured. It is the utility afforded by such coordinate systems as tools for understanding which interests humans, but we shall also be concerned with the nature of affordance in cognitive (and physical) environments that are created, affected, or manipulated by these systems, as well as with explicit methods to construct these systems.
499 |
500 | ** 3 Constructing Coordinates on Understanding Algorithmically and Generating Analogies Automatically
501 |
502 | Object-oriented design can be regarded as a special case of wreath product decomposition over a partial order, and global semigroup theory then provides a rigorous algebraic foundation.
503 |
504 | Algebraic engineering of understanding: Global hierarchical coordinates on computation for manipulation of data, knowledge, and process - C. L. Nehaniv, 1994
505 |
506 | We present first a formalization of “system” common in physics and computer science. Then, beginning from the above properties, we present a formalization of “model for understanding” of such a system.
507 |
508 | We contend that generally such useful coordinate systems can be considered as formal models for understanding of the domain via an emulation by the (generally larger) coordinate system, giving global hierarchical coordinates in the sense of emulation by a cascade (or wreath product) of small component parts.
509 |
510 | The idea that wreath product emulations provide models for understanding appears to be due to John L. Rhodes. In a still unpublished book written in the late 1960s, what we call formal models for understanding here are motivated as theories which provide understanding of an experiment, i.e. a system.
511 |
512 | The examples of coordinate systems on the concepts of number, time, physical transformation, and software represent hard-won achievements for humanity, and they share the properties listed above. Could they have been found by some algorithm, rather than brilliant insights?
513 |
514 | Our thesis is that the answer is “yes”, at least in part. Given a finite-state model of any system, mathematical constructions based on the Krohn-Rhodes Theorem in algebra show how to construct a coordinate system bearing the markings of a useful model of understanding as described above. Generalizations of this theorem to the infinite case, show that this is even possible without the assumptions of finiteness, although algorithmically the construction may be more complicated. In particular, mathematics guarantees the existence of and yields generally many formal models of understanding of the given system in global hierarchical coordinates. A hierarchy of simple components is found by a simple recursive procedure, and the dependencies between these components are configured in such a way that an input to the system results in a change of state whose effect at each given component, i.e., at the given coordinate position depends only on the input and states above the component in the hierarchy. Each state and input of the original system has one or many representations in the covering coordinate system, and computation of the updated state when working in the covering coordinates proceeds in a hierarchical, feed forward manner. Hence in global hierarchical coordinates the functioning of the system to be understood is easy globally computable, and recovery of the corresponding state in the original system given in a determined way.
515 |
516 | Given a finite state system described to a human or to a computer, it is possible then to algebraically engineer an appropriate formal model for understanding that system. For example, an object-oriented software realization of the system could be derived automatically from an unstructured finite-state description of the system to be emulated.
517 |
518 | Moreover, relationships and analogies between formal models for understanding have a nice algebraic descriptions which are amenable to automatic manipulation, and the completion of relations to full analogies can be carried out using recently developed “kernel theorems” of algebraic automata theory which capture mathematically the notion of what needs to be added to a relationship between systems to complete it to a full, covering analogy.
519 |
520 | We shall see below that for finite-state systems, methods already exist for generating a formal model of understanding for the given system and that one also has definitions for analogy and metaphors as relational mappings (morphisms in an appropriate setting) as well as characterizations of exactly what [extension] is required for an incomplete relation to be transformed into an finished model [emulation].
521 |
522 | * 4 Mathematical Platonism and Cybernetic Contigency: For cyborgs who have considered suicide when ad hoc coordinates were enough
523 |
524 | The non dualism of Haraway’s characterization of ourselves as cyborg is naturally disturbing to notions of separate identity, fixed boundaries, a Platonic realm of forms, ‘The Good’. As a profession that as a whole has implicitly adopted Platonic idealism (the belief in a heaven of pure, perfect forms beyond mere physical being) where proving existence and uniqueness takes an a primary role mathematics seems remote from considerations of the fragmentation and re-construction of ourselves. The late great combinatoricist Paul Erdös often spoke of ‘The Book’ in which the ‘Supreme Fascist’ (God) keeps secret the most elegant and beautiful proofs of eternal mathematical truths. only by great effort and inspiration can a mortal hope to find any of its proofs, glimpses of ‘The Book’. It is hardly possible to be a mathematician without taking this metaphysical realm seriously at least implicitly in practice if not consciously, since at least the abstract objects of study are presumed to exist. Yet we will see that this Platonic world can serve as a substrate for constructing (non-unique) models affording understanding and contingent self-extension. And indeed, whether empiricism science and engineering like it or not, mathematical results are neither physical nor subject to Popper’s notion of ‘falsifiability’, viz. the possibility of disproof by experiment. Thus mathematics is a branch of metaphysics providing tools applied by science and engineering, while remaining itself outside their epistemological scope. While mathematical entities may be of dubious or at least unclear ontological status, the structure (whatever this may ‘be’) of these entities (whatever they may ‘be’) informs and constrains science and engineering.
525 |
526 | The contingency we see in the cyborg appears also in the mathematically derived hierarchical coordinate systems comprising models for understanding that we shall survey below. Indeed, by extending our minds with coordinate systems such as the decimal expansion, we adopt non unique, possible non ideal, contingent (and thus historical tools and methods of self augmentation or interface with the world which could have been otherwise).
527 |
528 | This contingency of design appears elsewhere in science: it is the essential characteristic of what Herbert Simon has called the sciences of the artificial. That is, unlike the physical and chemical structures of the natural world, artificial systems are those that have a given form and behaviour only because they adapt (or are adapted) in reference to goals or purposes, to their environment. They have interfaces to the world that are contingent on possibly teleological circumstances (choice or design). We shall argue that this concept of ‘artificial’ has a natural extension to the biological and cognitive worlds (where design may not be of the conscious sort).
529 |
530 | Whether we consider our eyes, our human languages, or our number systems for calculation, the arbitrariness of choices while constrained by physics or chemistry or evolutionary history or the metaphysics of mathematics is not determined by necessity but could usually have been otherwise within historical contingencies and design constraints. The non-uniqueness of coordinate systems for understanding physics is well-known from the examples of Galilean inertial frames and from the locally Euclidean relativistic space-time coordinate systems. Local Euclidean coordinate systems agree where they overlap, describing regions of space which patch together to cover lal of curved space time in a manifold which is globally not the naïve Euclidean space.
531 |
532 | In a similar way, the earth is locally like a flat plane, and can be covered by sheets which are just planes topologically, but globally no single plane can cover the earth; consider for example the non-uniqueness of longitudes at the north and south poles. In fact, a globally consistent, one to one planar coordinate system is impossible for a sphere like the surface of the earth! There must be at least two patches, and where they overlap the coordinates are non-unique resulting in an expansion of space in the coordinate world but not of physical space.
533 |
534 | We see that we ourselves can be understood as “artificial systems” in the sense of contingent design. Our coordinate systems (that is, our ways of understanding space-time, numbers, language, concepts etc.) as conceptual tools as well as our physical tools (stone knives, hammers, eyeglasses, clothes, prosthetic limbs) have arbitrariness in their conscious or non-conscious design. Moreover, the evolution of multicellular life, the incorporation of bacterial endosymbionts into our ancestral eucaryotic cells, our evolution via duplication and divergence of HOX controlled development in segmented animals, even the correspondence (a code!) between nucleotide triplets (codons) and amino acids in protein biosynthesis could have been other wise. All this seems to have occurred without a ‘master plan’, but rather via ‘just’ the opportunistic (for self-reproduction) merging of identities and contingent choice of ad-hoc solutions.
535 |
536 | Tools, whether codes, boundaries, molecules, or even other beings, are just so much raw material for living systems in their ceaseless self-creation, reproduction and evolution under constrained circumstances. These facts point to the fundamental nature of the tool as logically prior to the cyborg or even to the living system. Indeed, living systems especially, whether human, metazoan, bacterial, plant or fungal, have always ‘taken what they could use’. In this way, one has now deconstructed by taking to their extreme the notions of tool and cyborg so that even a bacterium may be called a cyborg in the sense of being an ‘artificial’, contingent system making its best use of ‘tools’. While the cyborg arises from a living system merging with its tools from its very beginning choosing them by means of natural selection.
537 |
538 | Uniqueness of design may be possible sometimes, and existence of a unique necessary solution can be a beautiful thing, but in general it is a degenerate case, not to be expected in our coordinate system for understanding or in the design of other tools as we take responsibility for their construction, and thereby for the augmentation of our minds, ourselves.
539 |
540 | A way to see that decimal expansion are not numbers is to see that the decimal system truly ‘expands’ the numbers. For instance, the representations 1.000… and 0.999… are distinct expansions of the same number, 1. Examination of the decimal expansions also shows that they given ‘historical pathways’ for getting to a number, viz. 3, 3.1, 3.14, are the first steps on a path to the ratio between a circle’s circumference and diameter π.
541 |
542 | We contend that the decimal expansion is but one system of an entire class of computational coordinate systems for understanding, and wish to promote the techniques for generating such models, both with mathematical reasoning and, for the widest impact, with computer assistance.
543 |
544 | * 5 Transformation Semigroups Leading to Models for Understanding
545 |
546 | When a transformation semigroup arises from a system (of states and events) to be understood, a covering of the transformation semigroup by a wreath product of simpler transformation semigroups can be considered a formal model for understanding. Such a wreath product decomposition yields a coordinate system appropriate to the original system.
547 |
548 | As shown by Noether and Rhodes, conservation laws and symmetries lead to refined understanding of physical systems, and this understanding may be formalized as global hierarchical coordinatization. Hierarchical object structuring in computer software provides another example.
549 |
550 | Relational morphisms can be considered as metaphors, since they can be interpreted as partially successful attempts at understanding one structuring using another. Kernel theorems for relational morphisms of transformation semigroups can then be employed as algebraic tools for the manipulation and construction of such models of / for understanding.
551 |
552 | We suggest that developing computational tools for the implementation of feasible automatic discovery of these formal models of understanding as well as for their algebraic manipulation will extend the human notions of understanding, metaphor and analogy to a formal automated realm.
553 |
554 | ** 5.1 Mathematical Concepts and Notation
555 |
556 | Semigroups, Automata, Transformation Semigroups
557 |
558 | A semigroup is a set S with a binary multiplication defined on it satisfying the associative law: (xy)z = x(yz). An automaton X = (Q, S) is a set of states Q together with, for each symbol s of an input alphabet S, an associated [state-transition] function from Q to Q (also denoted by s). A special case of this is a transformation semigroup (Q, S), which is a right action of a semigroup S on a set of states Q, i.e. a rule taking q ∈ Q and s ∈ S to q.s ∈ Q, satisfying (q . s) . s' = q . ss' for all q ∈ Q, and s, s' ∈ S. Given any automaton X we can canonically associate to it a transformation semigroup generated by considering the semigroup of state transition mappings induced by all possible sequence of inputs to the automaton acting on its states. To avoid trivialities, we shall always assume that state sets, input sets, and semigroups are non-empty. Generally we shall require that two distinct inputs a ≠ a' to an automaton differ at least at one state x: x.a ≠ x.a'. Such an automaton is called faithful. The restriction of faithfulness is really not essential, as one can make X faithful by identifying input symbols which have the same action. A transformation semigroup (X, S) is faithful if x.s = x.s' for all x ∈ X implies s = s'. One can always make a transformation semigroup faithful by identifying the s and s' which both map X exactly the same way.
559 |
560 | For semigroups, a homomorphism (or, more briefly, a morphism) from S to T is a function ɸ : S → T such that ɸ(ss') = ɸ(s)ɸ(s') for all s, s' ∈ S. A morphism known to be surjective (onto) will be called a surmorphism and be denoted with a double-headed arrow ɸ : S ↠ T: the semigroup T is then said to be a homomorphic image of S. An injective (one-to-one) morphism ɸ is called an embedding of S into T. T is a subsemigroup of S if T is a subset of S closed under multiplication.
561 |
562 | Thought: If something does not have a closure, then it is not a subsemigroup, but rather a partial section of the original space. A complete partition of the space into subsemigroups I think is an equivalence partition, so in that sense, closure is an important property for obtaining the subsemigroups. Verify this claim when exploring these ideas further.
563 |
564 | We write T ≤ S if T is a subsemigroup of S or, more generally, embeds in S. If there is a bijective (one-to-one and onto) morphism from S to T, we say that S and T are isomorphic and write S ≅ T. The notation S ≺ T means S is a homomorphic image of a subsemigroup of T. Intuitively, T can emulate any computation of S. In such a case, we say S divides T.
565 |
566 | If X = (Q, S) and Y = (U, T) are automata or transformation semigroups a morphism ɸ from X to Y consists of a function ɸ_1 : Q → U and a mapping ɸ_2: S → T (which is required to a semigroup morphism in the transformation semigroup case), such that for all q ∈ :Q and s ∈ S, one has ɸ_1(q.s) = ɸ_1(q).ɸ_2(s). Surmorphism, embedding, isomorphism, and division of transformation semigroups are defined analogously. If Y divides X, it is also common to say that X covers Y and that X emulates Y.
567 |
568 | New Automata from Old: For transformation semigroups X = (Q, S) and Y = (U, T), their wreath product Y ≀ X is the transformation semigroup with states U × Q and semigroup consisting of all pairs (f, s) where f : Q → T and s ∈ S with action: (u, q) . (f, s) = (u . f(q), q . s). Thus the action of input (f, s) on the X component is independent of the Y component but depends only on the input, while the action Y depends on the input and the state of the X component. We have multiplication (f', s')(f, s) = (h,s's) where h(q) = f'(q)f(q.s), a product in T.
569 |
570 | To understand action and the multiplication in the semigroup, observe That
571 |
572 | ((u, q) . (f', s')) . (f, s) = (u . f'(q), q . s') . (f, s)
573 | = (u . f'(q)f(q . s), q.s's)
574 | = (u . h(q), q . s's)
575 |
576 | and indeed, the element h(q) in T applied to u is a function only of the input and the state q, while the element s's in S applied to q depends only on the input (f', s')(f, s). The wreath product is an associative product on the class of all transformation semigroups.
577 |
578 | More generally, transformation semigroups (X_α, S_α) generically combined with dependencies coded by an (irreflexive) partial order μ = (V, <) yield (X, S) = ∫_α ∈ V (X_α, S_α)dμ, with states X = ΠX_α and semigroup elements f : X → X with (xf)_α = x_α . f(x<_α), where f(x<_α) ∈ S_α and x<_α is the projection of x that forgets all components except the x_β with β < α. The transformation semigroup (X, S) is called the cascade integral of the (X_α, S_α) over µ.
579 |
580 | The usual wreath product of n transformation semigroups is just a cascade integral over a finite total order. Since the hierarchies we allow for formal models of understanding permit components to be combined according to a partial ordering, the above generalized form of wreath product is very useful. For computational applications, one would evidently most often restrict to finite partial orders (as for Cartesian coordinates) or finitely many coordinates of an infinite partial order (as for the decimal expansion of the integers).
581 |
582 | To read: Cascade decomposition of arbitrary semigroups - C. L. Nehaniv (1995)
583 |
584 | The wreath product of faithful transformation semigroups is easily seen to be faithful. Notice that there is a hierarchical dependence of Y and on X. By iterating this construction, one may take the wreath product of any sequence of components.
585 |
586 | The direct product (U, T) × (Q, S) of automata [resp. transformation semigroups] has state U × Q and inputs [resp. semigroup] T × S with component-wise action: (u,q) . (t, s) = (u . t, q . s). This is the case of no interaction between components.
587 |
588 | A cascade of automata X and Y as above has states U × Q and any set of inputs (f, s) with f : Q → T and s ∈ S, acting on the states as for the wreath product. The wreath product is the transformation semigroup version of a generic cascade of automata: It is easy to see that every cascade of two automata (including of course their product) embeds in the wreath product of their associated transformation semigroups. Thus, the wreath product provides the formal algebraic framework for the notion of hierarchical combination of automata.
589 |
590 | Given a semigroup S, we define S* to be S if S contains an identity element 1 ∈ S with s1 = 1s = s for all in S, otherwise take S* to be S with new identity element 1 adjoined. If X = (Q, S) is an automaton or transformation semigroup, we define X* = (Q, S*), where the identity of S* acts as the identity function on all states in Q. Also X^c = (Q, S^c), where for each state q ∈ Q a constant map taking value q has been adjoined as an element of S.
591 |
592 | The disjoint union of automata X ⊔ Y has state set Q × U and inputs S ⊔ T, with
593 |
594 | (q, u) . i = { (q.i, u) if i ∈ S | (q, u.i) if i ∈ T }
595 |
596 | Observe that this automaton generates the direct product at the associated transformation semigroup level if X and Y contain identity transformations. Whence, X ⊔ Y embeds in X* ⊔ Y.
597 |
598 | For a semigroup S, one obtains a canonically its so-called right regular representation (S*, S), with states S*, semigroup S and action s.s' = ss' for all s ∈ S* and s' ∈ S. If S and T are semigroups, we shall write S ≀ T for the wreath product of their right regular representations.
599 |
600 | A group is a semigroup S of a very special (symmetric and reversible) kind: S has an identity e and for each element s in S, there exists an inverse s' in S with ss' = s's = e. Generally, a semigroup may contain many groups (having unrelated structures) as subsemigroups. A group is called simple if its homomorphic images are just itself and the one element group (up to isomorphism). A finite semigroup each of whose subgroups have only one element is called aperiodic.
601 |
602 | Relational Morphisms. A relational morphism of ɸ : (X, S) ◁ (Y, T) is a subtransformation semigroup (Q, R) of (X, S) × (Y, T), thus,
603 |
604 | (x, y) ∈ Q and (s, t) ∈ T implies (x.s, y.t) ∈ Q
605 |
606 | and Q and R project fully onto X and S, respectively. A relational morphism is a morphism if each Q and R are [graphs of] functions, an embedding if Q and R are injective functions, or is an approximation [or simulation] of (X, S) by (Y, T) if it is a surjective morphism. It is an emulation or covering of (X, S) by (Y, T) if Q and R are injective relations: ((x, y) ∈ Q and (x', y) ∈ Q) ⇒ x = x'
607 |
608 | and similarly for R. If (Y, T) covers (X, S) we write (X, S) ≺ (Y, T), and often say that (X, S) divides (Y, T). Also in the case of semigroups, we write S ≺ T (S divides T) if S is a homomorphic image of a subsemigroup of T.
609 |
610 | In the case of covering, one can use (Y, T) in place of (X, S): given x ∈ X and s ∈ S, we choose (x, y) ∈ Q and (s, t) ∈ T. The elements y and t are called lifts of x and of s, respectively. Now the state x.s is uniquely determined from y.t by the injectivity condition. If (Y, T) has a nice form, say as a wreath product of simpler transformation semigroups, then this covering provides a global hierarchical coordinate system on (X, S), which may be more convenient to manipulate and provide insights into the original (X,S).
611 |
612 | 5.2 Systems & Formal Models for Understanding
613 |
614 | A system (X, A, λ) consists of a state spac X, inputs A, and a transition function λ : X × A → X. Traditional physics considers (or hopes) that physical phenomenon are faithfully modelled as such systems: knowing the current state x and what happens a, one can determine the resultig state as λ(x, a). Note that for a sequence of events a_1, …, a_n+1 one has a recursive description of the behaviour of the system (as λ induces an action of the free semigroup A^+ on X):
615 |
616 | x.a_1 … a_n a_n+1 = λ(x.a_1 ... a_n, a_n+1)
617 |
618 | Crucially, such a description of the system is not a model of understanding rather only a starting point for analysis. However, such a system, if modelling real world phenomena, abstracts essential features of state and events while ignoring others. This abstraction from the real world is one property of understanding that our formalism does not address. We shall always assume the system (X, A, λ) to be available before beginning any formalization for understanding it or we shall construct it from other, already available, systems.
619 |
620 | Indeed, A may consist of tiny intervals of time and λ describe the evolution of a physical situation accordign to a set of differential equations, e.g. determining the position and momentum of a set of point masses according to Newtonian mechanics. Such recursive descriptions including descriptions in terms of differential equations ar often the beginning of analysis of a physical system and precede formal understanding.
621 |
622 | From (X, A, λ) one derives a transformation semigroup (X, S) by making that induced action of A^+ faithful
623 | w ≡ w' ⇔ forAll(x ∈ X, x.w = x.w').
624 |
625 | Here S = A^+ / ≡.
626 |
627 | A formal model of understanding for a system (X, A, λ) is a covering of the induced transformation semigroup (X, S) by a wreath product of “simpler” transformation semigroups over a partial ordering.
628 |
629 | ** 6. Examples of Formal Models Affording Understanding
630 |
631 | Cartesian Coordinates on Euclidean n-Space.
632 |
633 | The n “simple” coordinates used to coordinatize Euclidean space are copies of the real numbers under addition. These copies are partially ordered by the empty partial order (no dependencies between them). In this case, there is a one-to-one correspondence from coordinates for points to points of the space, and points (or vectors) are added component-wise without regard to other components.
634 |
635 | Symmetries of a Hexagon.
636 |
637 | Let us imagine a regular polygon with n vertices in the plane with one vertex topmost. Let ℤ_n detone the cyclic gorup of order n (corresponding to a modulo n counting circuit). By our convention ℤ_n ≀ ℤ_2 denotes the wreath product of the right rectangular representations of ℤ_n and ℤ_2. Symmetries of a hexagon comprises the dihedral group D_6 (resp. D_n for the regular n-gon), which is covered by such a wreath product. If we paint one side of the hexagon white and the other black, and number the vertices of the hexagon clockwise on the white side from 0 to 5, then a pair (k, color) determines a configuration of the hexagon where we understand k ∈ ℤ_6 as the number on the currently topmost vertex and color ∈ ℤ_2 as the color of the face we see, either 0 denoting white or 1 denoting black. This gives a coordinate representation of the state of the hexagon.
638 |
639 | * [[http://web.eecs.utk.edu/~bmaclenn/papers/WLIOW.pdf][Words Lie in Our Way]]
640 | Bruce McLennan
641 | June 20, 1994
642 | 19 Pages
643 |
644 | If numbers are not broken down into digits but are represented by physical quantities proportional to them the computer is called an analogue. For example, a slide rule. In a simple analog computer the representing variable in the computer model is proportional to the represented variable in the modeled system.
645 |
646 | In traditional applications of analog computing, the represented variables were continuous phyical quantities, and so the analog computer usually made use of continuous quantities (states of the analog device, such as voltages or currents) to represent them. In digital computers, in contrast, the discrete states of the computer and the values of the modeled variable are not related by a simple proportion.
647 |
648 | There is an isomorphism (one-to-one structure preserving map) between the values of a model variable andd the values of the system variable in both analog and digital or a homomorphism (many-to-one structure preserving map) if we consider that a discrete model is an approximation of a continuous system and an analog system too can only have a finite amount of precision when it is used for simulating models.
649 |
650 | The complementarity principle states that continuous and discrete models should be complementary, that is, an approximately-discrete continuous model should mak the same macroscopic predictions as a discrete model, and conversely an approximately-continuous discrete model shhould make the same macroscopic predictions as a continuous model.
651 |
652 | In computation, the substance (matter and energy) is merely a carrier for the form (mathematical structure).
653 |
654 | The principal difference between analog and digital computation is that digital compuatiton has discrete series for its successive state while for analog compuattion the states form a continuous trajectory in state space. The state space provides substance for computation upon which a form is imposed by the path through state space.
655 |
656 | The equation:
657 | S(t') = F[t, S(t), X(t)]
658 |
659 | is used to characterize four kinds of autonomous systems.
660 |
661 | If we call t as time, S(t) as dependent variables, and X(t) as independent variables, depending on whether the next state S(t') depends on them;
662 |
663 | Independent of X: Purely autonomous (Square root of 2)
664 | Dependent of X only at start: Less autonomous (Square root of a given number)
665 | Dependent of X anytime: Interactive computing / Feedback control system
666 | Dependent of X but independent of S: Reactive system with no memory
667 |
668 | Axiomatic systecs studied in logic do not define a unique trajectory. Rather they define constraints (the inference rules) on allowaable trajectories, but the trajectory over which a give computation proceeds that is, the proof generated is determined by an outside agent.
669 |
670 | Formality vs. Transduction
671 |
672 | To the extent that the variables do refere to spcific physical quantities, equations are material.
673 |
674 | Formal equations: Contain no physical variables while material equations do.
675 |
676 | The formal equations specify an implementation-independent computation, whereas the material equations specify an implementation-specific transduction.
677 |
678 | Intuitively, the formal equations are the program, the material equations are the input/output relations to the real world.
679 |
680 | In its modern sense, calculus emboddies the idea of digital computation and formal logical inference.
681 |
682 | Simulacrum as a possible alternate to calculus. Simulacra have images as counterparts of formulas in calculi.
683 |
684 | One commonly distinguishes between an uninterpreted calculus andan interpreted calculus. Both hare formal computational systems, but in the former case we consider syntactic relaitons that are internal to the system, wheras in the latter we take into account semantic relaitons, which associate some meaning in an external domain with the states andd processes of the calculus, thus making them representations.
685 |
686 | I think of this roughly as being concerned about the internal relations (syntactic) and the mappings being made from the exterior to other systems (semantics).
687 |
688 | Systematicity in both the analog and digital cases can be defined as continuity, under the appropriate topology in each case.
689 |
690 | Computation is the instantiation of a formal process in a physical system to the end that we may exploit or better understand that process.
691 |
692 | A task is computational if its function would be served as well by any system with the same formal structure. Thus, computing the square root and unifying propositions are computational tasks, but digesting starch is not.
693 |
694 | A system is computational if it accomplishes a computational task by means of a computation.
695 |
696 | A computational system comprises a formal part (e.g., a calculus or simulacrum) and, usually, an interpretation.
697 |
698 | * To Read next
699 |
700 | ** [[https://www.journals.uchicago.edu/doi/pdf/10.1086/661623][AfterMath: The Work of Proof in the Age of Human-Machine Collaboration]]
701 | Stephanie Dick
702 |
--------------------------------------------------------------------------------
/readme.org:
--------------------------------------------------------------------------------
1 | * Morphisms of Computational Constructs
2 |
3 |
4 | This is a narrative on how computational structures such as data structures, programming constructs, and algebraic structures undergirding Computer Science display deep parallels among them. The mathematical terminology for making the links between two objects of study is called a morphism. It allows one to study analogies occuring across different categories rigourously by examining their underlying structure. I first came across these as as I was exploring features in computer programming languages and I kept finding out that they exhibited parallels to ideas in domains remote to Computer Science. It is designed to act as compilation of the rich skein of conceptual coherences and interconnections of computation as unearthed by research scientists across the globe.
5 |
6 | I was able to understand that the following subjects:
7 |
8 | | Logic | Algebra | Language | Computation | Categories | Topology/Spaces | Quantum Mechanics |
9 |
10 | all have deep links with each other, which is being unravelled by multiple scientists all across the world. It is one of the most exciting things to be learning any one of these subjects and then finding out links about these in the other subjects. There is a popular term called Curry-Howard isomorphism, which has been getting expanded into various forms as new links are unearthed. One can see this elaborated as:
11 |
12 | - Curry-Howard-Lambek
13 |
14 | - Curry-Howard-Tait
15 |
16 | - Curry-Howard-Lambek-Scott-Tarski-Stone
17 |
18 | - Curry-Howard-Brouwer-Heyting-Kolmogorov
19 |
20 | and so on as the articles and papers touching upon these issues try to draw the parallels. It is safe to say that there is much happening in an interdisciplinary manner and this document attempts to give a partial documentation of some of the links the author has been able to patch together. Hope you enjoy finding the interconnections!
21 |
22 | - TODO: Role of Analysis / Synthesis
23 | - TODO: Might have to go into cybernetics for a bit when mentioning synthesis
24 | - TODO: This would tie in with the genesis of the ideas of entropy in the Macy Circles
25 | - TODO: Discuss what ideas the second generation cybernetists uncovered (Foerster, Pask, Varela, Maturana etc.)
26 | - TODO: Add in Baez-Stay Rosetta work here
27 | - TODO: Describe about Theory A and Theory B side of computation
28 |
29 | Theory A concentrates on Programming Languages. Theory B concentrates on complexity theory
30 |
31 | One of the prominent ideas in complexity theory is that once you have a description of a computational process, you can now have a way in which to compare these algorithms with each other.
32 |
33 | If you have an algorithmic process to solve it, then it comes down into polynomial time.
34 |
35 | The problem that is opened up when something is not polynomial time is that you would have to search throughout the computational universe in an enumerative fashion for the solution. This means that it could take a lot of time. Even for creating a simple image, like say 32 x 32 with 2 colors could take an astronomical amount of time. And if we even just increase the amount a tiny bit, you could even go beyond the number of seconds till predicted lifespan of the Universe, even with the fastest computer.
36 |
37 | - TODO: Mention this thought experiment
38 |
39 | So, polynomial time is associated with algorithms which we have been able to device for computationally tractable problems and NP is the time usually (Verify if this is a true claim), with ones where we don't have a proper algorithm to solve it.
40 |
41 | - TODO: Discuss if the whole problem can be devised in a way so as to arrive at way to falsify it.
42 |
43 | The place where this matters is that for the number of open problems in mathematics and computer science, you have P associated with the problems for which we have been able to find a way to traverse the space of possibilities in a way so that we achieve neat time metrics and the intractable ones for which we still haven't been able to study/understand the space in which they are inhabited except traversing it one unit at a time which presents us a problem in the presence of finite space, time and computational resources.
44 |
45 | - TODO: Describe about P vs. NP
46 | - TODO: Shannon’s role
47 | - TODO: Diagram showing how Curry to Howard to De Bruijn AUTOMATH, Kolmogorov to Martin Löf to FOLDs to Voevodsky (Homotopy Type Theory) + Graphic Formalisms + Work in Quantum Mechanics
48 |
49 | - TODO: Note how propositions are now types, but types are not propositions. And types are the concepts used in a synthetic manner
50 |
51 | ** Notes to an enthusiast
52 |
53 | When I was young, I found people talking about ideas like Curry-Howard isomorphism or discussing the struggle about learning about monads pretty intriguing. Though, I lacked the technical know-how at that time to evaluate what was even being addressed. But as I started looking into these over a period of time, I slowly became immersed in the terminology enough to understand what was being referred to and what was at stake. If you find yourself puzzled about what all these ideas are about but are at loss on how to proceed, let me share some of the things that have helped me in gaining traction to understand them.
54 |
55 | *** [[./how-to-learn.org][A short guide about learning these ideas]]
56 |
57 | ** Preface
58 |
59 | Since 1900s there have been emergent fields in mathematics like universal algebra, category theory that attempt to capture rigorously the parallels between different domains of study. These studies along with the requirement for engineering complex systems and our drive to understand these ideas deeply, lead to setting up fields within computer science to examine them ideas closely. Some of these domains of inquiry include automata theory, algorithmic complexity, and different kinds of logical and (axiomatic/operational/denotational/categorical) semantic studies.
60 |
61 | Reading through this literature and paying attention to discoveries happening in Computer Science made me alert to the idea that something is up. There seems to be something strange and deep happening in the intersection of Computer Science and Mathematics. Observing my own work with programming languages made me see how they have deep congruences when you look closer at the surface structure of programming languages and use this to understand their deeper structures. Computing can bet hought of as a medium and programming languages as a way for interacting with these computational structures. Each of such structures that are constructed and deconstructed in the computers differ in the way they provide tractability and compositionality. Bringing together abstractions from mathematics and sciences help us see how each programming language differ and unite by casting them in a setting where their fundamental nature is made visible and can be tinkered with.
62 |
63 | This repository attempts to capture the (hi)story of how these emerged, and the key people who contributed to it. I intend to turn it into a visual catalogue of what kinds of morphisms/structure preserving maps computational structures display among each other written in a manner communicable to someone who have sensed a kind of resonance across very different fields of computation, but would like to explore if there is a meta-structure emerging here.
64 |
65 | * Why study these?
66 |
67 | My motivation towards studying these concepts is that they allow you to figure out the deep unity and distinction among different concepts in programming languages. Apart from programming languages, these studies also shine light on how natural language could be tied to programming languages. These I sense provide a certain setting in which you can understand how language, grammars, mechanism, and mind are related.
68 |
69 | Also, it is of great value in doing advancing programming methods and the field is being actively researched. There has been a ton of activities in these domains and it is intimidating for an entrant to understand the who, what, how and why of these. This document is my humble attempt at trying to bring a structure to the tangled web of development so that it might help someone to make sense when undertaking a similar journey. Hope it helps!
70 |
71 | I also keep a rough journal of how I came across the ideas [[./journal.org][here]].
72 |
73 | And if you find any errors or have feedback, please reach out to me on Twitter: [[https://twitter.com/prathyvsh][@prathyvsh]]
74 |
75 | #+BEGIN_HTML
76 |
77 | Concepts under study
78 | #+END_HTML
79 |
80 | - Fixed Point: Fixed points can be thought of as the state when an input to a function returns itself as the output.
81 | This is an important idea in computation as fixed points can be thought of as modelling loops and recursion.
82 |
83 | - Continuations: Continuations can be thought of as a construct that carries with it the context that need to be evaluated.
84 |
85 | - Lazy Evaluation / Non-strictness: Lazy evaluation also known as non-strictness, delays the evaluation of a program and lets a user derive the values on demand.
86 |
87 | - Actors: Actors are models of concurrency devised by Hewitt. He found the aspect of lack of time in Lambda Calculus a setback and sought to amend it with his model.
88 |
89 | - Closures: Closures are contexts of function execution stored for computational purposes
90 |
91 | - Automata Theory
92 |
93 | - Algebraic Effects: Algebraic Effects allow one to build up composable continuations.
94 |
95 | - Monads: Originally deriving from abstract algebra, where they are structures that are endofunctors with two natural transformations. Monads when used in the programming context can be thought of as a way to bring in infrastructure needed for composing functions together.
96 |
97 | - Montague Quantification: Montague considered programming language and natural languages as being united with a universal grammar. His idea of quantification is thought to be parallel to continuations in programming languages.
98 |
99 | - Generators/Iterators: Constructs that allows one to control the looping behaviour of a program
100 |
101 | - ACP
102 |
103 | - Pi Calculus / Calculus of Communicating Systems
104 |
105 | - Full Abstraction
106 |
107 | - Bisimulation
108 |
109 | - Communicating Sequential Processes
110 |
111 | - Combinatory Logic
112 |
113 | - Lambda Calculus
114 |
115 | - Homotopy Type Theory
116 |
117 | - Constructive Mathematics
118 |
119 | - Ludics
120 |
121 | - Linear Logic
122 |
123 | - Geometry of Interaction
124 |
125 | - Transcendental Syntax
126 |
127 | - Game Semantics
128 |
129 | - Domain Theory
130 |
131 | - *Algebraic Structures*
132 |
133 | [[./img/birkhoff-universal-algebra.png]]
134 |
135 | Magmas, Semigroup, Quasigroup, Loop, Monoid, Monad, Group, Abelian Groups, Ring, Fields, Lattice, Modules, Filters, Ideals, Groupoid, Setoid, Trees, Lists, Units
136 |
137 | Algebraic structures are studied under universal/abstract algebra with each species sharing a different structural property. They can be thought of as sharing a set with certain operations that gives them a particular nature.
138 |
139 | They have deep connections with computation as most of the structures that we deal with in computer science belongs to the algebraic species studied by mathematicians.
140 |
141 | - Data and Co-Data
142 |
143 | - Algebras and Co-Algebras
144 |
145 | - Initial and Final Algebras
146 |
147 | - Morphisms
148 |
149 | - Recursion Schemes
150 |
151 | - Covariance and Contravariance
152 |
153 | - Monotonicity
154 |
155 | #+BEGIN_HTML
156 |
157 | #+END_HTML
158 |
159 | * History
160 |
161 | ** Early History
162 |
163 | The study of computation is something that has deep roots into antiquity. Keeping in mind that it is anachronistic to ascribe modern concepts to describe what our ancestors did, some proto-form of computation can be seen in the ancient divination devices used in ancient Arab culture and medieval period. The 17th, and 18th century found many great minds setting a ground for modern algebra to take roots and a significant break in the tradition can be thought of as coming from the English school of logic where algebra and logic was combined. After this period great advances where made throughout the 19th century which set the stage for the intellectual advancements of the 20th century where the idea of computation takes shape.
164 |
165 | ** The intellectual advancements of 20th century
166 |
167 | There are several works that contributed to the emergence of computer science but some of the figures that have had a salient early influence in shaping up the idea of computation were the works of Gödel, Frege, Hilbert, Russell, Post, Turing, and Whitehead.
168 |
169 | ** Hilbert program and the birth of Lambda Calculus
170 |
171 | Towards 1910s, a framework called Lambda Calculus was invented by Alonzo Church, inspired by Principia Mathematica. Principia Mathematica was an undertaking to ground all of mathematics in logic. It was created in response to the Hilbert program to formalize effective calculability. Lambda Calculus became one of the standard environment to do work on computation in academic circles. This inspired Scott-Strachey-Landin line of investigations to base programming language studies on it.
172 |
173 | ** Universal Algebra and Category Theory
174 |
175 | #+BEGIN_HTML
176 |
177 |
178 |
179 |
180 |
181 |
182 | #+END_HTML
183 |
184 | In 1930s, work on Universal Algebra, commenced by Whitehead, were given a clarified format by mathematicians like Oysten Ore, and Garrett Birkhoff.
185 |
186 |
187 | #+BEGIN_HTML
188 |
189 |
190 |
191 |
192 | #+END_HTML
193 |
194 | Towards 1940s, one would see the development of Category Theory. A huge amount of intellectual advances are made from this theoretical vantage point that would contribute towards studying the morphisms between different theoretical models.
195 |
196 | ** Work post 1950s
197 |
198 | #+BEGIN_HTML
199 |
200 | #+END_HTML
201 |
202 | Lattice Theory, Universal Algebra, Algebraic Topology, and Category Theory became fields with intense investigation into the mathematical structure. It is during this period of intense activity that Godemont invented monads under the name “standard construction” in his work [[https://amzn.to/2ZP167s][Théorie des faisceaux (Theory of Sheaves) (1958)]].
203 |
204 | #+BEGIN_HTML
205 |
206 |
207 |
208 |
209 |
210 |
211 | #+END_HTML
212 |
213 | John McCarthy was one of the first persons to attempt to give a mathematical basis for programming. In his paper Towards a mathematical science of computation (1961), he discussed the then three current directions of numerical analysis, complxity theory and automata theory as inadequate to give a firm footing to software engineering as practiced in the day and attempted to give his ideas on building a firm foundation.
214 |
215 | - TODO: Add in image of John McCarthy
216 |
217 | Three approaches to programming language semantics emerged in the 1960s. Mathematical semantics attempted to act as a metalanguage to talk about the programs, their structures, and data handled by them. This in turn would also act as a precise way to provide specification for compilers.
218 |
219 | ** Operational Semantics
220 | The operational approach took the compiler itself to constitute a definition of the semantics of the language.
221 |
222 | There are three kinds:
223 |
224 | 1/ Small Step or Structural Operational Semantics
225 |
226 | It was introduced by Gordon Plotkin.
227 |
228 | This method makes use of a transition relation to ddefine behaviour of programs. SOS specifications make use of inference rules that derive the valid transitions of a composite syntax in terms of the transitions of its components.
229 |
230 | 2/ Reduction Semantics by Matthias Felleisen and Robert Hieb
231 |
232 | This is devised an equational theoy for control and state. It uses the idea of contexts where terms can be plugged in.
233 |
234 | 3/ Big Step Semantics or Natural Semantics
235 |
236 | This method was invented by Gillies Kahn. It describes in a divide and conquer manner how the final evaluation results of language constructs can be obtained by combining the evaluation results of their syntactic counterparts.
237 |
238 | ** Denotational Semantics
239 |
240 | Denotational Semantics was pioneered by Christopher Strachey and Dana Scott in the 1970s.
241 |
242 | Denotational Semantics is powered by Domain Theory, which is a Tarskian style semantics for understanding programming language constructs. By providing models for programming language constructs, it acts as a ground to understand what each kind of programming languages translate into a mathematical object setting. This has the advantage that now models for PL constructs can now be interpreted as mathematical entities and any two language can now be compared for equivalence and relationships by looking at the mathematical objects they desugar/translate into. The mathematical setting initially (currently?) adopted is that of set-theoretic objects but I think its not fully tied to that and the language can be changed to that of say hypersets or categories.
243 |
244 | Denotational Semantics uses this framework for understanding programming language semantics. Programming language constructs are understood as objects with well-defined extensional meaning in terms of sets.
245 |
246 | We make use of structural induction to provide denotational semantics for expressions.
247 |
248 | TODO: Somehow diagonalization proof of D → (D → E) is used to hint at the idea almost all functions are uncomputable. I need to understand this proof to address this idea of how it connects up with the well-foundedness(?) nature of (co)domains. Hint from [[https://www.cs.cornell.edu/courses/cs6110/2013sp/lectures/lec19-sp13.pdf][document]]
249 |
250 | ** Deductive Approach
251 | Pioneered by R. W. Floyd in 1967, it linked logical statements to the steps of the program thereby specifying its behaviour as well as providing a means of verifying the program.
252 |
253 | They used it to understand different programming language constructs popular at the time. Landin came up with operational semantics and Scott/Strachey with denotational semantics that modelled programming languages by mapping them to mathematical models.
254 |
255 | Using these formalizations, one can start to reason about what different constructs in programming language mean (operation wise / structure preserving mapping wise) and conduct studies on them for discovering their properties and complexity parameters.
256 |
257 | In “Toward a Formal Semantics” Strachey distinguished between L-values and R-values. The computer’s memory was seen as a finite set of objects, which is well ordered in some way by a mapping that assigns each of them a name, their L-value. And also, each object is a binary array which may be seen as the R-value. A computer program can thus be seen as a mapping from a set of values and names to another set of values and names.
258 |
259 | Scott set the stage for the work of semantics with his paper: [[https://www.cs.ox.ac.uk/files/3222/PRG02.pdf][Outline of a Mathematical Theory of Computation]]
260 |
261 | Scott’s work resulted in domain theory where lambda calculus was interpreted as modelling [[https://epubs.siam.org/doi/abs/10.1137/0205037?journalCode=smjcat][continuous lattices]].
262 |
263 | ** Domain Theory
264 |
265 | - TODO: Understand how CPO figures in here.
266 |
267 | Domain theory resulted from the attempt of Dana Scott to supply Lambda Calculus with a model.
268 |
269 | He arrived at this by using a particular kind of partial orders (directed acyclic graphs) called lattices.
270 |
271 | Within this theory, we are trying to construct a model or a type of space (decide which), where you can give an
272 | interpretation for the lambda term morphisms. That is, Lambda Calculus, on composition takes one lambda term as an input
273 | and generates another by way of evaluations. Domain Theory tries to give it a model theoretic interpretation.
274 |
275 | - TODO: Rewrite the above paragraph once you achieve more clarity.
276 |
277 | A semi-lattice is a structure in which each pair of elements either have a common ancestor or a common descendant. A complete lattice is a structure which has both.
278 | If you think about these structures as sort of rivers that originate from a common point and then finally culminate in a common end point, that would be a somewhat close metaphor.
279 |
280 | The central idea with a complete lattice is that for any pair of elements, you would be able to find both a common ancestor node upstream and a common descendant node downstream..
281 |
282 | - TODO: Add an illustrated image of a lattice here.
283 |
284 | Scott identified continuous partial orders as the domain he want to work with and equipped it with a bottom type, which stood for undefined values. This undefined value, enables one to represent the computations which are partial: that is, once that have not terminated or has a value, like 1 divided by 0.
285 |
286 | Domains are structures which are characterized as directed-complete partially ordered sets.
287 |
288 | *** Supremum/Meet/Upper Bound and Infimum/Join/Lower Bound
289 | To get an idea of what joins and meets are:
290 |
291 | Say we have 3 elements with some information in them.
292 | Joins roughly correspond to the smallest element which contains all the information present in the three nodes
293 | Meets roughly correspond to the largest element such that every element contains more information than the 3 elements.
294 |
295 | If you think in set theoretic terms, they correspond to the intersection and union operations.
296 |
297 | - TODO: I think there’s something to be talked about distributivity here on how it impinges on the nature of information.
298 |
299 | Directed set is a partial order which doesn't necessarily have a supremum/meet. Think of a total order (which also makes it a partial order) which doesn't have a top element such as the natural numbers. Here, there’s no top element, which makes it a directed set. But if we equip it with an top element, we now have a partial order that is completed.
300 |
301 | By having a supremum for any two elements, we are having a system in which there’s a third one encapsulating the information content of both of them.
302 |
303 | Any finite poset fulfills the supremum property, but there may be interesting cases when you move to infinite domains.
304 |
305 | The next property needed is continuity. Besides, the ordering >=, there’s a >> which corresponds to approximation. x approximates y iff for all directed sets A, where supremum(A) >= y, there’s a z in A such that z >= x. An element that approximates itself is compact.
306 |
307 | A continuous directed-complete partial order is one where for all points, the supremum approximates it.
308 |
309 | These dcpos are also equipped with a ⊥ element which is at the bottom of every element. Which makes it a pointed set. So, domains are continuous dcpops that is, continuous direct-completed partially ordered pointed set, where ⊥ is the basepoint.
310 |
311 | - TODO: Clarify, what it means for a supremum to approximate it.
312 |
313 | This is a [[https://www.lesswrong.com/posts/4C4jha5SdReWgg7dF/a-brief-intro-to-domain-theory][nice post]] to get an understanding of some of the basics.
314 |
315 | *** Ideas to Explain
316 |
317 | **** Partial Orders
318 |
319 | These are some of the properties commonly assumed by the partial orders used in Domain Theory. One or more of these properties can inhere in the structures studied. For example there can be a pointed completed partial order or a lifted discrete partial order as per the context demands.
320 |
321 | - Discrete
322 | - Lifting
323 | - Flat
324 | - Pointed
325 | - Completed
326 | - Chain
327 | A chain is a total order in which all elements are comparable.
328 |
329 | - Antichain
330 |
331 | TODO: Insert image
332 | Antichain is the collection of elements which are not comparable. It can roughly be thought of as the “width” of the partial order.
333 | Think elements in two separate branches of a tree such as Chennai and London in (Countries, (India, (Tamil Nadu, Chennai)), (United Kingdom, (England, London))).
334 |
335 | - ⍵-Chain
336 |
337 | Infinite chains indexed by natural numbers.
338 |
339 | - Algebraic
340 |
341 | **** Properties
342 |
343 | - Ordering
344 | - Partial Ordering
345 | - Continuity
346 | - Monotonicity
347 | - Completeness
348 | - Compactness
349 | - Compositionality
350 |
351 | ** Work in automata theory
352 |
353 | Inspired by Stephen Kleene’s characterization of events in Warren McCullough and Walter Pitts paper (that birthed the model of neural networks), Michael Rabin and Dana Scott showed that finite automata defined in the manner of Moore machines accepted a regular language (which algebraically correspond to free semigroups).
354 |
355 | There was a flurry of work in understanding how control flow constructs work post 1960s which is documented in the work of John Reynolds (See Resources section). There ensued work on denotational models of effectful (state, control flow, I/O) and non-deterministic (concurrency/parallelism) languages.
356 |
357 | This rise in complexity and clarity would lead to the use of topological/metric spaces to be brought to bear on studying computational structures.
358 |
359 | #+BEGIN_HTML
360 |
361 | #+END_HTML
362 |
363 | In Definitional Interpreters for Higher Order Programming Languages (1972), John Reynolds brings out the relationship between Lambda Calculus, SECD, Morris-Wadsworth method and his own definition for GEDANKEN.
364 | This work introduces the idea of defunctionalization: A method of converting a language with higher order functions into first order data structures.
365 |
366 | Defunctionalization allows to treat programming languages as algebraic structures. In this sense, they are related to F-algebras.
367 |
368 | Reynolds also distinguishes in this paper between trivial and serious functions which would later transform into showing the duality between values and computations. The parallel here is that values are the results that have been acquired from processes that have terminated and computations are processes that needs to be computed. This idea is emphasized in [[https://link.springer.com/chapter/10.1007%2F978-1-4612-4118-8_4][Essence of Algol (1997)]]. Continuations are the term for computations that remains to be processed and defunctionalization is the method by which you turn a computation into a value and refunctionalization the reverse process. Defunctionalization, so to speak, gives a handle on the underlying computation which is active at runtime.
369 |
370 | An important paper in this direction seems to be [[http://homepages.inf.ed.ac.uk/gdp/publications/Category_Theoretic_Solution.pdf][The Category-Theoretic Solution of Recursive Domain Equations]]
371 |
372 | #+BEGIN_HTML
373 |
374 | #+END_HTML
375 |
376 | Eugenio Moggi brought together [[https://www.irif.fr/~mellies/mpri/mpri-ens/articles/moggi-computational-lambda-calculus-and-monads.pdf][monads and control flow constructs in Lambda Calculus in late 1980s]]. This was further developed in his works: [[https://www.ics.uci.edu/~jajones/INF102-S18/readings/09_Moggi.pdf][An Abstract View on Programming Languages (1989)]] and [[http://www.cs.cmu.edu/~crary/819-f09/Moggi91.pdf][Notions of Computation and Monads (1991)]]. This paper tries to characterize various kinds of computations such as partial, non-deterministic, side-effecting, exceptions, continuations, and interactive I/O and supplies a framework from which it can be analyzed.
377 |
378 | Moggi’s semantics was used by Philipp Wadler to simplify the API of Haskell from [[http://doi.acm.org/10.1145/143165.143169][CPS-based to monad based]]. A good read in this direction to understand how monads can be used is the work on [[https://arxiv.org/abs/1702.08409][Query Combinators]] by Clark Evans and Kyrylo Simonov. They describe how their work on creating a database query language lead them to understand its denotation as (co)monads and (bi-)Kleisi arrows. Fong and Spivak in their book [[https://arxiv.org/abs/1803.05316][Seven Sketches in Compositionality]] also describe similar ideas.
379 |
380 | - TODO: Discuss about how modal logic and monads are related. I got this idea from reading Data and Interpretation article here: https://medium.com/@bblfish/data-interpretation-b9273520735c
381 | What needs to be figured out is how this idea of bringing in determinacy in the computational context is linked to the geometrical idea of creating a standard construction as per Godement.
382 | Is the idea of creating a tree like structure(?) from an interconnected directed graph (possibly with loops) linked to how we study geometrical objects using these same ideas?
383 |
384 | I would have to understand the connection between analysis and geometry more to bring these insights back into a computational context.
385 |
386 | Explore how monadic API which makes state tractable is related to the semantic aspect of how functional programming has a syntactic notion of unfolding like a derivation tree of a grammar.
387 |
388 | ** Algebra of Programming School
389 |
390 | TODO: Add some details on the Dutch School
391 |
392 | ** [[http://www.ii.uib.no/~wagner/MNotes/adjrun.ps][Algebraic Specifications: some old history, and new thoughts]]
393 | Eric G. Wagner
394 |
395 | Paper on the history by Gibbons:
396 |
397 | Videos by Bird and Merteens: http://podcasts.ox.ac.uk/series/algebra-programming
398 |
399 | ** Coalgebra
400 |
401 | The area of coalgebra hopes to aim the subjects of various formal structures that capture the essence of state-based computational dynamics such as automata, tranistion systems, Petri nets, event systems etc.
402 |
403 | It promises a perspective on uniting, say, the theory of differential equations with automata and process theory and with biological and quantum computing, by providing an appropriate semantical basis with associated logic.
404 |
405 | Coalgebras are about behaviour and dual to algebras which are about structure.
406 |
407 | The central emphasis is between observables and internal states.
408 |
409 | If a program can be understood as an element in an inductively defined set P of terms:
410 | F(P) -> P where the functor F captures the signature of the operations for fomring programs,
411 |
412 | Coalgebra is the dual G(P) -> where the functor G catpruse the kind of behaviour that can be displayed — such as deterministic, or with exceptions.
413 |
414 | A generate computer behaviour amounts to the repeated evaluation of an (inductively defined) coalgebra structure on an algebra of terms.
415 |
416 | VERIFY: OOP is coalgebraic, FP is algebraic
417 |
418 | Every programming language consists of an algebra, the structured elements (so called initial algebra). And each language corresponds to certain dynamical behaviour captured by a coalgebra acting on the state space of the computer.
419 |
420 | Structural operational semantics is used to study this coalgebraic behaivour.
421 |
422 | In coalgebra, it could be the case that internal states are different, but the observables are indistinguishable. This is called bisimilarity or observational equivalence.
423 |
424 | There could also be the inverse case that the internal states are the same, but the observable properties are different, such as in an algebra, which have two different valid interpretive frames.
425 |
426 | - TODO: Is this called congruence?
427 |
428 | - TODO: Describe about bialgebras
429 |
430 | ** Historical Sketch
431 |
432 | *** Categorical approch to mathematical system theory
433 | Work of Arbib, Manes and Goguen and also Adámek who analyzed Kalman’s work on linear dynamical systems, in relation to automata theory. This lead to a formulation for placing sequential machines and control systems in a unified framework by developing a notion of ”machine in a category”. This lead to general notions of state, behaviour, reachability, observability and realization of behaviour. The notion of coalgebra did not emerge here probably because the setting of modules and vector spaces from which this work arose rpovided too little categorical infrastructure (especially: no cartesian closure) to express these results purely coalgebraically.
434 |
435 | **** [[https://core.ac.uk/download/pdf/82763466.pdf][Machines in a Category (1980)]]
436 | Michael A. Arbib and Ernest G. Manes
437 |
438 | **** [[https://dml.cz/bitstream/handle/10338.dmlcz/105583/CommentatMathUnivCarol_015-1974-4_2.pdf][Free algebras and automata realizations in the language of categories (1974)]]
439 | Jiří Adámek
440 |
441 |
442 | ** Non-well-founded sets
443 | Aczel formed a crucial step with his set theory that allows infinitely descending ∈-chains, because it used coalgebraic terminology right from the beginning. The development of this theory was motivated by the desire to provide meaning to Milner’s theory of CCS of concurrent processes with potentially infinite behaviour. Therefore, the notion of bisimulation from process theory played a crucial role. Aczel showed how to treat bisimulation in a coalgebraic setting by establishing the first link between proofs by bisimulations and finality of coalgebras.
444 |
445 | *** [[https://link.springer.com/chapter/10.1007%2FBFb0018361][A final coalgebra theorem (1989)]]
446 | Peter Aczel, Nax Mendler
447 |
448 | *** [[https://www.escholar.manchester.ac.uk/uk-ac-man-scw:2h4][Final universes of processes (February 16, 1994)]]
449 | Peter Aczel
450 |
451 | ** Data types of infinite objects
452 |
453 | The first approaches to data types in computing relied on initiality of algebras.
454 | The use of final coalgebras in
455 |
456 | *** [[https://core.ac.uk/download/pdf/82297461.pdf][Parametrized datat ypes do not need highly constrained parameters (1982)]]
457 | Michael A. Arbib, Ernest G. Manes
458 |
459 | *** [[http://www.lfcs.inf.ed.ac.uk/reports/88/ECS-LFCS-88-44/ECS-LFCS-88-44.pdf][A Typed Lambda Calculus with Categorical Type Constructors (1988)]]
460 | Tatsuya Hagino
461 |
462 | *** [[http://it.mmcs.sfedu.ru/_files/ct_seminar/articles/The%20continuum%20as%20a%20final%20coalgebra.pdf][The continuum as a final coalgebra (2002)]]
463 | Dusko Pavlović, Vaughan Pratt
464 |
465 | *** [[https://www.sciencedirect.com/science/article/pii/0022000079900114][Final algebra semantics and data type extensions (1979)]]
466 | Mitchell Wand
467 |
468 | to capture infinite structures provided an important next step. Such infinite structures can be represented using lazy evaluation or in logical programming languages.
469 |
470 | *** [[https://personal.utdallas.edu/~gupta/iclp07paper.pdf][Coinductive programming and its applications (2007)]]
471 | Gopal Gupta, Ajay Bansal, Richard Min, Luke Simon, and Ajay Mallya
472 |
473 | *** [[https://personal.utdallas.edu/~gupta/calco11.pdf][Infinite computation, co-induction and computational logic (2011)]]
474 | Gopal Gupta, Neda Saeedloei, Brian DeVries, Richard Min, Kyle Marple, Feliks Kluźniak
475 |
476 | Talk available here: https://www.microsoft.com/en-us/research/video/logic-co-induction-and-infinite-computation/
477 |
478 | *** [[https://www.researchgate.net/publication/220985840_Coinductive_Logic_Programming][Coinductive Logic Programming (2006)]]
479 | Luke Simon, Ajay Mallya, Ajay Bansal, Gopal Gupta
480 |
481 | ** Initial and final semantics
482 | In semantics of programm and process languages it appeared that the relevant semantical domains carry the structure of a final coalgebra (sometimes in combination with an initial algebra structure). Especially in the metric space based tradition [50].
483 |
484 | *** [[https://mitpress.mit.edu/books/control-flow-semantics][Control Flow Semantics (1996)]]
485 | J. de Bakker and E. Vink
486 |
487 | This techinque was combined with Aczel’s techniques by Rutten and Turi.
488 |
489 | - TODO: Find out the work in which Rutten and Turi combined these techniques.
490 |
491 | It culminated in the recognition that “compatible” algebra-coalgebra pairs (called bialgebras) are highly releant structures, described via distributive laws. The basic observation of
492 |
493 | *** [[http://www.dcs.ed.ac.uk/home/dt/thesis.html][Functional operational semantics and its denotational dual (1996)]]
494 | Daniele Turi
495 |
496 |
497 | *** [[http://www.dcs.ed.ac.uk/home/dt/towards.html][Towards a mathematical operational semantics (1997)]]
498 | Daniele Turi and Gordon Plotkin
499 |
500 | further elaborated in:
501 |
502 | *** [[https://research.vu.nl/en/publications/on-generalised-coinduction-and-probabilistic-specification-format-2][On generalized coiduction and probabilistic specification formats: Distributive laws in coalgebraic modelling (2004)]]
503 | F. Bartels
504 |
505 | , is that such laws correspond to specification formats for operation rules on (inductively defined) programs.
506 |
507 | *** [[https://core.ac.uk/download/pdf/82824249.pdf][Bialgebras for structural operational semantics: An introduction (2011)]]
508 | B. Klin
509 |
510 | These bialgebras satisfy elementary properties like: observational equivalence (i.e. bisimulation wrt. the coalgebra) is a congruence (wrt. the algebra).
511 |
512 | *** [[https://link.springer.com/chapter/10.1007%2FBFb0084215][Algebraically complete categories (1991)]]
513 | P. Freyd
514 |
515 | *** [[https://era.ed.ac.uk/handle/1842/406][Axiomatic Domain Theory in Categories of Partial Maps (1994)]]
516 | M. Fiore
517 |
518 | ** Behavioural approach in specification
519 |
520 | Horst Reichel in [[https://www.researchgate.net/publication/266957938_Behavioural_equivalence_-_a_unifying_concept_for_initial_and_final_specification_methods][Behavioural Equivalence — a unifying concept for initial and final specifications (1981)]] was the first to use so-called behavioural validity of equations in the specification of algebraic structures that are computationally relevant. The basic idea is to divide one types (also called sorts) into ‘visible’ and ‘hidden’ ones. The latter are supposed to capture sattes, and are not directly accessible. Equality is only used for the “observable” elements of visible types. The idea is further elaborated in what has become known as hidden algebra
521 |
522 | *** [[https://www.sciencedirect.com/science/article/pii/S0304397599002753?via%3Dihub][A Hidden Agenda (2000)]]
523 | Joseph Goguen, Grant Malcom
524 |
525 | *** [[https://link.springer.com/chapter/10.1007%2F3-540-07854-1_231][Observability concepts in abstract data specifications (1976)]]
526 | V. Giarrantana, F. Gimona, U. Montanari
527 |
528 | There seems to be a 30 years later retrospect [[https://www.semanticscholar.org/paper/Observability-Concepts-in-Abstract-Data-Type-30-Sannella-Tarlecki/7c26d5071be3a815877ce0baeb7e12219e5541ce][here]].
529 |
530 | *** [[https://www.sciencedirect.com/science/article/pii/016764239090057K][Behavioural correctness of data representations (1990)]]
531 | Oliver Schoett
532 |
533 | *** [[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.56.4105&rep=rep1&type=pdf][Proving the correctness of behavioural implementations (1995)]]
534 | Michel Bidiot, Rolf Hennicker
535 |
536 | and has been applied to describe classes in OOP languages, which have an encapsulated state space. It was later realised that behavioural equality is essentially bisimilarity in a coalgebraic context
537 | *** [[https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.43.5273&rep=rep1&type=pdf][Behavioural Equivalence, Bisimulation, and Minimal Realisation (19 September, 1995)]]
538 | Grant Malcolm
539 |
540 | The original title of this paper was apparently “Objects as algebra-coalgebra pairs” which was replaced on the suggestion of Rod Burstall.
541 |
542 | and it was again Reichel
543 | *** [[https://www.researchgate.net/publication/220173547_An_Approach_to_Object_Semantics_based_on_Terminal_Co-Algebras][An approach to object semantics based on terminal co-algebras (1995)]]
544 | who first used coalgebras for the semantics of OOP languages.
545 |
546 | *** [[https://www.sciencedirect.com/science/article/pii/S0304397502003663/pdf][Coalgebras and monads in the semantics of Java (2003)]]
547 | Bart Jacobs
548 |
549 | ** Modal logic
550 |
551 | Modal logics qualify the truth conditions of statements, concerning knowledge, belief and time. Temporal logic is a part of modal logic which is particularly suitable for reasoning about (reactive) state-based systems.
552 |
553 | *** [[https://fi.ort.edu.uy/innovaportal/file/20124/1/49-pnueli_temporal_logic_of_programs.pdf][The temporal logic of programs (1971)]]
554 |
555 | Amir Pnueli
556 |
557 | *** [[The temporal semantics of concurrent programs (1981][The temporal semantics of concurrent programs (1981)]]
558 |
559 | Amir Pnueli
560 |
561 | Lawrence S. Moss in [[https://www.sciencedirect.com/science/article/pii/S0168007298000426][Coalgebraic Logic (1999)]] first associated a suitable modal logic to coalgebras which inspired much subsequent work.
562 |
563 | *** [[https://www.sciencedirect.com/science/article/pii/S1571066105803536][Coalgebras and modal logic (2000)]]
564 | Martin Rößiger
565 |
566 | *** [[https://www.sciencedirect.com/science/article/pii/S0304397500001286/pdf][From modal logics to terminal coalgebras (2001)]]
567 | Martin Rößiger
568 |
569 | *** [[https://www.sciencedirect.com/science/article/pii/S1571066104000532/pdf][Specifying coalgebras with modal logic (2001)]]
570 | Alexander Kurz
571 |
572 | *** [[https://www.sciencedirect.com/science/article/pii/S1571066104809095/pdf][Modal operators for coequations (2001)]]
573 | Jesse Hughes
574 |
575 | *** [[https://www.semanticscholar.org/paper/The-temporal-logic-of-coalgebras-via-Galois-Jacobs/d58370736cc5063f1af99580a87cdfdbccfe06b4][The temporal logic of coalgebras via Galois algebras (2002)]]
576 | Bart Jacobs
577 |
578 | *** [[https://www.sciencedirect.com/science/article/pii/S0304397503002019][Coalgebraic modal logic: Soundness, completeness and decidability of local consequence (2003)]]
579 | Dirk Pattinson
580 |
581 | *** [[https://staff.science.uva.nl/y.venema/papers/stone.pdf][Stone Coalgebras]]
582 | Clemens Kupke, Alexander Kurz, Yde Venema
583 |
584 | Overview in
585 | *** [[https://www.sciencedirect.com/science/article/pii/S0304397511003215][Coalgebraic semantics of modal logic: An overview (2011)]]
586 | Clemens Kupke, Dirk Pattinson
587 |
588 | The idea is that the role of equational formulas in algebra is played by modal formulas in coalgebra.
589 |
590 | ** Coalgebra and Category Theory
591 |
592 | - TODO: Give example of a multicoded / many-sorted? syntactical representation of an algebra
593 | Different process, same structure: 3 + 5 = 4 * 2 = 8
594 | Same process, multiple structure: sqrt(4) = 2 in Z+ and sqrt(4) = -2 in Z-
595 |
596 |
597 | - TODO: Learn about the distributive laws connecting algebra-coalgebra pairs
598 |
599 | - TODO: I need to understand the algebra/co-algebra duality deeply and how it connects with
600 | model theory, modal logic, linear logic, and topology
601 |
602 | Investigations into the computational setting for abstract algebra would see emergence of fields of study like Universal Co-algebra that captures the duality in computation and values. This is a neat table from J.J.M.M Rutten’s [[https://homepages.cwi.nl/~janr/papers/files-of-papers/universal_coalgebra.pdf][paper on Universal Coalgebra: a theory of systems]] to understand the duality between different ideas of universal algebra and universal co-algebra.
603 | [[./img/universal-co-algebra-chart.png]]
604 |
605 | Bisimulation was coined by David Park and Robin Milner during a walk when earlier that day David Park showed how there was a mistake in Robin Milner’s work on CCS. This story is told in [[https://users.sussex.ac.uk/~mfb21/interviews/milner/][his interview with Martin Berger]].
606 |
607 | - TODO: Detail about full abstraction and how it is related to game semantics. I might also have to link it up with CCS.
608 |
609 | *** [[https://homepages.cwi.nl/~janr/papers/files-of-papers/2011_Jacobs_Rutten_new.pdf][An introduction to (co)algebra and (co)induction]]
610 |
611 | - TODO: Detail about bisimulation and coinduction
612 | - TODO: Frame how hypersets and non-well founded set theory are used to provide a foundation for bisimulation
613 |
614 | [[http://www.cs.unibo.it/~sangio/DOC_public/history_bis_coind.pdf][On the Origins of Bisimulation and Coinduction (2007)]] - Davide Sangiorgi
615 |
616 | [[https://www.cs.cornell.edu/~kozen/Papers/Structural.pdf][Practical Coinduction (2016)]]
617 |
618 | [[https://www.brics.dk/RS/94/6/BRICS-RS-94-6.pdf][Bisimulation, Games, and Logic (1994)]]
619 | Mogens Nielsen
620 | Christian Clausen
621 |
622 | [[https://www.sciencedirect.com/science/article/pii/S016800720300023X][Introduction to Computability Logic (2003)]]
623 | Giorgio Japaridze
624 |
625 | [[https://arxiv.org/pdf/cs/0507045.pdf][In the beginning was game semantics (2008)]]
626 | Giorgio Japaridze
627 |
628 | - TODO: Discuss about sequent calculus and cirquent calculus
629 |
630 | [[https://www.researchgate.net/publication/227278992_Why_Play_Logical_Games][Why Play Logical Games (2009)]]
631 | Mathieu Marion
632 |
633 | Abramsky’s Game Theoretic Interpretation of Linear Logic
634 |
635 | Andrzej Filinski and Olivier Danvy worked on [[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.6.960&rep=rep1&type=pdf][unifying control concepts]].
636 |
637 | Filinski found out about Symmetric Lambda Calculus during his Ph. D. work. [[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.43.8729&rep=rep1&type=pdf][This paper]] detailed about the duality existing between values and continuations.
638 |
639 | Expressions can be thought of as [[http://www.cs.ox.ac.uk/ralf.hinze/WG2.8/27/slides/kenichi1.pdf][producing data and continuations as consuming data]].
640 | Matija Pretnar uses Filinski’s representation theorem to [[https://homepages.inf.ed.ac.uk/slindley/papers/handlers.pdf][invent effect handlers]].
641 |
642 | These works leads up to [[http://lambda-the-ultimate.org/node/4481][formalizing computational effects]] in languages like Eff and Koka.
643 |
644 | A good bibliography of this chain can be found catalogued by Jeremy Yallop (See Resources).
645 |
646 | A nice overview on the work of John Reynolds towards his program for logical relations is [[https://www.cs.bham.ac.uk/~udr/papers/logical-relations-and-parametricity.pdf][given by Uday Reddy]].
647 | - TODO: Include Uday Reddy et al.’s Category Theory programme for programming languages.
648 |
649 | ** Monads vs. Continuations
650 |
651 | There is a parallel between creating a continuation and bringing in monadic architecture around the program. Monads help in composing functions and gives control over their execution in calling and discard them. This architecture around the code enables creating performant changes such as discarding a certain fork of the search tree of the program if grows beyond a certain complexity or even allow to accept interrupts from outside the program execution to proceed a certain computation no further. This is the sort of tractable differences that monadic architecture and continuations grant to the programmer.
652 |
653 | - TODO: I need to describe how call/cc is connected with classical logic and how double elimination / law of excluded middle / Peirce’s Law figures in here.
654 |
655 | ** Logical investigations
656 |
657 | To understand the link of logic with computation is this article by John F. Sowa: http://www.jfsowa.com/logic/theories.htm
658 |
659 | The idea of creating models and the metalogical implications of constructing such intricate lattices are detailed in an accessible manner in this post.
660 |
661 | The link with computation comes from the idea that when you construct a computational object it can resemble such a lattice from which you equationally/implicationally extract out the truths consistent in that system.
662 |
663 | - TODO: Link this with Curry-Howard isomorphism
664 | - TODO: Seek out if there’s a Curry-Howard isomorphism identified for classical logic
665 |
666 | Sowa also links the idea of meaning preserving transformations and Chomsky’s linguistic attempts here: http://users.bestweb.net/~sowa/logic/meaning.htm
667 | The new version of the article which locates it in a logical system is present here: http://www.jfsowa.com/logic/proposit.htm
668 |
669 | ** Linear Logic
670 |
671 | Girard’s work can be thought as an attempt to create types out of the structure created from the dynamical interactions among players. It is possible to reconstrut Martin Löf’s type theory within Linear Logic framework.
672 |
673 | Recreating MLTT in Ludics: https://arxiv.org/abs/1402.2511
674 |
675 | - TODO: Can the move from Ludics to Transcendental Syntax be thought of as a move from thinking in trees to thinking in graphs?
676 |
677 | - TODO: Document how Girard arrived at the work on linear logic
678 |
679 | - TODO: Detail how linear logic is a logic of resources
680 |
681 | - TODO: Discuss the link between linear logic and constructive mathematics
682 | https://arxiv.org/pdf/1805.07518.pdf
683 |
684 | ** Type Theory
685 |
686 | *** Origins of Type Theory
687 |
688 | Type theory was devised by Bertrand Russell to solve problems associated with impredicativity in the foundations of mathematics.
689 |
690 | **** Law of Excluded Middle
691 |
692 | How does removing this results in constructive algorithms.
693 |
694 | - TODO: Brief history of how Law of Excluded Middle figures in the history of logic with emphasis on computational aspects
695 |
696 | - TODO: Include the role of Brouwer here
697 |
698 | *** Connection between type theory and language
699 |
700 | Type-Theoretical Grammar (1994) — Aarne Ranta
701 |
702 | [[https://www.researchgate.net/publication/307858446_Type_Theory_for_Natural_Language_Semantics][Type Theory for Natural Language Semantics (2016)]]
703 | Stergio Chatzikyriakidis, Robin Cooper
704 |
705 | *** Martin Löf’s Intuitionistic Type Theory
706 |
707 | - TODO: Discuss about how Martin Löf’s work was inspired by Automath
708 |
709 | - TODO: Discuss about the connection between game semantics and Martin Löf Type Theory
710 | https://arxiv.org/pdf/1610.01669.pdf
711 |
712 |
713 | There’s [[https://www.youtube.com/watch?v=xRUPr322COQ&t=589s][a talk]] by Joseph Abrahamson on ”On the Meanings of the Logical Constants” paper by Martin Löf.
714 |
715 | [[http://archive-pml.github.io/][Collected Works of Per Martin Löf]]
716 |
717 | [[https://web.archive.org/web/20160304130949/http://okmij.org/ftp/Computation/lem.html][Constructive Law of Excluded Middle]]
718 |
719 | [[http://www.cllc.vuw.ac.nz/talk-papers/whatisit.ps][Just What is it that Makes Martin Löf’s Type Theory so Different, so Appealing?]]
720 | Neil Leslie (1999)
721 |
722 | [[http://math.andrej.com/2008/08/13/intuitionistic-mathematics-for-physics/][Intuitionistic Mathematics for Physicists]]
723 |
724 | [[http://www.nuprl.org/documents/Constable/PrincipiaArticle.pdf][The Triumph of Types: Principia Mathematica’s Influence on Computer Science]]
725 |
726 | [[http://www.cs.uoregon.edu/research/summerschool/summer11/lectures/Triumph-of-Types-Extended.pdf][The Triumph of Types: Creating a Logic of Computational Reality]]
727 |
728 | [[http://www.cse.chalmers.se/~bengt/papers/vatican.pdf][Constructivism: A Computing Science Perspective]]
729 |
730 | [[https://math.vanderbilt.edu/schectex/papers/difficult.html][Constructivism is Difficult]]
731 |
732 | [[https://www.jstor.org/stable/2321650?seq=1][Meaning and Information in Constructive Mathematics]]
733 | Fred Richman
734 |
735 | - TODO: Find out how Kolmogorov’s work figures in here
736 |
737 | [[https://towardsdatascience.com/gradient-descend-with-free-monads-ebf9a23bece5][Continuity in Type Theory Slides]]
738 | Martín Escardó
739 |
740 | *** Homotopy Type Theory
741 |
742 | - TODO: Discuss Homotopy Hypothesis and Grothendieck’s work
743 |
744 | - TODO: Discuss the work in [[www.math.mcgill.ca/makkai/folds/foldsinpdf/FOLDS.pdf][FOLDS paper]]. How it was inspired from Martin Löf’s work
745 |
746 | ** Process Algebras and Calculi
747 |
748 | #+BEGIN_HTML
749 |
750 |
751 | #+END_HTML
752 |
753 | Etymology of Algebra is to join together broken parts. Calculus, means small pebble. Etymology comes from counting stones that stand for things like sheeps.
754 |
755 | The terms process algebra and calculus are used interchangeably, though there is some distinction to be gained by understanding their etymological and mathematical viewpoint. Mathematically, algebras have closure, that is they are limited is limited to their domain of algebraic operations, while calculus is constructed for computation without algebraic laws in mind.
756 |
757 | In other words, Calculus is used for computation and algebra is mapping between different structures under study in it’s domain. There is a way in which Lambda Calculus can be seen as both. You can use it to map values and it can then be seen as an algebra that followers certain rules, but if you want to use these properties to perform computations that is follow the entailments of the laws to calculate, then it becomes a calculus.
758 |
759 | ** Utility of algebraic properties in computation
760 |
761 | *** Associativity
762 | Allows you to put the bracket anywhere. A chain of operation executed in any order or within any contextual boundaries give the same effect.
763 |
764 | *** Commutativity
765 | Wearing your undergarments first and then pants is the normal style (a op b), but superheroes for some reason prefer wearing your pants and then the undergarment (b op a).
766 |
767 | If both of these operations result in the same end result, then the operation is said to be commutative otherwise, it is non-commutative
768 |
769 | In terms of computational processes, these allow you to perform an operation in any order.
770 | This could be important when asynchrony is present. If you don't know when your inputs are going to arrive, but you know that the end result is going to be commutative, you can arrange the processes to be executed in any order.
771 |
772 | *** Transitivity
773 | Enables you to travel through the links
774 |
775 | ** Linear Logic
776 |
777 | ** Geometry of Interaction
778 |
779 | A semantics of linear logic proofs.
780 |
781 | It acts as a framework for studying the trade-off between time and space efficiency
782 |
783 | *** [[https://dl.acm.org/doi/10.1145/199448.199483][The Geometry of Interaction machine]]
784 | I. Mackie (1995)
785 |
786 | *** [[http://sro.sussex.ac.uk/id/eprint/69302/][A Geometry of Interaction Machine for Gödel’s System T]]
787 | I. Mackie (2017)
788 |
789 | *** [[https://www.researchgate.net/publication/257642501_Reversible_Irreversible_and_Optimal_l-machines][Reversible, Irreversible, and Optimal Lambda-Machines]]
790 | Vincent Danos and Laurent Regnier (1996)
791 |
792 | ** Game Semantics
793 |
794 | - TODO: Document the Dana Scott manuscript to LCF to PCF story
795 |
796 | - TODO: Document the role of Kohei Honda: http://mrg.doc.ic.ac.uk/kohei/koheis_games.pdf
797 |
798 | - TODO: Detail a bit about full abstraction problem
799 |
800 | - TODO: Create a visualization of the influential papers in this domain
801 |
802 | We know that many expressions can evaluate to the same output.
803 | For example, 1 + 5 = 4 + 2 = 3 + 3 = 2 + 4 = 5 + 1 = 6
804 |
805 | What about sequential programs? How do we understand equivalence between two sequential programs that generate the same output?
806 | What is the underlying mathematical object here?
807 |
808 | With denotational semantics, we understand that programs are continuous functions on a topological spaces called Scott Domains.
809 |
810 | But there are sequential, parallel, and non-sequential computations in this space.
811 |
812 | Full abstract model tries to capture just the sequential programs and tries to identify what mathematical object that corresponds to.
813 |
814 | - TODO: Detail about parallel or and or tester
815 |
816 | In 1993, full abstraction was achieved using Game Semantics
817 |
818 | Games can be quotiented to give a topological space a la Scott.
819 |
820 | [[http://moscova.inria.fr/~levy/courses/X/M1/lambda/bib/90abramskylazy.pdf][The Lazy Lambda Calculus]] was introduced by Abramsky in 1987. See also [[https://www.sciencedirect.com/science/article/pii/S0890540183710448][Full Abstraction in the Lazy Lambda Calculus]] by C.H. Luke Ong and Samson Abramsky
821 |
822 | In it, the function application was identified as the fundamental interaction between contexts and fragments. After this work the full abstraction problem was solved.
823 |
824 | Since game semantics solved the full abstraction problem for PCF, it was adapted to accommodate ground state in Call-by-Value games (1998), Control by Laird in Full abstraction for functional languages with control (1997), and general references by Abramsky, Kosei Honda, and G. McCusker A fully abstract game semantics for general references in 1998.
825 |
826 | While ground state only allows data, such as natural numbers, to be stored, general references (also called higher-order state) has no restrictions as to what can be stored, general references (also called higher-order state) has no restriction as to what can be stored.
827 |
828 | In 1993 Abramsky, Jagadeeshan and Malacaria, Hyland and Ong, and Nickau created models solved the questions for call-by-name computations. Full abstraction for call-by-value was solved by Kohei and Nobuko in 1997.
829 |
830 | For logical relations, which is a type based inductive proof method for observational equivalence, higher-order state poses a challenge by introducing types that are not inductive. To deal with non-inductive types, namely recursive and quantified types, logical relations equipped with step indices were introduced.
831 |
832 | [[http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.28.5695][An Indexed Model of Recursive Types for Foundational Proof-Carrying Code]] - Andrew W. Appel, David Mcallester (2000)
833 | [[https://www.ccs.neu.edu/home/amal/papers/lr-recquant-techrpt.pdf][Step-indexed syntactic logical relations for recursive and quantified types]] — A. Ahmed (2006)
834 |
835 | Step-indexed logical relations were then used to model higher-order state together with abstract types in [[http://www.ccs.neu.edu/home/amal/papers/sdri.pdf][State-Dependent Representation Independence]] in 2009 by Amal Ahmed, Derek Dreyer, and Andreas Rossberge and to model higher-order state as well as control in 2012 by Derek Dreyer, Georg Neis, and Lars Birkedal in [[https://people.mpi-sws.org/~dreyer/papers/stslr/icfp.pdf][The Impact of Higher-Order State and Control Effects on Local Reasoning]].
836 |
837 | Environmental bisimulations in contrast with applicative bisimulations were developed to deal with more distinguishing power of contexts for instance caused by abstract types and recursive types in [[https://www.cis.upenn.edu/~bcpierce/papers/infohide5-jacm.pdf][A bisimulation for type abstraction and recursion]] by Eijiro Sumii and Benjamin C. Pierce
838 |
839 | Environmental bisimulations were used to study higher-order state in [[http://www.cs.unibo.it/~sangio/DOC_public/env.pdf][Environmental Bisimulations for Higher-Order Languages]] in 2007 by Davide Sangiorgi, Naoki Kobayashi, and Eijiro Sumii. Another paper in this direction is [[Small Bisimulations for Reasoning About Higher-Order Imperative Programs][https://www.ccs.neu.edu/home/wand/papers/popl-06.pdf]] by Vasileios Koutavas and Mitchell Wand.
840 |
841 | - TODO: Understand what higher order imperative programs are.
842 |
843 | Environmental bisimulation for state and polymorphism was studied in [[https://www.researchgate.net/publication/220370562_From_Applicative_to_Environmental_Bisimulation][From Applicative to Environmental Bisimulation]] in 2011 by Vasileios Koutavas, Paul Levy and Eijiro Sumii.
844 |
845 | Another variant of environmental bisimulation in [[https://link.springer.com/chapter/10.1007%2F978-3-319-47958-3_10][A Sound and Complete Bisimulation for Contextual Equivalence in Lambda-Calculus with Call/cc]] in 2016 by Taichi Yachi and Eijiro Sumii
846 |
847 | The detailed studies in game semantics resulted in the so-called Abramksy’s cube, first proposed in Linearity, Sharing and State by Samson Abramsky and G. McCusker and developed in their Marktoberdorf Summer School lectures of 1997. This was condesned and released as [[https://www.irif.fr/~mellies/mpri/mpri-ens/articles/abramsky-mccusker-game-semantics.pdf][Game Semantics (1999)]].
848 |
849 | Abramsky’s cube was also studied in terms of logical relations in [[https://people.mpi-sws.org/~dreyer/papers/stslr/icfp.pdf][The impact of higher-order state and control effects on local relational reasoning]] by Derek Dreyer, Georg Neis, and Lars Birkedal in 2010
850 |
851 | *** [[https://www.dpmms.cam.ac.uk/~martin/Research/Oldpapers/gamesemantics97scan.pdf][Game Semantics]]
852 | Martin Hyland (2007)
853 |
854 | *** [[https://www.cs.bham.ac.uk/~drg/papers/lics09tut.pdf][Applications of Game Semantics: From Program Analysis to Hardware Synthesis (2009)]]
855 | Dan Ghica
856 |
857 | *** [[https://arxiv.org/pdf/1908.04291.pdf][The Far Side of the Cube: An elementary introduction to game semantics (2019)]]
858 | Dan Ghica
859 |
860 | *** [[https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.676.7186&rep=rep1&type=pdf][Notes on Game Semantics]]
861 | Pierre-Louis Curien (February 28, 2015)
862 |
863 | ** Abstract Machines
864 |
865 | Taxonomy of complexity of abstract machines was given by Beniamino Accattoli in [[https://arxiv.org/abs/1701.00649][The complexity of abstract machines (2016)]].
866 |
867 | ** Hypernet semantics
868 |
869 | Graphs provide a convenient formalism for providing operational semantics and for reasoning about observational equivalence. Translating inductively structured programs into graphs as the representation enables fine control over resources and introduces the novel concept of locality in program execution.
870 |
871 | Due to the control the token holds over graph rewriting, program execution can be described loclaly in terms of the token and its neigbourhood. The rewrites happen around the regions through which the token passes.
872 |
873 | - TODO: Elaborate a bit about robustness here.
874 |
875 | Robustness provides a sufficient condition of observational equivalence.
876 |
877 | *** Dynamic Geometry of Interaction Machine
878 | Different specifiactions of time and space cost can be given in a uniform fashion.
879 |
880 | Cost measure of a DGoIM can be used as a generic measure for programming languages.
881 |
882 | **** [[https://arxiv.org/abs/1803.00427][The Dynamic Geometry of Interaction Machine: A Token-guided Graph Rewriter]]
883 | Dan Ghica, Koko Muroya (2018)
884 |
885 | *** Universal Abstract Machine
886 |
887 | Abstract semantic graph
888 |
889 | - TODO: Discuss about characterisation theorem
890 |
891 | ** Recursion Schemes / Morphisms of F-algebras
892 |
893 | Morphism of F-Algebras
894 |
895 | Anamorphism: From co-algebra to a final co-algebra
896 | Used as unfolds
897 |
898 | Catamorphism: Initial algebra to an algebra
899 | Used as folds
900 |
901 | Hylomorphism: Anamorphism followed by a Catamorphism (Use Gibbons’ image)
902 |
903 | Paramorphism: Extension of Catamorphism
904 | Apomorphism: Extension of Anamorphism
905 |
906 | There is a speculative article by Chris Olah on the relation between neural network architectures and functional programming type signatures:
907 | https://colah.github.io/posts/2015-09-NN-Types-FP/
908 |
909 | [[./img/nn-types-fp.png]]
910 |
911 | Proof Nets vs. Pi Calculus
912 | http://perso.ens-lyon.fr/olivier.laurent/picppn.pdf
913 |
914 | ** Constraint Programming
915 |
916 | ** Answer Set Programming
917 | ** Logic for Computable Functions
918 |
919 | ** Topology and Computation
920 |
921 | ** Program Analysis
922 |
923 | There is a neat way in which Abstract Interpretation ties together a lot of field is computer science.
924 |
925 | This is a good article on it: https://gist.github.com/serras/4370055d8e9acdd3270f5cee879898ed
926 |
927 | *** Constructive Mathematics
928 |
929 | Employing constructive logic ensures that law of excluded middle is not used.
930 | Axiom of choice is also restricted in this framework (TODO: Have to clarify exactly how).
931 |
932 | Avoiding the use of these, ensures that the propositions(is this the right term?) in this logic would result in “construction” of objects which guarantee an existence proof. This is in stark contrast with classical logic, where you can make the proposition to stand for truth values and then prove existence of objects by using reductio ad absurdum statements. This is a method by which you start with a set of postulates and then you derive a contradiction on deducing from these initial starting point. By showing such a contradiction, if the postulates was about the non-existence of some mathematical object, you have said that the contradictory is true, which establishes its existence. This flipping of logic so as to establish existence is thought to be insufficient and constructive logic ensures that existence of an object is to be ensured by supplying a construction of the object within some specified precision or assumed semantics (TODO: Verify if it is the right terminology).
933 |
934 | *** [[http://math.andrej.com/2006/03/27/sometimes-all-functions-are-continuous/][Sometimes all functions are continuous]]
935 | Blogpost detailing how all computable functions are continuous
936 |
937 | *** [[http://www.cse.chalmers.se/~coquand/esop.pdf][Constructive Mathematics and Functional Programming]]
938 |
939 | *** [[https://www.youtube.com/watch?v=zmhd8clDd_Y][Five stages of accepting constructive mathematics]]
940 |
941 | ** Automatic Differentiation
942 |
943 | - TODO: The role of dual numbers
944 |
945 | - TODO: The link with nilpotents developed by Benjamin Peirce
946 |
947 | ** Categorical Logic
948 |
949 | *** [[https://en.wikipedia.org/wiki/Pregroup_grammar][Pregroup grammar]]
950 |
951 | *** [[https://www.cs.cmu.edu/~fp/courses/15816-f16/misc/Lambek58.pdf][The Mathematics of Sentence Structure (1958)]]
952 |
953 | ** Quantum Mechanics
954 |
955 | *** ZX Calculus
956 |
957 | **** [[https://arxiv.org/pdf/0908.1787.pdf][Quantum Picturalism (2009)]]
958 | Bob Coecke
959 |
960 | * Resources
961 |
962 | ** Posts
963 |
964 | *** [[https://jlongster.com/Whats-in-a-Continuation][Whats in a Continuation]]
965 | James Longster
966 |
967 | *** [[https://garlandus.co/OfTablesChairsBeerMugsAndComputing.html][Of Tables, Chairs, Beers Mugs and Computing]]
968 | A really nice essay by Garlandus outlining the role of Hilbert and Göttingen in influencing the history of Computer Science
969 |
970 | *** [[http://pllab.is.ocha.ac.jp/~asai/cw2011tutorial/main-e.pdf][Introduction to Programming with Shift/Reset]]
971 | Kenichi Asai, Oleg Kiselyov (2011)
972 |
973 | *** [[http://comonad.com/reader/2009/recursion-schemes/][Recursion Schemes: A Field Guide]]
974 | Edward Kmett (2009)
975 |
976 | *** Introduction to Recursion Schemes [[https://blog.sumtypeofway.com/posts/introduction-to-recursion-schemes.html][Part 1]], [[https://blog.sumtypeofway.com/posts/recursion-schemes-part-2.html][Part 2]], [[https://blog.sumtypeofway.com/posts/recursion-schemes-part-3.html][Part 3]], [[https://blog.sumtypeofway.com/posts/recursion-schemes-part-4.html][Part 4]], [[https://blog.sumtypeofway.com/posts/recursion-schemes-part-4-point-5.html][Part 4.5]], [[https://blog.sumtypeofway.com/posts/recursion-schemes-part-5.html][Part 5]], [[https://blog.sumtypeofway.com/posts/recursion-schemes-part-6.html][Part 6]]
977 |
978 | *** [[https://robotlolita.me/diary/2018/10/why-pls-need-effects/][Why PLs should have effect handlers]]
979 |
980 | ** Slides
981 |
982 | *** [[https://www.ccs.neu.edu/home/types/resources/notes/call-by-name-call-by-value/extended-intro.pdf][An introduction to Call By Name, Call By Value and Lambda Calculus]]
983 |
984 | ** Talks
985 | *** [[https://www.youtube.com/watch?v=Ssx2_JKpB3U][A Categorical View of Computational Effects]]
986 |
987 | *** Hoare’s talks on unifying process calculus
988 | Hoare has given a set of three talks at Heidelberg Laureate Conferences where he talks about the coherence of logic, algebra, and geometry in Computer Science
989 |
990 | **** [[https://www.heidelberg-laureate-forum.org/video/lecture-pioneers-of-computer-science-aristotle-and-euclid.html][Talk 1: Pioneers of Computer Science: Aristotle and Euclid]]
991 | **** [[https://www.youtube.com/watch?v=wzd8BeVpQpw][Talk 2: A finite geometric representation of computer program behaviour]]
992 | **** [[https://www.youtube.com/watch?v=S_mmMVoSW30][Talk 3: Algebra, Logic, Geometry at the Foundation of Computer Science]]
993 |
994 | ** Surveys
995 |
996 | *** [[http://okmij.org/ftp/continuations/][Oleg Kiselyov’s compilation on continuations]]
997 |
998 | *** [[https://homepages.inf.ed.ac.uk/wadler/papers/papers-we-love/reynolds-discoveries.pdf][Discovery of Continuations]]
999 | John Reynolds
1000 |
1001 | ** [[https://dl.acm.org/doi/10.5555/22584.24311][Monads and theories: a survey for computation]]
1002 | D. E. Rydehead
1003 |
1004 | ** [[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.41.9551&rep=rep1&type=pdf][Histories of Discoveries of Continuations: Belles-Lettres with Equivocal Tenses]]
1005 | Peter Landin (1996)
1006 |
1007 | *** [[https://github.com/yallop/effects-bibliography][Effects Bibliography]]
1008 | Jeremy Yallop
1009 |
1010 | *** [[http://comonad.com/reader/2018/computational-quadrinitarianism-curious-correspondences-go-cubical/][A catalogue of the picture emerging among the Curry-Howard-Lambek-Stone-Scott-Tarski correspondences]]
1011 |
1012 |
1013 | *** [[https://github.com/rain-1/continuations-study-group][Continuations Reading List]]
1014 | A great set of papers for reading about continuations.
1015 |
1016 | ** Original Works
1017 |
1018 | *** [[https://www.cs.cmu.edu/~./epxing/Class/10715/reading/McCulloch.and.Pitts.pdf][A Logical Calculus of Ideas Immanent in Nervous Activity]]
1019 | Warren McCulloch, Walter Pitts (1943)
1020 |
1021 | *** Representation of events in nerve nets and finite automata (1956)
1022 | Stephen Kleene
1023 |
1024 | *** Finite automata and their decision problems (1959)
1025 | Micheal Rabin and Dana Scott
1026 |
1027 | *** [[https://www.cs.tau.ac.il/~nachumd/term/FloydMeaning.pdf][Assigning Meanings to Programs]]
1028 | R. W. Floyd
1029 |
1030 | *** [[http://www-formal.stanford.edu/jmc/towards.ps][Towards a Mathematical Theory of Computation (1961)]]
1031 | John McCarthy
1032 |
1033 | *** [[https://ropas.snu.ac.kr/~kwang/4190.310/mccarthy63basis.pdf][A Basis for a Mathematical Theory of Computation (1963)]]
1034 |
1035 | Another version: http://www.cs.cornell.edu/courses/cs4860/2018fa/lectures/Mathematical-Theory-of-Computation_McCarthy.pdf
1036 |
1037 | *** [[https://www.cs.cmu.edu/afs/cs/user/crary/www/819-f09/Landin64.pdf][The mechanical evaluation of expressions]]
1038 |
1039 | ** Books
1040 |
1041 | #+BEGIN_HTML
1042 |
1043 | Intermediate
1044 | #+END_HTML
1045 |
1046 | - [[Essentials of Programming Languages]]
1047 | - [[Design Concepts of Programming Languages]]
1048 |
1049 | #+BEGIN_HTML
1050 |
1051 | #+END_HTML
1052 |
1053 | #+BEGIN_HTML
1054 |
1055 | Advanced
1056 | #+END_HTML
1057 |
1058 | - [[https://www.irif.fr/~jep/PDF/MPRI/MPRI.pdf][Mathematical Foundations of Automata Theory]]
1059 | J. E. Pin
1060 |
1061 | - [[http://www.sci.brooklyn.cuny.edu/~noson/TCStext.html][Theoretical Computer Science for the Working Category Scientist]]
1062 |
1063 | Noson Yanofsky
1064 |
1065 | #+BEGIN_HTML
1066 |
1067 | #+END_HTML
1068 |
--------------------------------------------------------------------------------