├── .gitignore ├── Makefile ├── README.md ├── argus.html ├── argus.tex ├── emerald.html ├── emerald.pdf ├── emerald.tex ├── guesstimate.html ├── guesstimate.tex ├── hermes.html ├── hermes.pdf ├── hermes.tex ├── pl.bib ├── pl.html ├── pl.md ├── plits.html ├── plits.tex ├── pmldc.pdf ├── pmldc.tex ├── promises.html ├── promises.tex ├── rpc.html ├── rpc.pdf ├── rpc.tex └── todo.tex /.gitignore: -------------------------------------------------------------------------------- 1 | .texpadtmp 2 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | pandoc: 2 | pandoc promises.tex -o promises.html --bibliography pl.bib 3 | pandoc emerald.tex -o emerald.html --bibliography pl.bib 4 | pandoc hermes.tex -o hermes.html --bibliography pl.bib 5 | pandoc rpc.tex -o rpc.html --bibliography pl.bib 6 | pandoc plits.tex -o plits.html --bibliography pl.bib 7 | pandoc argus.tex -o argus.html --bibliography pl.bib 8 | pandoc guesstimate.tex -o guesstimate.html --bibliography pl.bib 9 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ## Programming Models and Languages for Distributed Computing 2 | 3 | This repository is a work-in-progress curriculum on models and languages 4 | for distributed computing. 5 | 6 | I try to write one section a week, but I don't guarantee that because 7 | this is a side project to my Ph.D. research. 8 | 9 | If you use any of this content in a course or workshop, please let me 10 | know, I'd love to know where it's being used and reference it from this 11 | page. 12 | 13 | ### Book 14 | 15 | Download all of the posts here: 16 | https://github.com/cmeiklejohn/PMLDC/blob/master/pmldc.pdf 17 | 18 | ### Blog 19 | 20 | Or, view them on my blog: 21 | http://christophermeiklejohn.com 22 | 23 | ### Contributing 24 | 25 | TODO's on sections to be written are in `todo.tex`. 26 | 27 | ## Copyright 28 | 29 | Copyright 2016 Christopher Meiklejohn 30 | -------------------------------------------------------------------------------- /argus.html: -------------------------------------------------------------------------------- 1 |

Relevant Reading

2 | 9 |

Commentary

10 |
11 |

“However, regardless of advances in hardware, we believe atomic actions are necessary and are a natural model for a large class of applications. If the language does not provide actions, the user will be compelled to implement them, perhaps unwittingly reimplementing with each new application, and may implement them incorrectly.”

12 |
13 |

Overview

14 |

The focus of these papers is the ARGUS system (and it’s roots with the programming language CLU), developed in the Laboratory for Computer Science at the Massachusetts Institute of Technology. ARGUS was designed to solve the problem of programming language support for the construction and maintenance of distributed programs constructed from modules executing at, and communicating from, geographically distinct nodes. ARGUS was designed with the following goals in mind:

15 | 23 |

Of all of the aforementioned concerns, the authors posit that the most difficult of these to provide is consistency: consistency becomes much more difficult to preserve in a system where coordination is minimized to avoid interference and m ask failures of the network. Therefore, the authors present a technique for integrating atomicity as fundamental concept in a programming language.

24 |

Atomicity and Actions

25 |

Since data resiliency only guarantees consistency under a quiescent environment, it is necessary to make some operations in the system atomic. When the authors state atomic, they mean two things:

26 | 30 |

ARGUS introduces actions: atomic transactions that need to either be committed or aborted and provide both the indivisibility and recoverability properties. If an action happens to abort, it is as if the the effects had never happened; if an action happens to commit, all modified objects in the system take on their new state. However, since ARGUS is a concurrent system, shared objects that will be manipulated by concurrent atomic actions must be done via synchronization, with read/and write locks.

31 |

To achieve this, ARGUS introduces a type of atomic object, or as they call it, atomic abstract data types: sets of objects and primitive operations for interacting with those objects. These objects are similar to normal data types, but are extended with operations that ensure indivisibility and recoverability. Access to objects within an atomic abstract data type are done through locking: read/write locks with versioning is used to facilitate concurrency access in the system. Write operations operate against a copy of the current version, and when the action completes, the new version of the object is written, replacing the old.

32 |

Nested Actions

33 |

The authors argue for indivisibility as a desired property, even if it reduces the amount of potential concurrency in the system:

34 |
35 |

“It has been argued that indivisibility is too strong a property for certain applications because it limits the amount of potential concurrency. We believe that indivisibility is the desired property for most application, if it is required only at the appropriate levels of abstraction. ARGUS provides a mechanism for user-defined atomic data types.”

36 |
37 |

Nested actions, or subactions, can be used to further divide actions and introduce concurrency within an action: each action can contain any number of subactions that can be either executed sequentially or concurrently. Subactions can commit or abort independently and this behavior has no impact on the parent action. However, an abort of a parent will force the abort of all subactions.

38 |

Nested actions provide support for the composition of actions into a larger action where some of the nested actions can run in parallel or might fail and have to be compensated for. For instance, if one subaction has to contact a remote guardian that may be unavailable, that subaction can abort and the operation retried against another remote guardian: the authors refer to this as “several ways to accomplish a task.”

39 |

Because nested actions may fail, interact with several guardians, and have “checkpointing” behavior as they either abort or succeed, each nested actions ancestor needs to track metadata during the execution:

40 |
    41 |
  1. Each subaction will inherit locks from its ancestor: read locks are directly inherited, but write locks can only be inherited if all read locks can also be inherited.

  2. 42 |
  3. Each subaction can abort or succeed: if the action aborts, the ancestor will restore the state, or “checkpoint”, of the execution to before the operation was executed.

  4. 43 |
  5. Each version created by a subaction needs to be propagated, to ensure that the effects of a previous write to a successful nested action are available to a subsequent subaction.

  6. 44 |
  7. Each top-level action will track what’s referred to as a plist, which is a list of all of the guardians contacted during the execution of the action to be used in the final two-phase commit protocol to commit the action and write to stable storage.

  8. 45 |
46 |

Remote Procedure Call and the Network

47 |

The authors argue for Remote Procedure Call and state that nested actions can be used for masking the communication failures inherent in the paradigm:

48 |
49 |

“In fact, we believe the form of communication that is needed is remote procedure call, with at-most-once semantics, namely, that (effectively) either the message is delivered and acts on exactly once, with exactly one reply received, or the message is never delivered and the sender is so informed.”

50 |
51 |

The authors believe that low-level issues from the system should be shielded from the user, such as packet retransmission, but do believe that the system should make a reasonable attempt of delivery for messages. However, the users believe that the notion that long delays are possible and that ultimate failure is possible, should be made present to the user. The authors propose that the subaction should be used for this: individual subactions can abort without causing the entire parent action to abort: in the case where these subactions do abort, the user should be able to take action, such as try an alternative replica to service the request.

52 |

In this model, some top-level actions can not be aborted. Consider the case of an airline reservation system: the clerk can initiate operations for making a reservation and subactions can take action and try multiple replicas if, for instance, the primary can not be contacted. However, an external event performed by a top-level action, such as printing a check, can not be undone and needs to have a compensating action for dealing with this behavior. To alleviate this problem, the authors suggest breaking these into two separate top-level actions, where they are sequenced to ensure the completion of one before the execution of the subsequent operation.

53 |

Finally, the authors acknowledge that even in this model timeouts are necessary to resolve issues that might arise from contention or deadlocks between atomic resources.

54 |

Semantics

55 |

We won’t provide the full semantics of how the ARGUS language works, but provide a brief write up that should give you the general idea of how the system works.

56 |

In ARGUS, distributed programs are composed of a group of guardians. Guardians encapsulate and control access to one or more resources and makes them available through operations called handlers, which are called by other guardians. Guardians contain both data and processes: process execute handlers and are spawned for each call to a handler by another guardian. Processes have access to objects that make up the state of the guardian and this is how access to objects is controlled.

57 |

Guardians contain both stable and volatile objects: volatile objects are lost when nodes running a guardian fail. When a guardian fails, the language support system is responsible for recreating the guardian with the objects that were persisted to stable storage. Stable state is versioned when modifications are performed, and volatile objects live on the heap. Guardians are created dynamically and the node where the guardian runs is specified by the programmer: guardians represent a logical node of the system where communication is performed by handlers, an abstraction of state and the physical network.

58 |

Processes in guardians execute concurrently and the language controls access to atomic objects using the locking mechanisms described above: in addition, a coroutining facility allows subactions to be further divided into concurrent executions, yielding more concurrency from the system.

59 |

ARGUS provides static type checking at compile time and loading of new guardians and handlers while the system is running: to ensure this happens safely, types must be known by the system before loading a handler that uses these types. Guardians and handlers are first class and can be used as both arguments and return values from handler invocations.

60 |

As the authors highlight, the major drawbacks of the system are in security and scheduling. ARGUS provides no mechanism for securing guardians from particular guardians or preventing nodes from launching certain types of guardians. ARGUS also has no mechanism for priority servicing of calls, backpressure, or load-shedding, leading the system susceptible to scheduling problems when overloaded.

61 |

Finally, the authors acknowledge that concurrent programming is hard. While ARGUS goes to great efforts to simplify the problem of maintaining consistency in a distributed application, as highlighted by the banking example, the ARGUS system made it all to easy to introduce deadlocks by having concurrent processes in a guardian, or concurrent actions, take out read/write locks in the wrong order.

62 |

Impact and Implementations

63 |

ARGUS was arguably the first system to provide a distributed programming paradigm where consistency was integrated in the language as a first class concept. ARGUS built on top of CLU, and provided a series of abstractions directly assisting programmers in writing code that maintained consistency, as the authors felt this was the main challenge in distributed programming.

64 |

ARGUS served as the platform where other ideas would blossom: both Remote Procedure Call was used as the primary abstraction for making remote calls, where subactions were used to mask omissions. The optimizations that Promises provided to pipeline RPCs and ensure ordering of requests and responses also were originally developed in the ARGUS system.

65 |

The idea of compensating transactions in ARGUS for top-level actions that can not be “recovered” also appeared later in the SAGA design (Garcia-Molina and Salem 1987).

66 |

Finally, languages like Bloom (Alvaro et al. 2011), while not providing abstractions for consistency, take an alternative approach by providing analysis tools for determining where consistency could be violated and cause program anomalies. The authors propose to plug in a distributed consensus mechanism, such as Zookeeper, to ensure consistency while performing these changes.

67 |
68 |
69 |

Alvaro, Peter, Neil Conway, Joseph M Hellerstein, and William R Marczak. 2011. “Consistency Analysis in Bloom: A Calm and Collected Approach.” In CIDR, 249–60. Citeseer.

70 |
71 |
72 |

Garcia-Molina, Hector, and Kenneth Salem. 1987. Sagas. Vol. 16. 3. ACM.

73 |
74 |
75 |

Liskov, Barbara. 1988. “Distributed Programming in Argus.” Communications of the ACM 31 (3). ACM: 300–312.

76 |
77 |
78 |

Liskov, Barbara, and Robert Scheifler. 1983. “Guardians and Actions: Linguistic Support for Robust, Distributed Programs.” ACM Transactions on Programming Languages and Systems (TOPLAS) 5 (3). ACM: 381–404.

79 |
80 |
81 |

Liskov, Barbara, Dorothy Curtis, Paul Johnson, and Robert Scheifer. 1987. “Implementation of Argus.” ACM SIGOPS Operating Systems Review 21 (5). ACM: 111–22.

82 |
83 |
84 |

Liskov, Barbara, Alan Snyder, Russell Atkinson, and Craig Schaffert. 1977. “Abstraction Mechanisms in Clu.” Communications of the ACM 20 (8). ACM: 564–76.

85 |
86 |
87 |

Walker, Edward Franklin. 1984. “Orphan Detection in the Argus System.” DTIC Document.

88 |
89 |
90 | -------------------------------------------------------------------------------- /argus.tex: -------------------------------------------------------------------------------- 1 | \subsection{Relevant Reading} 2 | 3 | \begin{itemize} 4 | \item \textit{Abstraction Mechanisms in CLU}, Liskov, Barbara and Snyder, Alan and Atkinson, Russell and Schaffert, Craig, CACM 1977~\cite{liskov1977abstraction}. 5 | \item \textit{Guardians and Actions: Linguistic Support for Robust, Distributed Programs}, Liskov, Barbara and Scheifler, Robert, TOPLAS 1982~\cite{liskov1983guardians}. 6 | \item \textit{Orphan Detection in the Argus System}, Walker, Edward Franklin, DTIC 1984~\cite{walker1984orphan}. 7 | \item \textit{Implementation of Argus}, Liskov, Barbara and Curtis, Dorothy and Johnson, Paul and Scheifer, Robert, SIGOPS 1987~\cite{liskov1987implementation}. 8 | \item \textit{Distributed Programming in Argus}, Liskov, Barbara CACM 1988~\cite{liskov1988distributed}. 9 | \end{itemize} 10 | 11 | \subsection{Commentary} 12 | 13 | \begin{quote} 14 | ``However, regardless of advances in hardware, we believe atomic actions are necessary and are a natural model for a large class of applications. If the language does not provide actions, the user will be compelled to implement them, perhaps unwittingly reimplementing with each new application, and may implement them incorrectly.'' 15 | \end{quote} 16 | 17 | \subsubsection{Overview} 18 | 19 | The focus of these papers is the ARGUS system (and it's roots with the programming language CLU), developed in the Laboratory for Computer Science at the Massachusetts Institute of Technology. ARGUS was designed to solve the problem of programming language support for the construction and maintenance of distributed programs constructed from modules executing at, and communicating from, geographically distinct nodes. ARGUS was designed with the following goals in mind: 20 | 21 | \begin{itemize} 22 | \item \textbf{Service:} Programs should have localized failures, and be geographically distributed with replicated data for both fault-tolerance and availability. 23 | \item \textbf{Reconfiguration:} Software should be able to be reconfigured while the system is running: this allows for the addition of additional capacity to increase processing power, decrease response times, or increase the availability of data. 24 | \item \textbf{Autonomy:} Nodes in the system may be owned by individuals or organizations that need to control what data is replicated at the node for political or sociological reasons. 25 | \item \textbf{Distribution:} Programs should be able to control explicit placement of data to ensure responsiveness and cost-effectiveness of hardware in the system. 26 | \item \textbf{Concurrency:} Distribution should be able to exploit the available concurrency in the system. 27 | \item \textbf{Consistency:} Consistency must be maintained, specifically for invariant preservation: for instance, conservation of funds during a transfer between two accounts. 28 | \end{itemize} 29 | 30 | Of all of the aforementioned concerns, the authors posit that the most difficult of these to provide is \textbf{consistency}: consistency becomes much more difficult to preserve in a system where coordination is minimized to avoid interference and m ask failures of the network. Therefore, the authors present a technique for integrating \textbf{atomicity} as fundamental concept in a programming language. 31 | 32 | \subsubsection{Atomicity and Actions} 33 | 34 | Since data resiliency only guarantees consistency under a quiescent environment, it is necessary to make some operations in the system \textbf{atomic}. When the authors state \textbf{atomic}, they mean two things: 35 | 36 | \begin{itemize} 37 | \item \textbf{Indivisibility:} the execution of an activity in the system never appears to overlap with another activity in the system. 38 | \item \textbf{Recoverability:} the overall effect of an activity is all or nothing: in the event of a failure, the activity must be completed after recovery or all objects must be restored to their initial state. 39 | \end{itemize} 40 | 41 | ARGUS introduces \textbf{actions}: atomic transactions that need to either be committed or aborted and provide both the indivisibility and recoverability properties. If an action happens to abort, it is as if the the effects had never happened; if an action happens to commit, all modified objects in the system take on their new state. However, since ARGUS is a concurrent system, shared objects that will be manipulated by concurrent atomic actions must be done via synchronization, with read/and write locks. 42 | 43 | To achieve this, ARGUS introduces a type of atomic object, or as they call it, \textbf{atomic abstract data types}: sets of objects and primitive operations for interacting with those objects. These objects are similar to normal data types, but are extended with operations that ensure indivisibility and recoverability. Access to objects within an atomic abstract data type are done through locking: read/write locks with versioning is used to facilitate concurrency access in the system. Write operations operate against a copy of the current version, and when the action completes, the new version of the object is written, replacing the old. 44 | 45 | \subsubsection{Nested Actions} 46 | 47 | The authors argue for indivisibility as a desired property, even if it reduces the amount of potential concurrency in the system: 48 | 49 | \begin{quote} 50 | ``It has been argued that indivisibility is too strong a property for certain applications because it limits the amount of potential concurrency. We believe that indivisibility is the desired property for most application, \textit{if} it is required only at the appropriate levels of abstraction. ARGUS provides a mechanism for \textit{user-defined} atomic data types.'' 51 | \end{quote} 52 | 53 | Nested actions, or subactions, can be used to further divide actions and introduce concurrency within an action: each action can contain any number of subactions that can be either executed sequentially or concurrently. Subactions can commit or abort independently and this behavior has no impact on the parent action. However, an abort of a parent will force the abort of all subactions. 54 | 55 | Nested actions provide support for the composition of actions into a larger action where some of the nested actions can run in parallel or might fail and have to be compensated for. For instance, if one subaction has to contact a remote guardian that may be unavailable, that subaction can abort and the operation retried against another remote guardian: the authors refer to this as ``several ways to accomplish a task.'' 56 | 57 | Because nested actions may fail, interact with several guardians, and have ``checkpointing'' behavior as they either abort or succeed, each nested actions ancestor needs to track metadata during the execution: 58 | 59 | \begin{enumerate} 60 | \item Each subaction will inherit locks from its ancestor: read locks are directly inherited, but write locks can only be inherited if all read locks can also be inherited. 61 | \item Each subaction can abort or succeed: if the action aborts, the ancestor will restore the state, or ``checkpoint'', of the execution to before the operation was executed. 62 | \item Each version created by a subaction needs to be propagated, to ensure that the effects of a previous write to a successful nested action are available to a subsequent subaction. 63 | \item Each top-level action will track what's referred to as a \textit{plist}, which is a list of all of the guardians contacted during the execution of the action to be used in the final two-phase commit protocol to commit the action and write to stable storage. 64 | \end{enumerate} 65 | 66 | \subsubsection{Remote Procedure Call and the Network} 67 | 68 | The authors argue for Remote Procedure Call and state that nested actions can be used for masking the communication failures inherent in the paradigm: 69 | 70 | \begin{quote} 71 | ``In fact, we believe the form of communication that is needed is \textit{remote procedure call,} with \textit{at-most-once} semantics, namely, that (effectively) either the message is delivered and acts on exactly once, with exactly one reply received, or the message is never delivered and the sender is so informed.'' 72 | \end{quote} 73 | 74 | The authors believe that low-level issues from the system should be shielded from the user, such as packet retransmission, but do believe that the system should make a reasonable attempt of delivery for messages. However, the users believe that the notion that long delays are possible and that ultimate failure is possible, should be made present to the user. The authors propose that the subaction should be used for this: individual subactions can abort without causing the entire parent action to abort: in the case where these subactions do abort, the user should be able to take action, such as try an alternative replica to service the request. 75 | 76 | In this model, some top-level actions can not be aborted. Consider the case of an airline reservation system: the clerk can initiate operations for making a reservation and subactions can take action and try multiple replicas if, for instance, the primary can not be contacted. However, an external event performed by a top-level action, such as printing a check, can not be undone and needs to have a compensating action for dealing with this behavior. To alleviate this problem, the authors suggest breaking these into two separate top-level actions, where they are sequenced to ensure the completion of one before the execution of the subsequent operation. 77 | 78 | Finally, the authors acknowledge that even in this model timeouts are necessary to resolve issues that might arise from contention or deadlocks between atomic resources. 79 | 80 | \subsubsection{Semantics} 81 | 82 | We won't provide the full semantics of how the ARGUS language works, but provide a brief write up that should give you the general idea of how the system works. 83 | 84 | In ARGUS, distributed programs are composed of a group of \textbf{guardians}. Guardians encapsulate and control access to one or more resources and makes them available through operations called \textbf{handlers}, which are called by other guardians. Guardians contain both data and processes: process execute handlers and are spawned for each call to a handler by another guardian. Processes have access to objects that make up the state of the guardian and this is how access to objects is controlled. 85 | 86 | Guardians contain both stable and volatile objects: volatile objects are lost when nodes running a guardian fail. When a guardian fails, the language support system is responsible for recreating the guardian with the objects that were persisted to stable storage. Stable state is versioned when modifications are performed, and volatile objects live on the heap. Guardians are created dynamically and the node where the guardian runs is specified by the programmer: guardians represent a logical node of the system where communication is performed by handlers, an abstraction of state and the physical network. 87 | 88 | Processes in guardians execute concurrently and the language controls access to atomic objects using the locking mechanisms described above: in addition, a coroutining facility allows subactions to be further divided into concurrent executions, yielding more concurrency from the system. 89 | 90 | ARGUS provides static type checking at compile time and loading of new guardians and handlers while the system is running: to ensure this happens safely, types must be known by the system before loading a handler that uses these types. Guardians and handlers are first class and can be used as both arguments and return values from handler invocations. 91 | 92 | As the authors highlight, the major drawbacks of the system are in security and scheduling. ARGUS provides no mechanism for securing guardians from particular guardians or preventing nodes from launching certain types of guardians. ARGUS also has no mechanism for priority servicing of calls, backpressure, or load-shedding, leading the system susceptible to scheduling problems when overloaded. 93 | 94 | Finally, the authors acknowledge that \textbf{concurrent programming is hard}. While ARGUS goes to great efforts to simplify the problem of maintaining consistency in a distributed application, as highlighted by the banking example, the ARGUS system made it all to easy to introduce deadlocks by having concurrent processes in a guardian, or concurrent actions, take out read/write locks in the wrong order. 95 | 96 | \subsection{Impact and Implementations} 97 | 98 | ARGUS was arguably the first system to provide a distributed programming paradigm where consistency was integrated in the language as a first class concept. ARGUS built on top of CLU, and provided a series of abstractions directly assisting programmers in writing code that maintained consistency, as the authors felt this was the main challenge in distributed programming. 99 | 100 | ARGUS served as the platform where other ideas would blossom: both \textbf{Remote Procedure Call} was used as the primary abstraction for making remote calls, where subactions were used to mask omissions. The optimizations that \textbf{Promises} provided to pipeline RPCs and ensure ordering of requests and responses also were originally developed in the ARGUS system. 101 | 102 | The idea of compensating transactions in ARGUS for top-level actions that can not be ``recovered'' also appeared later in the SAGA design~\cite{garcia1987sagas}. 103 | 104 | Finally, languages like Bloom~\cite{alvaro2011consistency}, while not providing abstractions for consistency, take an alternative approach by providing analysis tools for determining where consistency could be violated and cause program anomalies. The authors propose to plug in a distributed consensus mechanism, such as Zookeeper, to ensure consistency while performing these changes. -------------------------------------------------------------------------------- /emerald.html: -------------------------------------------------------------------------------- 1 |

Relevant Reading

2 | 10 |

These texts, and more, are available from the languages website, http://www.emeraldprogramminglanguage.org.

11 |

Commentary

12 |

The Eden Programming Language (EPL) was a distributed programming language developed on top of Concurrent Euclid (Holt 1982) that extended the existing language with support for remote method invocations. However, this support was far from ideal: incoming method invocation requests would have to be received and dispatched by a single thread while the programmer making the request would have to manually inspect error codes to ensure that the remote invocation succeeded.

13 |

Eden also provided location-independent mobile objects, but the implementation was extremely costly. In the implementation, each object was a full Unix process that could send and receive messages to each other: these messages would be sent using interprocess communication if located on the same node, resulting in latencies in the milliseconds. Eden additionally implemented a “kernel” object for dispatching messages between processes, resulting in a single message between two objects on the same system taking over 100 milliseconds, the cost of two context switches at the time. To make applications developed in EPL more efficient, application developers would use a lightweight heap-based object implemented in Concurrent Euclid (that, appeared as a single Eden object consuming a single Unix process) for objects that needed to communicate, but were located on the same machine, that would communicate through shared memory. The next problem follows naturally: the single abstraction provided resulted in extremely slow applications, so, a new abstraction was provided to compensate leaving the user with two different object models.

14 |

In a legendary memo entitled “Getting to Oz”, the language designers of Eden and the soon-to-be language designers began discussions to improve the design of Eden. This new language would be entitled “Emerald”1.

15 |

We enumerate here the list of specific goals the language designers had for Emerald, outside of the general improvements they wanted to make on Eden.

16 |
    17 |
  1. Convinced distributed objects were a good idea and the right way to construct distributed programs, they sought to improve the performance of distributed objects.

  2. 18 |
  3. Objects should stay relatively cost-free, for instance, if they do not take advantage of distribution: no-use, no-cost.

  4. 19 |
  5. Simplify and reduce the dual object model, remove explicit dispatching, error handling and other warts in the Eden model.

  6. 20 |
  7. To support the principle of information hiding and have a single semantics for both large and small, local or distributed, objects.

  8. 21 |
  9. Distributed programs can fail: the network can be down, a service can be unavailable, and therefore a language for building distributed applications needs to provide the programmer tools for dealing with these failures.

  10. 22 |
  11. Minimization of the language by removing many of the features seen in other languages and building abstractions that could be used to extend the language.

  12. 23 |
  13. Object location needs to be explicit even as much as the authors wanted to follow the principle of information hiding as it directly impacts performance. Objects should be able to be moved, but moving an object should not change the operational semantics of the language2.

  14. 24 |
25 |

Impact and Implementations

26 |

The technical innovations for Emerald, a system that was under primary development from 1983 to 1987 (and later continued by various graduate students) are numerous. We highlight a few of the most important technical innovations below:

27 |
    28 |
  1. Emerald presented a single object model for both distributed and local objects. Each object has a globally unique identifier, internal state, and a set of methods that could be invoked. These objects could run in their own process, if necessary, or not. (In fact, objects encapsulated their processes and launched them on object invocation.)

  2. 29 |
  3. Objects could exist with different implementations in this unified model: global objects, or objects that could be accessed either locally or remote; local objects, that were optimized for local access only as determined as best as possible at compile time; and direct, or objects that represented primitive types such as integers, booleans, etc.

  4. 30 |
  5. Emerald was a statically-typed language, that had dynamic type checking for objects that were received over the wire. This was achieved using a notion of protocols and conformity-based typing. Dynamic type checking would be performed by ensuring types at runtime were compatible based on their interfaces. This was done by forming a type lattice and computing both the join and meet based on the abstract, or the compile-time provided type, and the concrete implementation that were provided at runtime.

  6. 31 |
  7. Emerald’s type system also provided capabilities, where types could either be restricted to a higher type in the type lattice, or viewed at a lower type in the type lattice.

  8. 32 |
  9. Synchronization between processes in Emerald was achieved using monitors to achieve mutual exclusion with condition signaling and waiting (Hoare 1974).

  10. 33 |
  11. Mobility in Emerald was provided using explicit placement primitives. Processes could be moved to a new location, fixed at a precise location, and located. Emerald also provided two new parameter evaluation modes based on mobility: call-by-move3, or move the parameter object to the invocation location, and call-by-visit, by remote access of the parameter object from the invocation’s location. When objects were moved, the old placement would store a forwarding address that would be used to route messages forward; timestamps were used to detect routing loops and reference the most recent object and to avoid stable storage of the forwarding addresses, reliable broadcast was used to find lost pointers.

  12. 34 |
  13. Errors related to network availability were not considered exceptions; therefore special notation for handling these errors was provided to the programmer.

  14. 35 |
36 |

Emerald (and Eden’s) influence on programming languages throughout the history of programming languages is paramount. Emerald specifically innovated in two main areas: distributed objects and type systems, which is interesting because the innovations in type systems were only done to support the development of distributed objects in the Emerald system.

37 |

The idea of type conformity over a type lattice with both concrete and abstract types influenced the further development of protocols, mechanisms to specify the external behavior of an objects, in the ANSI 1997 Smalltalk standard.

38 |

Developers of the Modula-3 Network Objects (Birrell et al. 1993) system took what they felt was the most essential and the best of both the Emerald and SOS (Shapiro et al. 1989) systems. This system forewent the mobility of objects in favor of marshalling. In the author’s own words:

39 |
40 |

“We believe it is better to provide powerful marshaling than object mobility. The two facilities are similar, because both of them allow the programmer the option of communicating objects by reference or by copying. Either facility can be used to distribute data and computation as needed by applications. Object mobility offers slightly more flexibility, because the same object can be either sent by reference or moved; while with our system, network objects are always sent by reference and other objects are always sent by copying. However, this extra flexibility doesn’t seem to us to be worth the substantial increase in complexity of mobile objects.” (A. P. Black et al. 2007)

41 |
42 |

Both Java’s RMI system and Jini (and their predecessor OMG’s CORBA) were influenced by Emerald as well, however not without a fierce discussion on the merits of distinguishing between remote and local method invocations, motivated by legendary technical report (Kendall et al. 1994) by Sun Microsystems’ research division4.

43 |

Jim Waldo, author of the aforementioned technical report, writes:

44 |
45 |

“The RMI system (and later the Jini system) took many of the ideas pioneered in Emerald having to do with moving objects around the network. We introduced these ideas to allow us to deal with the problems found in systems like CORBA with type truncation (and which were dealt with in Network Objects by doing the closest-match); the result was that passing an object to a remote site resulted in passing [a copy of] exactly that object, including when necessary a copy of the code (made possible by Java bytecodes and the security mechanisms). This was exploited to some extent in the RMI world, and far more fully in the Jini world, making both of those systems more Emerald-like than we realized at the time.” (A. P. Black et al. 2007)

46 |
47 |

The authors eventually come to a similar conclusion to many distributed systems practitioners today and other critics in their research area: that availability, reliability, and the network remain the paramount challenges and add fuel to the fire against any location-transparent semantics provided by mobile objects. These still remain as much of a challenge for the developers of distributed programs today, as they did in 1983 at the start of the development lineage from Eden to Emerald:

48 |
49 |

“Mobile objects promise to make that same simplicity available in a distributed setting: the same semantics, the same parameter mechanisms, and so on. But this promise must be illusory. In a distributed setting the programmer must deal with issues of availability and reliability. So programmers have to replicate their objects, manage the replicas, and worry about ’one copy semantics’. Things are not so simple any more, because the notion of object identity supported by the programming language is no longer the same as the application’s notion of identity. We can make things simple only by giving up on reliability, fault tolerance, and availability — but these are the reasons that we build distributed systems.” (A. P. Black et al. 2007)

50 |
51 |

The authors of Eden and Emerald express an extremely interesting point early on in their paper on the history of Emerald (A. P. Black et al. 2007): the reason for the poor abstractions requiring manual dispatch of method invocations to threads, and the explicit error handling from network anomalies, was because as researchers working on a language, they were not implementing distributed applications themselves. To quote the authors:

52 |
53 |

“Eden team had real experience with writing distributed applications, we had not yet learned what support should be provided. For example, it was not clear to us whether or not each incoming call should be run in its own thread (possibly leading to excessive resource contention), whether or not all calls should run in the same thread (possibly leading to deadlock), whether or not there should be a thread pool of a bounded size (and if so, how to choose it), or whether or not there was some other, more elegant solution that we hadn’t yet thought of. So we left it to the application programmer to build whatever invocation thread management system seemed appropriate: EPL was partly a language, and partly a kit of components. The result of this approach was that there was no clear separation between the code of the application and the scaffolding necessary to implement remote calls.” (A. P. Black et al. 2007)

54 |
55 |

I will close this section on Emerald with a quote from the authors.

56 |
57 |

“We are all proud of Emerald, and feel that it is one of the most significant pieces of research we have ever undertaken. People who have never heard of Emerald are surprised that a language that is so old, and was implemented by so small a team, does so much that is ’modern’. If asked to describe Emerald briefly, we sometimes say that it’s like Java, except that it has always had generics, and that its objects are mobile.” (A. P. Black et al. 2007)

58 |
59 |
60 |
61 |

Birrell, Andrew, Greg Nelson, Susan Owicki, and Edward Wobber. 1993. “Network Objects.” In Proceedings of the Fourteenth Acm Symposium on Operating Systems Principles, 217–30. SOSP ’93. New York, NY, USA: ACM. doi:10.1145/168619.168637.

62 |
63 |
64 |

Black, A., N. Hutchinson, E. Jul, H. Levy, and L. Carter. 1987. “Distribution and Abstract Types in Emerald.” IEEE Transactions on Software Engineering SE-13 (1): 65–76. doi:10.1109/TSE.1987.232836.

65 |
66 |
67 |

Black, Andrew P, Norman C Hutchinson, Eric Jul, and Henry M Levy. 2007. “The Development of the Emerald Programming Language.” In Proceedings of the Third Acm Sigplan Conference on History of Programming Languages, 11–11. ACM.

68 |
69 |
70 |

Black, Andrew, Norman Hutchinson, Eric Jul, and Henry Levy. 1986. “Object Structure in the Emerald System.” In Conference Proceedings on Object-Oriented Programming Systems, Languages and Applications, 78–86. OOPLSA ’86. New York, NY, USA: ACM. doi:10.1145/28697.28706.

71 |
72 |
73 |

Hoare, Charles Antony Richard. 1974. Monitors: An Operating System Structuring Concept. Springer.

74 |
75 |
76 |

Holt, Richard C. 1982. “A Short Introduction to Concurrent Euclid.” ACM Sigplan Notices 17 (5). ACM: 60–79.

77 |
78 |
79 |

Kendall, Samuel C, Jim Waldo, Ann Wollrath, and Geoff Wyant. 1994. “A Note on Distributed Computing.” Sun Microsystems, Inc.

80 |
81 |
82 |

Liskov, Barbara. 1988. “Distributed Programming in Argus.” Communications of the ACM 31 (3). ACM: 300–312.

83 |
84 |
85 |

Raj, Rajendra K., Ewan Tempero, Henry M. Levy, Andrew P. Black, Norman C. Hutchinson, and Eric Jul. 1991. “Emerald: A General-Purpose Programming Language.” Software: Practice and Experience 21 (1). John Wiley & Sons, Ltd.: 91–118. doi:10.1002/spe.4380210107.

86 |
87 |
88 |

Shapiro, Marc, Yvon Gourhant, Sabine Habert, Laurence Mosseri, Michel Ruffin, and Celine Valot. 1989. “SOS: An Object-Oriented Operating System―-Assessment and Perspectives.” Computing Systems 2 (4): 287–338.

89 |
90 |
91 |
92 |
93 |
    94 |
  1. As in, the Emerald city from “The Wonderful Wizard of Oz” referencing the original runtime for the Oz language “Toto”, and the nickname for Seattle.

  2. 95 |
  3. This dichotomy is presented as the semantics vs. the locatics of the language and the authors soon realized that one aspect of the language influenced both of these: failures.

  4. 96 |
  5. This is a departure from systems like Argus (Liskov 1988) that assumed all arguments were passed using call-by-value.

  6. 97 |
  7. While we acknowledge the lineage here beginning with systems like Eden and Emerald, the majority of the criticisms of this technical report are targeted towards OMG’s CORBA system.

  8. 98 |
99 |
100 | -------------------------------------------------------------------------------- /emerald.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cmeiklejohn/PMLDC/d9bbfe20f87699aa5c6c09815ac0eb20c2e1d141/emerald.pdf -------------------------------------------------------------------------------- /emerald.tex: -------------------------------------------------------------------------------- 1 | \subsection{Relevant Reading} 2 | 3 | \begin{itemize} 4 | \item \textit{The development of the Emerald programming language}, Black, Andrew P and Hutchinson, Norman C and Jul, Eric and Levy, Henry M, HOPL 2007~\cite{black2007development}. 5 | \item \textit{Distribution and Abstract Types in Emerald}, A. Black and N. Hutchinson and E. Jul and H. Levy and L. Carter, IEEE 1987~\cite{1702134}. 6 | \item \textit{Emerald: A general-purpose programming language}, Raj, Rajendra K. and Tempero, Ewan and Levy, Henry M. and Black, Andrew P. and Hutchinson, Norman C. and Jul, Eric, Software Practice and Experience, 1991~\cite{SPE:SPE4380210107}. 7 | \item \textit{Object Structure in the Emerald System}, Black, Andrew and Hutchinson, Norman and Jul, Eric and Levy, Henry, OOPLSA '86~\cite{Black:1986:OSE:28697.28706}. 8 | \item \textit{Typechecking Polymorphism in Emerald}, Black, Andrew P. and Hutchinson, Norman, Technical Report CRL 91/1, Digital Cambridge Research Laboratory, 1991. 9 | \item \textit{Getting to Oz}, Hank Levy, Norm Hutchinson, and Eric Jul, April 1984. 10 | \end{itemize} 11 | 12 | These texts, and more, are available from the languages website, \url{http://www.emeraldprogramminglanguage.org}. 13 | 14 | \subsection{Commentary} 15 | 16 | The Eden Programming Language (EPL) was a distributed programming language developed on top of Concurrent Euclid~\cite{holt1982short} that extended the existing language with support for remote method invocations. However, this support was far from ideal: incoming method invocation requests would have to be received and dispatched by a single thread while the programmer making the request would have to manually inspect error codes to ensure that the remote invocation succeeded. 17 | 18 | Eden also provided location-independent mobile objects, but the implementation was extremely costly. In the implementation, each object was a full Unix process that could send and receive messages to each other: these messages would be sent using interprocess communication if located on the same node, resulting in latencies in the milliseconds. Eden additionally implemented a ``kernel'' object for dispatching messages between processes, resulting in a single message between two objects on the same system taking over 100 milliseconds, the cost of two context switches at the time. To make applications developed in EPL more efficient, application developers would use a lightweight heap-based object implemented in Concurrent Euclid (that, appeared as a single Eden object consuming a single Unix process) for objects that needed to communicate, but were located on the same machine, that would communicate through shared memory. The next problem follows naturally: the single abstraction provided resulted in extremely slow applications, so, a new abstraction was provided to compensate leaving the user with two different object models. 19 | 20 | In a legendary memo entitled ``Getting to Oz'', the language designers of Eden and the soon-to-be language designers began discussions to improve the design of Eden. This new language would be entitled ``Emerald''\footnote{As in, the Emerald city from ``The Wonderful Wizard of Oz'' referencing the original runtime for the Oz language ``Toto'', and the nickname for Seattle.}. 21 | 22 | We enumerate here the list of specific goals the language designers had for Emerald, outside of the general improvements they wanted to make on Eden. 23 | \begin{enumerate} 24 | \item Convinced distributed objects were a good idea and the right way to construct distributed programs, they sought to \textit{improve the performance of distributed objects.} 25 | \item Objects should stay relatively cost-free, for instance, if they do not take advantage of distribution: \textit{no-use, no-cost}. 26 | \item \textit{Simplify} and reduce the dual object model, remove explicit dispatching, error handling and other warts in the Eden model. 27 | \item To support \textit{the principle of information hiding} and have a single semantics for both large and small, local or distributed, objects. 28 | \item Distributed programs can fail: the network can be down, a service can be unavailable, and therefore a language for building distributed applications needs to \textit{provide the programmer tools for dealing with these failures.} 29 | \item \textit{Minimization of the language} by removing many of the features seen in other languages and building abstractions that could be used to extend the language. 30 | \item \textit{Object location needs to be explicit} even as much as the authors wanted to follow the \textit{principle of information hiding} as it directly impacts performance. Objects should be able to be moved, but moving an object should not change the operational semantics of the language\footnote{This dichotomy is presented as the \textit{semantics} vs. the \textit{locatics} of the language and the authors soon realized that one aspect of the language influenced both of these: failures.}. 31 | \end{enumerate} 32 | 33 | \subsection{Impact and Implementations} 34 | 35 | The technical innovations for Emerald, a system that was under primary development from 1983 to 1987 (and later continued by various graduate students) are numerous. We highlight a few of the most important technical innovations below: 36 | 37 | \begin{enumerate} 38 | \item Emerald presented a \textit{single object model} for both distributed and local objects. Each object has a globally unique identifier, internal state, and a set of methods that could be invoked. These objects could run in their own process, if necessary, or not. (In fact, objects encapsulated their processes and launched them on object invocation.) 39 | \item Objects could exist with different implementations in this unified model: \textit{global} objects, or objects that could be accessed either locally or remote; \textit{local} objects, that were optimized for local access only as determined as best as possible at compile time; and \textit{direct}, or objects that represented primitive types such as integers, booleans, etc. 40 | \item Emerald was a statically-typed language, that had dynamic type checking for objects that were received over the wire. This was achieved using a notion of \textit{protocols} and \textit{conformity}-based typing. Dynamic type checking would be performed by ensuring types at runtime were compatible based on their interfaces. This was done by forming a type lattice and computing both the \textit{join} and \textit{meet} based on the \textit{abstract}, or the compile-time provided type, and the \textit{concrete} implementation that were provided at runtime. 41 | \item Emerald's type system also provided \textit{capabilities}, where types could either be \textit{restrict}ed to a higher type in the type lattice, or \textit{view}ed at a lower type in the type lattice. 42 | \item Synchronization between processes in Emerald was achieved using \textit{monitors} to achieve mutual exclusion with condition signaling and waiting~\cite{hoare1974monitors}. 43 | \item Mobility in Emerald was provided using explicit placement primitives. Processes could be \textit{moved} to a new location, \textit{fix}ed at a precise location, and \textit{locate}d. Emerald also provided two new parameter evaluation modes based on mobility: \textit{call-by-move}\footnote{This is a departure from systems like Argus~\cite{liskov1988distributed} that assumed all arguments were passed using call-by-value.}, or move the parameter object to the invocation location, and \textit{call-by-visit}, by remote access of the parameter object from the invocation's location. When objects were moved, the old placement would store a \textit{forwarding address} that would be used to route messages forward; timestamps were used to detect routing loops and reference the most recent object and to avoid stable storage of the forwarding addresses, reliable broadcast was used to find lost pointers. 44 | \item Errors related to network availability were not considered exceptions; therefore special notation for handling these errors was provided to the programmer. 45 | \end{enumerate} 46 | 47 | Emerald (and Eden's) influence on programming languages throughout the history of programming languages is paramount. Emerald specifically innovated in two main areas: distributed objects and type systems, which is interesting because the innovations in type systems were only done to support the development of distributed objects in the Emerald system. 48 | 49 | The idea of type \textit{conformity} over a type lattice with both \textit{concrete} and \textit{abstract} types influenced the further development of \textit{protocols}, mechanisms to specify the external behavior of an objects, in the ANSI 1997 Smalltalk standard. 50 | 51 | Developers of the Modula-3 Network Objects~\cite{Birrell:1993:NO:168619.168637} system took what they felt was the most essential and the best of both the Emerald and SOS~\cite{shapiro1989sos} systems. This system forewent the mobility of objects in favor of marshalling. In the author's own words: 52 | 53 | \begin{quote} 54 | ``We believe it is better to provide powerful marshaling than object mobility. The two facilities are similar, because both of them allow the programmer the option of communicating objects by reference or by copying. Either facility can be used to distribute data and computation as needed by applications. Object mobility offers slightly more flexibility, because the same object can be either sent by reference or moved; while with our system, network objects are always sent by reference and other objects are always sent by copying. However, this extra flexibility doesn’t seem to us to be worth the substantial increase in complexity of mobile objects.''~\cite{black2007development} 55 | \end{quote} 56 | 57 | Both Java's RMI system and Jini (and their predecessor OMG's CORBA) were influenced by Emerald as well, however not without a fierce discussion on the merits of distinguishing between remote and local method invocations, motivated by legendary technical report~\cite{kendall1994note} by Sun Microsystems' research division\footnote{While we acknowledge the lineage here beginning with systems like Eden and Emerald, the majority of the criticisms of this technical report are targeted towards OMG's CORBA system.}. 58 | 59 | Jim Waldo, author of the aforementioned technical report, writes: 60 | 61 | \begin{quote} 62 | ``The RMI system (and later the Jini system) took many of the ideas pioneered in Emerald having to do with moving objects around the network. We introduced these ideas to allow us to deal with the problems found in systems like CORBA with type truncation (and which were dealt with in Network Objects by doing the closest-match); the result was that passing an object to a remote site resulted in passing [a copy of] exactly that object, including when necessary a copy of the code (made possible by Java bytecodes and the security mechanisms). This was exploited to some extent in the RMI world, and far more fully in the Jini world, making both of those systems more Emerald-like than we realized at the time.''~\cite{black2007development} 63 | \end{quote} 64 | 65 | The authors eventually come to a similar conclusion to many distributed systems practitioners today and other critics in their research area: that availability, reliability, and the network remain the paramount challenges and add fuel to the fire against any location-transparent semantics provided by mobile objects. These still remain as much of a challenge for the developers of distributed programs today, as they did in 1983 at the start of the development lineage from Eden to Emerald: 66 | 67 | \begin{quote} 68 | ``Mobile objects promise to make that same simplicity available in a distributed setting: the same semantics, the same parameter mechanisms, and so on. But this promise must be illusory. In a distributed setting the programmer must deal with issues of availability and reliability. So programmers have to replicate their objects, manage the replicas, and worry about 'one copy semantics'. Things are not so simple any more, because the notion of object identity supported by the programming language is no longer the same as the application’s notion of identity. We can make things simple only by giving up on reliability, fault tolerance, and availability — but these are the reasons that we build distributed systems.''~\cite{black2007development} 69 | \end{quote} 70 | 71 | The authors of Eden and Emerald express an extremely interesting point early on in their paper on the history of Emerald~\cite{black2007development}: the reason for the poor abstractions requiring manual dispatch of method invocations to threads, and the explicit error handling from network anomalies, was because as researchers working on a language, \textit{they were not implementing distributed applications themselves.} To quote the authors: 72 | 73 | \begin{quote} 74 | ``Eden team had real experience with writing distributed applications, we had not yet learned what support should be provided. For example, it was not clear to us whether or not each incoming call should be run in its own thread (possibly leading to excessive resource contention), whether or not all calls should run in the same thread (possibly leading to deadlock), whether or not there should be a thread pool of a bounded size (and if so, how to choose it), or whether or not there was some other, more elegant solution that we hadn’t yet thought of. So we left it to the application programmer to build whatever invocation thread management system seemed appropriate: EPL was partly a language, and partly a kit of components. The result of this approach was that there was no clear separation between the code of the application and the scaffolding necessary to implement remote calls.''~\cite{black2007development} 75 | \end{quote} 76 | 77 | I will close this section on Emerald with a quote from the authors. 78 | 79 | \begin{quote} 80 | ``We are all proud of Emerald, and feel that it is one of the most significant pieces of research we have ever undertaken. People who have never heard of Emerald are surprised that a language that is so old, and was implemented by so small a team, does so much that is 'modern'. If asked to describe Emerald briefly, we sometimes say that it’s like Java, except that it has always had generics, and that its objects are mobile.''~\cite{black2007development} 81 | \end{quote} 82 | -------------------------------------------------------------------------------- /guesstimate.html: -------------------------------------------------------------------------------- 1 |

Relevant Reading

2 | 5 |

Commentary

6 |

Guesstimate: A Programming Model for Collaborative Distributed Systems

7 |

As we’ve seen previously, distributed applications have to be shoehorned into either the CP or AP model as outlined by the CAP thereom. This results from the fact that applications that must remain available under partition, and as the CAP theorem states that applications cannot be simultaneously AP and CP, the application has to choose whether to remain consistent or sacrifice consistency for availability when partitions inevitably occur.

8 |

Guesstimate is a programming model for collaborative applications that aims to reduce latency by replicating objects on participating nodes in a network. Guesstimate provides the user with an object-oriented programming model: objects can be replicated and shared by different users on the network and method invocations that are side-effecting are eventually replicated to all users that contain an instance, or replica, of the object. Replication is provided transparently by the runtime, so the user doesn’t need their own backing store, differing from approoaches such as the previously discussed, IPA from Holt et al.

9 |

Each object in Guesstimate stores two states: a “guestimated” state, which is the result of local side-effecting method invocations and a “committed” state, which is the result of atomically committed state taken from the updates each node has made to their own side-effecting “guestimated” state. These method invocations that cause side-effects must take a particular form: they must validate the change against the current state using a guard and return either true or false, depending on if the state mutation has taken effect. To verify methods take this form, the Boogie verification language is used to analyze the implementation. These method invocations must take a particular form: a method that causes modifications to the local “guestimated” state and a delegate (read: anonymous function) to be invoked when the state is finally committed on all of the nodes.

10 |

As nodes modify their local “guestimated” state, periodically a desginated “master” node begins synchronization rounds. These synchronization rounds begin at the master, walk each of the nodes in turn preventing updates from occuring while the synchronization round runs, and aggregates the pending updates from each nodes “guestimated“ state. These updates are stored in a tentative log that’s aggregated at the master. Once the master node aggregates all of the updates from each node, the updates, each with a pair consisting of the node identifier and operation identifier, using lexicographical ordering on the set of pairs. This determines the commit order for each update, and once established, this commit order and the associated updates are sent to each of the nodes. Once these updates are accepted by all nodes, they are then applied to the “committed” state, the delegates invoked, and the lock on updates released, so local updates are allowed to resume.

11 |

The ideal way of programming with Guesstimate is to have operations modify “guestimated” state first and update an associated UI afterwards. Then, the delegate is invoked and should be used to update the UI with a notification on whether or not the update has been successfully applied or was refused during committment and must be tried again.

12 |

One problem that can occur is issues around update interleaving at committment time. Consider the case of a ride sharing application from the paper: a guard for a method invocation that states that a user should get a ride from driver X may yieldd the user getting a ride from driver Y under a particular update reordering: therefore, guards should be written in a way where any acceptable outcome is allowed.

13 |

To address the problems of ordering, Guesstimate provides atomic operations: if two updates have to happen together, or there’s a causal relationship between updates (think: references, pointers, secondary indexes), these operations can be grouped together in the programming model to ensure that they commit together. To address operations that must commit before proceeding, Guesstimate provides a primitive for blocking an operation until committment.

14 |

Users of Guesstimate have to also deal with the reality that updates will be executed multiple times, although against different components of the objects state: “guesstimated” and ”committed”.

15 |

Guesstimates’ runtime takes care of most of the synchronization concerns: nodes are allowed to leave and join the system, where failed nodes are evicted after a certain amount of time and forced to rejoin the cluster and repopulate their state. Nodes form a full mesh, where each node can talk to each other node, and there’s currently no mechanism outlined for handling the failure of a “master” node that begins and coordinates the synchronization rounds.

16 |

We can see the genesis of ideas that made it into systems such as the Global Sequence Protocol and CAPtain, which are covered in other articles.

17 |
18 |
19 |

Rajan, Kaushik, Sriram Rajamani, and Shashank Yaduvanshi. 2010. “Guesstimate: A Programming Model for Collaborative Distributed Systems.” In ACM Sigplan Notices, 45:210–20. 6. ACM.

20 |
21 |
22 | -------------------------------------------------------------------------------- /guesstimate.tex: -------------------------------------------------------------------------------- 1 | \subsection{Relevant Reading} 2 | 3 | \begin{itemize} 4 | \item \textit{Guesstimate: A Programming Model for Collaborative Distributed Systems}, Rajan, Rajamani, Yaduvanshi, PLDI 2010~\cite{rajan2010guesstimate}. 5 | \end{itemize} 6 | 7 | \subsection{Commentary} 8 | 9 | As we've seen previously, distributed applications have to be shoehorned into either the CP or AP model as outlined by the CAP thereom. This results from the fact that applications that must remain available under partition, and as the CAP theorem states that applications cannot be simultaneously AP and CP, the application has to choose whether to remain consistent or sacrifice consistency for availability when partitions inevitably occur. 10 | 11 | Guesstimate is a programming model for collaborative applications that aims to reduce latency by replicating objects on participating nodes in a network. Guesstimate provides the user with an object-oriented programming model: objects can be replicated and shared by different users on the network and method invocations that are side-effecting are eventually replicated to all users that contain an instance, or replica, of the object. Replication is provided transparently by the runtime, so the user doesn't need their own backing store, differing from approoaches such as the previously discussed, IPA from Holt et al. 12 | 13 | Each object in Guesstimate stores two states: a ``guestimated'' state, which is the result of local side-effecting method invocations and a ``committed'' state, which is the result of atomically committed state taken from the updates each node has made to their own side-effecting ``guestimated'' state. These method invocations that cause side-effects must take a particular form: they must validate the change against the current state using a guard and return either true or false, depending on if the state mutation has taken effect. To verify methods take this form, the Boogie verification language is used to analyze the implementation. These method invocations must take a particular form: a method that causes modifications to the local ``guestimated'' state and a delegate (read: anonymous function) to be invoked when the state is finally committed on all of the nodes. 14 | 15 | As nodes modify their local ``guestimated'' state, periodically a desginated ``master'' node begins synchronization rounds. These synchronization rounds begin at the master, walk each of the nodes in turn preventing updates from occuring while the synchronization round runs, and aggregates the pending updates from each nodes ``guestimated`` state. These updates are stored in a tentative log that's aggregated at the master. Once the master node aggregates all of the updates from each node, the updates, each with a pair consisting of the node identifier and operation identifier, using lexicographical ordering on the set of pairs. This determines the commit order for each update, and once established, this commit order and the associated updates are sent to each of the nodes. Once these updates are accepted by all nodes, they are then applied to the ``committed'' state, the delegates invoked, and the lock on updates released, so local updates are allowed to resume. 16 | 17 | The ideal way of programming with Guesstimate is to have operations modify ``guestimated'' state first and update an associated UI afterwards. Then, the delegate is invoked and should be used to update the UI with a notification on whether or not the update has been successfully applied or was refused during committment and must be tried again. 18 | 19 | One problem that can occur is issues around update interleaving at committment time. Consider the case of a ride sharing application from the paper: a guard for a method invocation that states that a user should get a ride from driver X may yieldd the user getting a ride from driver Y under a particular update reordering: therefore, guards should be written in a way where any acceptable outcome is allowed. 20 | 21 | To address the problems of ordering, Guesstimate provides atomic operations: if two updates have to happen together, or there's a causal relationship between updates (think: references, pointers, secondary indexes), these operations can be grouped together in the programming model to ensure that they commit together. To address operations that must commit before proceeding, Guesstimate provides a primitive for blocking an operation until committment. 22 | 23 | Users of Guesstimate have to also deal with the reality that updates will be executed multiple times, although against different components of the objects state: ``guesstimated'' and ''committed''. 24 | 25 | Guesstimates' runtime takes care of most of the synchronization concerns: nodes are allowed to leave and join the system, where failed nodes are evicted after a certain amount of time and forced to rejoin the cluster and repopulate their state. Nodes form a full mesh, where each node can talk to each other node, and there's currently no mechanism outlined for handling the failure of a ``master'' node that begins and coordinates the synchronization rounds. 26 | 27 | We can see the genesis of ideas that made it into systems such as the Global Sequence Protocol and CAPtain, which are covered in other articles. 28 | -------------------------------------------------------------------------------- /hermes.html: -------------------------------------------------------------------------------- 1 |

Relevant Reading

2 | 6 |

Commentary

7 |

The general idea behind the Remote Procedure Call (RPC) paradigm is that it supports the transfer of control between address spaces. This paradigm allows programmers to write distributed applications without having to have knowledge of data representations or specific network protocols. Even though we know that there is quite a bit semantically different between remote and local calls (Kendall et al. 1994; Andrew P Black and Artsy 1990), the authors posit that the most fundamental difference is that of binding, or, how to figure out which address space to direct the call to.

8 |

Traditionally, this has been done one of two ways: default or automatic binding where the RPC system makes the choice for the programmer; or clerks, an application specific module used for determining where the place the call. Default binding is fairly straightforward when there is only one server (or a group of semantically equivalent servers) to service the request. Clerks are fairly expensive, as one must be written for each type of request that needs to be serviced. If the service the RPC call is being made to is pure, for instance providing as fast Fourier transform as the authors put it, it is easy to choose automatic binding to select a server based on latency or availability. However, it is more challenging if services host application data. In their example, they consider an employee directory at Digital where application data is partitioned by company, and further by other groupings. If this mapping changes infrequently, a static mapping can be distributed to all of the clients; but, what happens if objects are mobile and this changes more frequently?

9 |

One of the fantastic things about this paper is how forward thinking the design is for an actual industrial problem at Digital Equipment Corporation. I consider this one of the early versions of what we now call an “industry” research report, even though the system never was productized and the work was mainly performed by researchers in a lab. The application deals with expense vouchers for employees: each form needs to be filled in by an employee, approved by various managers, filed, and eventually results in a payout of actual cash. Each of the managers that are involved in approving the form may be located in different buildings in different continents. The application design assumes Digital’s global network of 36,000 machines and assumes that centralizing the records for each form in a centralized database is infeasible. Instead, the design is based on mobile objects for both data and code; forms should be able to move around the network as required by the application.

10 |

The Hermes system is broken into three components: a naming service, a persistent store known as a collection of storesites, and routing layer that sits above the RPC system. Each object in the system is given a globally unique identifier, a source storesite and a temporal address descriptor or tad. The temporal address descriptor is a pair composed of a Hermes node identifier and a monotonically advancing timestamp: this pair represents where an object is located at a given time. This information is also persisted in the objects storesite. As objects move around the network, the tad is updated at the source node and 2PC used to coordinate a change with the record at the objects’s storesite.

11 |

When remote procedure calls are issued, the callee attempts to issue the call locally if the objects is local. If not, and a forwarding pointer, or tad exists, the message is routed to that node. Forwarding pointers are followed a number of times until a maximum hop count is reached; at this point the call is returned to the callee who begins the process again with the last known forwarding pointer. Along the path of forwarding, the tad is updated as each hop occurs, reducing the number of hops needed for the next request through that node. This is possible because of the monotonicity of the temporal addresses.

12 |

If a node has no local knowledge of where that object is, either because it is not running locally or because there exists no temporal address, a request is made to the naming service to request the storesite for the object, and the address of the current location retrieved from the storesite.

13 |

However, in this model failures may occur. If the RPC arrives at the destination of the object and the call invoked and completed, but the response packets dropped, what happens? In this case, an invocation sequencer is required to ensure that the operation only performed if it has not previously completed. The authors suggest developers write operations that are idempotent, to ensure they can be replayed without issue or additional overhead.

14 |

Impact and Implementations

15 |

Both the Eden (Andrew P. Black 1985) and Emerald (Andrew P Black et al. 2007) programming languages both had notions of distributed objects. Eden used hints to identify where to route messages for objects, but timed them out quickly. Once timed out, a durable storage location called a checksite would be checked, and if that yielded no results, broadcast messages would be used. Emerald, a predecessor to Hermes, used forwarding addresses, but used a broadcast mechanism to find objects when forwarding addresses were not available. In the event the broadcast yielded no results, an exhaustive search of every node in the cluster was performed. All of these decisions were fine for a language and operating system designed mainly for research.

16 |

Emerald was more advanced in several ways. Emerald’s type system allowed for the introduction of new types of objects, whereas the Hermes system assumed at system start all possible object types were known to the system. Emerald could also migrate processes during invocation, something that the Hermes system could not.

17 |

While the system could tolerate some notion of failures while following forwarding addresses, by resorting to usage of the information located at the storesite, the system had no way to prevent issues with partitions: where an invocation may fail because the object is inaccessible. However, given the relative independence of objects in the system, this would only affect objects (or users) located on the partitioned machine.

18 |

The design of Hermes was completed in a year and a half, written in Modula-2+, and was demonstrated functional in the laboratory with a LAN composed of a small number of nodes. According to one of the authors of the paper, the system never was turned into a product, mainly because Digital did not have a team at the time responsible for turning advanced research projects into actual distributed systems products1.

19 |

The Sapphire (Zhang et al. 2014) system presented at OSDI ’14 bears a similar resemblance to the Hermes system and its Emerald roots. While Sapphire focuses on the separation of application logic from deployment logic through the use of interfaces and interface inheritance in object-oriented programming languages, Sapphire uses many of the techniques presented in both Emerald and Hermes: transparent relocation based on annotations or for load balancing; location independent method invocation through the use of forwarding pointers, and fallback to a persistent data store to find the canonical location of a particular object.

20 |

Today, idempotence (Helland 2012) has been a topic of study in distributed systems, as it assists in designing deterministic computations that must happen on unreliable, asynchronous networks; a place where it is impossible to reliably detect failures (Fischer, Lynch, and Paterson 1985). Shapiro et al. (Shapiro et al. 2011) propose the use of data structures that are associative, commutative, and idempotent as the basis for shared state in distributed databases. Meiklejohn and Van Roy (Meiklejohn and Van Roy 2015) propose similar for large-scale distributed computations; whereas Conway et al. (Conway et al. 2012) propose similar for protocol development. Lee et al.  propose a system called RIFL for ensuring exactly-once semantics for remote procedure calls by uniquely identifying each call and fault-tolerant storage of the results (Lee et al. 2015).

21 |
22 |
23 |

Black, Andrew P, and Yeshayahu Artsy. 1990. “Implementing Location Independent Invocation.” Parallel and Distributed Systems, IEEE Transactions on 1 (1). IEEE: 107–19.

24 |
25 |
26 |

Black, Andrew P, Norman C Hutchinson, Eric Jul, and Henry M Levy. 2007. “The Development of the Emerald Programming Language.” In Proceedings of the Third Acm Sigplan Conference on History of Programming Languages, 11–11. ACM.

27 |
28 |
29 |

Black, Andrew P. 1985. “Supporting Distributed Applications: Experience with Eden.” In Proceedings of the Tenth Acm Symposium on Operating Systems Principles, 181–93. SOSP ’85. New York, NY, USA: ACM. doi:10.1145/323647.323646.

30 |
31 |
32 |

Conway, Neil, William R Marczak, Peter Alvaro, Joseph M Hellerstein, and David Maier. 2012. “Logic and Lattices for Distributed Programming.” In Proceedings of the Third Acm Symposium on Cloud Computing, 1. ACM.

33 |
34 |
35 |

Fischer, Michael J, Nancy A Lynch, and Michael S Paterson. 1985. “Impossibility of Distributed Consensus with One Faulty Process.” Journal of the ACM (JACM) 32 (2). ACM: 374–82.

36 |
37 |
38 |

Helland, Pat. 2012. “Idempotence Is Not a Medical Condition.” Queue 10 (4). New York, NY, USA: ACM: 30:30–30:46. doi:10.1145/2181796.2187821.

39 |
40 |
41 |

Kendall, Samuel C, Jim Waldo, Ann Wollrath, and Geoff Wyant. 1994. “A Note on Distributed Computing.” Sun Microsystems, Inc.

42 |
43 |
44 |

Lee, Collin, Seo Jin Park, Ankita Kejriwal, Satoshi Matsushita, and John Ousterhout. 2015. “Implementing Linearizability at Large Scale and Low Latency.” In Proceedings of the 25th Symposium on Operating Systems Principles, 71–86. ACM.

45 |
46 |
47 |

Meiklejohn, Christopher, and Peter Van Roy. 2015. “Lasp: A Language for Distributed, Eventually Consistent Computations with Crdts.” In Proceedings of the First Workshop on Principles and Practice of Consistency for Distributed Data, 7. ACM.

48 |
49 |
50 |

Shapiro, Marc, Nuno Preguiça, Carlos Baquero, and Marek Zawirski. 2011. “A Comprehensive Study of Convergent and Commutative Replicated Data Types.” PhD thesis, Inria–Centre Paris-Rocquencourt.

51 |
52 |
53 |

Zhang, Irene, Adriana Szekeres, Dana Van Aken, Isaac Ackerman, Steven D Gribble, Arvind Krishnamurthy, and Henry M Levy. 2014. “Customizable and Extensible Deployment for Mobile/Cloud Applications.” In 11th Usenix Symposium on Operating Systems Design and Implementation (Osdi 14), 97–112.

54 |
55 |
56 |
57 |
58 |
    59 |
  1. Andrew P. Black, personal communication.

  2. 60 |
61 |
62 | -------------------------------------------------------------------------------- /hermes.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cmeiklejohn/PMLDC/d9bbfe20f87699aa5c6c09815ac0eb20c2e1d141/hermes.pdf -------------------------------------------------------------------------------- /hermes.tex: -------------------------------------------------------------------------------- 1 | \subsection{Relevant Reading} 2 | 3 | \begin{itemize} 4 | \item \textit{Implementing Location Independent Invocation}, Black, Andrew P and Artsy, Yeshayahu, IEEE Transactions on Parallel and Distributed Systems, 1990~\cite{black1990implementing}. 5 | \item \textit{Customizable and extensible deployment for mobile/cloud applications}, Zhang, Irene and Szekeres, Adriana and Van Aken, Dana and Ackerman, Isaac and Gribble, Steven D and Krishnamurthy, Arvind and Levy, Henry M, 2014~\cite{zhang2014customizable}. 6 | \end{itemize} 7 | 8 | \subsection{Commentary} 9 | 10 | The general idea behind the Remote Procedure Call (RPC) paradigm is that it supports the transfer of control between address spaces. This paradigm allows programmers to write distributed applications without having to have knowledge of data representations or specific network protocols. Even though we know that there is quite a bit semantically different between remote and local calls~\cite{kendall1994note, black1990implementing}, the authors posit that the most fundamental difference is that of \textit{binding}, or, how to figure out which address space to direct the call to. 11 | 12 | Traditionally, this has been done one of two ways: \textit{default or automatic} binding where the RPC system makes the choice for the programmer; or \textit{clerks}, an application specific module used for determining where the place the call. Default binding is fairly straightforward when there is only one server (or a group of semantically equivalent servers) to service the request. Clerks are fairly expensive, as one must be written for each type of request that needs to be serviced. If the service the RPC call is being made to is \textit{pure}, for instance providing as fast Fourier transform as the authors put it, it is easy to choose automatic binding to select a server based on latency or availability. However, it is more challenging if services host application data. In their example, they consider an employee directory at Digital where application data is partitioned by company, and further by other groupings. If this mapping changes infrequently, a static mapping can be distributed to all of the clients; but, what happens if objects are mobile and this changes more frequently? 13 | 14 | One of the fantastic things about this paper is how forward thinking the design is for an actual industrial problem at Digital Equipment Corporation. I consider this one of the early versions of what we now call an ``industry'' research report, even though the system never was productized and the work was mainly performed by researchers in a lab. The application deals with expense vouchers for employees: each form needs to be filled in by an employee, approved by various managers, filed, and eventually results in a payout of actual cash. Each of the managers that are involved in approving the form may be located in different buildings in different continents. The application design assumes Digital's global network of 36,000 machines and assumes that centralizing the records for each form in a centralized database is infeasible. Instead, the design is based on mobile objects for both data and code; forms should be able to move around the network as required by the application. 15 | 16 | The Hermes system is broken into three components: a naming service, a persistent store known as a collection of \textit{storesites}, and routing layer that sits above the RPC system. Each object in the system is given a globally unique identifier, a source \textit{storesite} and a \textit{temporal address descriptor} or \textit{tad}. The \textit{temporal address descriptor} is a pair composed of a Hermes node identifier and a monotonically advancing timestamp: this pair represents where an object is located at a given time. This information is also persisted in the objects \textit{storesite}. As objects move around the network, the \textit{tad} is updated at the source node and 2PC used to coordinate a change with the record at the objects's \textit{storesite}. 17 | 18 | When remote procedure calls are issued, the callee attempts to issue the call locally if the objects is local. If not, and a forwarding pointer, or \textit{tad} exists, the message is routed to that node. Forwarding pointers are followed a number of times until a maximum hop count is reached; at this point the call is returned to the callee who begins the process again with the last known forwarding pointer. Along the path of forwarding, the \textit{tad} is updated as each hop occurs, reducing the number of hops needed for the next request through that node. This is possible because of the monotonicity of the temporal addresses. 19 | 20 | If a node has no local knowledge of where that object is, either because it is not running locally or because there exists no temporal address, a request is made to the naming service to request the \textit{storesite} for the object, and the address of the current location retrieved from the \textit{storesite}. 21 | 22 | However, in this model failures may occur. If the RPC arrives at the destination of the object and the call invoked and completed, but the response packets dropped, what happens? In this case, an invocation sequencer is required to ensure that the operation only performed if it has not previously completed. The authors suggest developers write operations that are idempotent, to ensure they can be replayed without issue or additional overhead. 23 | 24 | \subsection{Impact and Implementations} 25 | 26 | Both the Eden~\cite{Black:1985:SDA:323647.323646} and Emerald~\cite{black2007development} programming languages both had notions of distributed objects. Eden used hints to identify where to route messages for objects, but timed them out quickly. Once timed out, a durable storage location called a \textit{checksite} would be checked, and if that yielded no results, broadcast messages would be used. Emerald, a predecessor to Hermes, used forwarding addresses, but used a broadcast mechanism to find objects when forwarding addresses were not available. In the event the broadcast yielded no results, an exhaustive search of every node in the cluster was performed. All of these decisions were fine for a language and operating system designed mainly for research. 27 | 28 | Emerald was more advanced in several ways. Emerald's type system allowed for the introduction of new types of objects, whereas the Hermes system assumed at system start all possible object types were known to the system. Emerald could also migrate processes during invocation, something that the Hermes system could not. 29 | 30 | While the system could tolerate some notion of failures while following forwarding addresses, by resorting to usage of the information located at the \textit{storesite}, the system had no way to prevent issues with partitions: where an invocation may fail because the object is inaccessible. However, given the relative independence of objects in the system, this would only affect objects (or users) located on the partitioned machine. 31 | 32 | The design of Hermes was completed in a year and a half, written in Modula-2+, and was demonstrated functional in the laboratory with a LAN composed of a small number of nodes. According to one of the authors of the paper, the system never was turned into a product, mainly because Digital did not have a team at the time responsible for turning advanced research projects into actual distributed systems products\footnote{Andrew P. Black, personal communication.}. 33 | 34 | The Sapphire~\cite{zhang2014customizable} system presented at OSDI '14 bears a similar resemblance to the Hermes system and its Emerald roots. While Sapphire focuses on the separation of application logic from deployment logic through the use of interfaces and interface inheritance in object-oriented programming languages, Sapphire uses many of the techniques presented in both Emerald and Hermes: transparent relocation based on annotations or for load balancing; location independent method invocation through the use of forwarding pointers, and fallback to a persistent data store to find the canonical location of a particular object. 35 | 36 | Today, idempotence~\cite{Helland:2012:IMC:2181796.2187821} has been a topic of study in distributed systems, as it assists in designing deterministic computations that must happen on unreliable, asynchronous networks; a place where it is impossible to reliably detect failures~\cite{fischer1985impossibility}. Shapiro \textit{et al.}~\cite{shapiro2011comprehensive} propose the use of data structures that are associative, commutative, and idempotent as the basis for shared state in distributed databases. Meiklejohn and Van Roy~\cite{meiklejohn2015lasp} propose similar for large-scale distributed computations; whereas Conway \textit{et al.}~\cite{conway2012logic} propose similar for protocol development. Lee \textit{et al.}~ propose a system called RIFL for ensuring exactly-once semantics for remote procedure calls by uniquely identifying each call and fault-tolerant storage of the results~\cite{lee2015implementing}. -------------------------------------------------------------------------------- /pl.bib: -------------------------------------------------------------------------------- 1 | @book{liskov1988promises, 2 | title={Promises: linguistic support for efficient asynchronous procedure calls in distributed systems}, 3 | author={Liskov, Barbara and Shrira, Liuba}, 4 | volume={23}, 5 | number={7}, 6 | year={1988}, 7 | publisher={ACM} 8 | } 9 | 10 | @article{kendall1994note, 11 | title={A note on distributed computing}, 12 | author={Kendall, Samuel C and Waldo, Jim and Wollrath, Ann and Wyant, Geoff}, 13 | year={1994}, 14 | publisher={Sun Microsystems, Inc.} 15 | } 16 | 17 | @book{tanenbaum1987critique, 18 | title={A critique of the remote procedure call paradigm}, 19 | author={Tanenbaum, Andrew Stuart and van Renesse, Robbert}, 20 | year={1987}, 21 | publisher={Vrije Universiteit, Subfaculteit Wiskunde en Informatica} 22 | } 23 | 24 | @incollection{Sandberg:1988:DIS:59309.59338, 25 | author = {Sandberg, R. and Golgberg, D. and Kleiman, S. and Walsh, D. and Lyon, B.}, 26 | chapter = {Design and Implementation of the Sun Network Filesystem}, 27 | title = {Innovations in Internetworking}, 28 | editor = {Partridge, C.}, 29 | year = {1988}, 30 | isbn = {0-89006-337-0}, 31 | pages = {379--390}, 32 | numpages = {12}, 33 | url = {http://dl.acm.org/citation.cfm?id=59309.59338}, 34 | acmid = {59338}, 35 | publisher = {Artech House, Inc.}, 36 | address = {Norwood, MA, USA}, 37 | } 38 | 39 | @article{halpern1990knowledge, 40 | title={Knowledge and common knowledge in a distributed environment}, 41 | author={Halpern, Joseph Y and Moses, Yoram}, 42 | journal={Journal of the ACM (JACM)}, 43 | volume={37}, 44 | number={3}, 45 | pages={549--587}, 46 | year={1990}, 47 | publisher={ACM} 48 | } 49 | 50 | @article{black1990implementing, 51 | title={Implementing location independent invocation}, 52 | author={Black, Andrew P and Artsy, Yeshayahu}, 53 | journal={Parallel and Distributed Systems, IEEE Transactions on}, 54 | volume={1}, 55 | number={1}, 56 | pages={107--119}, 57 | year={1990}, 58 | publisher={IEEE} 59 | } 60 | 61 | @inproceedings{wollrath1996distributed, 62 | title={A distributed object model for the java TM system}, 63 | author={Wollrath, Ann and Riggs, Roger and Waldo, Jim}, 64 | booktitle={Proceedings of the 2nd conference on USENIX Conference on Object-Oriented Technologies (COOTS)-Volume 2}, 65 | pages={17--17}, 66 | year={1996}, 67 | organization={USENIX Association} 68 | } 69 | 70 | @article{birrell1984implementing, 71 | title={Implementing remote procedure calls}, 72 | author={Birrell, Andrew D and Nelson, Bruce Jay}, 73 | journal={ACM Transactions on Computer Systems (TOCS)}, 74 | volume={2}, 75 | number={1}, 76 | pages={39--59}, 77 | year={1984}, 78 | publisher={ACM} 79 | } 80 | 81 | @article{liskov1988distributed, 82 | title={Distributed programming in Argus}, 83 | author={Liskov, Barbara}, 84 | journal={Communications of the ACM}, 85 | volume={31}, 86 | number={3}, 87 | pages={300--312}, 88 | year={1988}, 89 | publisher={ACM} 90 | } 91 | 92 | @incollection{miller2014spores, 93 | title={Spores: A type-based foundation for closures in the age of concurrency and distribution}, 94 | author={Miller, Heather and Haller, Philipp and Odersky, Martin}, 95 | booktitle={ECOOP 2014--Object-Oriented Programming}, 96 | pages={308--333}, 97 | year={2014}, 98 | publisher={Springer} 99 | } 100 | 101 | @article{vinoski2008convenience, 102 | title={Convenience over correctness}, 103 | author={Vinoski, Steve}, 104 | journal={Internet Computing, IEEE}, 105 | volume={12}, 106 | number={4}, 107 | pages={89--92}, 108 | year={2008}, 109 | publisher={IEEE} 110 | } 111 | 112 | @inproceedings{alvaro2011consistency, 113 | title={Consistency Analysis in Bloom: a CALM and Collected Approach.}, 114 | author={Alvaro, Peter and Conway, Neil and Hellerstein, Joseph M and Marczak, William R}, 115 | booktitle={CIDR}, 116 | pages={249--260}, 117 | year={2011}, 118 | organization={Citeseer} 119 | } 120 | 121 | @inproceedings{Black:1985:SDA:323647.323646, 122 | author = {Black, Andrew P.}, 123 | title = {Supporting Distributed Applications: Experience with Eden}, 124 | booktitle = {Proceedings of the Tenth ACM Symposium on Operating Systems Principles}, 125 | series = {SOSP '85}, 126 | year = {1985}, 127 | isbn = {0-89791-174-1}, 128 | location = {Orcas Island, Washington, USA}, 129 | pages = {181--193}, 130 | numpages = {13}, 131 | url = {http://doi.acm.org/10.1145/323647.323646}, 132 | doi = {10.1145/323647.323646}, 133 | acmid = {323646}, 134 | publisher = {ACM}, 135 | address = {New York, NY, USA}, 136 | } 137 | 138 | @article{fischer1985impossibility, 139 | title={Impossibility of distributed consensus with one faulty process}, 140 | author={Fischer, Michael J and Lynch, Nancy A and Paterson, Michael S}, 141 | journal={Journal of the ACM (JACM)}, 142 | volume={32}, 143 | number={2}, 144 | pages={374--382}, 145 | year={1985}, 146 | publisher={ACM} 147 | } 148 | 149 | @inproceedings{conway2012logic, 150 | title={Logic and lattices for distributed programming}, 151 | author={Conway, Neil and Marczak, William R and Alvaro, Peter and Hellerstein, Joseph M and Maier, David}, 152 | booktitle={Proceedings of the Third ACM Symposium on Cloud Computing}, 153 | pages={1}, 154 | year={2012}, 155 | organization={ACM} 156 | } 157 | 158 | @phdthesis{shapiro2011comprehensive, 159 | title={A comprehensive study of convergent and commutative replicated data types}, 160 | author={Shapiro, Marc and Pregui{\c{c}}a, Nuno and Baquero, Carlos and Zawirski, Marek}, 161 | year={2011}, 162 | school={Inria--Centre Paris-Rocquencourt} 163 | } 164 | 165 | @article{liskov1988distributed, 166 | title={Distributed programming in Argus}, 167 | author={Liskov, Barbara}, 168 | journal={Communications of the ACM}, 169 | volume={31}, 170 | number={3}, 171 | pages={300--312}, 172 | year={1988}, 173 | publisher={ACM} 174 | } 175 | 176 | @inproceedings{lee2015implementing, 177 | title={Implementing linearizability at large scale and low latency}, 178 | author={Lee, Collin and Park, Seo Jin and Kejriwal, Ankita and Matsushita, Satoshi and Ousterhout, John}, 179 | booktitle={Proceedings of the 25th Symposium on Operating Systems Principles}, 180 | pages={71--86}, 181 | year={2015}, 182 | organization={ACM} 183 | } 184 | 185 | @inproceedings{Black:1986:OSE:28697.28706, 186 | author = {Black, Andrew and Hutchinson, Norman and Jul, Eric and Levy, Henry}, 187 | title = {Object Structure in the Emerald System}, 188 | booktitle = {Conference Proceedings on Object-oriented Programming Systems, Languages and Applications}, 189 | series = {OOPLSA '86}, 190 | year = {1986}, 191 | isbn = {0-89791-204-7}, 192 | location = {Portland, Oregon, USA}, 193 | pages = {78--86}, 194 | numpages = {9}, 195 | url = {http://doi.acm.org/10.1145/28697.28706}, 196 | doi = {10.1145/28697.28706}, 197 | acmid = {28706}, 198 | publisher = {ACM}, 199 | address = {New York, NY, USA}, 200 | } 201 | 202 | @inproceedings{claessen2005semantics, 203 | title={A semantics for distributed Erlang}, 204 | author={Claessen, Koen and Svensson, Hans}, 205 | booktitle={Proceedings of the 2005 ACM SIGPLAN workshop on Erlang}, 206 | pages={78--87}, 207 | year={2005}, 208 | organization={ACM} 209 | } 210 | 211 | @inproceedings{svensson2007more, 212 | title={A more accurate semantics for distributed Erlang}, 213 | author={Svensson, Hans and Fredlund, Lars-Ake}, 214 | booktitle={Erlang Workshop}, 215 | pages={43--54}, 216 | year={2007}, 217 | organization={Citeseer} 218 | } 219 | 220 | @inproceedings{svensson2007programming, 221 | title={Programming distributed erlang applications: Pitfalls and recipes}, 222 | author={Svensson, Hans and Fredlund, Lars-{\AA}ke}, 223 | booktitle={Proceedings of the 2007 SIGPLAN workshop on ERLANG Workshop}, 224 | pages={37--42}, 225 | year={2007}, 226 | organization={ACM} 227 | } 228 | 229 | @article{shapiro1989sos, 230 | title={SOS: An Object-Oriented Operating System―-Assessment and Perspectives}, 231 | author={Shapiro, Marc and Gourhant, Yvon and Habert, Sabine and Mosseri, Laurence and Ruffin, Michel and Valot, Celine}, 232 | journal={Computing Systems}, 233 | volume={2}, 234 | number={4}, 235 | pages={287--338}, 236 | year={1989} 237 | } 238 | 239 | @article{SPE:SPE4380210107, 240 | Author = {Raj, Rajendra K. and Tempero, Ewan and Levy, Henry M. and Black, Andrew P. and Hutchinson, Norman C. and Jul, Eric}, 241 | Doi = {10.1002/spe.4380210107}, 242 | Issn = {1097-024X}, 243 | Journal = {Software: Practice and Experience}, 244 | Keywords = {Programming languages, Programming methodology, Object-oriented programming, Abstract data types, Inheritance, Object-based concurrency}, 245 | Number = {1}, 246 | Pages = {91--118}, 247 | Publisher = {John Wiley & Sons, Ltd.}, 248 | Title = {Emerald: A general-purpose programming language}, 249 | Url = {http://dx.doi.org/10.1002/spe.4380210107}, 250 | Volume = {21}, 251 | Year = {1991}, 252 | Bdsk-Url-1 = {http://dx.doi.org/10.1002/spe.4380210107}} 253 | 254 | @inproceedings{Birrell:1993:NO:168619.168637, 255 | author = {Birrell, Andrew and Nelson, Greg and Owicki, Susan and Wobber, Edward}, 256 | title = {Network Objects}, 257 | booktitle = {Proceedings of the Fourteenth ACM Symposium on Operating Systems Principles}, 258 | series = {SOSP '93}, 259 | year = {1993}, 260 | isbn = {0-89791-632-8}, 261 | location = {Asheville, North Carolina, USA}, 262 | pages = {217--230}, 263 | numpages = {14}, 264 | url = {http://doi.acm.org/10.1145/168619.168637}, 265 | doi = {10.1145/168619.168637}, 266 | acmid = {168637}, 267 | publisher = {ACM}, 268 | address = {New York, NY, USA}, 269 | } 270 | 271 | @ARTICLE{1702134, 272 | author={A. Black and N. Hutchinson and E. Jul and H. Levy and L. Carter}, 273 | journal={IEEE Transactions on Software Engineering}, 274 | title={Distribution and Abstract Types in Emerald}, 275 | year={1987}, 276 | volume={SE-13}, 277 | number={1}, 278 | pages={65-76}, 279 | keywords={Abstract data types;distributed operating system;distributed programming;object-oriented programming;process migration;type checking;Art;Computer languages;Local area networks;Object oriented modeling;Object oriented programming;Operating systems;Packaging;Programming profession;Prototypes;Workstations;Abstract data types;distributed operating system;distributed programming;object-oriented programming;process migration;type checking}, 280 | doi={10.1109/TSE.1987.232836}, 281 | ISSN={0098-5589}, 282 | month={Jan},} 283 | 284 | @inproceedings{henz1993oz, 285 | title={Oz-a programming language for multi-agent systems}, 286 | author={Henz, Martin and Smolka, Gert and W{\"u}rtz, J{\"o}rg}, 287 | booktitle={IJCAI}, 288 | pages={404--409}, 289 | year={1993} 290 | } 291 | 292 | @book{hoare1974monitors, 293 | title={Monitors: An operating system structuring concept}, 294 | author={Hoare, Charles Antony Richard}, 295 | year={1974}, 296 | publisher={Springer} 297 | } 298 | 299 | @inproceedings{haridi1997overview, 300 | title={An overview of the design of Distributed Oz}, 301 | author={Haridi, Seif and Van Roy, Peter and Smolka, Gert}, 302 | booktitle={Proceedings of the second international symposium on Parallel symbolic computation}, 303 | pages={176--187}, 304 | year={1997}, 305 | organization={ACM} 306 | } 307 | 308 | @article{holt1982short, 309 | title={A short introduction to Concurrent Euclid}, 310 | author={Holt, Richard C.}, 311 | journal={ACM Sigplan Notices}, 312 | volume={17}, 313 | number={5}, 314 | pages={60--79}, 315 | year={1982}, 316 | publisher={ACM} 317 | } 318 | 319 | @misc{ wiki:futures, 320 | author = "Wikipedia", 321 | title = "Futures and promises --- Wikipedia{,} The Free Encyclopedia", 322 | year = "2016", 323 | url = "https://en.wikipedia.org/w/index.php?title=Futures_and_promises&oldid=708150517", 324 | note = "[Online; accessed 4-March-2016]" 325 | } 326 | 327 | @article{Baker:1977:IGC:872734.806932, 328 | author = {Baker,Jr., Henry C. and Hewitt, Carl}, 329 | title = {The Incremental Garbage Collection of Processes}, 330 | journal = {SIGPLAN Not.}, 331 | issue_date = {August 1977}, 332 | volume = {12}, 333 | number = {8}, 334 | month = aug, 335 | year = {1977}, 336 | issn = {0362-1340}, 337 | pages = {55--59}, 338 | numpages = {5}, 339 | url = {http://doi.acm.org/10.1145/872734.806932}, 340 | doi = {10.1145/872734.806932}, 341 | acmid = {806932}, 342 | publisher = {ACM}, 343 | address = {New York, NY, USA}, 344 | keywords = {Eager evaluation, Garbage collection, Lazy evaluation, Multiprocessing systems, Processor scheduling}, 345 | } 346 | 347 | @ARTICLE{1675100, 348 | author={D. P. Friedman and D. S. Wise}, 349 | journal={IEEE Transactions on Computers}, 350 | title={Aspects of Applicative Programming for Parallel Processing}, 351 | year={1978}, 352 | volume={C-27}, 353 | number={4}, 354 | pages={289-296}, 355 | keywords={Compiling;Lisp;functional combinations;multiprocessing;recursion;suspensions;Automatic control;Computer architecture;Computer languages;Data structures;Hardware;Modems;Parallel processing;Parallel programming;Programming profession;Suspensions;Compiling;Lisp;functional combinations;multiprocessing;recursion;suspensions}, 356 | doi={10.1109/TC.1978.1675100}, 357 | ISSN={0018-9340}, 358 | month={April},} 359 | 360 | @article{halstead1985multilisp, 361 | title={Multilisp: A language for concurrent symbolic computation}, 362 | author={Halstead Jr, Robert H}, 363 | journal={ACM Transactions on Programming Languages and Systems (TOPLAS)}, 364 | volume={7}, 365 | number={4}, 366 | pages={501--538}, 367 | year={1985}, 368 | publisher={ACM} 369 | } 370 | 371 | @inproceedings{Bravo:2014:DDD:2633448.2633451, 372 | author = {Bravo, Manuel and Li, Zhongmiao and Van Roy, Peter and Meiklejohn, Christopher}, 373 | title = {Derflow: Distributed Deterministic Dataflow Programming for Erlang}, 374 | booktitle = {Proceedings of the Thirteenth ACM SIGPLAN Workshop on Erlang}, 375 | series = {Erlang '14}, 376 | year = {2014}, 377 | isbn = {978-1-4503-3038-1}, 378 | location = {Gothenburg, Sweden}, 379 | pages = {51--60}, 380 | numpages = {10}, 381 | url = {http://doi.acm.org/10.1145/2633448.2633451}, 382 | doi = {10.1145/2633448.2633451}, 383 | acmid = {2633451}, 384 | publisher = {ACM}, 385 | address = {New York, NY, USA}, 386 | keywords = {dynamo, erlang, riak}, 387 | } 388 | 389 | @article{Hoare:1974:MOS:355620.361161, 390 | author = {Hoare, C. A. R.}, 391 | title = {Monitors: An Operating System Structuring Concept}, 392 | journal = {Commun. ACM}, 393 | issue_date = {Oct. 1974}, 394 | volume = {17}, 395 | number = {10}, 396 | month = oct, 397 | year = {1974}, 398 | issn = {0001-0782}, 399 | pages = {549--557}, 400 | numpages = {9}, 401 | url = {http://doi.acm.org/10.1145/355620.361161}, 402 | doi = {10.1145/355620.361161}, 403 | acmid = {361161}, 404 | publisher = {ACM}, 405 | address = {New York, NY, USA}, 406 | keywords = {monitors, mutual exclusion, operating systems, scheduling, structured multiprogramming, synchronization, system implementation languages}, 407 | } 408 | 409 | @article{Hoare:1978:CSP:359576.359585, 410 | author = {Hoare, C. A. R.}, 411 | title = {Communicating Sequential Processes}, 412 | journal = {Commun. ACM}, 413 | issue_date = {Aug. 1978}, 414 | volume = {21}, 415 | number = {8}, 416 | month = aug, 417 | year = {1978}, 418 | issn = {0001-0782}, 419 | pages = {666--677}, 420 | numpages = {12}, 421 | url = {http://doi.acm.org/10.1145/359576.359585}, 422 | doi = {10.1145/359576.359585}, 423 | acmid = {359585}, 424 | publisher = {ACM}, 425 | address = {New York, NY, USA}, 426 | keywords = {classes, concurrency, conditional critical regions, coroutines, data representations, guarded commands, input, iterative arrays, monitors, multiple entries, multiple exits, nondeterminacy, output, parallel programming, procedures, program structures, programming, programming languages, programming primitives, recursion}, 427 | } 428 | 429 | @article{Feldman:1979:HLP:359114.359127, 430 | author = {Feldman, Jerome A.}, 431 | title = {High Level Programming for Distributed Computing}, 432 | journal = {Commun. ACM}, 433 | issue_date = {June 1979}, 434 | volume = {22}, 435 | number = {6}, 436 | month = jun, 437 | year = {1979}, 438 | issn = {0001-0782}, 439 | pages = {353--368}, 440 | numpages = {16}, 441 | url = {http://doi.acm.org/10.1145/359114.359127}, 442 | doi = {10.1145/359114.359127}, 443 | acmid = {359127}, 444 | publisher = {ACM}, 445 | address = {New York, NY, USA}, 446 | keywords = {assertions, distributed computing, messages, modules}, 447 | } 448 | 449 | @article{Liskov:1977:AMC:359763.359789, 450 | author = {Liskov, Barbara and Snyder, Alan and Atkinson, Russell and Schaffert, Craig}, 451 | title = {Abstraction Mechanisms in CLU}, 452 | journal = {Commun. ACM}, 453 | issue_date = {Aug. 1977}, 454 | volume = {20}, 455 | number = {8}, 456 | month = aug, 457 | year = {1977}, 458 | issn = {0001-0782}, 459 | pages = {564--576}, 460 | numpages = {13}, 461 | url = {http://doi.acm.org/10.1145/359763.359789}, 462 | doi = {10.1145/359763.359789}, 463 | acmid = {359789}, 464 | publisher = {ACM}, 465 | address = {New York, NY, USA}, 466 | keywords = {control abstractions, data abstractions, data types, programming languages, programming methodology, separate compilation}, 467 | } 468 | 469 | @article{Liskov:1983:GAL:2166.357215, 470 | author = {Liskov, Barbara and Scheifler, Robert}, 471 | title = {Guardians and Actions: Linguistic Support for Robust, Distributed Programs}, 472 | journal = {ACM Trans. Program. Lang. Syst.}, 473 | issue_date = {July 1983}, 474 | volume = {5}, 475 | number = {3}, 476 | month = jul, 477 | year = {1983}, 478 | issn = {0164-0925}, 479 | pages = {381--404}, 480 | numpages = {24}, 481 | url = {http://doi.acm.org/10.1145/2166.357215}, 482 | doi = {10.1145/2166.357215}, 483 | acmid = {357215}, 484 | publisher = {ACM}, 485 | address = {New York, NY, USA}, 486 | } 487 | 488 | @article{Halstead:1985:MLC:4472.4478, 489 | author = {Halstead,Jr., Robert H.}, 490 | title = {MULTILISP: A Language for Concurrent Symbolic Computation}, 491 | journal = {ACM Trans. Program. Lang. Syst.}, 492 | issue_date = {Oct. 1985}, 493 | volume = {7}, 494 | number = {4}, 495 | month = oct, 496 | year = {1985}, 497 | issn = {0164-0925}, 498 | pages = {501--538}, 499 | numpages = {38}, 500 | url = {http://doi.acm.org/10.1145/4472.4478}, 501 | doi = {10.1145/4472.4478}, 502 | acmid = {4478}, 503 | publisher = {ACM}, 504 | address = {New York, NY, USA}, 505 | } 506 | 507 | @article{Liskov:1988:DPA:42392.42399, 508 | author = {Liskov, Barbara}, 509 | title = {Distributed Programming in Argus}, 510 | journal = {Commun. ACM}, 511 | issue_date = {March 1988}, 512 | volume = {31}, 513 | number = {3}, 514 | month = mar, 515 | year = {1988}, 516 | issn = {0001-0782}, 517 | pages = {300--312}, 518 | numpages = {13}, 519 | url = {http://doi.acm.org/10.1145/42392.42399}, 520 | doi = {10.1145/42392.42399}, 521 | acmid = {42399}, 522 | publisher = {ACM}, 523 | address = {New York, NY, USA}, 524 | } 525 | 526 | @article{Dijkstra:1968:SLS:363095.363143, 527 | author = {Dijkstra, Edsger W.}, 528 | title = {The Structure of the \&Ldquo;THE\&Rdquo;-multiprogramming System}, 529 | journal = {Commun. ACM}, 530 | issue_date = {May 1968}, 531 | volume = {11}, 532 | number = {5}, 533 | month = may, 534 | year = {1968}, 535 | issn = {0001-0782}, 536 | pages = {341--346}, 537 | numpages = {6}, 538 | url = {http://doi.acm.org/10.1145/363095.363143}, 539 | doi = {10.1145/363095.363143}, 540 | acmid = {363143}, 541 | publisher = {ACM}, 542 | address = {New York, NY, USA}, 543 | keywords = {cooperating sequential processes, input-output buffering, multiprocessing, multiprogramming, multiprogramming system, operating system, processor sharing, program verification, real-time debugging, synchronizing primitives, system hierarchy, system levels, system structure}, 544 | } 545 | 546 | @inproceedings{black2007development, 547 | title={The development of the Emerald programming language}, 548 | author={Black, Andrew P and Hutchinson, Norman C and Jul, Eric and Levy, Henry M}, 549 | booktitle={Proceedings of the third ACM SIGPLAN conference on History of programming languages}, 550 | pages={11--1}, 551 | year={2007}, 552 | organization={ACM} 553 | } 554 | 555 | @article{Helland:2012:IMC:2181796.2187821, 556 | author = {Helland, Pat}, 557 | title = {Idempotence Is Not a Medical Condition}, 558 | journal = {Queue}, 559 | issue_date = {April 2012}, 560 | volume = {10}, 561 | number = {4}, 562 | month = apr, 563 | year = {2012}, 564 | issn = {1542-7730}, 565 | pages = {30:30--30:46}, 566 | articleno = {30}, 567 | numpages = {17}, 568 | url = {http://doi.acm.org/10.1145/2181796.2187821}, 569 | doi = {10.1145/2181796.2187821}, 570 | acmid = {2187821}, 571 | publisher = {ACM}, 572 | address = {New York, NY, USA}, 573 | } 574 | 575 | @article{vinoski2003s, 576 | title={It's just a mapping problem [computer application adaptation]}, 577 | author={Vinoski, Steve}, 578 | journal={Internet Computing, IEEE}, 579 | volume={7}, 580 | number={3}, 581 | pages={88--90}, 582 | year={2003}, 583 | publisher={IEEE} 584 | } 585 | 586 | @inproceedings{meiklejohn2015lasp, 587 | title={Lasp: a language for distributed, eventually consistent computations with CRDTs}, 588 | author={Meiklejohn, Christopher and Van Roy, Peter}, 589 | booktitle={Proceedings of the First Workshop on Principles and Practice of Consistency for Distributed Data}, 590 | pages={7}, 591 | year={2015}, 592 | organization={ACM} 593 | } 594 | 595 | @inproceedings{zhang2014customizable, 596 | title={Customizable and extensible deployment for mobile/cloud applications}, 597 | author={Zhang, Irene and Szekeres, Adriana and Van Aken, Dana and Ackerman, Isaac and Gribble, Steven D and Krishnamurthy, Arvind and Levy, Henry M}, 598 | booktitle={11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14)}, 599 | pages={97--112}, 600 | year={2014} 601 | } 602 | 603 | @article{feldman1979high, 604 | title={High level programming for distributed computing}, 605 | author={Feldman, Jerome A}, 606 | journal={Communications of the ACM}, 607 | volume={22}, 608 | number={6}, 609 | pages={353--368}, 610 | year={1979}, 611 | publisher={ACM} 612 | } 613 | 614 | @article{liskov1977abstraction, 615 | title={Abstraction mechanisms in CLU}, 616 | author={Liskov, Barbara and Snyder, Alan and Atkinson, Russell and Schaffert, Craig}, 617 | journal={Communications of the ACM}, 618 | volume={20}, 619 | number={8}, 620 | pages={564--576}, 621 | year={1977}, 622 | publisher={ACM} 623 | } 624 | 625 | @article{liskov1988distributed, 626 | title={Distributed programming in Argus}, 627 | author={Liskov, Barbara}, 628 | journal={Communications of the ACM}, 629 | volume={31}, 630 | number={3}, 631 | pages={300--312}, 632 | year={1988}, 633 | publisher={ACM} 634 | } 635 | 636 | @article{liskov1983guardians, 637 | title={Guardians and actions: Linguistic support for robust, distributed programs}, 638 | author={Liskov, Barbara and Scheifler, Robert}, 639 | journal={ACM Transactions on Programming Languages and Systems (TOPLAS)}, 640 | volume={5}, 641 | number={3}, 642 | pages={381--404}, 643 | year={1983}, 644 | publisher={ACM} 645 | } 646 | 647 | @techreport{walker1984orphan, 648 | title={Orphan Detection in the Argus System.}, 649 | author={Walker, Edward Franklin}, 650 | year={1984}, 651 | institution={DTIC Document} 652 | } 653 | 654 | @article{liskov1987implementation, 655 | title={Implementation of argus}, 656 | author={Liskov, Barbara and Curtis, Dorothy and Johnson, Paul and Scheifer, Robert}, 657 | journal={ACM SIGOPS Operating Systems Review}, 658 | volume={21}, 659 | number={5}, 660 | pages={111--122}, 661 | year={1987}, 662 | publisher={ACM} 663 | } 664 | 665 | @book{garcia1987sagas, 666 | title={Sagas}, 667 | author={Garcia-Molina, Hector and Salem, Kenneth}, 668 | volume={16}, 669 | number={3}, 670 | year={1987}, 671 | publisher={ACM} 672 | } 673 | 674 | @article{armstrong2010erlang, 675 | title={erlang}, 676 | author={Armstrong, Joe}, 677 | journal={Communications of the ACM}, 678 | volume={53}, 679 | number={9}, 680 | pages={68--75}, 681 | year={2010}, 682 | publisher={ACM} 683 | } 684 | 685 | @inproceedings{rajan2010guesstimate, 686 | title={Guesstimate: a programming model for collaborative distributed systems}, 687 | author={Rajan, Kaushik and Rajamani, Sriram and Yaduvanshi, Shashank}, 688 | booktitle={ACM Sigplan Notices}, 689 | volume={45}, 690 | number={6}, 691 | pages={210--220}, 692 | year={2010}, 693 | organization={ACM} 694 | } 695 | -------------------------------------------------------------------------------- /pl.html: -------------------------------------------------------------------------------- 1 |

Promises

2 |

Relevant Reading

3 | 7 |

Commentary

8 |

Outside of early mentions from Friedman and Wise on a cons cell with placeholder values Friedman and Wise (1978) and Baker and Hewitt’s work on incremental garbage collection Baker and Hewitt (1977), futures originally appeared as one of the two principal constructs for parallel operations in MultiLisp. MultiLisp attempted to solve a main challenge of designing a language for parallel computation: how can parallel computation be introduced into a language in a way that fits with the existing programming paradigm. This problem is motivated by the fact that computer programmers will need to introduce concurrency into applications because automated analysis may not be able to identify all of the points for parallelism. Halstead decides there is quite a natural fit with a Lisp/Scheme: expression evaluation can be done in parallel. MultiLisp introduces two main concepts: pcall, to evaluate the expressions being passed to a function in parallel and introduce concurrency into evaluation of arguments to a function, and futures, to introduce concurrency between the computation of a value and the use of that value. Halstead also notes that futures closely resemble the “eventual values” in Hibbard’s Algol 68, however were typed distinctly from the values they produced and later represented. Halstead Jr (1985)

9 |

In 1988, Liskov and Shrira introduce the concept of a promise: an efficient way to perform asynchronous remote procedure calls in a type-safe way Liskov and Shrira (1988). Simply put, a promise is a placeholder for a value that will be available in the future. When the initial call is made, a promise is created and the asynchronous call to compute the value of the promise runs in parallel with the rest of the program. When the call completes, the value can be “claimed“ by the caller.

10 |

An excerpt motivation from Promises: linguistic support for efficient asynchronous procedure calls in distributed systems (Liskov and Shrira, PLDI 1988):

11 |
12 |

“Remote procedure calls have come to be the preferred method of communication in a distributed system because programs that use procedures are easier to understand and reason about than those that explicitly send and receive messages. However, remote calls require the caller to wait for a reply before continuing, and therefore can lead to lower performance than explicit message exchange.”

13 |
14 |

The general motivation behind the work by Liskov and Shrira can be thought as the following critiques of two models of distributed programming.

15 | 19 |

Promises attempts to bridge this gap by combining the remote procedure call style of building applications, with the asynchronous execution model seen in systems that primarily use message passing.

20 |

The first challenge in combining these two programming paradigms for distributed programming is that of order. Synchronous RPC imposes a total order across all of the calls in an application: one call will fully complete, from request to response, before moving to the next call, given a single thread of execution. If we move to an asynchronous model of RPC, we must have a way to block for a given value, or result, of an asynchronous RPC if required for further processing.

21 |

Promises does this by imagining the concept of a call-stream. A call-stream is nothing more than a stream of placeholder values for each asynchronous RPC issued by a client. Once a RPC is issued, the promise is considered blocked asynchronous execution is performed, and once the value has been computed, the promise is considered ready and the value can be claimed by the caller. If an attempt to claim the value is issued before the value is computed, execution blocks until the value is available. The stream of placeholder values serves as an implicit ordering of the requests that are issued; in the Argus system that served as the implementation platform for this work, multiple streams were used and related operations sequenced together in the same stream1.

22 |

Impact and Implementations

23 |

While promises originated as a technique for decoupling values from the computations that produced them, promises, as proposed by Liskov and Shrira mainly focused on reducing latency and improving performance of distributed computations. The majority of programming languages in use today by practitioners contain some notion of futures or promises. Below, we highlight a few examples.

24 |

The Oz Henz, Smolka, and Würtz (1993) language, designed for the education of programmers in several different programming paradigms, provides a functional programming model with single assignment variables, streams, and promises. Given every variable in Oz is a dataflow, and therefore every single value in the system is a promise. Both Distributed Oz Haridi, Van Roy, and Smolka (1997) and Derflow (an implementation of Oz in the Erlang programming language) Bravo et al. (2014) provide distributed versions of the Oz programming model. The Akka library for Scala also provides Oz-style dataflow concurrency with Futures.

25 |

More recently, promises have been repurposed by the JavaScript community to allow for asynchronous programs to be written in direct style instead of continuation-passing style. ECMAScript 6 contains a native Promise object, that can be used to perform asynchronous computation and register callback functions that will fire once the computation either succeeds or fails.

26 |
27 |
28 |

Baker, Henry C., Jr., and Carl Hewitt. 1977. “The Incremental Garbage Collection of Processes.” SIGPLAN Not. 12 (8). New York, NY, USA: ACM: 55–59. doi:10.1145/872734.806932.

29 |
30 |
31 |

Bravo, Manuel, Zhongmiao Li, Peter Van Roy, and Christopher Meiklejohn. 2014. “Derflow: Distributed Deterministic Dataflow Programming for Erlang.” In Proceedings of the Thirteenth ACM SIGPLAN Workshop on Erlang, 51–60. Erlang ’14. New York, NY, USA: ACM. doi:10.1145/2633448.2633451.

32 |
33 |
34 |

Friedman, D. P., and D. S. Wise. 1978. “Aspects of Applicative Programming for Parallel Processing.” IEEE Transactions on Computers C-27 (4): 289–96. doi:10.1109/TC.1978.1675100.

35 |
36 |
37 |

Halstead Jr, Robert H. 1985. “Multilisp: A Language for Concurrent Symbolic Computation.” ACM Transactions on Programming Languages and Systems (TOPLAS) 7 (4). ACM: 501–38.

38 |
39 |
40 |

Haridi, Seif, Peter Van Roy, and Gert Smolka. 1997. “An Overview of the Design of Distributed Oz.” In Proceedings of the Second International Symposium on Parallel Symbolic Computation, 176–87. ACM.

41 |
42 |
43 |

Henz, Martin, Gert Smolka, and Jörg Würtz. 1993. “Oz-a Programming Language for Multi-Agent Systems.” In IJCAI, 404–9.

44 |
45 |
46 |

Liskov, Barbara, and Liuba Shrira. 1988. Promises: Linguistic Support for Efficient Asynchronous Procedure Calls in Distributed Systems. Vol. 23. 7. ACM.

47 |
48 |
49 |
50 |
51 |
    52 |
  1. Promises also provide a way for stream composition, where processes read values from one or more streams once they are ready, fulfilling placeholder blocked promises in other streams. One classic implementation of stream composition using promises is the Sieve of Eratosthenes.

  2. 53 |
54 |
55 | -------------------------------------------------------------------------------- /pl.md: -------------------------------------------------------------------------------- 1 | Promises 2 | ======== 3 | 4 | Relevant Reading 5 | ---------------- 6 | 7 | - *Promises: linguistic support for efficient asynchronous procedure 8 | calls in distributed systems*, Liskov and Shrira, PLDI 9 | 1988 @liskov1988promises. 10 | 11 | - *Multilisp: A language for concurrent symbolic computation*, 12 | Halstead, TOPLAS 1985 @halstead1985multilisp. 13 | 14 | Commentary 15 | ---------- 16 | 17 | Outside of early mentions from Friedman and Wise on a *cons* cell with 18 | placeholder values @1675100 and Baker and Hewitt’s work on incremental 19 | garbage collection @Baker:1977:IGC:872734.806932, *futures* originally 20 | appeared as one of the two principal constructs for parallel operations 21 | in MultiLisp. MultiLisp attempted to solve a main challenge of designing 22 | a language for parallel computation: how can parallel computation be 23 | introduced into a language in a way that fits with the existing 24 | programming paradigm. This problem is motivated by the fact that 25 | computer programmers will need to introduce concurrency into 26 | applications because automated analysis may not be able to identify all 27 | of the points for parallelism. Halstead decides there is quite a natural 28 | fit with a Lisp/Scheme: expression evaluation can be done in parallel. 29 | MultiLisp introduces two main concepts: *pcall*, to evaluate the 30 | expressions being passed to a function in parallel and introduce 31 | concurrency into evaluation of arguments to a function, and *futures*, 32 | to introduce concurrency between the computation of a value and the use 33 | of that value. Halstead also notes that futures closely resemble the 34 | “eventual values” in Hibbard’s Algol 68, however were typed distinctly 35 | from the values they produced and later 36 | represented. @halstead1985multilisp 37 | 38 | In 1988, Liskov and Shrira introduce the concept of a *promise*: an 39 | efficient way to perform asynchronous remote procedure calls in a 40 | type-safe way @liskov1988promises. Simply put, a promise is a 41 | placeholder for a value that will be available in the future. When the 42 | initial call is made, a promise is created and the asynchronous call to 43 | compute the value of the promise runs in parallel with the rest of the 44 | program. When the call completes, the value can be “claimed“ by the 45 | caller. 46 | 47 | An excerpt motivation from *Promises: linguistic support for efficient 48 | asynchronous procedure calls in distributed systems (Liskov and Shrira, 49 | PLDI 1988)*: 50 | 51 | > “Remote procedure calls have come to be the preferred method of 52 | > communication in a distributed system because programs that use 53 | > procedures are easier to understand and reason about than those that 54 | > explicitly send and receive messages. However, remote calls require 55 | > the caller to wait for a reply before continuing, and therefore can 56 | > lead to lower performance than explicit message exchange.” 57 | 58 | The general motivation behind the work by Liskov and Shrira can be 59 | thought as the following critiques of two models of distributed 60 | programming. 61 | 62 | - The Remote Procedure Call (RPC) paradigm is preferable by 63 | programmers because it is a familiar programming model. However, 64 | because of the synchronous nature of RPC, this model does not scale 65 | in terms of performance. 66 | 67 | - The message passing paradigm is harder for programmers to reason 68 | about, but provides the benefit of decoupling of request and 69 | response, allowing for asynchronous programming and the subsequent 70 | performance benefits. 71 | 72 | *Promises* attempts to bridge this gap by combining the remote procedure 73 | call style of building applications, with the asynchronous execution 74 | model seen in systems that primarily use message passing. 75 | 76 | The first challenge in combining these two programming paradigms for 77 | distributed programming is that of order. Synchronous RPC imposes a 78 | total order across all of the calls in an application: one call will 79 | fully complete, from request to response, before moving to the next 80 | call, given a single thread of execution. If we move to an asynchronous 81 | model of RPC, we must have a way to block for a given value, or result, 82 | of an asynchronous RPC if required for further processing. 83 | 84 | Promises does this by imagining the concept of a *call-stream*. A 85 | *call-stream* is nothing more than a stream of placeholder values for 86 | each asynchronous RPC issued by a client. Once a RPC is issued, the 87 | *promise* is considered *blocked* asynchronous execution is performed, 88 | and once the value has been computed, the *promise* is considered 89 | *ready* and the value can be *claimed* by the caller. If an attempt to 90 | *claim* the value is issued before the value is computed, execution 91 | blocks until the value is available. The stream of placeholder values 92 | serves as an implicit ordering of the requests that are issued; in the 93 | Argus system that served as the implementation platform for this work, 94 | multiple streams were used and related operations sequenced together in 95 | the same stream[^1]. 96 | 97 | Impact and Implementations 98 | -------------------------- 99 | 100 | While promises originated as a technique for decoupling values from the 101 | computations that produced them, promises, as proposed by Liskov and 102 | Shrira mainly focused on reducing latency and improving performance of 103 | distributed computations. The majority of programming languages in use 104 | today by practitioners contain some notion of *futures* or *promises*. 105 | Below, we highlight a few examples. 106 | 107 | The Oz @henz1993oz language, designed for the education of programmers 108 | in several different programming paradigms, provides a functional 109 | programming model with single assignment variables, streams, and 110 | promises. Given every variable in Oz is a dataflow, and therefore every 111 | single value in the system is a promise. Both Distributed 112 | Oz @haridi1997overview and Derflow (an implementation of Oz in the 113 | Erlang programming language) @Bravo:2014:DDD:2633448.2633451 provide 114 | distributed versions of the Oz programming model. The Akka library for 115 | Scala also provides Oz-style dataflow concurrency with Futures. 116 | 117 | More recently, promises have been repurposed by the JavaScript community 118 | to allow for asynchronous programs to be written in direct style instead 119 | of continuation-passing style. ECMAScript 6 contains a native Promise 120 | object, that can be used to perform asynchronous computation and 121 | register callback functions that will fire once the computation either 122 | succeeds or fails. 123 | 124 | [^1]: Promises also provide a way for stream composition, where 125 | processes read values from one or more streams once they are 126 | *ready*, fulfilling placeholder *blocked* promises in other streams. 127 | One classic implementation of stream composition using *promises* is 128 | the Sieve of Eratosthenes. 129 | -------------------------------------------------------------------------------- /plits.html: -------------------------------------------------------------------------------- 1 |

Relevant Reading

2 | 5 |

Commentary

6 |
7 |

“A significant conclusion was that parallelism and data sharing are inherently difficult to combine effectively.”

8 |
9 |

The Programming Language in the Sky (PLITS) project was an effort by the University of Rochester Computer Science department started in 1974 to take a serious look at how to make programming languages more declarative, and to see what benefits could be gained from using state-of-the-art compiler technology. This paper, specifically focuses on the problem of addressing the need for programming language constructs for distributed computing, that normally do not arise in conventional programming1.

10 |

At a high level, PASCAL-PLITS views distributed computing as a group of computers communicating over low bandwidth, unreliable communication paths, applications would consist of communication between modules via an asynchronous message protocol instead of through the use of subroutines to avoid having to wait for a response to a request. Modules only communicate through message passing, and each message is composed of a set of name-value pairs which are called slots: names are uninterpreted string and values are an element of some primitive type (for instance, integers.)

11 |

To avoid discussion of many of the PASCAL specific details outlined in the formal specification section of the paper, the intuition on how message passing between two modules operates is rather straightforward:

12 | 18 |

An interesting idea that the paper presents is the idea of forwarding the message to another module for handling processing. For instance, a server might need to accept a message and pass that message to another client to handle specific processing of that request. In that case, the path from original sender, to server, to client handling the request must be retraced when the request is complete and the response is being returned to the original caller. PASCAL-PLITS offers an alternative method: a unique transaction identifier is transmitted with the original message, so the client handling the request can send it directly back to the original sender, by passing the coordinating server module. In PASCAL-PLITS, this is referred to as a transaction key, and receive messages can specifically request they want to selectively receive messages about a key, instead of from a specific recipient; because of this proxy-like behavior, modules need to be able to read only particular slots of the message, and ignore, but forward, the remainder2.

19 |

The authors argue for message passing over subroutines:

20 |
21 |

“The message paradigm has several advantages over subroutine calls. If the modules were in different languages, the subroutine call mechanisms would have to be made compatible. Any sophisticated lockout procedure would require the internal coding of queues equivalent to what the message switcher provides. In the subroutine discipline, a module which tries to execute a locked subroutine is unable to proceed with other computation. The total picture on the relative value of messages and calls is much more complex...”

22 |
23 |

The authors also argue for the encapsulation provided by message passing as the only interface to data:

24 |
25 |

“There are other interesting features that arise when messages are combined with the idea of modules. The most obvious feature of PLITS programming is the high degree of locality and protection it provides. Each PLITS module is totally self-contained and communicates solely through messages. This means that no local variables can even be examined from the outside, no procedures invoked, etc. A module can be asked to return or update a value, execute a function, etc. It now becomes quite natural to screen requests for validity (much more than type checking), to guard against conflicting demands on a data structure, etc. This does not solve all the problems attacked by structured programming strictures, but does make it clear what has to be done and where.“

26 |
27 |

The authors use this encapsulation to solve the exclusion problem: if module B is storing data that A wants to compare and swap, B simply ignores messages that do not contain the transaction key generated by A until the swap is complete (or, does not handle messages related to the keys that are beings swapped until the operation is complete.) However, this ultimately brings additional complexities, as now the module itself must deal with the queue management of messages that are waiting to be processed while resources are locked.

28 |

Perhaps the most interesting part of the PLITS system is the distribution system for running distributed jobs. Distribution is tackled through a layered approach:

29 | 36 |

Messages to the recipients that are overloaded are buffered by the sender: this procedure is done on the sender side through process suspension. If nodes are located together at the same site, the kernel is responsible for this process, if not, the distributed communication manager is responsible for control flow management.

37 |

Identification of where a module are located is done through the module name, an incarnation number, site number and local module number: this means that the identifier of a process can locate where its running on the network alone, without the need for a global registry. However, given this is done on a module instance basis, modules only ever live on the same machine. To get around this, the authors suggest using equivalent modules across machines and having the caller make the decision where to send the message. As the authors state it quite succinctly “contrary to current fantasies about distributed computing”.

38 |

Impact and Implementations

39 |

So, what’s the contribution of this paper?

40 |

Well, as the authors describe it, it’s the “module-message” paradigm: something that you’ve probably seen before and the reason this paper has sounded so familiar: Erlang, or more specifically Distributed Erlang.

41 |

PLITS, only briefly mentioned in Joe Armstrong’s thesis (Armstrong 2010), bears a striking resemblance to Erlang and brings some of the features that are used on a daily basis by Erlang developers. We quickly summarize those features now.

42 | 48 |

We should be clear, however, that PLITS is not an actor-based language such as Erlang. While PLITS shares many of the same design patterns and abstractions, the language is fundamentally a module-message system: modules are identified by name, which includes a location and incarnation number and are the target of messages, not processes.

49 |
50 |
51 |

Armstrong, Joe. 2010. “Erlang.” Communications of the ACM 53 (9). ACM: 68–75.

52 |
53 |
54 |

Feldman, Jerome A. 1979. “High Level Programming for Distributed Computing.” Communications of the ACM 22 (6). ACM: 353–68.

55 |
56 |
57 |
58 |
59 |
    60 |
  1. It is easy to argue today that conventional programming, is distributed computing, but I digress.

  2. 61 |
  3. The dilligent reader will notice that the PASCAL-PLITS system argues that they provide stronger guarantees than “very strong typing” through this slot system, but it’s unclear how slot compatibility and forwarding can be verified without either dataflow analysis or type checking at compile time: the slot system is supposed to be more expressive because it describes application level behavior.

  4. 62 |
  5. Not discussed in the literature is how it’s required that the HCP guarantees reliable transmission and handles flow control and error handling.

  6. 63 |
64 |
65 | -------------------------------------------------------------------------------- /plits.tex: -------------------------------------------------------------------------------- 1 | \subsection{Relevant Reading} 2 | 3 | \begin{itemize} 4 | \item \textit{High Level Programming for Distributed Computing}, Feldman, Jerome A, CACM 1979~\cite{feldman1979high}. 5 | \end{itemize} 6 | 7 | \subsection{Commentary} 8 | 9 | \begin{quote} 10 | ``A significant conclusion was that parallelism and data sharing are inherently difficult to combine effectively.'' 11 | \end{quote} 12 | 13 | The Programming Language in the Sky (PLITS) project was an effort by the University of Rochester Computer Science department started in 1974 to take a serious look at how to make programming languages more declarative, and to see what benefits could be gained from using state-of-the-art compiler technology. This paper, specifically focuses on the problem of addressing the need for programming language constructs for distributed computing, that normally do not arise in conventional programming\footnote{It is easy to argue today that conventional programming, \textbf{is} distributed computing, but I digress.}. 14 | 15 | At a high level, PASCAL-PLITS views distributed computing as a group of computers communicating over low bandwidth, unreliable communication paths, applications would consist of communication between modules via an asynchronous message protocol instead of through the use of subroutines to avoid having to wait for a response to a request. Modules only communicate through message passing, and each message is composed of a set of name-value pairs which are called slots: names are uninterpreted string and values are an element of some primitive type (for instance, integers.) 16 | 17 | To avoid discussion of many of the PASCAL specific details outlined in the formal specification section of the paper, the intuition on how message passing between two modules operates is rather straightforward: 18 | \begin{itemize} 19 | \item Messages can be sent between two modules where slots are compatible: for instance, both a server and client must define the types of messages they can receive and send, and they must specify the slots of the message they are going to read. It's perfectly fine in this system to define a message that underspecifies the message slots -- readers can only access what they know about through their local specification. 20 | \item Messages are sent to recipients by specifying the recipient's module identifier and can be selectively received at the recipient by sender's module identifier. 21 | \item Messages in PASCAL-PLITS arrive in FIFO order between modules and waiting on a message that never arrives after a particular time interval will throw an application-level exception. 22 | \item Messages support a grouping behavior that allow for fixed size repetitions of values by type, nested one level deep, as values in slots. 23 | \end{itemize} 24 | 25 | An interesting idea that the paper presents is the idea of forwarding the message to another module for handling processing. For instance, a server might need to accept a message and pass that message to another client to handle specific processing of that request. In that case, the path from original sender, to server, to client handling the request must be retraced when the request is complete and the response is being returned to the original caller. PASCAL-PLITS offers an alternative method: a unique transaction identifier is transmitted with the original message, so the client handling the request can send it directly back to the original sender, by passing the coordinating server module. In PASCAL-PLITS, this is referred to as a transaction key, and receive messages can specifically request they want to selectively receive messages \textbf{about} a key, instead of \textbf{from} a specific recipient; because of this proxy-like behavior, modules need to be able to read only particular slots of the message, and ignore, but forward, the remainder\footnote{The dilligent reader will notice that the PASCAL-PLITS system argues that they provide stronger guarantees than ``very strong typing'' through this slot system, but it's unclear how slot compatibility and forwarding can be verified without either dataflow analysis or type checking at compile time: the slot system is supposed to be more expressive because it describes application level behavior.}. 26 | 27 | The authors argue for message passing over subroutines: 28 | 29 | \begin{quote} 30 | ``The message paradigm has several advantages over subroutine calls. If the modules were in different languages, the subroutine call mechanisms would have to be made compatible. Any sophisticated lockout procedure would require the internal coding of queues equivalent to what the message switcher provides. In the subroutine discipline, a module which tries to execute a locked subroutine is unable to proceed with other computation. The total picture on the relative value of messages and calls is much more complex...'' 31 | \end{quote} 32 | 33 | The authors also argue for the encapsulation provided by message passing as the only interface to data: 34 | 35 | \begin{quote} 36 | ``There are other interesting features that arise when messages are combined with the idea of modules. The most obvious feature of PLITS programming is the high degree of locality and protection it provides. Each PLITS module is totally self-contained and communicates solely through messages. This means that no local variables can even be examined from the outside, no procedures invoked, etc. A module can be asked to return or update a value, execute a function, etc. It now becomes quite natural to screen requests for validity (much more than type checking), to guard against conflicting demands on a data structure, etc. This does not solve all the problems attacked by structured programming strictures, but does make it clear what has to be done and where.`` 37 | \end{quote} 38 | 39 | The authors use this encapsulation to solve the exclusion problem: if module B is storing data that A wants to compare and swap, B simply ignores messages that do not contain the transaction key generated by A until the swap is complete (or, does not handle messages related to the keys that are beings swapped until the operation is complete.) However, this ultimately brings additional complexities, as now the module itself must deal with the queue management of messages that are waiting to be processed while resources are locked. 40 | 41 | Perhaps the most interesting part of the PLITS system is the distribution system for running distributed jobs. Distribution is tackled through a layered approach: 42 | 43 | \begin{itemize} 44 | \item Networks are made up of a set of machines. 45 | \item Machines are made up of multiple sites, each with a kernel that keeps track of scheduling and which modules are waiting to receive messages: within a site, messages must have the same format and same primitive data representation. 46 | \item Kernels are responsible for distribution of messages within a site, forwarding messages between sites and resource allocation. 47 | \item Co-located with a kernel is a Host Control Program (HCP) that is responsible for forwarding within sites on the same machine, and forwarding messages between machines. 48 | \item Finally, the HCP is divided into two components: a distributed job manager and communication manager: one for job-lifecycle related operations and one for managing communication between machines\footnote{Not discussed in the literature is how it's required that the HCP guarantees reliable transmission and handles flow control and error handling.}. 49 | \end{itemize} 50 | 51 | Messages to the recipients that are overloaded are buffered by the sender: this procedure is done on the sender side through process suspension. If nodes are located together at the same site, the kernel is responsible for this process, if not, the distributed communication manager is responsible for control flow management. 52 | 53 | Identification of where a module are located is done through the module name, an incarnation number, site number and local module number: this means that the identifier of a process can locate where its running on the network alone, without the need for a global registry. However, given this is done on a module instance basis, modules only ever live on the same machine. To get around this, the authors suggest using equivalent modules across machines and having the caller make the decision where to send the message. As the authors state it quite succinctly ``contrary to current fantasies about distributed computing''. 54 | 55 | \subsection{Impact and Implementations} 56 | So, what's the contribution of this paper? 57 | 58 | Well, as the authors describe it, it's the ``module-message'' paradigm: something that you've probably seen before and the reason this paper has sounded so familiar: \textbf{Erlang}, or more specifically \textbf{Distributed Erlang}. 59 | 60 | PLITS, only briefly mentioned in Joe Armstrong's thesis~\cite{armstrong2010erlang}, bears a striking resemblance to Erlang and brings some of the features that are used on a daily basis by Erlang developers. We quickly summarize those features now. 61 | 62 | \begin{itemize} 63 | \item Selective receive through the use of a transaction identifier has a strong resemblance to Erlang's ability to generate a globally unique reference and use it as criteria to select from the process mailbox. This supports the forwarding behavior of processes that serve mainly for the routing of a request. 64 | \item Repetition in slot values, where each value is tagged in the initial position by type bear a strong resemblance of how tagged tuples are used in records to specify a typed value purely as tuples. 65 | \item The idea of encapsulation and locking or queueing messages for exclusion is similar to how processes control access to resources using the ``generic abstractions'' provided by Erlang: specifically, the ``generic server'' or ``gen\_server''. The generic server and generic finite state machine are both to control access to some state and has the ability to either select specific messages for processing related to a message identifier, or interleave requests if possible based on selective receive and asynchronous messaging. 66 | \item The design of both the PLITS kernel, as a process scheduler responsible for waking and suspending processes waiting on messages, as well as the Host Control Program work very similar to the implementation of Distributed Erlang: processes running on remote machines are uniquely identified by a process identifier that encodes the machine name and messages are transparently delivered to them under what aims to achieve reliable transmission, but sometimes falls short. 67 | \end{itemize} 68 | 69 | We should be clear, however, that PLITS is \textbf{not} an actor-based language such as Erlang. While PLITS shares many of the same design patterns and abstractions, the language is fundamentally a module-message system: modules are identified by name, which includes a location and incarnation number and are the target of messages, not processes. 70 | -------------------------------------------------------------------------------- /pmldc.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cmeiklejohn/PMLDC/d9bbfe20f87699aa5c6c09815ac0eb20c2e1d141/pmldc.pdf -------------------------------------------------------------------------------- /pmldc.tex: -------------------------------------------------------------------------------- 1 | \documentclass[11pt,twoside,a4paper]{article} 2 | 3 | \usepackage[english]{babel} 4 | 5 | \usepackage{amsthm} 6 | \usepackage{todonotes} 7 | \usepackage{amsmath} 8 | \usepackage{amssymb} 9 | \usepackage{url} 10 | \usepackage{listings} 11 | \usepackage{balance} 12 | \usepackage{epigraph} 13 | \usepackage{fixltx2e} 14 | \usepackage{graphicx} 15 | \usepackage{algorithm} 16 | \usepackage{algpseudocode} 17 | \usepackage{caption} 18 | \usepackage{color} 19 | \usepackage{hyperref} 20 | \usepackage{breakurl} 21 | 22 | \usepackage{textcomp} 23 | 24 | \usepackage{fancyhdr} 25 | \pagestyle{fancy} 26 | \fancyfoot{} 27 | \fancyfoot[LE,LO]{MY TEXT} 28 | \renewcommand{\footrulewidth}{0.4pt} 29 | \fancyfoot[L]{Programming Models and Languages for Distributed Computation\\ \textcopyright\ 2016 Christopher S. Meiklejohn} 30 | 31 | \usepackage{cleveref} 32 | \crefname{section}{§}{§§} 33 | \Crefname{section}{§}{§§} 34 | 35 | \setlength{\belowcaptionskip}{0pt} 36 | % \setlength{\belowcaptionskip}{-10pt} 37 | % \setlength{\abovecaptionskip}{-5pt} 38 | 39 | \renewcommand\floatpagefraction{.9} 40 | \renewcommand\topfraction{.9} 41 | \renewcommand\bottomfraction{.9} 42 | \renewcommand\textfraction{.1} 43 | 44 | \usepackage[labelformat=simple]{subcaption} 45 | \renewcommand\thesubfigure{(\alph{subfigure})} 46 | 47 | \hypersetup{ 48 | colorlinks=true, 49 | citebordercolor=cyan, 50 | filebordercolor=red, 51 | linkbordercolor=blue, 52 | citecolor=cyan, 53 | linkcolor=red, 54 | urlcolor=blue} 55 | 56 | \captionsetup[figure]{labelfont=bf,font={small,it},margin=10pt} 57 | \captionsetup[figure]{labelfont=bf} 58 | 59 | \captionsetup[subfigure]{labelfont=bf, 60 | textfont=normalfont, 61 | singlelinecheck=off, 62 | justification=raggedright, 63 | font=small} 64 | 65 | \usepackage{MnSymbol} 66 | 67 | \newtheorem{theorem}{Theorem}[section] 68 | \newtheorem{corollary}{Corollary}[theorem] 69 | \newtheorem{lemma}[theorem]{Lemma} 70 | 71 | \theoremstyle{definition} 72 | \newtheorem{definition}{Definition}[section] 73 | 74 | \theoremstyle{definition} 75 | \newtheorem{property}{Property}[section] 76 | 77 | \theoremstyle{remark} 78 | \newtheorem*{remark}{Remark} 79 | 80 | \lstset{basicstyle=\ttfamily\footnotesize, 81 | numbers=left, 82 | numbersep=5pt, 83 | numberstyle=\tiny, 84 | columns=fullflexible, 85 | showstringspaces=false, 86 | keywordstyle=\color{blue}, 87 | stringstyle=\color{red}, 88 | commentstyle=\color{purple}, 89 | morecomment=[l][\color{magenta}]{\#}} 90 | 91 | \newcommand*{\myprime}{^{\prime}\mkern-1.2mu} 92 | \newcommand*{\mydprime}{^{\prime\prime}\mkern-1.2mu} 93 | \newcommand*{\mytrprime}{^{\prime\prime\prime}\mkern-1.2mu} 94 | 95 | %% Fix numbering to be compatible with IEEE style. 96 | \renewcommand\thetheorem{\arabic{section}.\arabic{theorem}} 97 | \renewcommand\thedefinition{\arabic{section}.\arabic{definition}} 98 | 99 | \begin{document} 100 | 101 | \title{Programming Models and Languages for Distributed Computation} 102 | 103 | \author{Christopher S. Meiklejohn\\ 104 | Universit\'e catholique de Louvain\\ 105 | Louvain-la-Neuve, Belgium\\ 106 | \texttt{christopher.meiklejohn@uclouvain.be}} 107 | 108 | \date{\today} 109 | \maketitle 110 | \clearpage 111 | 112 | \tableofcontents 113 | 114 | \clearpage 115 | 116 | \section{Remote Procedure Call} 117 | \input{rpc} 118 | \clearpage 119 | 120 | \section{PLITS} 121 | \input{plits} 122 | \clearpage 123 | 124 | \section{ARGUS} 125 | \input{argus} 126 | \clearpage 127 | 128 | \section{Promises} 129 | \input{promises} 130 | \clearpage 131 | 132 | \section{Emerald} 133 | \input{emerald} 134 | \clearpage 135 | 136 | \section{Hermes} 137 | \input{hermes} 138 | \clearpage 139 | 140 | \input{todo} 141 | 142 | \section*{Acknowledgements} 143 | I would like to thank the following people who provided helpful editing with the material: Sean Cribbs, Stuart Marks, and Steve Vinoski. 144 | 145 | \clearpage 146 | 147 | \bibliographystyle{abbrv} 148 | \bibliography{pl} 149 | 150 | \end{document} -------------------------------------------------------------------------------- /promises.html: -------------------------------------------------------------------------------- 1 |

Relevant Reading

2 | 6 |

Commentary

7 |

Outside of early mentions from Friedman and Wise on a cons cell with placeholder values (Friedman and Wise 1978) and Baker and Hewitt’s work on incremental garbage collection for speculative execution in parallel processes (Baker and Hewitt 1977), futures originally appeared as one of the two principal constructs for parallel operations in MultiLisp. MultiLisp attempted to solve a main challenge of designing a language for parallel computation: how can parallel computation be introduced into a language in a way that fits with the existing programming paradigm. This problem is motivated by the fact that computer programmers will need to introduce concurrency into applications because automated analysis may not be able to identify all of the points for parallelism. Halstead decides there is quite a natural fit with a Lisp/Scheme: expression evaluation can be done in parallel. MultiLisp introduces two main concepts: pcall, to evaluate the expressions being passed to a function in parallel and introduce concurrency into evaluation of arguments to a function, and futures, to introduce concurrency between the computation of a value and the use of that value. Halstead also notes that futures closely resemble the “eventual values” in Hibbard’s Algol 68, however were typed distinctly from the values they produced and later represented. (Halstead Jr 1985)

8 |

In 1988, Liskov and Shrira introduce the concept of a promise: an efficient way to perform asynchronous remote procedure calls in a type-safe way (Liskov and Shrira 1988). Simply put, a promise is a placeholder for a value that will be available in the future. When the initial call is made, a promise is created and the asynchronous call to compute the value of the promise runs in parallel with the rest of the program. When the call completes, the value can be “claimed” by the caller.

9 |

An excerpt motivation from Promises: linguistic support for efficient asynchronous procedure calls in distributed systems (Liskov and Shrira, PLDI 1988):

10 |
11 |

“Remote procedure calls have come to be the preferred method of communication in a distributed system because programs that use procedures are easier to understand and reason about than those that explicitly send and receive messages. However, remote calls require the caller to wait for a reply before continuing, and therefore can lead to lower performance than explicit message exchange.”

12 |
13 |

The general motivation behind the work by Liskov and Shrira can be thought as the following critiques of two models of distributed programming.

14 | 18 |

Promises attempts to bridge this gap by combining the remote procedure call style of building applications, with the asynchronous execution model seen in systems that primarily use message passing.

19 |

The first challenge in combining these two programming paradigms for distributed programming is that of order. Synchronous RPC imposes a total order across all of the calls in an application: one call will fully complete, from request to response, before moving to the next call, given a single thread of execution. If we move to an asynchronous model of RPC, we must have a way to block for a given value, or result, of an asynchronous RPC if required for further processing.

20 |

Promises does this by imagining the concept of a call-stream. A call-stream is nothing more than a stream of placeholder values for each asynchronous RPC issued by a client. Once a RPC is issued, the promise is considered blocked asynchronous execution is performed, and once the value has been computed, the promise is considered ready and the value can be claimed by the caller. If an attempt to claim the value is issued before the value is computed, execution blocks until the value is available. The stream of placeholder values serves as an implicit ordering of the requests that are issued; in the Argus (Liskov 1988) system that served as the implementation platform for this work, multiple streams were used and related operations sequenced together in the same stream1.

21 |

Impact and Implementations

22 |

While promises originated as a technique for decoupling values from the computations that produced them, promises, as proposed by Liskov and Shrira mainly focused on reducing latency and improving performance of distributed computations. The majority of programming languages in use today by practitioners contain some notion of futures or promises. Below, we highlight a few examples.

23 |

The Oz (Henz, Smolka, and Würtz 1993) language, designed for the education of programmers in several different programming paradigms, provides a functional programming model with single assignment variables, streams, and promises. Every variable in Oz is a dataflow, and therefore every single value in the system is a promise. Both Distributed Oz (Haridi, Van Roy, and Smolka 1997) and Derflow (an implementation of Oz in the Erlang programming language) (Bravo et al. 2014) provide distributed versions of the Oz programming model. The Akka library for Scala also provides Oz-style dataflow concurrency with Futures.

24 |

More recently, promises have been repurposed by the JavaScript community to allow for asynchronous programs to be written in direct style instead of continuation-passing style. ECMAScript 6 contains a native Promise object, that can be used to perform asynchronous computation and register callback functions that will fire once the computation either succeeds or fails (Wikipedia 2016).

25 |
26 |
27 |

Baker, Henry C., Jr., and Carl Hewitt. 1977. “The Incremental Garbage Collection of Processes.” SIGPLAN Not. 12 (8). New York, NY, USA: ACM: 55–59. doi:10.1145/872734.806932.

28 |
29 |
30 |

Bravo, Manuel, Zhongmiao Li, Peter Van Roy, and Christopher Meiklejohn. 2014. “Derflow: Distributed Deterministic Dataflow Programming for Erlang.” In Proceedings of the Thirteenth Acm Sigplan Workshop on Erlang, 51–60. Erlang ’14. New York, NY, USA: ACM. doi:10.1145/2633448.2633451.

31 |
32 |
33 |

Friedman, D. P., and D. S. Wise. 1978. “Aspects of Applicative Programming for Parallel Processing.” IEEE Transactions on Computers C-27 (4): 289–96. doi:10.1109/TC.1978.1675100.

34 |
35 |
36 |

Halstead Jr, Robert H. 1985. “Multilisp: A Language for Concurrent Symbolic Computation.” ACM Transactions on Programming Languages and Systems (TOPLAS) 7 (4). ACM: 501–38.

37 |
38 |
39 |

Haridi, Seif, Peter Van Roy, and Gert Smolka. 1997. “An Overview of the Design of Distributed Oz.” In Proceedings of the Second International Symposium on Parallel Symbolic Computation, 176–87. ACM.

40 |
41 |
42 |

Henz, Martin, Gert Smolka, and Jörg Würtz. 1993. “Oz-a Programming Language for Multi-Agent Systems.” In IJCAI, 404–9.

43 |
44 |
45 |

Liskov, Barbara. 1988. “Distributed Programming in Argus.” Communications of the ACM 31 (3). ACM: 300–312.

46 |
47 |
48 |

Liskov, Barbara, and Liuba Shrira. 1988. Promises: Linguistic Support for Efficient Asynchronous Procedure Calls in Distributed Systems. Vol. 23. 7. ACM.

49 |
50 |
51 |

Wikipedia. 2016. “Futures and Promises — Wikipedia, the Free Encyclopedia.” https://en.wikipedia.org/w/index.php?title=Futures_and_promises&oldid=708150517.

52 |
53 |
54 |
55 |
56 |
    57 |
  1. Promises also provide a way for stream composition, where processes read values from one or more streams once they are ready, fulfilling placeholder blocked promises in other streams. One classic implementation of stream composition using promises is the Sieve of Eratosthenes.

  2. 58 |
59 |
60 | -------------------------------------------------------------------------------- /promises.tex: -------------------------------------------------------------------------------- 1 | \subsection{Relevant Reading} 2 | 3 | \begin{itemize} 4 | \item \textit{Promises: linguistic support for efficient asynchronous procedure calls in distributed systems}, Liskov and Shrira, PLDI 1988~\cite{liskov1988promises}. 5 | \item \textit{Multilisp: A language for concurrent symbolic computation}, Halstead, TOPLAS 1985~\cite{halstead1985multilisp}. 6 | \end{itemize} 7 | 8 | \subsection{Commentary} 9 | 10 | Outside of early mentions from Friedman and Wise on a \textit{cons} cell with placeholder values~\cite{1675100} and Baker and Hewitt's work on incremental garbage collection for speculative execution in parallel processes~\cite{Baker:1977:IGC:872734.806932}, \textit{futures} originally appeared as one of the two principal constructs for parallel operations in MultiLisp. MultiLisp attempted to solve a main challenge of designing a language for parallel computation: how can parallel computation be introduced into a language in a way that fits with the existing programming paradigm. This problem is motivated by the fact that computer programmers will need to introduce concurrency into applications because automated analysis may not be able to identify all of the points for parallelism. Halstead decides there is quite a natural fit with a Lisp/Scheme: expression evaluation can be done in parallel. MultiLisp introduces two main concepts: \textit{pcall}, to evaluate the expressions being passed to a function in parallel and introduce concurrency into evaluation of arguments to a function, and \textit{futures}, to introduce concurrency between the computation of a value and the use of that value. Halstead also notes that futures closely resemble the ``eventual values'' in Hibbard's Algol 68, however were typed distinctly from the values they produced and later represented.~\cite{halstead1985multilisp} 11 | 12 | In 1988, Liskov and Shrira introduce the concept of a \textit{promise}: an efficient way to perform asynchronous remote procedure calls in a type-safe way~\cite{liskov1988promises}. Simply put, a promise is a placeholder for a value that will be available in the future. When the initial call is made, a promise is created and the asynchronous call to compute the value of the promise runs in parallel with the rest of the program. When the call completes, the value can be ``claimed'' by the caller. 13 | 14 | An excerpt motivation from \textit{Promises: linguistic support for efficient asynchronous procedure calls in distributed systems (Liskov and Shrira, PLDI 1988)}: 15 | 16 | \begin{quote} 17 | ``Remote procedure calls have come to be the preferred method of communication in a distributed system because programs that use procedures are easier to understand and reason about than those that explicitly send and receive messages. However, remote calls require the caller to wait for a reply before continuing, and therefore can lead to lower performance than explicit message exchange.'' 18 | \end{quote} 19 | 20 | The general motivation behind the work by Liskov and Shrira can be thought as the following critiques of two models of distributed programming. 21 | \begin{itemize} 22 | \item The Remote Procedure Call (RPC) paradigm is preferable by programmers because it is a familiar programming model. However, because of the synchronous nature of RPC, this model does not scale in terms of performance. 23 | \item The message passing paradigm is harder for programmers to reason about, but provides the benefit of decoupling of request and response, allowing for asynchronous programming and the subsequent performance benefits. 24 | \end{itemize} 25 | 26 | \textit{Promises} attempts to bridge this gap by combining the remote procedure call style of building applications, with the asynchronous execution model seen in systems that primarily use message passing. 27 | 28 | The first challenge in combining these two programming paradigms for distributed programming is that of order. Synchronous RPC imposes a total order across all of the calls in an application: one call will fully complete, from request to response, before moving to the next call, given a single thread of execution. If we move to an asynchronous model of RPC, we must have a way to block for a given value, or result, of an asynchronous RPC if required for further processing. 29 | 30 | Promises does this by imagining the concept of a \textit{call-stream}. A \textit{call-stream} is nothing more than a stream of placeholder values for each asynchronous RPC issued by a client. Once a RPC is issued, the \textit{promise} is considered \textit{blocked} asynchronous execution is performed, and once the value has been computed, the \textit{promise} is considered \textit{ready} and the value can be \textit{claimed} by the caller. If an attempt to \textit{claim} the value is issued before the value is computed, execution blocks until the value is available. The stream of placeholder values serves as an implicit ordering of the requests that are issued; in the Argus~\cite{liskov1988distributed} system that served as the implementation platform for this work, multiple streams were used and related operations sequenced together in the same stream\footnote{Promises also provide a way for stream composition, where processes read values from one or more streams once they are \textit{ready}, fulfilling placeholder \textit{blocked} promises in other streams. One classic implementation of stream composition using \textit{promises} is the Sieve of Eratosthenes.}. 31 | 32 | \subsection{Impact and Implementations} 33 | 34 | While promises originated as a technique for decoupling values from the computations that produced them, promises, as proposed by Liskov and Shrira mainly focused on reducing latency and improving performance of distributed computations. The majority of programming languages in use today by practitioners contain some notion of \textit{futures} or \textit{promises}. Below, we highlight a few examples. 35 | 36 | The Oz~\cite{henz1993oz} language, designed for the education of programmers in several different programming paradigms, provides a functional programming model with single assignment variables, streams, and promises. Every variable in Oz is a dataflow, and therefore every single value in the system is a promise. Both Distributed Oz~\cite{haridi1997overview} and Derflow (an implementation of Oz in the Erlang programming language)~\cite{Bravo:2014:DDD:2633448.2633451} provide distributed versions of the Oz programming model. The Akka library for Scala also provides Oz-style dataflow concurrency with Futures. 37 | 38 | More recently, promises have been repurposed by the JavaScript community to allow for asynchronous programs to be written in direct style instead of continuation-passing style. ECMAScript 6 contains a native Promise object, that can be used to perform asynchronous computation and register callback functions that will fire once the computation either succeeds or fails~\cite{wiki:futures}. 39 | -------------------------------------------------------------------------------- /rpc.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/cmeiklejohn/PMLDC/d9bbfe20f87699aa5c6c09815ac0eb20c2e1d141/rpc.pdf -------------------------------------------------------------------------------- /rpc.tex: -------------------------------------------------------------------------------- 1 | \subsection{Relevant Reading} 2 | 3 | \begin{itemize} 4 | \item \textit{A Critique of the Remote Procedure Call Paradigm}, Tanenbaum and van Renesse, 1987~\cite{tanenbaum1987critique}. 5 | \item \textit{A Note On Distributed Computing}, Kendall, Waldo, Wollrath, Wyant, 1994~\cite{kendall1994note}. 6 | \item \textit{It's Just A Mapping Problem}, Vinoski, 2003~\cite{vinoski2003s}. 7 | \item \textit{Convenience Over Correctness}, Vinoski, 2008~\cite{vinoski2008convenience}. 8 | \end{itemize} 9 | 10 | \subsection{Commentary} 11 | 12 | \begin{quote} 13 | ``Does developer convenience really trump correctness, scalability, performance, separation of concerns, extensibility, and accidental complexity?''~\cite{vinoski2008convenience} 14 | \end{quote} 15 | 16 | \subsubsection{Timeline} 17 | 18 | \begin{itemize} 19 | 20 | \item{1974:} RFC 674, \\ ``Procedure Call Protocol Documents, Version 2'' \\ 21 | RFC 674 attempts to define a general way to share resources across \textbf{all 70 nodes} of the Internet. This work is performed at Bolt, Beranek and Newman (BBN Technologies).\footnote{Fun fact: BBN's Internet division would later become Genuity and then Level3 out of bankruptcy and has autonomous system number 1 (ASN 1).} 22 | 23 | \item{1975:} RFC 684, \\ ``A Commentary on Procedure Calling as a Network Protocol'' \\ 24 | First outlines the problems of RPC and how they related to fundamental problems in distributed systems. 25 | 26 | \item{1976:} RFC 707, \\ ``A High-Level Framework for Network-Based Resource Sharing'' \\ 27 | An attempt to ``mechanize'' the TELNET and FTP protocols through a generalization to functions. 28 | 29 | \item{1984:} ``Implementing Remote Procedure Calls''~\cite{birrell1984implementing} \\ 30 | In this work, the \textbf{Cedar RPC} mechanism is designed at Xerox PARC. 31 | 32 | \item{1987:} Distribution and Abstract Types in Emerald~\cite{1702134} \\ 33 | One of the first major implementations of distributed objects containing distribution-specific calling conventions, such as \textit{call-by-move}. 34 | 35 | \item{1987:} A Critique of the Remote Procedure Call Paradigm~\cite{tanenbaum1987critique} \\ 36 | Tanenbaum and van Renesse provide a criticism, similar to the one in RFC 684, on why the RPC model is the wrong model for distributed computing. 37 | 38 | \item{1988:} Distributed Programing in Argus~\cite{liskov1988distributed} \\ 39 | In Argus, atomic units of computation referred to as ``guardians'' coordinate application of effects and rollback. 40 | 41 | \item{1988:} RFC 1057, \\ ``Remote Procedure Call Protocol Specification, Version 2'' \\ 42 | Defines \textbf{sunrpc} as a standard along with its supporting infrastructure, such as portmapper, that's used in the creation of NFS. 43 | 44 | \item{1991:} CORBA 1.0 \\ 45 | CORBA 1.0 introduces distributed objects. 46 | 47 | \item{1994:} A Note On Distributed Computing~\cite{kendall1994note} \\ 48 | Kendall et al.\ also talk in great length about why the RPC model, extended to objects, is problematic. 49 | 50 | \item{1996:} A Distributed Object Model for the Java System~\cite{wollrath1996distributed} \\ 51 | Introduces the Java RMI system. 52 | 53 | \item{1997:} CORBA 2.0 \\ 54 | CORBA 2.0 represents the major release of CORBA that most people are familiar with. 55 | 56 | \item{1999-:} EJB, XML-RPC, SOAP, REST, Thrift, Finagle, gRPC, etc. \\ 57 | Modern day RPC mechanisms. 58 | 59 | \end{itemize} 60 | 61 | \subsubsection{Overview} 62 | 63 | Remote Procedure Call (RPC) is a general term for executing a subroutine in a different address space without writing the actual code used to perform the remote execution. To provide an example, we can imagine a user wishing to invoke the random number generator function on another machine, but, the only difference between the local and remote invocation is supplying an additional node identifier where it should occur. While not the first implementation, because it was preceded by Apollo Computer's Network Computing System (NCS), the first major implementation to be widely known and adopted was the SunRPC mechanism, from Sun Microsystems, used to back their Network File System (NFS). 64 | 65 | Remote Procedure Call mechanisms you may be more familiar with are Java's Remote Method Invocation (RMI), it's predecessor Modula-3's Network Objects, XML-RPC, SOAP, CORBA, Avro, Facebook's, (now Apache) Thrift, Google's Protocol Buffers with Stubby, Twitter's Finagle, and Google's gRPC. 66 | 67 | \subsubsection{RFC 684} 68 | 69 | \begin{quote} 70 | ``Rather, we take exception to PCP’s underlying premise: that the procedure calling discipline is the starting point for building multi-computer systems.'' 71 | \end{quote} 72 | 73 | RFC 684 is commentary on RFC 674 that introduced the Procedure Call Paradigm, Version 2 (PCP). This commentary highlights, what boils down to, three major points from a critical analysis of the Procedure Call Paradigm. 74 | 75 | \begin{itemize} 76 | \item Procedure calling is usually a primitive operation; by primitive, it should be an extremely fast context switch operation performed by the underlying abstraction. 77 | \item Local and remote calls each have different cost profiles; remote calls can be delayed, and in the event of failure, may \textbf{never return}. 78 | \item Asynchronous message passing, or sending a message and waiting for a response when the response is needed, is a much better model because it makes the passing of messages \textbf{explicit}. 79 | \end{itemize} 80 | 81 | Following from these three points, we see a series of concerns develop about this programming paradigm, all of which become a common theme across the 40+ years in RPC's history. These are: 82 | 83 | \begin{itemize} 84 | \item Difficulty in recovery after malfunction or error. For instance, do we rollback or throw exceptions? How do we handle these errors? Can we just try again?\footnote{Some systems that attempt to address this include Liskov's promises~\cite{liskov1988promises} and Lee's \cite{lee2015implementing} work on a reusable framework for linearizability.} 85 | \item Difficulty in sequencing operations. If all calls are synchronous and some of these calls can fail, it can require a significant amount of code to ensure correct re-execution to preserve order moving forward. 86 | \item Remote Procedure Call forces \textbf{synchronous programming}: a method is invoked and the invoking process waits for a response. 87 | \item Backpressure, or blocking on previous actions completing, load-shedding, or dropping messages on the floor when the system is overloaded, and priority servicing become more difficult with the call-and-response model of Remote Procedure Call. 88 | \end{itemize} 89 | 90 | \subsubsection{RFC 707} 91 | 92 | \begin{quote} 93 | ``Because of this cost differential, the applications programmer must exercise discretion in his use of remote resources, even though the mechanics of their use will have been greatly simplified by the RTE. Like virtual memory, the procedure call model offers great convenience, and therefore power, in exchange for reasonable alertness to the possibilities of abuse.'' 94 | \end{quote} 95 | 96 | RFC 707 generalizes the ideas from RFC 684 and discusses the problem of resources sharing for services such as TELNET and FTP: each of these services presents a \textit{different} interface for interacting with it, which requires the operator to know the specific protocol for interacting with that service. In realizing that both services like TELNET and FTP follow the call-and-response model, the authors propose an alternative idea: rather than needing to know all of the available commands and protocols on the remote machine, can we define a generic interface for executing a remote procedure that takes an argument list and follows the call-and-response model? 97 | 98 | While we can, the problems of control flow and priority servicing outlined in RFC 684 remain; however, not enough that it prevents this model from being adopted by many systems in the future. 99 | 100 | \subsubsection{``A Critique of the Remote Procedure Call Paradigm''} 101 | 102 | \begin{quote} 103 | ``We propose the following test for a general-purpose RPC system. Imagine that two programmers are working on a project. Programmer 1 is writing the main program. Programmer 2 is writing a collection of procedures to be called by the main program. The subject of RPC has never been mentioned and both programmers assume that all their code will be compiled and linked together into a single executable binary program and run on a free-standing computer, not connected to any networks.'' 104 | 105 | ``At the very last minute, after all the code has been thoroughly tested, debugged, and documented and both programmers have quit their jobs and left the country, the project management is forced by unexpected, external circumstances to run the program on a distributed system. The main program must run on one computer, and each procedure must run on a different computer. We also assume that all the stub procedures are produced mechanically by a stub generating program.'' 106 | 107 | ``It is our contention that a large number of things may now go wrong due to the fact that RPC tries to make remote procedure calls look exactly like local ones, but is unable to do it perfectly. Many of the problems can be solved by modifying the code is various ways, but then the transparency is lost. Once we admit that true transparency is impossible, and that programmers must know which calls are remote and which ones are local, we are faced with the question of whether a partially transparent mechanism is really better than one that was designed specifically for remote access and makes no attempt to make remote computations look local at all.'' 108 | \end{quote} 109 | 110 | Tanenbaum and van Renesse directly attack the Remote Procedure Call paradigm stating that it is fundamentally wrong to treat local and remote calls the same. They state that the transparency that RPC tries to achieve is impossible and posit that a protocol that is designed specifically for remote access is better. 111 | 112 | The authors go on to describe an alternative paradigm: a ``virtual circuit''. This alternative comes off sounding very similar to Distributed Erlang running with TCP: non-blocking send and receive operations across a sliding-window network protocol. 113 | 114 | \begin{quote} 115 | ``An alternative model that does not attempt any transparency in the first place is the virtual circuit model (e.g., the ISO OSI reference model [Zimmermann, 1980]). In this model, a full-duplex virtual circuit using a sliding window protocol is set up between the client and server. If nonblocking SEND and RECEIVE primitives are used, incoming messages can be signalled by interrupts to allow the maximum amount of parallelism between communication and computation.'' 116 | \end{quote} 117 | 118 | Tanenbaum and van Renesse outline many of the same criticisms as RFC 684: latency, lack of parallelism, lack of streaming, exception handling and failure detection. In addition, they raise a few additional criticisms that we will highlight below. 119 | 120 | \paragraph{Unexpected Messages} With a synchronous call-and-response protocol between client and server, how is one supposed to send a message to the client that is unexpected if that client is not waiting for a message? 121 | \paragraph{Single Threaded Servers} How does one handle the situation where the server does not have a response ready for a client immediately -- for instance, if it needs to wait on input from another server? Not only does this block the server, but it also blocks the client from proceeding further with local computation.\footnote{Our section on promises talks about this problem in depth~\cite{liskov1988promises}.} 122 | 123 | \begin{quote} 124 | ``There is, in fact, no protocol that guarantees that both sides definitely and unambiguously know that the RPC is over in the face of a lossy network.''~\cite{tanenbaum1987critique} 125 | \end{quote} 126 | 127 | \paragraph{The Two Army Problem} How do we handle requesting irreplaceable data? (or, more generally have two servers come to agreement that some RPC was successfully executed and the response received.) Well, we can acknowledge the receipt of that information, but then we would also have to acknowledge the acknowledgements to be sure the acknowledgement was delivered, right? This topic, central to the agreement problem, has been discussed extensively in distributed systems literature~\cite{halpern1990knowledge}. 128 | 129 | \paragraph{Parameters} Tanenbaum and van Renesse discuss the problem of parameter passing and parameter marshalling. This issue becomes exacerbated with references in object systems such as CORBA, where specific distributed references have to be used to ensure they remain accessible and valid over time. 130 | 131 | \begin{quote} 132 | ``However, if there are reference parameters or pointers, things are more complicated. While it is obviously possible to copy pointers into the message, when the server tries to use them, it will not work correctly because the object pointed to will not be present.'' 133 | ``Two possible solutions suggest themselves, each with major drawbacks. The first solution is to have the client stub not only put the pointer itself in the message, but also the thing pointed to. However, if the thing pointed to is the middle of a complex list structure containing pointers in both directions, sublists, etc., copying the entire structure into the message will be expensive. Furthermore, when it arrives, the structure will have to be reassembled at the same memory addresses that it had on the client side, because the server code will just perform indirection operations on the pointers as though it were working on local variables.'' 134 | 135 | ... 136 | 137 | ``The other solution is just to pass the pointer itself. Every time the pointer is used, a message is sent back to the client to read or write the relevant word. The problem here is that we violate one of the basic rules: the compiler should not have to know that it is dealing with RPC. Normally the code produced for reading from a pointer is just to indirect from it. If remote pointers work differently from local pointers, the transparency of the RPC is lost.'' 138 | \end{quote} 139 | 140 | \paragraph{Idempotence} Finally, the authors highlight the problem of providing exactly-once semantics across the network and the power of idempotence. Tanenbaum and van Renesse say it very concisely. 141 | 142 | \begin{quote} 143 | ``Suppose a client does a nonidempotent RPC and the server crashes one machine instruction after finishing the operation, but before the server stub has had a chance to reply. The client stub times out and sends the request again. If the server has rebooted by then, there is a chance that the operation will be performed two or more times and thus fail.'' 144 | \end{quote} 145 | 146 | \subsubsection{CORBA} 147 | 148 | The Common Object Request Broker Architecture (CORBA) is an abstraction for object-oriented languages, popularized by C++, that allows you to communicate between different languages and different address spaces running on different machines. CORBA relied on the use of an Interface Definition Language (IDL) for specifying the interfaces of remote classes of objects; this IDL was used to generate stubs of what the remote systems object interfaces appeared as on the local machine. These IDL's would be used to generate mappings between the abstract interfaces provided by the IDL's and the actual implementations in languages such as C++ and Java. 149 | 150 | CORBA attempted to provide several benefits to the application developer: language independence, OS-independence, architecture-independence, static typing through a mapping of abstract types in the IDL to machine and language specific implementations of those types, and object transfer, where objects can be migrated over the wire between different machines. CORBA's promise was that through the use of mapping that remote calls could appear as local calls, and that distributed systems related exceptions could be mapped into local exceptions and handled by local exception handling mechanisms. 151 | 152 | However, as Vinoski points out in 2003, the evaluation of programming languages and abstractions based on transparency alone is flawed: 153 | 154 | \begin{quote} 155 | ``The goal is to merge middleware abstractions directly into the realm of the programming language, minimizing the impedance mismatch between the programming language world and the middleware world. For example, mappings make request invocations on distributed objects and services appear as normal programming-language function calls, and they map distributed system exceptions into native programming language exception-handling mechanisms.''~\cite{vinoski2003s} 156 | \end{quote} 157 | 158 | \subsubsection{``A Note On Distributed Computing''} 159 | 160 | \begin{quote} 161 | ``It is the thesis of this note that this unified view of objects is mistaken.''~\cite{kendall1994note} 162 | \end{quote} 163 | 164 | In this pinnacle Waldo paper, they argue that ``it is perilous to ignore the differences'' between local and distributed computing and that the unified view of objects is flawed.~\cite{kendall1994note} They cite two independent groups of work, the systems of Emerald and Argus, and the modern equivalents of those systems, Microsoft's DCOM and OMG's CORBA: all systems that extended the RPC mechanism to objects and method invocation. 165 | 166 | We can summarize the ``promise'' of the unified view of objects, as Waldo does in the paper. 167 | 168 | \begin{itemize} 169 | \item Applications are designed using interfaces on the local machine. 170 | \item Objects are relocated, because of the transparency of location, to gain the desired application performance. 171 | \item The application is then tested with ``real bullets.'' 172 | \end{itemize} 173 | 174 | This strategy for the design of a distributed application has two fundamental flaws. First, that the design of an application can be done with interfaces alone, and that this design will be discovered during the development of the application. Second, that application correctness does not depend on object location, but only the interfaces to each object. 175 | 176 | Waldo swats down this design with the ``three false principles''~\cite{kendall1994note}: 177 | 178 | \begin{quote} 179 | ``there is a single natural object-oriented design for a given application, regardless of the context in which that application will be deployed'' 180 | \end{quote} 181 | 182 | \begin{quote} 183 | ``failure and performance issues are tied to the implementation of the components of an application, and consideration of these issues should be left out of an initial design'' 184 | \end{quote} 185 | 186 | \begin{quote} 187 | ```the interface of an object is independent of the context in which that object is used'' 188 | \end{quote} 189 | 190 | \subsubsection{``Every 10 years...''} 191 | 192 | \begin{quote} 193 | ``The hard problems in distributed computing are not the problems of getting things on and off the wire.''~\cite{kendall1994note} 194 | \end{quote} 195 | 196 | Waldo argues that every ten years we approach the problem of attempting to unify the view of local and remote computing and run into the same problems, again and again: \textbf{local and remote computing are fundamentally different.} 197 | 198 | \paragraph{Latency} Waldo argues that the most obvious difference should be the issues of latency: if you ignore latency, you will end up directly impacting software performance. He states that is it wrong to ``rely on steadily increasing speed of the underlying hardware'' and that it is not always possible to test with ``real bullets''. Performance analysis and relocation is non-trivial and a design that is optimal at one point will not necessarily stay optimal. 199 | 200 | \paragraph{Memory Access} His criticisms of memory access are very specific to CORBA and its predecessors in the object space: objects can retain pointers to objects in the same address space, but once moved these pointers will no longer be valid. He states that one approach to solving the problem is distributed shared memory, but more practically techniques such as marshalling or replacement by CORBA references, which are marshalled for distributed access, are used. 201 | 202 | \paragraph{Partial Failure} Finally, the most fundamental problem: partial failure. In local computing, he argues, failures are detectable, total, and result in a return of control. This is not true with distributed computing: independent components may fail, failures are partial and a failure of a link is indistinguishable from a failure of a remote processor. 203 | 204 | As always, Waldo says it best: 205 | 206 | \begin{quote} 207 | ``The question is not `can you make remote method invocation look like local method invocation?' but rather `what is the price of making remote method invocation identical to local method invocation?'''~\cite{kendall1994note} 208 | \end{quote} 209 | 210 | Waldo argues that there is only two paths forward if we want to achieve the goal of the unified object model. 211 | 212 | \begin{itemize} 213 | \item Treat all objects as local. 214 | \item Treat all objects as remote. 215 | \end{itemize} 216 | 217 | However, he states that if the real goal is to ``make distributed computing as simple as local computing'', that the only real path forward is the first. This approach, he believes, is flawed, and that distribution is fundamentally different, and must be treated so. 218 | 219 | \begin{quote} 220 | ``This approach would also defeat the overall purpose of unifying the object models. The real reason for attempting such a unification is to make distributed computing more like local computing and thus make distributed computing easier. This second approach to unifying the models makes local computing as complex as distributed computing.''~\cite{kendall1994note} 221 | \end{quote} 222 | 223 | The paper provides two examples of where this paradigm is problematic, but we will highlight one case, that builds upon RPC. 224 | 225 | \subsubsection{Network File System} 226 | 227 | Sun Microsystem's \textbf{Network File System (NFS)}, built upon RPC, is one of the first distributed file systems to gain popularity. Network File System adhered to the existing filesystem API, but introduced an entire new class of failures resulting from network partitions, partial failure, and high latency. Network File System is a stateless protocol implemented in UDP; the decision to implement it this way is motivated by crash recovery avoidance and protocol simplification~\cite{Sandberg:1988:DIS:59309.59338}. 228 | 229 | Network File System operated in two modes: soft mounting and hard mounting. Soft mounting introduced a set of new error codes related to the additional ways file operations could fail: these error codes were not known by existing UNIX applications and led to smaller adoption of this approach. Hard mounting introduced the opposite behavior for failures related to the network: \textbf{operations would block until they could be completed successfully}. 230 | 231 | \textit{It's just a mapping problem, right?}~\cite{vinoski2003s} 232 | 233 | \subsubsection{``Convenience Over Correctness''} 234 | 235 | \begin{quote} 236 | ``We have a general-purpose imperative programming-language hammer, so we treat distributed computing as just another nail to bend to fit the programming models.''~\cite{vinoski2008convenience} 237 | \end{quote} 238 | 239 | Vinoski highlights three very important points in ``Convenience Over Correctness'' criticism of RPC many years later. 240 | 241 | \begin{itemize} 242 | \item \textbf{Interface Definition Languages (IDL) ``impedance mismatch''}: base types may be easy to map, but more complex types may be less so. 243 | \item \textbf{Scalability:} the RPC paradigm does not have any first class support for caching, or mechanisms for mitigating high latency, and is remains a rather primitive operation to build distributed applications with. 244 | \item \textbf{Representational State Transfer (REST)}: REST is good: it specifically addresses the problem of managing distributed resources; but most frameworks built on top of REST alter the abstraction and present something that repeats the problem~\footnote{For instance, if one was to build an object model on top of REST.}. 245 | \end{itemize} 246 | 247 | \subsection{Impact and Implementations} 248 | 249 | Remote Procedure Call (RPC) has been around for a very long time and while many opponents of RPC have been extremely critical of it, it still remains one of the most widely used way of writing distributed applications. There's an unprecedented amount of use of frameworks for RPC such as Google's Protocol Buffers and Apache's Thrift used in production applications. 250 | 251 | Frameworks such as Google's gRPC for HTTP/2.0 and Twitter's Finagle continue to reduce the amount of complexity in building applications with them, attempting to bring RPC to an even wider audience. For instance, Twitter's Finagle is protocol-independent and attempts to deal with the problems of distribution directly. Finagle does this through the use of futures, which allow composition and explicit sequencing; Google's gRPC does similar. These frameworks claim that since they do not attempt to hide the fact that the calls are remote, that it provides a better abstraction. However, now we have returned to the aforementioned problem of soft mounting in NFS: explicitly handling the flow control from the array of possible network exceptions that could occur, though mitigated through the use of promises/futures~\footnote{Our related post on promises and futures challenges whether that abstraction is right either.}. 252 | 253 | But, the question we have to ask ourselves is whether the abstraction of an individual method invocation or function call is the correct paradigm for building distributed applications. Is the idea of treating all remote objects as local, and making distribution as transparent as possible, the correct decision moving forward? Does it mask failure modes that will allow developers to build applications that will not operate correctly under partial failure? 254 | 255 | When we talk about distributed programming languages today, many developers equate this to \textbf{programming languages} that can be, and have been used, to build \textbf{distributed systems.} For example, any language with concurrency primitives and the ability to open a network socket would suffice to build these systems; this does not imply that these languages are distributed programming languages. 256 | 257 | But, a \textbf{distributed programming language} is where the distribution is \textbf{first class}. Languages like Go are more closely related to \textbf{concurrent} languages, where concurrency is first class; and, while concurrency is a requirement for distribution, these are different topics. CORBA is an example of trying to make distribution first class in languages such as C++. 258 | 259 | Erlang~\cite{claessen2005semantics, svensson2007more, svensson2007programming} is one language where distribution is first class. Erlang has a RPC mechanism, but prefers the use of asynchronous message passing between processes. In fact, the RPC mechanism in Erlang is implemented using Erlang's native asynchronous message passing. While you can peek under the covers and see \textbf{where} processes are running, Erlang tries to make the programmer assume that each process could be executing on a different node. Motivated by the expressiveness of the design, both Distributed Process, from the Cloud Haskell group, and Akka, in Scala, are examples that attempt to bring Erlang-style semantics to Haskell and Scala, respectively. 260 | 261 | One approach taken in the Scala community for distributed programming is serializable closures~\cite{miller2014spores}, or what's known as the function shipping paradigm. In this model, entire functions are moved across the network, where the type system is used to ensure that all of values in scope can be properly serialized or marshalled as these closures move across the network. While this solves some of the problematic points in systems like CORBA and DCOM, it does not have a solution for the problem of how to ensure exactly-once execution of functions, or to handle partial failure where you can not distinguish the failure of the remote node from the network. 262 | 263 | Languages like Bloom, and Bloom\textsubscript{L}, and Lasp~\cite{alvaro2011consistency, conway2012logic, meiklejohn2015lasp} take an alternative approach: can we build abstractions that rely on asynchronous programming, very weak ordering and structuring our applications in such a way where they are tolerant to network anomalies such as message duplication and message re-ordering. While this approach is more expensive in terms of state transmission, and more restrictive in what types of computations can be expressed, this style of programming supports the creation of \textit{correct-by-construction} distributed applications. These applications are highly tolerant to anomalies resulting from network failures by assuming all actors in the system are distributed. The restrictions, however, might prohibit wide adoption of these techniques. 264 | 265 | So, we ask again: 266 | 267 | \begin{quote} 268 | ``Does developer convenience really trump correctness, scalability, performance, separation of concerns, extensibility, and accidental complexity?''~\cite{vinoski2008convenience} 269 | \end{quote} -------------------------------------------------------------------------------- /todo.tex: -------------------------------------------------------------------------------- 1 | % A View of Cloud Computing 2 | 3 | % RPC: 4 | % Implementing Remote Procedure Calls (1984) 5 | % A Distributed Object Model for the Java System (1996) 6 | % A Note on Distributed Computing (1994) 7 | % A Critique of the Remote Procedure Call Paradigm (1988) 8 | % Convenience Over Correctness (2008) 9 | 10 | % Futures: 11 | % Multilisp: A language for concurrent symbolic computation (1985) 12 | % Promises: linguistic support for efficient asynchronous procedure calls in distributed systems (1988) 13 | % Oz dataflow concurrency. Selected sections from the textbook Concepts, Techniques, and Models of Computer Programming. 14 | % Sections to read: 1.11: Dataflow, 2.2: The single-assignment store, 4.93-4.95: Dataflow variables as communication channels ...etc. 15 | % The F# asynchronous programming model (2011) 16 | % Your Server as a Function (2013) 17 | 18 | % Message passing: 19 | % Concurrent Object-Oriented Programming (1990) 20 | % Concurrency among strangers (2005) 21 | % Scala actors: Unifying thread-based and event-based programming (2009) 22 | % Erlang (2010) 23 | % Orleans: cloud computing for everyone (2011) 24 | 25 | % Distributed programming languages: 26 | % Distributed Programming in Argus (1988) 27 | % Distribution and Abstract Types in Emerald (1987) 28 | % The Linda alternative to message-passing systems (1994) 29 | % Orca: A Language For Parallel Programming of Distributed Systems (1992) 30 | % Ambient-Oriented Programming in AmbientTalk (2006) 31 | 32 | % CRDTs: 33 | % Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services (2002) 34 | % Conflict-free Replicated Data Types (2011) 35 | % A comprehensive study of Convergent and Commutative Replicated Data Types (2011) 36 | % CAP Twelve Years Later: How the "Rules" Have Changed (2012) 37 | % Cloud Types for Eventual Consistency (2012) 38 | 39 | % Languages and Consistency: 40 | % Consistency Analysis in Bloom: a CALM and Collected Approach (2011) 41 | % Logic and Lattices for Distributed Programming (2012) 42 | % Consistency Without Borders (2013) 43 | % Lasp: A language for distributed, coordination-free programming (2015) 44 | 45 | % Languages Extended for Distribution: 46 | % Distributed Erlang 47 | % Cloud Haskell 48 | % Alice ML 49 | % Termite Scheme 50 | % ML5 51 | % MBrace 52 | 53 | % Global Sequence Protocol 54 | 55 | % Gosling's Oak 56 | % Multicast invocation 57 | % RPC "missing the point" post. 58 | % Concurrent programming analogy 59 | % StarOS 60 | % SOS 61 | % Accent 62 | % Gaggles 63 | % Virtual Synchrony 64 | % Hidden Hand 65 | % V 66 | % Actors 67 | % Linda 68 | % SAIL 69 | % Avalon, Herlihy 70 | 71 | % Michael Scott paper 72 | % Liskov paper 73 | 74 | % Distributed unification via Oz 75 | --------------------------------------------------------------------------------