├── papers
    └── whitepaper
    │   ├── media
    │       ├── image10.png
    │       ├── image6.png
    │       ├── image7.png
    │       ├── image8.png
    │       └── image9.png
    │   ├── the-graph-whitepaper.pdf
    │   ├── README.md
    │   └── the-graph-whitepaper.tex
├── .gitignore
├── specs
    ├── graph-protocol-hybrid-network
    │   ├── graphql-api
    │   │   └── README.md
    │   ├── graph-protocol-v1-spec.pdf
    │   ├── assets
    │   │   ├── query-node-architecture.png
    │   │   └── query-client-architecture.png
    │   ├── CHANGELOG.md
    │   ├── SUMMARY.md
    │   ├── DOCUMENT-CONVERSION.md
    │   ├── deeplists.tex
    │   ├── README.md
    │   ├── datasets
    │   │   └── README.md
    │   ├── subgraph-manifest
    │   │   └── README.md
    │   ├── data-modeling
    │   │   └── README.md
    │   ├── mappings-api
    │   │   └── README.md
    │   ├── rpc-api
    │   │   └── README.md
    │   ├── query-processing
    │   │   └── README.md
    │   ├── payment-channels
    │   │   └── README.md
    │   ├── architecture-overview
    │   │   └── README.md
    │   ├── mechanism-design
    │   │   └── README.md
    │   ├── messages
    │   │   └── README.md
    │   └── read-interface
    │   │   └── README.md
    └── book.toml
├── .travis.yml
└── README.md


/papers/whitepaper/media/image10.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/graphprotocol/research/HEAD/papers/whitepaper/media/image10.png


--------------------------------------------------------------------------------
/papers/whitepaper/media/image6.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/graphprotocol/research/HEAD/papers/whitepaper/media/image6.png


--------------------------------------------------------------------------------
/papers/whitepaper/media/image7.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/graphprotocol/research/HEAD/papers/whitepaper/media/image7.png


--------------------------------------------------------------------------------
/papers/whitepaper/media/image8.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/graphprotocol/research/HEAD/papers/whitepaper/media/image8.png


--------------------------------------------------------------------------------
/papers/whitepaper/media/image9.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/graphprotocol/research/HEAD/papers/whitepaper/media/image9.png


--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | .DS_Store
2 | **/*.aux
3 | **/*.fdb_latexmk
4 | **/*.fls
5 | **/*.log
6 | **/*.out
7 | **/*.synctex.gz
8 | specs/book
9 | 


--------------------------------------------------------------------------------
/papers/whitepaper/the-graph-whitepaper.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/graphprotocol/research/HEAD/papers/whitepaper/the-graph-whitepaper.pdf


--------------------------------------------------------------------------------
/specs/graph-protocol-hybrid-network/graphql-api/README.md:
--------------------------------------------------------------------------------
1 | This spec has been moved to: https://github.com/graphprotocol/specs/tree/master/graphql-api
2 | 


--------------------------------------------------------------------------------
/specs/graph-protocol-hybrid-network/graph-protocol-v1-spec.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/graphprotocol/research/HEAD/specs/graph-protocol-hybrid-network/graph-protocol-v1-spec.pdf


--------------------------------------------------------------------------------
/specs/graph-protocol-hybrid-network/assets/query-node-architecture.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/graphprotocol/research/HEAD/specs/graph-protocol-hybrid-network/assets/query-node-architecture.png


--------------------------------------------------------------------------------
/specs/graph-protocol-hybrid-network/assets/query-client-architecture.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/graphprotocol/research/HEAD/specs/graph-protocol-hybrid-network/assets/query-client-architecture.png


--------------------------------------------------------------------------------
/specs/book.toml:
--------------------------------------------------------------------------------
 1 | [book]
 2 | title = "Hybrid Network"
 3 | authors = ["Developers of The Graph project"]
 4 | description = "Hybrid Network Spec."
 5 | src = "graph-protocol-hybrid-network"
 6 | 
 7 | [build]
 8 | create-missing = false
 9 | 
10 | [output.html]
11 | mathjax-support = true
12 | 


--------------------------------------------------------------------------------
/specs/graph-protocol-hybrid-network/CHANGELOG.md:
--------------------------------------------------------------------------------
 1 | # Changelog
 2 | ## Unreleased
 3 | ### Changed
 4 | - Participation reward for Indexers is now a function of curation signal, rather than query volume. See [#85](https://github.com/graphprotocol/research/issues/85) for additional context.
 5 | - Curation reward now paid through fees rather than inflation.
 6 | - `v` field on Attestation and Balance Proof messages changed to type `uint8`
 7 | 
 8 | ## 0.0.1 - 2019-01-25
 9 | 
10 | ### Added
11 |  - Specification open-sourced!
12 |  - Changelog created
13 | 


--------------------------------------------------------------------------------
/.travis.yml:
--------------------------------------------------------------------------------
 1 | language: rust
 2 | sudo: false
 3 | 
 4 | cache:
 5 |   - cargo
 6 | 
 7 | rust:
 8 |   - stable
 9 | 
10 | before_script:
11 |   - (test -x $HOME/.cargo/bin/cargo-install-update || cargo install cargo-update)
12 |   - (test -x $HOME/.cargo/bin/mdbook || cargo install mdbook)
13 |   - cargo install-update -a
14 | 
15 | script:
16 |   - mdbook build specs && mdbook test specs
17 | 
18 | deploy:
19 |   provider: pages
20 |   skip-cleanup: true
21 |   github-token: $GITHUB_TOKEN
22 |   local-dir: specs
23 |   keep-history: false
24 |   on:
25 |     branch: master
26 | 


--------------------------------------------------------------------------------
/specs/graph-protocol-hybrid-network/SUMMARY.md:
--------------------------------------------------------------------------------
 1 | [Hybrid Network](README.md)
 2 | - [Architecture Overview](architecture-overview/README.md)
 3 | - [Mechanism Design](mechanism-design/README.md)
 4 | - [Query Processing](query-processing/README.md)
 5 | - [Payment Channels](payment-channels/README.md)
 6 | - [Read Interface](read-interface/README.md)
 7 | - [Messages](messages/README.md)
 8 | - [JSON-RPC API](rpc-api/README.md)
 9 | - [Datasets](datasets/README.md)
10 | - [Data Modeling](data-modeling/README.md)
11 | - [Subgraph Manifest](subgraph-manifest/README.md)
12 | - [Mappings API](mappings-api/README.md)
13 | 


--------------------------------------------------------------------------------
/specs/graph-protocol-hybrid-network/DOCUMENT-CONVERSION.md:
--------------------------------------------------------------------------------
 1 | # Document Conversion
 2 | 
 3 | Below are instructions for converting the markdown specification to a PDF.
 4 | 
 5 | 1. Install [Pandoc](https://pandoc.org/installing.html)
 6 | 1. Install [LaTeX](https://www.latex-project.org/) on your machine
 7 | 1. (Mac OSX) You may also need to install `rsvg-convert` by running `brew install librsvg` in the terminal.
 8 | 1. Navigate to *this folder* in your Terminal
 9 |    - Run the following command:
10 |  ```bash
11 |  pandoc -f gfm -H deeplists.tex --resource-path ./architecture-overview -s \
12 |  -o graph-protocol-v1-spec.pdf README.md ./architecture-overview/README.md \
13 |  ./mechanism-design/README.md ./query-processing/README.md ./payment-channels/README.md \
14 |  ./read-interface/README.md ./messages/README.md ./rpc-api/README.md \
15 |  ./datasets/README.md ./data-modeling/README.md ./subgraph-manifest/README.md \
16 |  ./mappings-api/README.md
17 |  ```
18 | 


--------------------------------------------------------------------------------
/specs/graph-protocol-hybrid-network/deeplists.tex:
--------------------------------------------------------------------------------
 1 |    \usepackage{enumitem}
 2 |    \setlistdepth{9}
 3 | 
 4 |    \setlist[itemize,1]{label=$\bullet$}
 5 |    \setlist[itemize,2]{label=$\bullet$}
 6 |    \setlist[itemize,3]{label=$\bullet$}
 7 |    \setlist[itemize,4]{label=$\bullet$}
 8 |    \setlist[itemize,5]{label=$\bullet$}
 9 |    \setlist[itemize,6]{label=$\bullet$}
10 |    \setlist[itemize,7]{label=$\bullet$}
11 |    \setlist[itemize,8]{label=$\bullet$}
12 |    \setlist[itemize,9]{label=$\bullet$}
13 |    \renewlist{itemize}{itemize}{9}
14 | 
15 |    \setlist[enumerate,1]{label=$\arabic*.$}
16 |    \setlist[enumerate,2]{label=$\alph*.$}
17 |    \setlist[enumerate,3]{label=$\roman*.$}
18 |    \setlist[enumerate,4]{label=$\arabic*.$}
19 |    \setlist[enumerate,5]{label=$\alpha*$}
20 |    \setlist[enumerate,6]{label=$\roman*.$}
21 |    \setlist[enumerate,7]{label=$\arabic*.$}
22 |    \setlist[enumerate,8]{label=$\alph*.$}
23 |    \setlist[enumerate,9]{label=$\roman*.$}
24 |    \renewlist{enumerate}{enumerate}{9}
25 | 


--------------------------------------------------------------------------------
/papers/whitepaper/README.md:
--------------------------------------------------------------------------------
 1 | # The Graph Whitepaper [Deprecated]
 2 | 
 3 | See the [v1 protocol specification](../../specs/graph-protocol-v1/README.md) for the latest protocol design.
 4 | 
 5 | ## Install MacTeX / TeX Live
 6 | 
 7 | ```sh
 8 | brew cask install mactex
 9 | ```
10 | 
11 | After this, log in to a fresh shell to make sure all TeX Live commands are
12 | available in your `$PATH`.
13 | 
14 | ## Build PDF
15 | 
16 | ```sh
17 | pdflatex the-graph-whitepaper
18 | ```
19 | 
20 | ## Fonts
21 | 
22 | A list of font packages that are available in TeXlive out of the box is available
23 | here: https://tex.stackexchange.com/questions/59403/what-font-packages-are-installed-in-tex-live
24 | 
25 | To use any of these, you'd typically add
26 | ```latex
27 | \usepackage{gentium}
28 | ```
29 | to the header section of `the-graph-whitepaper.tex`, however, oftentimes the
30 | documentation of the packages describe additional parameters that are worth paying
31 | attention to.
32 | 
33 | These are typically found by going to the package on CTAN and following
34 | `Package Documentation` link under `Documentation`. See
35 | https://ctan.org/pkg/psnfss for an example.
36 | 


--------------------------------------------------------------------------------
/specs/graph-protocol-hybrid-network/README.md:
--------------------------------------------------------------------------------
 1 | # Graph Protocol Specification
 2 | 
 3 | **Version**: 0.0.1
 4 | 
 5 | **Stage**:
 6 | ![WIP Badge](https://img.shields.io/badge/stage-wip-%23C25F38.svg)
 7 | 
 8 | **Authors**:
 9 |  - [Brandon Ramirez](github.com/zerim)
10 | 
11 | ## Abstract
12 | This document presents *Graph Protocol* ("the protocol"), a protocol for indexing public blockchain data and querying this data via a decentralized network. The canonical network implementing the protocol is referred to as *The Graph* ("the network").
13 | 
14 | Graph Protocol falls into a category we refer to as a *layer 2 read-scalability* solution. Its purpose is to enable decentralized applications (dApps) to query public blockchain data efficiently and trustlessly via a service that, like blockchains and the Internet itself, operates as a public utility. This is in the interest of minimizing the role of brittle, centralized infrastructure seen in many "decentralized" application architectures today.
15 | 
16 | This specification covers the network architecture, protocol interfaces, algorithms, and economic incentives required to build a network that is robust, performant, cost-efficient, and enables a high margin of economic security for queries processed via the network.
17 | 
18 | ## Philosophy
19 | This spec defines a hybrid network design in which the core mechanisms are decentralized and run on the blockchain, but some building blocks are still centralized. A future version of this specification will target full decentralization. This is in keeping with our team's philosophy of shipping early and delivering immediate value, while incrementally decentralizing, as research and the state of external ecosystem dependencies progress.
20 | 
21 | See [this slide](https://www.slideshare.net/secret/AnB7pWnqZhiW2d/17) from this [recent research talk](https://www.youtube.com/watch?v=eRnYgXHQnlA&t=586s) for more info on this approach.
22 | 
23 | ## Disclaimer
24 | This spec defines a protocol that is still being implemented. Until a fully stable reference implementation exists, the specification is likely to change in breaking ways.
25 | 


--------------------------------------------------------------------------------
/specs/graph-protocol-hybrid-network/datasets/README.md:
--------------------------------------------------------------------------------
 1 | # Datasets
 2 | 
 3 | ## Object Diagram
 4 | ```ascii
 5 |            +--------------+     +-------------------+
 6 |            |              |     |                   |
 7 | Ethereum   | GNS Registry |     | Staking Contract  |
 8 |            |              |     | (Indexing Records)|
 9 |            |              |     |                   |
10 |            +--------------+     +-------------------+
11 |                   |1                    |1
12 |                   |                     |
13 | +---------------------------------------------------------------------------------+
14 |                   |                     |
15 |                   |     +------------+  |  +------------+
16 |                   |     |            |  |  |            |
17 |                   |    *|  Subgraph  | *|  |  Index     |
18 | IPFS              +----->  Manifest  +--v--+  Records   |
19 |                         |            |1   *|            |
20 |                         +-----+------+     +------------+
21 |                               |
22 |                  +------------+----------+
23 |                  |                       |
24 |         +--------v-------+       +-------v-------+
25 |         |                |       |               |
26 |         | Mapping        |       | Data Model    |
27 |         | (WASM Module)  |       | (GraphQL IDL) |
28 |         |                |       |               |
29 |         +----------------+       +---------------+
30 | 
31 | ```
32 | 
33 | ## Overview
34 | Datasets that may be queried through The Graph are referred to as *subgraphs* because they represent a subset of the data that is available to query in the network. Subgraphs are defined in a *subgraph manifest*, which is a top-level IPLD document that defines how Ethereum and IPFS data is ingested and loaded into The Graph. Importantly, while the subgraph manifest includes a logical data model for the dataset, it does not specify a specific storage format, database model, or index method. These are defined as *Index Records* and are associated with a subgraph manifest by indexing nodes in the Staking Contract on-chain.
35 | 
36 | Subgraph manifests are immutable and referenced according to the [IPLD CID v1 specification](https://github.com/ipld/cid#cidv1). This CID is referred to in this specification as a *Subgraph ID*. Mutable names may be assigned to subgraph IDs via the Graph Name Service (GNS). These are not consumed on-chain anywhere in the protocol and are mainly a convenience for users interacting with The Graph. These names may also be used in composing a unified global schema in the query interface of Query Nodes. See [Query Processing](../query-processing) for more information. In future versions of the protocol, names will play a more useful role in various forms of the subgraph composition.
37 | 
38 | ## Subgraph Creation
39 | Creating a subgraph involves the following steps, in no specific order:
40 | - Create subgraph manifest ([Subgraph Manifest](../subgraph-manifest))
41 | - Define data model ([Data Modeling](../data-modeling))
42 | - Define mappings ([Mappings API](../mappings-api))
43 | 
44 | ## Subgraph Deployment
45 | Deploying a subgraph involves the following steps:
46 | 1. Deploy subgraph manifest to IPLD and get subgraph ID.
47 | 1. Curate or index the subgraph, referenced by subgraph ID, in the Staking Contract ([Mechanism Design](../mechanism-design)).
48 | 1. (Optional) Associate a human-friendly name with the SubgraphID in the Graph Name Service.
49 | 


--------------------------------------------------------------------------------
/specs/graph-protocol-hybrid-network/subgraph-manifest/README.md:
--------------------------------------------------------------------------------
 1 | # Subgraph Manifest
 2 | 
 3 | **TODO** This file has moved to https://github.com/graphprotocol/graph-node. Please confirm. Datasets references this file with ../subgraph-manifest, so it needs to remain in this repo.
 4 | 
 5 | ## Overview
 6 | The Subgraph manifest specifies all the information required to index and query a specific subgraph. It is the entry point to your subgraph, so to speak.
 7 | 
 8 | The subgraph manifest, and all the files linked from it, are what is deployed to IPFS, and hashed to produce a subgraph ID that can be referenced on Ethereum and used to retrieve your subgraph in The Graph.
 9 | 
10 | ## Format
11 | The subgraph manifest follows the IPLD specification, which defines a data model for linking decentralized and universally addressable data structures.[<sup>1</sup>](#footnotes) Supported formats include YAML and JSON. All examples in this section are written as YAML.
12 | 
13 | ## Top-Level API
14 | 
15 | | Field  | Type | Description   |
16 | | --- | --- | --- |
17 | | **specVersion** | *String*   | A semver version indicating which version of this API is being used.|
18 | | **schema**   | [*Schema*](#schema) | The GraphQL schema of this subgraph|
19 | | **dataSources**| [*[Data Source Spec]*](#data-source)| Each Data Source spec defines data which will be ingested, and transformation logic to derive the state of the subgraph's entities based on the source data.|
20 | 
21 | ## Schema
22 | 
23 | | Field | Type | Description |
24 | | --- | --- | --- |
25 | | **file**| [*Path*](#path) | The path of the GraphQL IDL file, either locally or on IPFS |
26 | 
27 | ## Data Source
28 | 
29 | | Field | Type | Description |
30 | | --- | --- | --- |
31 | | **kind** | *String | The type of data source. Possible values: *ethereum/contract*|
32 | | **name** | *String* | The name of the source data. Will be used to generate APIs in mapping, and also for self-documentation purposes |
33 | | **source** | [*EthereumContractSource*](#ethereumcontractsource) | The source data on a blockchain such as Ethereum |
34 | | **mapping** | [*Mapping*](#mapping) | The transformation logic applied to the data prior to being indexed |
35 | 
36 | ### EthereumContractSource
37 | 
38 | | Field | Type | Description |
39 | | --- | --- | --- |
40 | | **address** | *String* | The address of the source data in its respective blockchain |
41 | | **abi** | *String* | The name of the ABI for this Ethereum contract (see `abis` in `mapping` manifest) |
42 | 
43 | ### Mapping
44 | The `mapping` field may be one of the following supported mapping manifests:
45 |  - [Ethereum Events Mapping](#ethereum-events-mapping)
46 | 
47 | #### Ethereum Events Mapping
48 | 
49 | | Field | Type | Description |
50 | | --- | --- | --- |
51 | | **kind** | *String* | Must be "ethereum/events" for Ethereum Events Mapping |
52 | | **apiVersion** | *String* | Semver string of the version of the Mappings API which will be used by the mapping script |
53 | | **language** | *String* | The language of the runtime for the Mapping API. Possible values: *wasm/assemblyscript* |
54 | | **entities** | *[String]* | A list of entities which will be ingested as part of this mapping. Must correspond to names of entities in the GraphQL IDL |
55 | | **abis** | *ABI* | ABIs for the contract classes which should be generated in the Mapping ABI. Name is also used to reference the ABI elsewhere in the manifest |
56 | | **eventHandlers** | *EventHandler* | Handlers for specific events, which will be defined in the mapping script |
57 | | **file** | [*Path*](#path) | The path of the mapping script |
58 | 
59 | #### EventHandler
60 | 
61 | | Field | Type | Description |
62 | | --- | --- | --- |
63 | | **event** | *String* | An identifier for an event which will be handled in the mapping script. For Ethereum contracts, this must be the full event signature to disambiguate from events which may share the same name. |
64 | | **handler** | *String* | The name of an exported function in the mapping script which should handle the specified event. |
65 | 
66 | ## Path
67 | A path has one field `path` which either refers to a path of a file on the local dev machine, or an [IPLD link](#footnotes).
68 | 
69 | When using the Graph-CLI, local paths may be used during development, and then the tool will take care of deploying linked files to IPFS and replacing the local paths with IPLD links at deploy time.
70 | 
71 | | Field | Type | Description |
72 | | --- | --- | --- |
73 | | **path** | *String or IPLD Link* | A path to a local file or an IPLD link |
74 | 
75 | ## Footnotes
76 | - [1] https://github.com/ipld/specs
77 | - [2] https://github.com/ipld/specs/blob/master/Codecs/DAG-JSON.md
78 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Graph Protocol Research
 2 | This repo contains specifications and research papers related to The Graph, a decentralized query protocol for the decentralized web.
 3 | 
 4 | ## Stages
 5 | The following stages apply to papers and specs in this repo.
 6 | 
 7 | | Stage | Description | Badge |
 8 | | :----- | :----------- | -----: |
 9 | | **WIP**       | Specifications which are in progress. Has gaps and is actively being changed and added to. | ![WIP Badge](https://img.shields.io/badge/stage-wip-%23C25F38.svg)
10 | | **Draft**     | Specification is a complete draft. Subject to change heavily. | ![Draft Badge](https://img.shields.io/badge/stage-draft-%23E3CB63.svg)
11 | | **Stable** | Specification is at a state where it is relatively stable. Only slight improvements should be made to the spec. | ![Stable Badge](https://img.shields.io/badge/stage-stable-%233CBFC2.svg)
12 | | **Finished**     | Specification has been fully implemented and should not change, except to address critical issues | ![Final Badge](https://img.shields.io/badge/stage-finished-0075AB.svg)
13 | | **Deferred**  | Specification made it to at least the "Draft" stage but was later rejected. | ![Deferred Badge](https://img.shields.io/badge/stage-deferred-30324F.svg)
14 | 
15 | ## Papers
16 | - ![Deferred Badge](https://img.shields.io/badge/stage-deferred-30324F.svg): [The Graph Whitepaper V1 [Deprecated]]() - This was the whitepaper we used to build interest in our protocol, early in 2018, before we had a team, or funding. Its a useful look into some of our early thinking, but should no longer be considered a source of truth for the protocol design.
17 | 
18 | ## Specs
19 |  - ![Draft Badge](https://img.shields.io/badge/stage-draft-%23E3CB63.svg):  [Hybrid Network Specification](./specs/graph-protocol-hybrid-network) - This specification is a hybrid protocol design, intended to bridge the gap between our [hosted service](http://thegraph.com) and our fully decentralized network design. Important elements of the decentralized network are covered here, including several economic mechanisms, interfaces and a high level architecture. Several elements are notably still centralized, such as the dispute management process, payment channels and governance.
20 | 
21 | ## Implementations
22 | 
23 | 🔨 = In Progress
24 | 
25 | ✅ = Implemented
26 | 
27 | ❌ = Not implemented
28 | 
29 | 
30 | ### Indexing Nodes
31 | |                       | [Graph Node](https://github.com/graphprotocol/graph-node)   |
32 | | :-------------------- | :----------: |
33 | | **Hybrid Network spec**        | |
34 | | *Subgraphs*           | ✅ |
35 | | *GraphQL Schemas*             | ✅ |
36 | | *WASM Mappings*            | ✅ |
37 | | *Index Ethereum Solidity events*            | ✅ |
38 | | *Index IPFS data*            | ✅ |
39 | | *Read Interface*      | ❌ |
40 | | *RPC API*      | ❌ |
41 | | *Payment Channels*      | ❌ |
42 | | *Work Token Economics*      | ❌ |
43 | 
44 | ## Contributing
45 | If you have questions about the contents of this repo, feel free to ask questions in the #research channel of our [Discord](http://thegraph.com/discord).
46 | 
47 | We don't yet have a formal improvement proposal process, but feel free to submit ideas for improvements to the protocol as issues in this repo.
48 | 
49 | If you spot an error, gap or useful clarification in one of the specs or papers, feel free to file an issue (or if it is small, submit a PR directly), and you may be listed as a contributor on the spec, if your issue is accepted 🙂.
50 | 
51 | We look forward to some great contributions from our community, thank you in advance! 🗿✨🚀
52 | 
53 | ## License
54 | Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
55 | 
56 | The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
57 | 
58 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE GRAPH PROTOCOL INC  BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
59 | 
60 | Except as contained in this notice, the names "The Graph" or "Graph Protocol" shall not be used in advertising or otherwise to promote the sale, use or other dealings in this Software without prior written authorization from the Graph Protocol Inc. team.
61 | 
62 | The Graph is a trademark of Graph Protocol Inc.
63 | 


--------------------------------------------------------------------------------
/specs/graph-protocol-hybrid-network/data-modeling/README.md:
--------------------------------------------------------------------------------
  1 | # Data Modeling
  2 | The schema of your dataset--that is, the entity types, values and relationships that are available to query--are defined through the  GraphQL Interface Definition Language (IDL)[<sup>1</sup>](#footnotes).
  3 | 
  4 | ## Entities
  5 | 
  6 | ## Entities
  7 | 
  8 | Entities are defined as GraphQL types decorated with an `@entity`. All entities must have an `id: ID!` field defined.
  9 | 
 10 | #### Example
 11 | Define a `Token` entity:
 12 | 
 13 | ```graphql
 14 | @entity
 15 | type Token {
 16 |   # The unique ID of this entity
 17 |   id: ID!
 18 |   name: String!
 19 |   symbol: String!
 20 |   decimals: Int!
 21 | }
 22 | ```
 23 | 
 24 | An attribute on an entity type may be specified as unique, in which case the value for that attribute must be unique among all instances of that entity type.
 25 | 
 26 | ### Example
 27 | Define a `File` entity with a unique content hash:
 28 | ```graphql
 29 | @entity
 30 | type File {
 31 |   id: ID!
 32 |   name: String!
 33 |   bytes: Bytes!
 34 |   length: Int!
 35 |   # Only one File entity may be created with a given
 36 |   # content hash.
 37 |   hash: String! @unique
 38 | }
 39 | ```
 40 | 
 41 | ## Built-In Types
 42 | 
 43 | ### GraphQL Built-In Scalars
 44 | All the scalars defined in the GraphQL spec are supported: `Int`, `Float`, `String`, `Boolean`, and `ID`.
 45 | 
 46 | ### Bytes
 47 | There is a `Bytes` scalar for variable-length byte arrays.
 48 | 
 49 | Additionally, fixed-length byte scalar types between 1 and 32 bytes are supported: `Byte`, `Bytes1` (an alias for `Byte`), `Bytes2`, `Bytes3` ... `Bytes31`, and `Bytes32`.
 50 | 
 51 | ### Numbers
 52 | The GraphQL spec defines `Int` and `Float` to have sizes of 32 bytes.
 53 | 
 54 | This API additionally includes `BigInt` and `BigFloat` number types to represent arbitrarily large integer or floating point numbers, respectively.
 55 | 
 56 | There are also fixed-size number types to represent numbers between 1 and 32 bytes long. The suffix is specified by the number of bits.
 57 | 
 58 | Signed integers all share the `Int` prefix: `Int8`, `Int16`, `Int24`, `Int32` (an alias of `Int`) ... `Int240`, `Int248`, and `Int256`.
 59 | 
 60 | There are corresponding unsigned integer types prefixed with `UInt`: `UInt8`, `UInt16`, `UInt24`, `UInt32` ... `UInt240`, `UInt248`, and `UInt256`.
 61 | 
 62 | All number types other than `Int` and `Float`, which are serialized as JSON number types, are serialized as strings.
 63 | 
 64 | Even though the serialization format is the same, having the sizes captured in the type system provides better self-documentation and enables tooling that generates convenient deserializers in statically typed languages.
 65 | 
 66 | ## Value Objects
 67 | All types not decorated with the `@entity` decorator are value objects. Value object types may be used as the type of entity attributes, and do not have unique `id` attributes themselves.
 68 | 
 69 | ## Entity Relationships
 70 | An entity may have a relationship to one or more other entities in your data model. Relations are unidirectional.
 71 | 
 72 | Despite being unidirectional, attributes may be defined on entities which facilitate navigating relationships in the reverse direction. See [Reverse Lookups](#reverse-lookups).
 73 | 
 74 | ### Basics
 75 | 
 76 | Relationships are defined on entities just like any other scalar type, except that the type specified is that of another entity.
 77 | 
 78 | #### Example
 79 | Define a `Transaction` entity type with an optional one-to-one relationship with a `TransactionReceipt` entity type:
 80 | ```graphql
 81 | @entity
 82 | type Transaction {
 83 |   id: ID!
 84 |   transactionReceipt: TransactionReceipt
 85 | }
 86 | 
 87 | @entity
 88 | type TransactionReceipt {
 89 |   id: ID!
 90 |   transaction: Transaction
 91 | }
 92 | ```
 93 | 
 94 | #### Example
 95 | Define a `Token` entity type with a required one-to-many relationship with a `TokenBalance` entity type:
 96 | ```graphql
 97 | @entity
 98 | type Token {
 99 |   id: ID!
100 |   tokenBalances: [TokenBalance!]!
101 | }
102 | 
103 | @entity
104 | type TokenBalance {
105 |   id: ID!
106 |   amount: Int!
107 | }
108 | ```
109 | 
110 | ### Reverse Lookups
111 | Reverse lookups can be defined on an entity through the `@derivedFrom` field. This creates a "virtual" field on the entity that may be queried but cannot be set manually through the mappings API. Rather, it is derived from the relationship defined on the other entity.
112 | 
113 | The type of a `@derivedFrom` field must be a collection since multiple entities may specify relationships to a single entity.
114 | 
115 | #### Example
116 | Define a reverse lookup from a `User` entity type to an `Organization` entity type:
117 | ```graphql
118 | @entity
119 | type Organization {
120 |   id: ID!
121 |   name: String!
122 |   members: [User]!
123 | }
124 | 
125 | @entity
126 | type User {
127 |   id: ID!
128 |   name: String!
129 |   organizations: [Organization!] @derivedFrom(field: "members")
130 | }
131 | ```
132 | 
133 | ## Footnotes
134 | - [1] http://facebook.github.io/graphql/draft/#sec-Type-System
135 | 


--------------------------------------------------------------------------------
/specs/graph-protocol-hybrid-network/mappings-api/README.md:
--------------------------------------------------------------------------------
  1 | # Mappings API
  2 | 
  3 | ## Overview
  4 | Mappings define how data is extracted or ingested from one or more data sources, transformed, and then loaded in a format that follows a specific data model. At their core, mappings are simply WASM modules, and the mappings API is a set of host external functions that are injected into the WASM runtime and implement a specific interface. These function interfaces are in-protocol.
  5 | 
  6 | Additionally, extra-protocol APIs may be defined in userspace, which implement an API in a higher-level language that compiles to WASM. The Graph Protocol team created one such API, which is included here as a reference example.
  7 | 
  8 | ## WASM API
  9 | The Mappings API can be split into two portions, the *ingest* or *extract* API and the *store* API (how data is loaded). We present an ingest API tailored to event-sourcing Ethereum smart contract data, but future versions of the protocol will enable ingest APIs specific to other decentralized data sources. The *store API* will not need to change to support new data sources.
 10 | 
 11 | **Note:** There are also a number of utility functions that are currently injected into the WASM runtime for convenience. These will either be implemented natively in WASM or fully specified in a future version of this spec.
 12 | 
 13 | ### Ingest
 14 | #### Ethereum
 15 | Data is ingested from Ethereum by event-sourcing Solidity events, as well as other triggers, defined in the subgraph manifest. The WASM module referenced in the subgraph manifest is expected to have handlers that correspond to the handlers defined in the [subgraph manifest](../subgraph-manifest).
 16 | 
 17 | See this [reference implementation](https://github.com/graphprotocol/graph-node/blob/master/runtime/wasm/src/host.rs) for how these handlers should be called.
 18 | 
 19 | Additionally, we inject functions for calling Ethereum smart contracts for additional data that is not included in the Ethereum event:
 20 | - ethereum_call
 21 | 
 22 | See this [reference](https://github.com/graphprotocol/graph-node/blob/master/runtime/wasm/src/host_exports.rs) for these additional functions.
 23 | 
 24 | ### Store
 25 | The store API includes the following methods:
 26 | - set
 27 | - get
 28 | - remove
 29 | 
 30 | See this [reference implementation](https://github.com/graphprotocol/graph-node/blob/master/runtime/wasm/src/host_exports.rs) for these external functions.
 31 | 
 32 | ### Utilities
 33 | We currently inject the following utility functions into the WASM runtime, which may be changed or removed in a future version of the protocol:
 34 | - bytes_to_string
 35 | - bytes_to_hex
 36 | - big_int_to_hex
 37 | - big_int_to_i32
 38 | - json_to_i64
 39 | - json_to_u64
 40 | - json_to_f64
 41 | - json_to_big_int
 42 | - crypto_keccak_256
 43 | - big_int_plus
 44 | - big_int_minus
 45 | - big_int_times
 46 | - big_int_divided_by
 47 | - big_int_mod
 48 | - string_to_h160
 49 | 
 50 | See this [reference implementation](https://github.com/graphprotocol/graph-node/blob/master/runtime/wasm/src/host_exports.rs) for these external functions.
 51 | 
 52 | ## Higher-Level APIs
 53 | Higher-level APIs provide context for the low-level APIs, described above, in a higher-level programming language that compiles to WASM.
 54 | 
 55 | ### AssemblyScript
 56 | [AssemblyScript](https://github.com/AssemblyScript/assemblyscript/wiki) is a subset of TypeScript that compiles to WASM. It only natively supports a handful of types, 32 and 64-bit floating point and integer numeric types, but we extend the runtime with additional higher-level types, such as `TypedMap` and `BigInt`, to facilitate a more developer-friendly API.
 57 | 
 58 | See this [reference](https://github.com/graphprotocol/graph-ts/blob/master/index.ts) for all types, external functions, and utilities.
 59 | 
 60 | #### Types
 61 | ##### Basic Types
 62 | - `TypedMap<K, V>`
 63 | - `TypedMapEntry<K, V>`
 64 | - `BytesArray`
 65 | - `Bytes`
 66 | - `BigInt`
 67 | - `Value`
 68 | - `ValueKind`
 69 | - `ValuePayload`
 70 | 
 71 | ##### Serialization Formats
 72 | - `JSONValue`
 73 | - `JSONValueKind`
 74 | - `JSONValuePayload`
 75 | 
 76 | ##### Ethereum Types
 77 | - `EthereumValue`
 78 | - `EthereumValueKind`
 79 | - `EthereumBlock`
 80 | - `EthereumTransaction`
 81 | - `EthereumEvent`
 82 | - `EthereumEventParams`
 83 | - `SmartContractCall`
 84 | - `SmartContract`
 85 | 
 86 | ##### Store Types
 87 | - `Entity`
 88 | 
 89 | #### Ingest
 90 | ##### Ethereum
 91 | - `ethereum.call`
 92 | 
 93 | ##### IPFS
 94 | - `ipfs.cat`
 95 | 
 96 | #### Store
 97 | - `store.set`
 98 | - `store.get`
 99 | - `store.remove`
100 | 
101 | #### Utilities
102 | - `crypto.keccak256`
103 | - `json.fromBytes`
104 | - `json.toI64`
105 | - `json.toU64`
106 | - `json.toF64`
107 | - `json.toBigInt`
108 | - `typeConversion.bytesToString`
109 | - `typeConversion.bytesToHex`
110 | - `typeConversion.bigIntToString`
111 | - `typeConversion.stringToH160`
112 | - `typeConversion.i32ToBigInt`
113 | - `typeConversion.bigIntToI32`
114 | - `bigInt.plus`
115 | - `bigInt.minus`
116 | - `bigInt.times`
117 | - `bigInt.dividedBy`
118 | - `bigInt.mod`
119 | 


--------------------------------------------------------------------------------
/specs/graph-protocol-hybrid-network/rpc-api/README.md:
--------------------------------------------------------------------------------
  1 | # JSON RPC API
  2 | 
  3 | This API uses  JSON RPC 2.0[<sup>1</sup>](#footnotes), a light-weight, transport agnostic RPC protocol.
  4 | 
  5 | ## Methods
  6 | 
  7 | - getPrices
  8 | - ping
  9 | - callReadOp
 10 | 
 11 | ## Reference
 12 | 
 13 | ### getPrices
 14 | Retrieves the gas and bandwidth pricing of an Indexing Node in a specific token denomination. Prices returned are informational only, they do not represent a commitment by the Indexing Node.
 15 | 
 16 | #### Parameters
 17 | 1. `String` (optional) - The symbol of the token which prices are being requested in. Valid values are 'ETH' or 'DAI'. If not specified, prices will be returned for all tokens the node denominates prices in.
 18 | 
 19 | #### Returns
 20 | `Array<Object>`
 21 |  - `token`: `String` - The symbol of the token which the prices are denominated in.
 22 |  - `gasPrice`: `Number` | `null` - The price of a unit of gas. If no price denominated in the specified token, `null`.
 23 |  - `bandwidthPrice`: `Number` | `null` - The price of sending one byte over the network. If no price denominated in the specified token, `null`.
 24 | 
 25 | #### Example
 26 | ```js
 27 | {
 28 |   "method": "getPrices",
 29 |   "params": ["DAI"],
 30 |   "jsonrpc": "2.0"
 31 | }
 32 | 
 33 | // response
 34 | {
 35 |   "result": [
 36 |     {
 37 |       "token": "DAI",
 38 |       "bandwidthPrice": 0.01,
 39 |       "gasPrice": 0.025
 40 | 
 41 |     }
 42 |   ],
 43 |   "jsonrpc": "2.0"
 44 | }
 45 | ```
 46 | 
 47 | ### ping
 48 | Pings a node to check that is it is available and gauge the latency of the 'pong' response.
 49 | 
 50 | #### Parameters
 51 | None
 52 | 
 53 | #### Returns
 54 | `String` - The string "pong".
 55 | 
 56 | #### Example
 57 | ```js
 58 | // request
 59 | {
 60 |   "method": "ping",
 61 |   "jsonrpc": "2.0"
 62 | }
 63 | 
 64 | // response
 65 | {
 66 |   "result": "pong"
 67 |   "jsonrpc": "2.0"
 68 | }
 69 | ```
 70 | 
 71 | ### callReadOp
 72 | Calls a low-level read operation on a database index.
 73 | 
 74 | #### Parameters
 75 | 
 76 | 1. `Object`
 77 |  - `blockHash`: `String` - The hash of the Ethereum block as of which to read the data.
 78 |  - `subgraphID`: `String` - The ID of the subgraph to read from.
 79 |  - `index`: `Object` - The [Index Record](#indexes) of the index being read from.
 80 |  - `op`: `String` - The name of the read operation.
 81 |  - `params`: `[any]` - The parameters passed into the called read operation.
 82 | 2. `Object` - A [Locked Transfer](../messages#locked-transfer) message which serves as a conditional micropayment for the read operation.
 83 | 
 84 | #### Returns
 85 | Returns one of the following message types:
 86 | 
 87 | ##### Read Result
 88 | `Object`
 89 |  - `type`: `String` - The constant "READ_RESULT"
 90 |  - `data`: `any` - The data retrieved by the read operation, if any.
 91 |  - `attestation`: Object - An attestation that `data` and `type` is a correct response for the given read operation (see [Attestation](#attestation)).
 92 | 
 93 | ##### Not Enough Gas
 94 | Indicates that that the gas limit was consumed without completing the computation. Payment will still be made to the Indexing Node for computation performed. No data is returned.
 95 | `Object`
 96 |  - `type`: `String` - The constant "NOT_ENOUGH_GAS"
 97 |  - `attestation`: Object - An attestation that `type` is a correct response for the given read operation (see [Attestation](#attestation)).
 98 | 
 99 | ##### Not Enough Bandwidth
100 | Indicates that that the bandwidth limit is insufficient to cover the response size. Payment will still be made to the Indexing Node for computation performed (but not for bandwidth). No data is returned.
101 | `Object`
102 |  - `type`: `String` - The constant "NOT_ENOUGH_BANDWIDTH"
103 |  - `attestation`: Object - An attestation that `type` is a correct response for the given read operation (see [Attestation](#attestation)).
104 | 
105 | ##### Insufficient Funds
106 | Indicates that that there are insufficient funds in the payment channel to cover the maximum amount of tokens that may be spent completing the read operation.
107 | `Object`
108 |  - `type`: `String` - The constant "INSUFFICIENT_FUNDS"
109 | 
110 | ##### Price Too Low
111 | Indicates that the price offered for gas or bandwidth is lower than what the Indexing Node will accept. Response includes up-to-date prices.
112 | `Object`
113 |  - `type`: `String` - The constant "PRICE_TOO_LOW"
114 |  - `prices`: `Object` - The currently advertised prices for the Indexing Node
115 |    - `token`: `String` - The symbol of the token which the prices are denominated in.
116 |    - `gasPrice`: `Number` | `null` - The price of a unit of gas. If no price denominated in the specified token, `null`.
117 |    - `bandwidthPrice`: `Number` | `null` - The price of sending one byte over the network. If no price denominated in the specified token, `null`.
118 | 
119 | ##### Must Include Payment
120 | Indicates that the Indexing Nodes expects a conditional micropayment to be included with the request.
121 | 
122 | `Object`
123 |  - `type`: `String` - The constant "MUST_INCLUDE_PAYMENT"
124 | 
125 | 
126 | 
127 | #### Example
128 | ###### Example - Entity exists
129 | ```js
130 | // request
131 | {
132 |   "method": "callReadOp",
133 |   "params": [
134 |     {
135 |       "blockHash": "xbf133b670857b983fc1b8f08759bc860378179042a0dba30b30e26d6f7f919d1",
136 |       "subgraphID": "QmTeW79w7QQ6Npa3b1d5tANreCDxF2iDaAPsDvW6KtLmfB",
137 |       "index": {
138 |         "indexType": "kv"
139 |       },
140 |       "op": "get"
141 |       "params": ["User:1"]
142 |     }
143 |   ],
144 |   "jsonrpc": "2.0"
145 | }
146 | // response
147 | {
148 |   "data": {
149 |     "firstName": "Vitalik",
150 |     "lastName": "Buterin",
151 |   },
152 |   // TODO: Provide more realistic attestations
153 |   "attestation": 0x0122340
154 | }
155 | ```
156 | 
157 | ###### Example - Entity doesn't exist
158 | ```js
159 | // request
160 | {
161 |   "method": "callReadOp",
162 |   "params": [
163 |     {
164 |       "blockHash": "xbf133b670857b983fc1b8f08759bc860378179042a0dba30b30e26d6f7f919d1",
165 |       "index": {
166 |         "indexType": "kv"
167 |       },
168 |       "op": "get"
169 |       "params": ["User:1"]
170 |     }
171 |   ],
172 |   "jsonrpc": "2.0"
173 | }
174 | // response
175 | {
176 |   "data": null,
177 |   // TODO: Provide more realistic attestations
178 |   "attestation": 0x0122340
179 | }
180 | ```
181 | 
182 | ## Footnotes
183 | - [1] https://www.jsonrpc.org/specification
184 | 


--------------------------------------------------------------------------------
/specs/graph-protocol-hybrid-network/query-processing/README.md:
--------------------------------------------------------------------------------
 1 | # Query Processing
 2 | 
 3 | ## Background
 4 | In some respects, The Graph resembles a traditional distributed query engine, where users may retrieve data that is distributed across a variety of stores via a single query interface, typically SQL.
 5 | 
 6 | In this analogy, Query Nodes play the role of query engine, and Indexing Nodes play the role of data stores.
 7 | 
 8 | Notably, the protocol only defines the *read interface* to the Indexing Nodes, which is consumed by the Query Nodes, while remaining agnostic to the actual implementation of the Query Nodes.
 9 | 
10 | Some Query Node implementations may provide an SQL interface, while others may provide a GraphQL interface. Query Nodes may implement different heuristics for balancing the requirements of price, performance, and economic security or make these algorithms pluggable. Indeed, some users may choose to forgo the Query Node altogether and directly consume the lower-level read interface exposed by the Indexing Nodes.
11 | 
12 | While this last case is certainly possible, we present the architecture and high-level algorithms for consuming the data indexed by The Graph via a distributed query engine because that is the intended usage pattern of the protocol.
13 | 
14 | The specification will omit detailed steps in the algorithms that are left to implementers. However, the Graph Protocol team will implement a reference Query Node/Client that provides a concrete example of how these algorithms might be implemented in order to provide a GraphQL interface to The Graph.
15 | 
16 | ## Query Processing Architecture
17 | 
18 | ### With Query Node
19 | ![Query Node Architecture](../assets/query-node-architecture.png)
20 | 
21 | 
22 | ### With Query Client
23 | ![Query Client Architecture](../assets/query-client-architecture.png)
24 | 
25 | ## Overview
26 | As shown in the diagrams above, the query processing may take place via a Query Client, which is embedded in the end-user application, or it may take place via a Query Node that is external to the application. In the latter case, the Query Node may be running locally on the user's machine or as an external service that is accessed via the Internet. Some extra-protocol Query Node providers may choose to also run Indexing Nodes to provide a low-latency GraphQL interface, optimized for specific datasets.
27 | 
28 | In either construction, query processing consists of the following steps:
29 | 1. Query Planning (Optional)
30 | 1. Service Discovery
31 | 1. Service Selection
32 | 1. Processing and Payment
33 | 1. Response Collation
34 | 
35 | ## Design
36 | 
37 | ### Query Planning
38 | In this stage, the Query Node transforms a query into a plan, consisting of an ordered set of lower-level read operations that may be used to retrieve the data specified by the query. These steps are encapsulated in some intermediate representation (IR).
39 | 
40 | | Implementor's Note |
41 | | ----------------- |
42 | | Query planning is optional. For example, producing a query plan is common for most SQL databases, but for GraphQL server implementations, it is common to directly process the query in field-level resolvers. |
43 | 
44 | ## Query Optimization
45 | Query plans may optionally be optimized based on a variety of heuristics and algorithms, which are out of the scope of this specification.
46 | 
47 | ### Service Discovery
48 | Processing a query plan, or processing a query directly, results in low-level read operations being made to Indexing Nodes. Each read operation corresponds to a specific dataset and, thus, needs to be made against an Indexing Node for that dataset. In the Service Discovery step, the Query Node locates Indexing Nodes for a specific dataset as well as important metadata that is useful in deciding which Indexing Node to issue read operations to, such as price, performance, and economic security margin.
49 | 
50 | #### Locating Indexing Nodes
51 | To locate Indexing Nodes with data for a specific dataset, the Query Node makes several calls to the service discovery layer, which is implemented as several smart contracts on the Ethereum mainnet:
52 | 
53 | 1. Resolve subgraph names via the Graph Name Service (GNS).
54 | 1. Identify per-subgraph Indexing Nodes via the Staking Contract.
55 | 1. Identify Indexing Node URLs via the Service Registry.
56 | 
57 | #### Collecting Indexing Node Metadata
58 | After identifying the URLs of all Indexing Nodes for a given dataset, the next step is to collect the metadata for price, performance, and economic security margin. This information should be cached for future Service Discovery steps for subsequent queries.
59 | 
60 | Fetching price and latency for a node is done via a single call to the Indexing Node RPC API and returns the following data: the latency required to fulfill the request; a `bandwidthPrice` measured in price per byte transmitted over the network; and a `gasPrice`, which captures the cost of compute and IO for a given read operation.
61 | 
62 | Economic security margin is the amount that an Indexing Node has staked and is willing to forfeit in the event that they provide an incorrect response to a read operation. The Query Node receives this in the previously made call to the Staking Contract.
63 | 
64 | | Implementor's Note |
65 | | ----------------- |
66 | | The Query Node does not need to make calls to every Indexing Node for a given dataset. It could choose to only contact a randomly selected subset or to keep contacting Indexing Nodes until it finds one that meets its selection criteria.|
67 | 
68 | ### Service Selection
69 | In the Service Selection stage, Query Nodes choose which Indexing Nodes to transact with for each read operation. An algorithm for this stage could incorporate `latency` (measured in ms), `economicSecurityMargin` (measured in Graph Tokens), `gasPrice`, and `bytesPrice` (the cost of sending a byte over the network).
70 | 
71 | A naive algorithm for service selection could look like the following:
72 | 1. Filter Indexing Nodes where `economicSecurityMargin < minEconomicSecurityMargin`.
73 |     - If no Indexing Nodes remain, return an error to the sender of the query for this piece of data, and specify the reason.
74 | 2. Filter Indexing Nodes where `latency < minLatency`.
75 |     - If no Indexing Nodes remain, increase `minLatency` by 33%, and repeat the step.
76 | 3. Estimate the cost of the read operation for each remaining Index.
77 |     - Assume 80% of the maximum possible entities returnable by the query will be returned.
78 |     - Assume 25% of the max field size (in bytes) for each entity field with variable size.
79 |     - Calculate the bandwidth and gas costs based on the above assumptions and the gas costs specified in the [Read Interface](../read-interface).
80 | 4. Choose the Indexing Node with the lowest estimated cost for the read operation.
81 | 
82 | In this example algorithm, `minLatency` and `minEconomicSecurityMargin` could be set per dataset or for all datasets. Additionally, it could be set by the Query Node or sent as metadata with an individual query.
83 | 
84 | ### Processing and Payment
85 | Available read operations are defined in the [Read Interface](../read-interface), and are sent to the Indexing Nodes via the [JSON-RPC API](../rpc-api). They are accompanied by Locked Transfers, conditional micropayments that may be unlocked by the Indexing Node producing a Read Response and a signed [Attestation](../messages#attestation) message certifying the response data is correct.
86 | >>>>>>> v1-spec: Update links and minor edits
87 | 
88 | ### Response Collation
89 | Once all read operations have been processed, the resulting data must be collated into a response that fulfills the read schema of the query interface provided by the Query Node. This response is then returned to the sender of the query.
90 | 


--------------------------------------------------------------------------------
/specs/graph-protocol-hybrid-network/payment-channels/README.md:
--------------------------------------------------------------------------------
 1 | # Payment Channels
 2 | 
 3 | ## Overview
 4 | The protocol adopts payment channels, as a means of facilitating micropayments that are paid to Indexing Nodes in exchange for reading from indexes.
 5 | 
 6 | The protocol's payment channel architecture follows the Raiden Network Specification[<sup>1</sup>](#footnotes), with a few notable differences:
 7 | 1. The Graph v1 implements a hub-and-spoke topology that dramatically simplifies payment routing, compared to a fully distributed network topology.
 8 | 1. The Graph v1 introduces a new concept, *minting channels*, to get around prohibitively large balance requirements for the payment channel hub.
 9 | 1. The Graph uses an alternate locking mechanism for mediated payments that is tailored to the domain of reading data from indexes.
10 | 1. Payment channels are one-way and may be withdrawn from by payment receiver without closing channel. See Payment Channel(#payment-channel).
11 | 1. Balance Proofs may be exchanged off-chain directly between the sender and final recipient of a mediate transfer. See [Micropayment Routing](#micropayment-routing).
12 | 
13 | In this construction, the Payment Channel Hub acts as a *trusted intermediary*, although users of the protocol have no counter-party risk with respect to one another. A future version of this protocol will move away from the hub and spoke topology in favor of a decentralized payment channel network topology. At that point, minting channels will also likely be removed from the specification.
14 | 
15 | ## Hub-and-Spoke Topology
16 | ### Architecture
17 | ```ascii
18 |        +-------------------+-------------------+------------------+
19 |        |                   |                   |                  |
20 |        |                   |                   |                  |
21 |   +----+-----+        +----+-----+       +-----+----+             |
22 |   |          |        |          |       |          |             |
23 |   | Indexing |        | Indexing |       | Indexing |             |
24 |   | Node     |        | Node     |       | Node     |             |
25 |   |          |        |          |       |          |             |
26 |   +-----^----+        +----^-----+       +-----^----+         Sell GRT/
27 |         |                  |                   |              Buy ETH
28 |         |                  |                   |                  |
29 |         |                  |                   |                  |
30 |         +---GRT-----+     GRT     +-----GRT----+                  |
31 |                     |      |      |                               |
32 |  minting channels   |      |      |                               |
33 |                     |      |      |                               |
34 |                 +---+------+------+---+                  +--------v------+
35 |                 |                     |      Sell ETH/   |               |
36 |                 | Payment Channel Hub +------------------> Token Auction |
37 |                 |                     |      Buy GRT     |               |
38 |                 +---^------^------^---+                  +---------------+
39 |                     |      |      |
40 |  payment channels   |      |      |
41 |                     |      |      |
42 |       +-----ETH-----+     ETH     +----ETH---+
43 |       |                    |                 |
44 |       |                    |                 |
45 | +-----+------+      +------+-----+     +-----+------+
46 | |            |      |            |     |            |
47 | |  End user  |      |  End user  |     |  End user  |
48 | |            |      |            |     |            |
49 | +------------+      +------------+     +------------+
50 | ```
51 | ### High-Level Design
52 | End users pay Indexing Nodes via the Payment Channel Hub. Micropayments from end users to the Payment Channel Hub are denominated in ETH or DAI, while micropayments from the Payment Channel Hub to Indexing Nodes are denominated in Graph Tokens, which the Payment Channel Hub mints.
53 | 
54 | To determine the exchange rate between ETH or DAI and Graph Tokens, the Payment Channel Hub reads from an on-chain Token Auction contract that acts as a price feed. The Token Auction contract also acts as a sink for the Graph Tokens that are minted by the Payment Channel Hub and provides a mechanism for selling the ETH or DAI that the Payment Channel Hub collects.
55 | 
56 | ## Payment Channel Hub
57 | The Payment Channel Hub is a service, an externally owned Ethereum account operated by the Graph Protocol Team. It acts as a counter party for payment channels with end users and for minting channels with Indexing Nodes.
58 | 
59 | To act as a counterparty for the minting channel contract, the Ethereum account corresponding to the Payment Channel Hub must be designated as a *treasurer* of the Graph Token (GRT) ERC-20 contract, which grants the hub the right to mint Graph Tokens.
60 | 
61 | ## Payment Channel
62 | The payment channel presented here is modeled off the payment channel design in the Raiden Specification[<sup>2</sup>](#footnotes) with the key difference that payments are only made in one direction from the end-user to the Payment Channel Hub. This difference enables other simplifications in the design:
63 | 1. Balance proofs may be settled on-chain continuously, without closing the channel.
64 | 1. Tokens may be withdrawn from the channel by the Payment Channel Hub continuously, without closing the channel.
65 | 
66 | As is the case with normal payment channel contracts, deposits may be made by participants, specifically the end-user, on an ongoing basis. For the end-user to withdraw tokens that they deposited, the channel must be closed and settled.
67 | 
68 | ## Minting Channel
69 | Traditional payment channels involve exchanging off-chain messages that are "backed" by a deposit in a channel on-chain, which may be used to settle the final balance when the payment channel is closed. We present a variation on this construction, where instead of being backed by a deposit, payments in the channel are backed by the ability of one participant, the sender, to mint the token that the micropayments are denominated in.
70 | 
71 | Specifically, the Payment Channel Hub has the ability to mint Graph Tokens to pay Indexing Nodes an amount equivalent to the amount of ETH or DAI paid toward that Indexing Node by an end user. The minting channel acts as the second leg of a mediated transfer.
72 | 
73 | The minting channel should be settled once per *round*. See [Mechanism Design](../mechanism-design) for more information.
74 | 
75 | ## Micropayment Routing
76 | Because of the hub-and-spoke payment channel, micropayment routing is trivial. Payment must always go through the Payment Channel Hub, and only deposits in the first leg of the mediate payment need be checked to confirm that there are sufficient funds to cover the transfer. As such, balance proofs may be exchanged directly between sender and final recipient of a mediated micropayment, and the receiver may send messages to the payment channel hub on the sender's behalf. This facilitates sending valid [Locked Transfer messages](.../messages) in-band with requests to the [JSON RPC API](../rpc-api).
77 | 
78 | ## Token Auction
79 | With the minting channel construction, new Graph Tokens are minted in direct proportion to the amount of value exchanged between end users and Indexing Nodes in a given round. The Token Auction acts as a sink for these new Graph Tokens, whereby the Payment Channel "buys back" the Graph Tokens previously minted in exchange for the ETH or DAI collected in the payment channels through an on-chain auction mechanism.
80 | 
81 | **TODO** [Select on-chain auction/ price feed mechanism.](https://github.com/graphprotocol/research/issues/79)
82 | 
83 | The Token Auctions implemented as smart contracts on the Ethereum blockchain also act as on-chain price feeds, indicating an exchange rate between the supported tokens and GRT. This may be used by the Payment Channel Hub to determine the amount of Graph Tokens to send on the second leg of the mediated transfer via the minting channel.
84 | 
85 | ## Data Retrieval Timelock
86 | Traditional mediated transfers via payment channels use a hash timelock, in which payments are unlocked by providing the pre-image to a hash. In The Graph, micropayments are unlocked by the Indexing Node performing useful work, that of reading from an index, and providing an attestation that the work was performed correctly. Rather than having a fixed amount of tokens locked, the amount of tokens unlocked are dynamic, based on the amount of bandwidth and computation required to fulfill the request, and maximum computation and bandwidth limits which is defined in the lock.
87 | 
88 | See [Data Retrieval Timelock](../messages#data-retrieval-timelock) in the Messages section of the specification.
89 | 
90 | ## Footnotes
91 | - [1] https://github.com/raiden-network/spec
92 | - [2] https://raiden-network-specification.readthedocs.io/en/latest/smart_contracts.html#tokennetwork-channel-protocol-overview
93 | 


--------------------------------------------------------------------------------
/specs/graph-protocol-hybrid-network/architecture-overview/README.md:
--------------------------------------------------------------------------------
 1 | # Architecture Overview
 2 | 
 3 | ## High-Level Architecture
 4 | 
 5 | ```ascii
 6 |     +------------------------------------------------------------------+
 7 |     |                                                                  |
 8 |     | Decentralized Application (dApp)                                 |
 9 |     |                                                                  |
10 |     +-+---------------------------------^--------------------------+---+
11 |       |                                 |                          |
12 |       |                              Queries                       |
13 |       |                                 |                          |
14 |       |   +-----------------------------+--------------------+     |
15 |       |   |                                                  |     |
16 |       |   |  Query Nodes and Clients                         | Micropayments
17 |       |   |                                                  |     |
18 |       |   +---------+-------------------^--------------------+     |
19 |       |             |                   |                          |
20 | Transactions   Attestations     (Reads, Attestations)              |
21 |       |             |                   |                          |
22 |       |   +---------v-----------+  +----+--------------------------v---+
23 |       |   |                     |  |                                   |
24 |       |   |  Fisherman Service  |  | Indexing Nodes                    |
25 |       |   |                     |  |                                   |
26 |       |   +---------+-----------+  +----^-------------------^----------+
27 |       |             |                   |                   |
28 |       |         Disputes          (Events, Data)          Data
29 |       |             |                   |                   |
30 |     +-v-------------v-------------------+-----+ +-----------+----------+
31 |     |                                         | |                      |
32 |     |                Ethereum                 | |   IPFS               |
33 |     |                                         | |                      |
34 |     +-----------------------------------------+ +----------------------+
35 | ```
36 | 
37 | ## Overview
38 | The Graph supports a command query responsibility segregation (CQRS) pattern where dApps send commands (transactions) directly to the underlying Ethereum blockchain, but issue queries (reads) against the layer 2 Indexing Nodes, in exchange for micropayments, via a Query Node or Query Client. In addition to being able to scale reads independently from transactions, there is the added benefit of being able to specify read semantics which differ from the more limited write semantics supported by the Ethereum blockchain (see [Data Modeling](../data-modeling)). Indeed it allows the dApp to query a completely different view on the underlying blockchain data, which may be augmented by data that is stored off-chain on IPFS. Read responses are accompanied by [attestations](../messages#attestation), messages which certify the correctness of a response, which a Query Node may optionally provide to a Fisherman Service to improve the economic security guarantees as to the correctness of responses in the network (see [Mechanism Design](../mechanism-design)).
39 | 
40 | **Note:** While the above diagram conveys the full architecture of a dApp interacting with The Graph, the protocol is primarily concerned with the interface to the Indexing Nodes, as well as as the mechanisms implemented by a series of smart contracts which shall be deployed to the Ethereum mainnet. Query Nodes, Query Clients, and Fisherman Services are "extra-protocol", which is to say that while they may be described in this document to add color, the protocol is agnostic to their specific interfaces and logic, and indeed we expect there to arise multiple implementations with distinct interfaces and logic. There may also arise multiple implementations for the Indexing Nodes, in a variety of languages, however, the service interfaces of each implementation must adhere strictly to the protocol defined in this specification.
41 | 
42 | ## Components
43 | 
44 | ### Decentralized Application (dApp)
45 | This is an application run by an end user in their browser or on their device. Its data and business logic primarily live on the Ethereum blockchain and IPFS, meaning that it is safe from censorship and the economic risks of the developers shutting down or going out of business. In exchange for this robustness, a user assumes the cost of operating the infrastructure required to power the dApp by paying gas costs to transact against the Ethereum blockchain and making micropayments for metered usage of The Graph to query the data required to power the dApp. The dApp may interact with The Graph via an embedded Query Client or external Query Node.
46 | 
47 | ### Query [Nodes | Clients]
48 | Query Nodes provide an abstraction on top of the low-level read API provided by the Indexing Nodes. The Query Nodes may optionally choose to provide a GraphQL interface, SQL interface, or traditional REST interface, whatever is best-suited toward the respective domain in which it will be used. We include a reference JavaScript Query Node that provides a GraphQL interface and may be embedded and extended in a server or browser application as a Query Client.
49 | 
50 | In addition to providing an interface to dApps, the Query Node is responsible for discovering Indexing Nodes in the network that are indexing a specific dataset, and selecting an Indexing Node to read from based on factors such as price and performance (see [Query Processing](../query-processing)). It may also optionally forward attestations along to a Fisherman Service.
51 | 
52 | ### Indexing Nodes
53 | Indexing Nodes index one or more user-defined datasets, called *subgraphs*. These nodes perform a deterministic streaming extract, transform and load (ETL) of events emitted by the Ethereum blockchain. These events are processed by user-defined logic called *mappings* which run deterministically inside a WASM runtime, and are also able to load additional data from the Ethereum blockchain or IPFS, in order to compute the current state of a subgraph. See [Datasets](../datasets)) for more information.
54 | 
55 | **Note:** Here, and throughout this document, "event" is used in its standard usage, meaning data which is emitted asynchronously and may act as a trigger for computation. This is to disambiguate from "Solidity events," which build atop Ethereum's low-level logging facilities, and will be referred to throughout this specification as "Solidity events" or "Ethereum logs". Indexing Nodes will process events which include Ethereum logs, new blocks, as well as internal and external Ethereum transactions.
56 | 
57 | Indexing Nodes implement a standard interface for reading from indexes and to advertise compute and bandwidth prices for read operations. See [JSON-RPC API](../rpc-api)) for full interface.
58 | 
59 | ## Fisherman Service
60 | Fisherman Services accept read responses and attestations which they may verify, and in the event of an invalid response, may file a protocol-level dispute (see [Mechanism Design](../mechanism-design)). Note that whether or not the Fisherman Service actually verifies the response is completely opaque to the end-user and the protocol.
61 | 
62 | In the v1 of the protocol, the Graph Protocol team will operate a Fisherman service.
63 | 
64 | ### IPFS
65 | "IPFS" refers to the Interplanetary File Service, a decentralized content-addressed storage network. Data stored on IPFS is identified by a content ID (CID) which is computed by encoding and hashing the content being stored.[<sup>1</sup>](#footnotes) It has become a common pattern to store these CIDs in Ethereum contracts, providing a form of cheap, decentralized, off-chain storage. The Graph will index data linked in IPFS, referenced in Ethereum smart contracts, to support this use case.
66 | 
67 | The Graph also uses IPFS in this way - subgraph manifests, are stored on IPFS and referenced on-chain in the protocol's smart contracts (see [Subgraph Manifest](../subgraph-manifest)). In the future, we may store indexed data and even query results on IPFS, as having data stored in a global decentralized file system increases chances for reuse, adds redundancy for widely used data and effectively gives you caching for free.
68 | 
69 | ### Ethereum
70 | Ethereum is a blockchain that can run small Turing-complete executable programs called smart contracts. Consensus is built around the results of these computations, using a Byzantine fault tolerant (BFT) consensus algorithm, meaning that a centralized actor cannot easily tamper with or rewrite the results of past computations.[<sup>2</sup>](#footnotes)
71 | 
72 | The Ethereum blockchain plays two principle roles in the protocol. First, dApps include business logic implemented as smart contracts deployed to the Ethereum blockchain, which in turn emit events and store data that is indexed by The Graph. Second, the mechanisms that define the incentives and economic security of The Graph are themselves implemented as smart contracts deployed to the Ethereum blockchain.
73 | 
74 | ## Footnotes
75 | - [1] https://ipfs.io/ipfs/QmR7GSQM93Cx5eAg6a6yRzNde1FQv7uL6X1o4k7zrJa3LX/ipfs.draft3.pdf
76 | - [2] https://github.com/ethereum/wiki/wiki/White-Paper
77 | 


--------------------------------------------------------------------------------
/specs/graph-protocol-hybrid-network/mechanism-design/README.md:
--------------------------------------------------------------------------------
  1 | # Mechanism Design
  2 | 
  3 | ## Overview
  4 | The protocol implements a *work token* token model[<sup>1</sup>](#footnotes) in which Indexing Node operators stake deposits of Graph Tokens for particular datasets, called subgraphs, to gain the right to participate in the data retrieval marketplaces for that dataset--indexing data and responding to read requests in exchange for micropayments. This deposit is forfeit in the event that the work is not performed correctly, or is performed maliciously, as defined in the [slashing conditions](#dispute-resolution).
  5 | 
  6 | There are secondary mechanisms in the protocol that also require a staking of tokens, such as curation, stake delegation, and name registration, all of which will be expanded upon in their respective sections.
  7 | 
  8 | ## Graph Token
  9 | We introduce a native token for the protocol, Graph Tokens, which are the only token that may be used for staking in the network. However, ETH or DAI is used for paying for read operations, thus reducing friction and balance sheet risk for end-users of dApps that query The Graph. Graph Tokens will have variable inflation to reward specific activities in the network, as described in [Inflation Rewards](#inflation-rewards).
 10 | 
 11 | ## Governance
 12 | There are several parameters throughout this mechanism design that are set via a governance process. In the v1 specification, governance will consist of a multi-sig wallet contract controlled by the Graph Protocol team.
 13 | 
 14 | In future versions of the protocol, more decentralized forms of governance will explored.
 15 | 
 16 | ## Staking
 17 | Indexing Nodes deposit a `stakingAmount` of Graph Tokens to process read requests for a specific dataset, which is identified by its `subgraphID`.
 18 | 
 19 | For a `stakingAmount` to be considered valid, it must meet the following requirements:
 20 |  - `stakingAmount >= minStakingAmount` where `minStakingAmount` is set via governance.
 21 |  - The `stakingAmount` must be in the set of the top N staking amounts, where N is determined by the `maxIndexers` parameter that is set via governance.
 22 | 
 23 | Indexing Nodes that have staked for a dataset are not limited by the protocol in how many read requests they may process for that dataset. However, it may be assumed that Indexing Nodes with higher deposits will receive more read requests and, thus, collect more fees, if all else is equal, as this represents a greater economic security margin to the end user.
 24 | 
 25 | ## Data Retrieval Market
 26 | Indexing Nodes which have staked to index a particular dataset, will be discoverable in the data retrieval market for that dataset.
 27 | 
 28 | Indexing Nodes compete to have the most compelling combination of economic security margin (the amount of tokens staked), performance and price to attract read requests from users of the network. See [Query Processing](../query-processing) for an example market interaction.
 29 | 
 30 | Indexing Nodes receive requests which include a [Read Operation](../messages#read-operation) and a Locked Transfer.
 31 | 
 32 | The Read Operation fully defines the data that is being requested, while the [Locked Transfer](../messages#locked-transfer) is a micropayment that is paid, conditional, on the Indexing Node producing a [Read Response](../messages#read-response) along with a signed [Attestation](../messages#attestation) message which certifies the response data is correct.
 33 | 
 34 | ## Data Retrieval Pricing
 35 | Pricing in the data retrieval market is set according to the bandwidth and compute required to process a request.
 36 | 
 37 | Compute is priced as a `gasPrice`, denominated in ETH or DAI, where the `gas` required for a request is determined by the specific read operation and parameters. See [Read Interface](../read-interface) for operation specific gas prices.
 38 | 
 39 | Bandwidth is priced in `bytesPrice`, denominated in ETH or DAI, where `bytes` refers to the size of the `data` portion of the response, measured in bytes.
 40 | 
 41 | Indexing Nodes respond with their compute and bandwidth costs in response to the `getPrices` method in the [JSON-RPC API](../rpc-api).
 42 | ## Verification
 43 | 
 44 | ### Fisherman Service
 45 | A Fisherman Service is an economic agent who verifies read responses in exchange for a reward in cases where they detect that an Indexing Node has attested to an incorrect response, and the Fisherman successfully disputes the response on-chain.
 46 | 
 47 | In the v1 of the protocol, the Graph Protocol team will operate a Fisherman service. This is to accommodate the fact, that in the absence of forced errors in the v1 protocol, Fisherman rewards should go to zero overtime, and thus must have altruistic motives in order to perform their service.
 48 | 
 49 | ### Dispute Resolution
 50 | Dispute resolution is handled through an on-chain dispute resolution process. In future versions of the protocol, this may involve programmatically verifying proofs or using a Truebit-style verification game, but in the v1 specification, the outcome of a dispute will be decided by a centralized arbitrator interacting with the on-chain dispute resolution process.
 51 | 
 52 | To dispute a response, a Fisherman must submit the attestation of the response they are disputing as well as a deposit.
 53 | 
 54 | **TODO** [Define deposit amount for Fisherman disputes](https://github.com/graphprotocol/research/issues/76)
 55 | 
 56 | In the event of a successful dispute the Indexing Node forfeits the entire deposit of tokens they staked on the dataset for which they produced an incorrect response. The Fisherman, in turn, receives a reward equal to a percentage of the slashed deposit.
 57 | 
 58 | **TODO** [Define slashing reward for successful disputes](https://github.com/graphprotocol/research/issues/77)
 59 | 
 60 | In the event of an unsuccessful dispute, the Fisherman forfeits the entire deposit they submitted with their dispute.
 61 | 
 62 | ## Market Discovery
 63 | Market discovery is the process by which Indexing Nodes choose which datasets to index and serve data on.
 64 | 
 65 | When the data retrieval market for a particular dataset is active, an Indexing Node may observe payment activity on-chain to decide if it would be profitable to participate in that market.
 66 | 
 67 | With little to no activity for a newly created dataset, however, payment activity provides a poor signal. Instead, this signal to the network is provided by a *Curation Market*.[<sup>2</sup>](#footnotes)
 68 | 
 69 | ### Curation Market
 70 | Curators are economic agents who earn rewards by betting on the future economic value of datasets, perhaps with the benefit of private information.
 71 | 
 72 | A Curator stakes a deposit of Graph Tokens for a particular dataset in exchange for dataset-specific *subgraph tokens*. These tokens entitle the holder to a portion of a curation reward, which is paid in Graph Tokens through inflation. See [Inflation Rewards](#inflation-rewards) for how curation reward is calculated for each dataset.
 73 | 
 74 | Subgraph tokens are issued according to a bonding curve, making it more expensive to mint subgraph tokens by locking up Graph Tokens as the amount of bonded tokens increases, thus making it more expensive to purchase a share of future curator inflation rewards.
 75 | 
 76 | **TODO** [Define bonding curve for curation market or bonding curve parameters](https://github.com/graphprotocol/research/issues/69).
 77 | 
 78 | #### Curator Rewards
 79 | Curators earn a percentage of the fees paid for queries on the subgraphs they curate. Each subgraph token minted corresponds to *one basis point (0.01%)* of the fees paid to Indexers of that subgraph.
 80 | 
 81 | ## Participation Adjusted Inflation Reward
 82 | To encourage Graph Token holders to participate in the network, the protocol implements a participation-adjusted inflation[<sup>3</sup>](#footnotes) reward.
 83 | 
 84 | The participation reward to the entire network is calculated as a function of a `targetParticipationRate` that is set via governance. If `actualParticipationRate == targetParticipationRate`, then `participationRewardRate = 0`. Conversely, the lower the actual participation rate is relative to the target participation rate, the higher the participation reward.
 85 | 
 86 | **TODO** [Decide on actual function for relating `participationRewardRate` to `targetParticipationRate`](https://github.com/graphprotocol/research/issues/70).
 87 | 
 88 | To incentivize actual work being provided to the network, not just staking, the participation reward will be distributed to Indexing Nodes who are staking for datasets with the strongest market signal from curators.
 89 | 
 90 |  - Let `totalStakedForCuration` be the amount of Graph Tokens staked for curation in the entire network
 91 |  - Let `stakedForCuration[s]` be the amount staked for curation for a particular dataset `s`.
 92 |  - Let `stakedForIndexing[s]` be the total amount staked for indexing on a particular dataset `s`.
 93 |  - Let `stakedForIndexing[s][i]` be the amount staked for indexing by Indexer `i` on dataset `s`.
 94 | 
 95 | Then we can compute `participationReward[s][i]`, the participation reward allotted to Indexer `i` staked on dataset `s`, as follows:
 96 | 
 97 | `participationReward[s][i] = (stakedForCuration[s] / totalStakedForCuration) * (stakedForIndexing[s][i] / stakedForIndexing[s]) * participationRewardRate * totalTokenSupply`.
 98 | 
 99 | ## Rounds
100 | Inflation rewards are calculated over a period of time, measured in blocks, according to a `roundDuration` parameter that is set via governance.
101 | 
102 | **TODO** [How should round duration be set to balance gas costs and facilitating a dynamic market?](https://github.com/graphprotocol/research/issues/75)
103 | 
104 | For a given round `R`, the inflation rewards for that round are made available at the end of round `R+1`.
105 | 
106 | This provides adequate time for off-chain micropayments to be settled on-chain. This settlement on-chain also provides a [market signal](#market-signals). So, `roundDuration` should be set sufficiently small to provide a good market signal, but sufficiently large to reduce the amount of on-chain transactions required to redeem inflation rewards on an on-going basis.
107 | 
108 | ## Stake Delegation
109 | Participation in the protocol is a specialized activity. In the case of Curators, it entails accurately predicting the future value of datasets to the network, while in the case of Indexing Nodes, it requires operating infrastructure to index and serve data.
110 | 
111 | Token holders who do not feel equipped to perform one of these functions may *delegate* their tokens to an Indexing Node that is staked for a particular dataset. In this case, the delegator is the residual claimant for their stake, earning participation rewards according to the activities of the delegatee Indexing Node but also forfeiting their stake in the event that the delagatee Indexing Node is slashed.
112 | 
113 | ## Footnotes
114 | - [1] https://multicoin.capital/2018/02/13/new-models-utility-tokens/
115 | - [2] https://medium.com/@simondlr/introducing-curation-markets-trade-popularity-of-memes-information-with-code-70bf6fed9881
116 | - [3] https://medium.com/@petkanics/inflation-and-participation-in-stake-based-token-protocols-1593688612bf
117 | 


--------------------------------------------------------------------------------
/specs/graph-protocol-hybrid-network/messages/README.md:
--------------------------------------------------------------------------------
  1 | # Messages
  2 | 
  3 | ## Off-Chain Messages
  4 | 
  5 | ### Encoding
  6 | Off-chain messages are encoded using JSON[<sup>1</sup>](#footnotes), a light-weight data interchange format, and the mostly commonly used format for exchanging data on the web.
  7 | 
  8 | Off-chain messages may be referenced in an on-chain message via a Content ID (CID). These are produced according to the IPLD CID V1 specification[<sup>2</sup>](#footnotes).
  9 | 
 10 | CIDs must use the canonical CBOR encoding[<sup>3</sup>](#footnotes), and SHA-256 multi-hash.
 11 | 
 12 | In producing CIDs for JSON RPC messages, the optional `id` field from the JSON-RPC 2.0 specification should be omitted, as well as the optional conditional micropayment in the `readIndex` params list.
 13 | 
 14 | ### Message Types
 15 | 
 16 | #### Read Operation
 17 | 
 18 | ##### Fields
 19 | | Field Name | Field Type | Description |
 20 | | ---------- | ---------- | ----------- |
 21 | | blockHash | String | The hash of the Ethereum block as of which to read the data. |
 22 | | subgraphID | String | The ID of the subgraph to read from. |
 23 | | index | Object | The Index Record corresponding to the index being read from. |
 24 | | op | String | The name of the read operation. |
 25 | | params | Array<any> | The parameters passed into the called read operation. |
 26 | 
 27 | #### Index Record
 28 | 
 29 | #### Fields
 30 | 
 31 | | Field Name | Field Type | Description |
 32 | | ---------- | ---------- | ----------- |
 33 | | db | String | The identifier of the database model being used. |
 34 | | indexType | String | An identifier of the index type used for the respective database model. |
 35 | | partition | String | The name of the entity or interface which should be covered by the index. |  
 36 | | options | Array<String> | Parameters specific to the type of index. |
 37 | 
 38 | #### Locked Transfer
 39 | A message intended to be exchanged off-chain as a conditional micropayment in the data retrieval market for a subgraph. Accompanied by a Payment Channel Balance Proof which may be redeemed on-chain.
 40 | 
 41 | ##### Fields
 42 | 
 43 | | Field Name | Field Type | Description |
 44 | | ---------- | ---------- | ----------- |
 45 | | chainID  | Number | EIP155 chain ID. |
 46 | | tokenDenomination | String | Token denomination. Must be "ETH" or "DAI". |
 47 | | transferredAmount | Number | A monotonically increasing amount of tokens which have been sent in the channel.|
 48 | | receiver | String | The Ethereum address of the final destination of the micropayment. Must be the address of an Indexing Node which is staked for the subgraph referenced in the payment. |
 49 | | subgraphID | String | The ID of the subgraph for which the receiver must be staked. |
 50 | | maxLockedAmount | Number | The maximum amount of tokens locked in pending transfers. |
 51 | | locksRoot | String | The root of a Merkle tree containing all locked data retrieval timelocks. |
 52 | | lock | Object | The [Off-chain Data Retrieval Timelock](#off-chain-data-retrieval-timelock) corresponding to the most recent lock added to the Balance Proof. |
 53 | | nonce | Number | A monotonically increasing nonce value starting at `1`. Used for strictly ordering balance proofs. |
 54 | | v | Number | The ECDSA recovery ID of the corresponding Payment Channel Balance Proof. |
 55 | | r | String | The ECDSA signature r of the corresponding Payment Channel Balance Proof. |
 56 | | s | String | The ECDSA signature v of the corresponding Payment Channel Balance Proof. |
 57 | 
 58 | #### Off-chain Data Retrieval Timelock
 59 | An off-chain representation of the [Data Retrieval Timelock](#data-retrieval-timelock)
 60 | 
 61 | ##### Fields
 62 | | Field Name | Field Type | Description |
 63 | | ---------- | ---------- | ----------- |
 64 | | expiration | Number | The block until which the locked transfer may be settled on-chain.
 65 | | gasPrice | Number | Amount of tokens locked.|
 66 | | maxGas | Number | The maximum amount of gas to be consumed in the read operation. |
 67 | | bytesPrice | Number | The price to pay per byte served.|
 68 | | maxBytes | Number | The maximum amount of bytes to be sent over the wire |
 69 | | maxTokens | Number | The maximum amount of tokens to be paid. |
 70 | | requestCID | String | The content ID of the read operation to which the Indexing Node must respond with a valid attestation, in order to unlock the payment. |
 71 | 
 72 | #### Read Response
 73 | There are several possible statuses for a read response. Read responses must update the nonce of the balance proof and may be accompanied by an attestation. It may update balances or other state in the state channel and may be used as a part of settling the channel.
 74 | 
 75 | ##### Success
 76 | Sent if the read operation was successful, within the gas and response size limits specified. Includes the return data an attestation that the response is correct.
 77 | 
 78 | | Field Name | Field Type | Description |
 79 | | ---------- | ---------- | ----------- |
 80 | | status     | String     | The constant "SUCCESS"   |
 81 | | data       | any        | The result of calling the read operation. |
 82 | | attestation | Object | An Attestation, where the `responseCID` is the CID of the object containing the above fields. |
 83 | 
 84 | ##### Max Gas Exceeded
 85 | Sent if the maximum amount of gas specified was consumed before the read operation could complete. The caller of the read operation is responsible for paying for the computation, but not for any bandwidth.
 86 | 
 87 | | Field Name | Field Type | Description |
 88 | | ---------- | ---------- | ----------- |
 89 | | status | String | The constant "MAX_GAS_EXCEEDED" |
 90 | | attestation | Object | An Attestation, where the `responseCID` is the CID of the object containing the above field. |
 91 | 
 92 | ##### Max Bytes Exceeded
 93 | Sent if the result of calling the read operation is larger than the `maxBytes` parameter in the data retrieval timelock. The caller of the read operation is responsible for paying for the computation, but not for any bandwidth.
 94 | 
 95 | | Field Name | Field Type | Description |
 96 | | ---------- | ---------- | ----------- |
 97 | | status | String | The constant "MAX_BYTES_EXCEEDED" |
 98 | | attestation | Object | An Attestation, where the `responseCID` is the CID of the object containing the above field. |
 99 | 
100 | ##### Insufficient Funds
101 | Sent if the maximum amount of tokens which may be consumed by the read operation would exceed the balance in the payment channel.
102 | 
103 | | Field Name | Field Type | Description |
104 | | ---------- | ---------- | ----------- |
105 | | status | String | The constant "INSUFFICIENT_FUNDS". |
106 | 
107 | ##### Price Too Low
108 | Sent if the Indexing Node is unwilling to provide the service at the prices offered by the caller.
109 | 
110 | | Field Name | Field Type | Description |
111 | | ---------- | ---------- | ----------- |
112 | | status | String | The constant "PRICE_TOO_LOW". |
113 | | askingPrice | Object | A price listing object. |
114 | 
115 | #### Price Listing
116 | A price listing advertising an Indexing Nodes asking price for computation and bandwidth, denominated in a specific token.
117 | 
118 | | Field Name | Field Type | Description |
119 | | ---------- | ---------- | ----------- |
120 | | token | String |
121 | | gasPrice | Number | The price of a unit of gas, denominated in the token included in the listing. Must be an integer. |
122 | | bytesPrice | Number | The price per byte in the read operation result data, denominated in the token included in the listing. Must be an integer. |
123 | 
124 | ## On-Chain Messages
125 | 
126 | ### Encoding
127 | Unsigned messages are encoded according to the ABIv2 specification[<sup>4</sup>](#footnotes), while signed messages are encoded according to [EIP 712 specification[<sup>5</sup>](#footnotes).
128 | 
129 | Signed message formats are accompanied by a typed structured data definition, which can be used to compute the type, the type hash, and the data of a message according to the EIP 712 specification. Types are written as Solidity code, but are intended to be compatible with any language that compiles to EVM bytecode.
130 | 
131 | #### EIP 712 Domain Separator
132 | The EIP 712 specification requires defining a domain separator to disambiguate signed messages intended for different chains, different protocols, or different versions of the same protocol.
133 | 
134 | The domain separator for the protocol has the following chain-agnostic parameters:
135 |  - **name** - 'graphprotocol'
136 |  - **version** - '0'
137 | 
138 | Additionally there are chain-specific parameters:
139 | - mainnet
140 |   - **chainid** - 1
141 |   - **verifyingContract** - TBD
142 | - ropsten
143 |   - **chainid** - 3
144 |   - **verifyingContract** - TBD
145 | - kovan
146 |   - **chainid** - 42
147 |   - **verifyingContract** - TBD
148 | - rinkeby
149 |   - **chainid** - 4
150 |   - **verifyingContract** - TBD
151 | 
152 | ### Message Types
153 | #### Attestation
154 | 
155 | ##### Fields
156 | | Field Name  | Field Type | Description |
157 | | ----------- | ---------- | ----------- |
158 | | requestCID | bytes    | The content ID of the message. |
159 | | responseCID | bytes   | The content ID of the response. |
160 | | gasUsed     | uint256    | The gas used to process the read operation. |
161 | | responseBytes     | uint256    | The size of the response data in bytes. |
162 | | v | uint8 | The ECDSA recovery ID . |
163 | | r | bytes32 | The ECDSA signature r. |
164 | | s | bytes32 | The ECDSA signature v. |
165 | 
166 | ###### EIP712 Struct Type
167 | ```solidity
168 | struct Attestation {
169 |   bytes requestCID;
170 |   bytes responseCID;
171 |   uint256 gasUsed;
172 |   uint256 responseBytes;
173 | }
174 | ```
175 | 
176 | #### Payment Channel Balance Proof
177 | The Payment Channel Balance Proof is a signed off-chain message which represents a micropayment between an end user of The Graph and the Payment Channel Hub via a payment channel. Because all payment channels have the Payment Channel Hub as the receiver, it is sufficient to be able to identify the token denomination and the sender's Ethereum address (this may be derived from the signature), as well as the subgraph on which they are staked, to uniquely identify the channel to which the balance proof applies.
178 | 
179 | ##### Fields
180 | | Field Name | Field Type | Description |
181 | | ---------- | ---------- | ----------- |
182 | | chainID  | uint256 | EIP155 chain ID. |
183 | | tokenDenomination | string | Token denomination. Must be "ETH" or "DAI". |
184 | | transferredAmount | uint256 | A monotonically increasing amount of tokens which have been sent in the channel.|
185 | | receiver| address | The Ethereum address of the final destination of the micropayment. Must be the address of an Indexing Node which is staked for the subgraph referenced in the payment. |
186 | | subgraphID | bytes | The ID of the subgraph for which the receiver must be staked. |
187 | | maxLockedAmount | uint256 | The maximum amount of tokens locked in pending transfers. |
188 | | locksRoot | bytes32 | The root of a Merkle tree containing all locked data retrieval timelocks. |
189 | | nonce | uint256 | A monotonically increasing nonce value starting at `1`. Used for strictly ordering balance proofs. |
190 | | v | uint8 | The ECDSA recovery ID . |
191 | | r | bytes32 | The ECDSA signature r. |
192 | | s | bytes32 | The ECDSA signature v. |
193 | 
194 | #### Minting Channel Balance Proof
195 | The Minting Channel Balance Proof is a signed off-chain message which represents a micropayment between the Payment Channel Hub and an Indexing Node in The Graph. Because all payment channels have the Payment Channel Hub as the sender, it is sufficient to be able to identify the token denomination and the receiver's Ethereum address to uniquely identify the channel to which the balance proof applies.
196 | 
197 | ##### Fields
198 | | Field Name | Field Type | Description |
199 | | ---------- | ---------- | ----------- |
200 | | chainID  | uint256 | EIP155 chain ID. |
201 | | tokenDenomination | string | Token denomination. Must be "ETH" or "DAI". |
202 | | transferredAmount | uint256 | A monotonically increasing amount of tokens which have been sent in the channel.|
203 | | maxLockedAmount | uint256 | The maximum amount of tokens locked in pending transfers. |
204 | | locksRoot | bytes32 | The root of a Merkle tree containing all locked data retrieval timelocks. |
205 | | nonce | uint256 | A monotonically increasing nonce value starting at `1`. Used for strictly ordering balance proofs. |
206 | | v | uint256 | The ECDSA recovery ID . |
207 | | r | bytes32 | The ECDSA signature r. |
208 | | s | bytes32 | The ECDSA signature v. |
209 | 
210 | ###### EIP712 Struct Type
211 | ```solidity
212 | struct MintingChannelBalanceProof {
213 |   uint256 chainID;
214 |   address tokenNetworkAddress;
215 |   uint256 channelID;
216 |   uint256 transferredAmount;
217 |   uint256 maxLockedAmount;
218 |   bytes32 locksRoot;
219 |   uint256 nonce;
220 | }
221 | ```
222 | 
223 | #### Data Retrieval Timelock
224 | 
225 | ##### Fields
226 | | Field Name | Field Type | Description |
227 | | ---------- | ---------- | ----------- |
228 | | expiration | uint256    | The block until which the locked transfer may be settled on-chain.
229 | | gasPrice | uint256 | Amount of tokens locked.|
230 | | maxGas | uint256 | The maximum amount of gas to be consumed in the read operation. |
231 | | bytesPrice | uint256 | The price to pay per byte served.|
232 | | maxBytes | uint256 | The maximum amount of bytes to be sent over the wire |
233 | | maxTokens | uint256 | The maximum amount of tokens to be paid. |
234 | | requestCID | bytes | The content ID of the read operation to which the Indexing Node must respond with a valid attestation, in order to unlock the payment. |
235 | 
236 | ## Footnotes
237 | - [1] http://json.org
238 | - [2] https://github.com/ipld/cid#cidv1
239 | - [3] https://tools.ietf.org/html/rfc7049#section-3.9
240 | - [4] https://solidity.readthedocs.io/en/develop/abi-spec.html
241 | - [5] https://github.com/ethereum/EIPs/blob/master/EIPS/eip-712.md
242 | 


--------------------------------------------------------------------------------
/specs/graph-protocol-hybrid-network/read-interface/README.md:
--------------------------------------------------------------------------------
  1 | # Read Interface
  2 | 
  3 | ## Overview
  4 | To participate in the data retrieval market, Indexing Nodes implement a low-level read interface to the indexed data in their store. The read interface not only provides the means of retrieving data from an Indexing Node, but it also defines a contract that an Indexing Node is agreeing to uphold or else be slashed. This is enabled by attestations, which assert that a response was produced correctly and may be verified on-chain.
  5 | 
  6 | ## Calling Read Operations
  7 | Available read operations are defined by the respective interface of the index being read from. See [Index Abstract Data Structures](#index-abstract-data-structures) and [Index Types](#index-types) for more information.
  8 | 
  9 | While the read interfaces are described using a TypeScript notation, all the interfaces are language agnostic and defined in terms of JSON types.
 10 | 
 11 | Calling these read operations is done via JSON RPC 2.0[<sup>1</sup>](#footnotes). See the full [JSON RPC API](../rpc-api).
 12 | 
 13 | 
 14 | The method of interest here is `callReadOp` which accepts the following parameters:
 15 | 1. `Object`
 16 |  - `blockHash`: `String` - The hash of the Ethereum block from which to read the data.
 17 |  - `subgraphID`: `String` - The ID of the subgraph to read from.
 18 |  - `index`: `Object` - The [IndexRecord](#indexes) of the index being read from.
 19 |  - `op`: `String` - The name of the read operation.
 20 |  - `params`: `[any]` - The parameters passed into the called read operation.
 21 | 2. `Object` - A [Locked Transfer](../messages#locked-transfer) message which serves as a conditional micropayment for the read operation.
 22 | 
 23 | The `readIndex` method returns the following:
 24 | 1. `Object`
 25 |  - `data`: `any` - The data retrieved by the read operation.
 26 |  - `attestation`: Object - An attestation that `data` is a correct response for the given read operation (see [Attestation](#attestation)).
 27 | 
 28 | ```js
 29 | // request
 30 | {
 31 |   "method": "readIndex",
 32 |   "params": [
 33 |     {
 34 |       "blockHash": "xbf133b670857b983fc1b8f08759bc860378179042a0dba30b30e26d6f7f919d1",
 35 |       "subgraphID": "QmTeW79w7QQ6Npa3b1d5tANreCDxF2iDaAPsDvW6KtLmfB",
 36 |       "index": {
 37 |         "indexType": "kv"
 38 |       },
 39 |       "op": "get"
 40 |       "params": ["User:1"]
 41 |     }
 42 |   ],
 43 |   "jsonrpc": "2.0"
 44 | }
 45 | // response
 46 | {
 47 |   "data": {
 48 |     "firstName": "Vitalik",
 49 |     "lastName": "Buterin",
 50 |   },
 51 |   // TODO: Provide more realistic attestations
 52 |   "attestation": 0x0122340
 53 | }
 54 | ```
 55 | 
 56 | ##### Example - Entity doesn't exist
 57 | 
 58 | ```js
 59 | // request
 60 | {
 61 |   "method": "readIndex",
 62 |   "params": [
 63 |     {
 64 |       "blockHash": "xbf133b670857b983fc1b8f08759bc860378179042a0dba30b30e26d6f7f919d1",
 65 |       "index": {
 66 |         "indexType": "kv"
 67 |       },
 68 |       "op": "get"
 69 |       "params": ["User:1"]
 70 |     }
 71 |   ],
 72 |   "jsonrpc": "2.0"
 73 | }
 74 | // response
 75 | {
 76 |   "data": null,
 77 |   // TODO: Provide more realistic attestations
 78 |   "attestation": 0x0122340
 79 | }
 80 | ```
 81 | 
 82 | ## Indexes
 83 | All read operations require that the caller specify an index. Index data structures efficiently organize the data to support different read access patterns.
 84 | 
 85 | Indexes may include the entire dataset or cover only a subset. This is useful for enabling sharding, where different Indexing Nodes may store different subsets of the dataset to reduce the storage requirements for a single Indexing Node or enable better read performance.
 86 | 
 87 | Indexes are defined by an `IndexRecord` which has the following shape:
 88 | 
 89 | | Field Name | Field Type | Description |
 90 | | ---------- | ---------- | ----------- |
 91 | | db | String | The identifier of the database model being used. |
 92 | | indexType | String | An identifier of the index type used for the respective database model. |
 93 | | partition | String | The name of the entity or interface which should be covered by the index. |  
 94 | | options | Object | Options specific to the type of index. |
 95 | 
 96 | ###### Example Index Records
 97 | Given a dataset with the following data model:
 98 | 
 99 | ```graphql
100 | interface EthereumAccount {
101 |   id: ID!
102 |   address: String!
103 | }
104 | 
105 | type Contract implements EthereumAccount {
106 |   id: ID!
107 |   address: String!
108 | }
109 | 
110 | type User implements EthereumAccount {
111 |   id: ID!
112 |   address: String!
113 |   name: FullName
114 | }
115 | 
116 | type FullName {
117 |   first: String!
118 |   last: String!
119 | }
120 | ```
121 | 
122 | Then, the following would be valid index names for that dataset:
123 | 
124 | | Index Name | Description|
125 | | ---------- | ---------- |
126 | | `{ db: "entitydb", indexType: "dictionary" }`       | A basic key-value index supporting constant-time lookup of all entities in dataset. |
127 | | `{ db: "entitydb", "indexType: "dictionary", partition: "User" }`  | A basic key-value index supporting constant-time lookup of `User` entities. |
128 | | `{ db: "entitydb", indexType: "searchTree", options: { sortBy: ["id"] } }` | A sorted key-value index supporting iteration through all entities, sorted by ID. |
129 | | `{ db: "entitydb", indexType: "searchTree", partition: "EthereumAccount", options: { sortBy: ["address"] }}` |  A sorted key-value index supporting iteration through all entities implementing the `EthereumAccount` interface, sorted by the `address` field. |
130 | | `{ db: "entitydb", indexType: "searchTree", partition: "User",  options: { sortBy: ["name.first", "name.last"] } }` | A sorted key-value index supporting iteration through all `User` entities, first sorted by the nested field `name.first`, then by the nested field `name.last` (i.e., a compound index). |
131 | 
132 | ### Index Abstract Data Structures
133 | 
134 | All concrete index types implement an indexing abstract data structure, which specify the interface, semantics, and gas costs for read operations against that index.
135 | 
136 | The concrete types (i.e., `K` and `V` shown below), as well as the implicit comparator function to determine sort order, are specified by each concrete index type.
137 | 
138 | #### Dictionary
139 | ##### Type
140 | Dictionary<K,V>
141 | 
142 | ##### Operations
143 | | Op  | Signature | Description | Gas Cost |
144 | | --- | --------- | ----------- | -------- |
145 | | get | `(key: K) => V` | Retrieves a value by its key. | `opCostDictionaryGet` (set via governance) |
146 | 
147 | #### Search Tree
148 | 
149 | ##### Type
150 | `SearchTree<K,V>`
151 | 
152 | ##### Operations
153 | | Op  | Signature           | Description  | Gas Cost |
154 | | --- | ------------------- | ------------ | -------- |
155 | | find | `(predicate: FilterPredicate, options?: { gte?: K, lt?: K } ) => V` | Retrieves the first value for which the `FilterPredicate` returns true, searching in ascending order. If specified, only take values whose sort keys are between the range parameters `gte` (inclusive) and `lt` (exclusive). | `(opCostSearchTreeStep opCostFilterPredicate) * N` where `N` is the number of iterations taken to find the value, `opCostSearchTreeStep` is set via governance, and `opCostFilterPredicate` is calculated for the specific filter predicate provided. |
156 | | findLast | `(predicate: FilterPredicate, options?: { gt?: K, lte?: K } ) => V` | Retrieves the first value for which the `FilterPredicate` returns true, searching in descending order. If specified, only take values whose sort keys are between the range parameters `gt` (exclusive) and `lte` (inclusive). |  `(opCostSearchTreeStep opCostFilterPredicate) * N` where `N` is the number of iterations taken to find the value, `opCostSearchTreeStep` is set via governance, and `opCostFilterPredicate` is calculated for the specific filter predicate provided. |
157 | | get | `(key: K) => V` | Retrieves a value by its sort key. If multiple values share the same sort key, it will retrieve the first value inserted with the sort key. | `opCostSearchTreeGetPerH * H` where `H` is the height of a binary search tree, and `opCostSearchTreeGetPerH` is set via governance. |
158 | | take | `(count: Number, options?: { skip?: Number, gte?: K, lt?: K}) => [V]` | Retrieves the first N values, specified by `count`, from the index in ascending order, optionally skipping the number of values specified by `skip`. If specified, it only takes values whose sort keys are between the range parameters `gte` (inclusive) and `lt` (exclusive).| `opCostSearchTreeStep * N` where `N` is the number of iterations taken including skipped values. |
159 | | takeUntil | `(predicate: FilterPredicate, options?:{ skip?: Number, gte?: K, lt?: K}) => [V]` | Retrieves values from the index in ascending order until `FilterPredicate` returns false, optionally skipping the number of values specified by `skip`. If specified, it only takes values whose sort keys are between the range parameters `gte` (inclusive) and `lt` (exclusive). | `(opCostSearchTreeStep + opCostFilterPredicate) * N + opCostSearchTreeStep * S` where `N` is the number of iterations taken not including skipped values, `S` is the number of values skipped over, `opCostSearchTreeStep` is set via governance, and `opCostFilterPredicate` is calculated for the specific filter predicate provided.  |
160 | | takeWhile | `(predicate: FilterPredicate, options?:{ skip?: Number, gte?: K, lt?: K}) => [V]` | Retrieves values from the index in ascending order, while `FilterPredicate` returns true, optionally skipping the number of values specified by `skip`. If specified, it only takes values whose sort keys are between the range parameters `gte` (inclusive) and `lt` (exclusive). | `(opCostSearchTreeStep + opCostFilterPredicate) * N + opCostSearchTreeStep * S` where `N` is the number of iterations taken not including skipped values, `S` is the number of values skipped over, `opCostSearchTreeStep` is set via governance, and `opCostFilterPredicate` is calculated for the specific filter predicate provided. |
161 | | takeLast | `(count: Number, options?: { skip?: Number, gt?: K, lte?: K })` | Retrieves the last N values, specified by `count`, from the index in descending order, optionally skipping the number of values specified by `skip`. If specified, it only takes values whose sort keys are between the range parameters `gt` (exclusive) and `lte` (inclusive). | `opCostSearchTreeStep * N` where `N` is the number of iterations taken including skipped values.  |
162 | | takeLastUntil | `(predicate: FilterPredicate, options?:{ skip?: Number, gt?: K, lte?: K}) => [V]` | Retrieves values from the index in descending order, until `FilterPredicate` returns false, optionally skipping the number of values specified by `skip`. If specified, it only takes values whose sort keys are between the range parameters `gt` (exclusive) and `lte` (inclusive). | `(opCostSearchTreeStep + opCostFilterPredicate) * N + opCostSearchTreeStep * S` where `N` is the number of iterations taken not including skipped values, `S` is the number of values skipped over, `opCostSearchTreeStep` is set via governance, and `opCostFilterPredicate` is calculated for the specific filter predicate provided. |
163 | | takeLastWhile | `(predicate: FilterPredicate, options?:{ skip?: Number, gt?: K, lte?: K}) => [V]` | Retrieves values from the index in descending order, while `FilterPredicate` returns true, optionally skipping the number of values specified by `skip`. If specified, it only takes values whose sort keys are between the range parameters `gt` (exclusive) and `lte` (inclusive). | `(opCostSearchTreeStep + opCostFilterPredicate) * N + opCostSearchTreeStep * S` where `N` is the number of iterations taken not including skipped values, `S` is the number of values skipped over, `opCostSearchTreeStep` is set via governance, and `opCostFilterPredicate` is calculated for the specific filter predicate provided. |
164 | 
165 | ### Filter Predicates
166 | Filter predicates allow for declaratively asserting whether a value meets certain criteria. Filter predicates are expressed as objects that can be passed into several low-level index read operations, such as `takeWhile` and `find`.
167 | 
168 | #### Structure
169 | 
170 | Filter predicates are expressed through a simple DSL:
171 | ```typescript
172 | type FilterPredicate = FilterPredicateAnd | FilterPredicateOr | FilterPredicateLeaf
173 | 
174 | interface FilterPredicateAnd {
175 |   and: [FilterPredicate];
176 | }
177 | 
178 | interface FilterPredicateOr {
179 |   or: [FilterPredicate];
180 | }
181 | 
182 | type FilterPredicateLeaf = StringFilter | NumberFilter | BooleanFilter
183 | 
184 | interface BaseFilter {
185 |   // The field the predicate will be applied to. Nested fields may be
186 |   // specified by concatenating field names with a "."
187 |   // If no field is specified, the predicate will be applied to the value. This
188 |   // is only supported if the value is a primitive type.
189 |   field?: String;
190 | }
191 | 
192 | // If multiple filter clauses are supplied, they will be treated as a logical AND.
193 | interface StringFilter extends BaseFilter {
194 |   equals?: String;
195 |   notEquals?: String;
196 |   // Contains string
197 |   contains?: String;
198 |   // Does not contain string
199 |   notContains?: String;
200 |   startsWith?: String;
201 |   notStartsWith?: String;
202 |   endsWith?: String;
203 |   notEndsWith?: String;
204 |   // Less than
205 |   lt?: String;
206 |   // Less than or equal to
207 |   lte?: String;
208 |   // Greater than
209 |   gt?: String;
210 |   // Greater than or equal to
211 |   gte?: String;
212 |   // Contained in list
213 |   in?: [String];
214 |   // Not contained in list
215 |   notIn?: [String];
216 | }
217 | 
218 | 
219 | // If multiple filter clauses are supplied, they will be treated as a logical AND.
220 | interface NumberFilter extends BaseFilter {
221 |   equals?: Number;
222 |   notEquals?: Number;
223 |   // Less than
224 |   lt?: Number;
225 |   // Less than or equal to
226 |   lte?: Number;
227 |   // Greater than
228 |   gt?: Number;
229 |   // Greater than or equal to
230 |   gte?: Number;
231 |   // Contained in list
232 |   in?: [Number];
233 |   // Not contained in list
234 |   notIn?: [Number];
235 | }
236 | 
237 | // If multiple filter clauses are supplied, they will be treated as a logical AND.
238 | interface BooleanFilter extends BaseFilter {
239 |   equals?: Boolean;
240 |   notEquals?: Boolean;
241 | }
242 | ```
243 | 
244 | ##### Example - Simple Value Filter Predicate
245 | ```js
246 | {
247 |   equals: 12
248 | }
249 | ```
250 | 
251 | ##### Example - Object Filter Predicate
252 | ```js
253 | {
254 |   field: "fullName",
255 |   contains: "Vitalik"
256 | }
257 | ```
258 | 
259 | ##### Example - Filter Predicate with Boolean Operators and Nested Fields
260 | ```js
261 | {
262 |  and: [
263 |    {
264 |      field: "name.first",
265 |      equals: "Vitalik"
266 |    },
267 |    {
268 |      field: "name.last",
269 |      equals: "Buterin"
270 |    }
271 |  ]
272 | }
273 | ```
274 | 
275 | #### Gas Cost
276 | The clauses in the filter predicate DSL can be grouped into several buckets of operation types, which share equivalent gas cost calculations:
277 | 
278 | | Operation Type | Description | Gas Cost |
279 | | --------- | ----------- | -------- |
280 | | Number Comparison | Includes `lt`, `lte`, `gt`, `gte`, `equals` and `notEquals` clauses on Number types. | `opCostByteCompare * B ` where `B` is the number of bytes in the number type, and `opCostByteCompare` is set via governance. |
281 | | String Comparison | Includes `lt`, `lte`, `gt`, `gte`, `startsWith`, `notStartsWith`, `endsWith`, `notEndsWith`, `equals` and `notEquals` clauses on String types. | `opCostCharCompare * N ` where `N` is the number of characters compared in order to complete the operation, and `opCostCharCompare` is set via governance. |
282 | | Bit Comparison | Includes `equals` and `notEquals` clauses on Boolean types. Also used for combining two filter predicate clauses via the Boolean operators `or` and `and` (including the implicit `and` described above). | `opCostBitCompare` where `opCostBitCompare` is set via governance. |
283 | | String Match | Used for `contains` and `notContains` clauses on String types | `opCostStringSearch * (M + N)` where `N` is the number of characters in the pattern being matched, and `M` is the number of characters in the string being searched. `opCostStringSearch` is set via governance. |
284 | 
285 | 
286 | ### Database Models
287 | The semantics of reading from an Indexing Node are determined by the database model that the index being read from implements, such as [key-value (KV)](https://en.wikipedia.org/wiki/Key-value_database), [entity-attribute-value (EAV)](https://en.wikipedia.org/wiki/Entity%E2%80%93attribute%E2%80%93value_model) and the [relational model](https://en.wikipedia.org/wiki/Relational_model). Index types are prefixed with a short label indicating the database model the index implements:
288 | - `entitydb` - An [entity database model](#entity-database-model).
289 | - `rdb` -  Relational database model. Not supported in this version of the protocol.
290 | - `eav` -  Entity-attribute-value database model. Not supported in this version of the protocol.
291 | 
292 | The database model also defines the available partitions and index types for use in read operations.
293 | 
294 | In the v1 protocol, we only support the entity database model.
295 | 
296 | #### Entity Database Model
297 | In the protocol's entity database model, entities are stored as key-value pairs, where the key is a concatenation of the entity type and the entity ID, and the value is an entity object.
298 | 
299 | This database model is referenced as `entitydb` in Index Records.
300 | 
301 | 
302 | ##### Example - Entities Stored as Key-Value Pairs
303 | | Key | Value |
304 | | --- | ----- |
305 | | `user:1` | `{ id: "user:1", user: "Alice", age: 17 }` |
306 | | `user:2` | `{ id: "user:2", user: "Bob", age: 47 }` |
307 | 
308 | #### Partitions
309 | Partitions define the subset of the data that is covered by the index.
310 | 
311 | Possible values of `<partition>` in an Index Record:
312 |  - none - Includes all entities in the dataset. Default partition, `partition` key should be ommitted in IndexRecord.
313 |  - `"<entityType>"` - Includes entities of the type specified by `entityType`. The entity type name is case-sensitive.
314 |  - `"<interface>"` - Includes entities that implement the provided interface. The interface name is case-sensitive.
315 | 
316 | ### Index Types
317 | 
318 | #### Entity DB Indexes
319 | 
320 | ##### Dictionary
321 | The entity dictionary supports simple key-value lookups of entities by their entity type and ID, in constant time.
322 | 
323 | ###### Name
324 | `dictionary`
325 | 
326 | ###### Database Model
327 | `entitydb`
328 | 
329 | ###### Type
330 | `Dictionary<K, V>`
331 |    - `K`: `String` - The id of the entity.
332 |    - `V`: `Object` -  An entity that conforms to its type as defined in the schema of the dataset.
333 | 
334 | ###### Options
335 | None
336 | 
337 | ##### Search Tree
338 | This index supports iterating through entities, ordered by possibly nested attribute values. Supports compound indexes, where an entity is sorted first by one attribute, then by another.
339 | 
340 | ###### Name
341 | `searchTree`
342 | 
343 | ###### Database Model
344 | `entitydb`
345 | 
346 | ###### Type
347 | `SearchTree<K, V>`
348 |    - `K`: `String` | `Number` | `Object` - The value of the sortKey, which is either a primitive value in the case of single-attribute indexes, or an object containing two attribute-value pairs in the case of compound indexes.
349 |    - `V`:  An entity that conforms to its type as defined in the schema of the dataset.
350 | 
351 | ###### Options
352 | - `sortBy`: `Array`
353 |   1. `String` - The first attribute to sort by, using `.` to indicate nested attributes (i.e., `"name.first"`)
354 |   2. `String` -  The second attribute to sort by, using `.` to indicate nested attributes (i.e., `"name.last"`)
355 | 
356 | ## Footnotes
357 | - [1] https://www.jsonrpc.org/specification
358 | - [2] https://github.com/multiformats/multicodec
359 | 


--------------------------------------------------------------------------------
/papers/whitepaper/the-graph-whitepaper.tex:
--------------------------------------------------------------------------------
  1 | \documentclass[12pt]{article}
  2 | \usepackage{extsizes}
  3 | \usepackage{graphicx}
  4 | \usepackage[hidelinks]{hyperref}
  5 | \usepackage{multirow}
  6 | \usepackage{tabularx}
  7 | \usepackage{color}
  8 | \usepackage{amsmath}
  9 | \usepackage{amssymb}
 10 | \usepackage{amsfonts}
 11 | \usepackage{amsxtra}
 12 | \usepackage{wasysym}
 13 | \usepackage{isomath}
 14 | \usepackage{mathtools}
 15 | \usepackage{txfonts}
 16 | \usepackage{upgreek}
 17 | \usepackage{enumerate}
 18 | \usepackage{enumitem}
 19 | \usepackage{tensor}
 20 | \usepackage{pifont}
 21 | \usepackage[margin=1.1in]{geometry}
 22 | \definecolor{color-1}{rgb}{0.26,0.26,0.26}
 23 | \definecolor{color-2}{rgb}{0.4,0.4,0.4}
 24 | \title{The Graph: \protect\\ A Decentralized Query Protocol for Blockchains}
 25 | \usepackage{extsizes}
 26 | \usepackage{tocbibind}
 27 | \usepackage{float}
 28 | \usepackage{flafter}
 29 | \usepackage{xcolor}
 30 | \usepackage{sectsty}
 31 | \usepackage[font=small, skip=0pt]{caption}
 32 | \usepackage{setspace}
 33 | \setstretch{1.1}
 34 | 
 35 | % Select the font
 36 | \usepackage{charter}
 37 | 
 38 | % Prevent widow/orphan lines
 39 | \clubpenalty10000
 40 | \widowpenalty10000
 41 | \displaywidowpenalty=10000
 42 | 
 43 | % Define paragraph spacing and first line indentation
 44 | \setlength{\parskip}{8pt}
 45 | \setlength{\parindent}{0pt}
 46 | 
 47 | % Configure line spacing
 48 | \renewcommand{\baselinestretch}{1.1}
 49 | 
 50 | \author{Yaniv Tal, Brandon Ramirez, Jannis Pohlmann}
 51 | \date{March 21, 2018
 52 | \endgraf\bigskip Version 0.2}
 53 | 
 54 | \begin{document}
 55 | 
 56 | \maketitle
 57 | 
 58 | \begin{abstract}
 59 |   \noindent
 60 |   We introduce The Graph, a \textit{Decentralized Query Protocol} for indexing
 61 |   and caching data from blockchains and storage networks. We describe the query
 62 |   interface, the topology of the P2P network, and the economic incentives and
 63 |   mechanisms designed to keep the network running as a public utility.
 64 | \end{abstract}
 65 | 
 66 | \section{Introduction}
 67 | 
 68 | \subsection{Motivation}
 69 | A large amount of data resides in silos that are centrally controlled by a
 70 | handful of corporations. Web-era apps like Google, Facebook, YouTube, LinkedIn,
 71 | and Salesforce are built on these data monopolies. This centralization puts
 72 | tremendous power into the hands of a few and reduces economic opportunity and
 73 | self-determination for many.
 74 | 
 75 | \textit{Decentralized Applications (dApps)} put users in control of their data.
 76 | dApps are built using data that is either owned and managed by the community or
 77 | is private and controlled by the user. This way many products and services can
 78 | be built on pluggable datasets and users can freely switch between dApps. We
 79 | believe that this will create wide-scale economic opportunity as more products
 80 | are able to compete in a fair and open market and mechanisms are put in place to
 81 | incentivize people to contribute to a larger and farther-reaching public
 82 | commons.
 83 | 
 84 | To make this vision a reality, there needs to be an interoperability layer for
 85 | dApps. Applications building in the same domain need a way to coordinate and
 86 | agree on standardized names. They also need a common way to query data.
 87 | Applications use queries to find data in larger datasets. Queries generally
 88 | include operations like filtering, pagination, sorting, grouping, and joining
 89 | result sets. Executing queries requires creating and maintaining indexes,
 90 | without which running the queries would be prohibitively slow. All of this
 91 | requires economic incentives to produce a flourishing ecosystem. The Graph
 92 | provides this infrastructure layer for Web3, an emerging web application stack.
 93 | 
 94 | \setcounter{figure}{0}
 95 | \begin{figure}[H]
 96 |   \caption{\textbf{The Web3 Application Stack.}}
 97 |   \begin{center}
 98 |     \includegraphics[width=0.6\textwidth]{media/image8.png}
 99 |   \end{center}
100 | \end{figure}
101 | 
102 | An established decentralized \textit{Query Execution Layer} shown in Figure 1
103 | does not currently exist. Without a decentralized Query Execution Layer for
104 | Web3, dApp developers must resort to building custom indexing servers on an
105 | ad-hoc basis. This introduces a centralized component and requires engineering
106 | and devops resources to build and maintain. Providing a decentralized Query
107 | Execution Layer would allow dApp developers to ship more reliable dApps faster
108 | with fewer resources. It would also enable dApps to become fully decentralized.
109 | 
110 | \subsection{Decentralized Query Protocol}
111 | 
112 | \subsubsection*{Definition}
113 | 
114 | A \textit{Decentralized Query Protocol} is defined to be a collection of rules
115 | by which clients pay a decentralized network of nodes for indexing, caching, and
116 | querying data that is stored on public blockchains and decentralized storage
117 | networks such as IPFS/Swarm.
118 | 
119 | \subsubsection*{Protocol Requirements}
120 | 
121 | In order to enable a new class of data-intensive dApps, a Decentralized Query
122 | Protocol must meet the following requirements:
123 | 
124 | \begin{enumerate}
125 | \item \textit{Trust without verification}---a client should be able to trust the
126 |   results of queries without independently verifying each query or loading the
127 |   underlying raw data.
128 | \item \textit{Metering}---a client should be able to efficiently pay for each
129 |   query processed by the network, with minimal counterparty risk for either the
130 |   client or the nodes.
131 | \item \textit{Predictable performance}---the client should be able to pay for
132 |   predictable performance for queries that are run against specific data
133 |   sources.
134 | \item \textit{Data availability---}a client should be able to pay to keep the
135 |   data available for running queries against specific data sources.
136 | \item \textit{Price efficiency}---clients should be able to pay for queries,
137 |   performance, and data availability in efficient and competitive marketplaces.
138 | \item \textit{Incentive alignment}---incentives should be aligned between
139 |   clients, nodes, and dApp developers to encourage growth of the network and
140 |   positive network effects.
141 | \end{enumerate}
142 | 
143 | \section{Design}
144 | 
145 | The Graph implements a Decentralized Query Protocol, which enables users to
146 | query a network for data without having to operate any centralized
147 | infrastructure for indexing and caching. The protocol synthesizes ideas from
148 | distributed computing and cryptoeconomics\footnote{Vitalik Buterin,
149 |   ``Introduction to Cryptoeconomics,'' Vitalik Buterin's Website, March 12,
150 |   2018, https://vitalik.ca/files/intro\_cryptoeconomics.pdf} to produce a
151 | network that is self-organizing, robust, and secure.
152 | 
153 | \subsection{System Overview}
154 | 
155 | \subsubsection*{Protocol Stack}
156 | 
157 | The Graph can be divided into a stack of sub-protocols which can be treated
158 | conceptually as distinct interoperable layers, as shown in Figure 2.
159 | 
160 | \begin{figure}[H]
161 |   \vspace*{5mm}
162 |   \caption{\textbf{The Protocol Stack has the following sub-protocols:}}
163 |   \begin{center}
164 |     \includegraphics[width=.85\textwidth]{media/image7.png}
165 |   \end{center}
166 | \end{figure}
167 | 
168 | \begin{enumerate}
169 | \item \textbf{Consensus} \textbf{Layer---}responsible for smart contract
170 |   execution and payment settlement.
171 | \item \textbf{Peer-to-peer (P2P) Network}---defines how nodes locate and connect
172 |   to each other.
173 | \item \textbf{Storage Layer---}data stored on public blockchains or content
174 |   addressable networks.
175 | \item \textbf{Query Processing---}how a query is routed to a specific node for
176 |   processing.
177 | \item \textbf{Payment Channels---}facilitates fast and low-cost payments in the
178 |   system.
179 | \item \textbf{Governance---}manages schemas, data sources, and disputes.
180 | \item \textbf{Query Marketplace}---mechanism by which users pay nodes for
181 |   specific queries.
182 | \item \textbf{Indexing and Caching Marketplace---}mechanism by which users pay
183 |   nodes for indexing and caching data sources.
184 | \end{enumerate}
185 | 
186 | \subsubsection*{Token}
187 | 
188 | The Graph introduces a new token, \textit{Graph Tokens}, which play a vital role
189 | in securing and governing the network. Each of the uses for the token will be
190 | described alongside their respective related sub-protocols and summarized in the
191 | Token Economics section of this document.
192 | 
193 | \subsubsection*{Query Language}
194 | 
195 | The Graph will support queries written in GraphQL, a query language invented and
196 | open-sourced by Facebook. While SQL might be more familiar to back-end
197 | engineers, it is not well-suited to running queries from front-end applications.
198 | Traditional web apps handle this impedance mismatch by writing centralized API
199 | servers and data access layers in front of SQL databases, which are then exposed
200 | as REST endpoints. Since a requirement of dApps is that they require no
201 | centralized infrastructure to function, it is important that dApp clients can
202 | query data directly from the front-end in a flexible way. GraphQL was designed
203 | to meet this criteria and has since seen accelerating adoption in the web and
204 | mobile communities.
205 | 
206 | \subsubsection*{Data Model}
207 | 
208 | All queries in The Graph are executed against a particular \textit{Data Source}.
209 | A Data Source is composed of a \textit{Schema} and one or more
210 | \textit{Datasets}.
211 | 
212 | The Schema is a GraphQL SDL schema, and defines the entities, values, types, and
213 | relationships which may be queried. Unlike in traditional databases, the Schema
214 | here is purely a logical definition, and doesn't dictate the structure of the
215 | data at the storage layer.
216 | 
217 | The Dataset defines data which exists on public blockchains or on decentralized
218 | storage networks that may be queried as part of a particular Data Source. It is
219 | composed of \textit{Data}\textbf{,} \textit{Mappings,} and an optional
220 | \textit{Update Function}.
221 | 
222 | Data identifies the raw data in the decentralized storage layer. It contains an
223 | identifier for the storage system being used, and the location of the raw data
224 | in that storage system. The location format will vary by storage system, for
225 | example a content hash on The InterPlanetary File System (IPFS) but a contract
226 | address on Ethereum.
227 | 
228 | The Mappings define how the Data maps to a particular Schema. It also includes
229 | metadata around the format in which the data is stored, such as CSV, Parquet, or
230 | a custom binary format.
231 | 
232 | The \textit{Update Function} is an optional script that can be provided for
233 | mutable data such as Ethereum contract data. It can also be provided for
234 | content-addressed data, which is referenced through a naming service such as
235 | IPNS rather than its content hash, which is immutable. The function accepts the
236 | data as input and returns an \textit{Update Event} which consists of a CUD
237 | (Create, Update, Delete) \textit{Operation} and a \textit{Payload}. The Update
238 | Event allows for performantly updating indexes without having to fully reindex
239 | the underlying Data.
240 | 
241 | \subsubsection*{Network Participants}
242 | 
243 | There are several types of participants in the network which are defined by
244 | their functional role in the protocol. With the exception of the dApp client,
245 | which is external to the protocol, a single node implementation may fulfill
246 | multiple functional roles.
247 | 
248 | \begin{enumerate}
249 | \item \textit{dApp Client}---a front-end application running on the End User's
250 |   machine which queries The Graph.
251 | \item \textit{Gateway Node}\textbf{---}a node which acts as an HTTP, WebSocket,
252 |   or JSON RPC endpoint for dApp clients to query The Graph.
253 | \item \textit{P2P Node}\textbf{---}a node which participates in the P2P network.
254 | \item \textit{Query Node}\textbf{---}a node which participates in query
255 |   processing.
256 | \end{enumerate}
257 | 
258 | \subsubsection*{Economic Agents}
259 | 
260 | There are several types of economic agents which we define by the common set of
261 | incentives that govern their usage of the protocol.
262 | 
263 | \begin{enumerate}
264 | \item \textit{End User}\textbf{---}seeks to get utility from an application and
265 |   pays to use the network.
266 | \item \textit{dApp Developer}\textbf{---}seeks to monetize their work building a
267 |   decentralized application for the End User.
268 | \item \textit{Node Operator}\textbf{---}operates P2P Nodes and Query Nodes to
269 |   extract fees and drive up the value of existing token holdings.
270 | \item \textit{Data Source Curator}\textbf{---}creates and curates Data Sources
271 |   in The Graph to extract interest and drive up the value of existing token
272 |   holdings.
273 | \item \textit{Validator}\textbf{---}validates query responses in exchange for
274 |   interest and driving up the value of existing token holdings.
275 | \end{enumerate}
276 | 
277 | \subsection{Sub-Protocols}
278 | 
279 | \subsubsection*{Consensus Layer}
280 | 
281 | The Graph has several components which require a blockchain-based consensus
282 | layer to provide guarantees that mechanisms in the protocol (payments, voting,
283 | validation, etc.) are immutable, irreversible, and can be carried out without
284 | the help of a central governing authority. We will use an existing blockchain
285 | such as Ethereum for this purpose.
286 | 
287 | \subsubsection*{Storage Integration Layer}
288 | 
289 | The Graph can support a variety of storage backends using Storage Adapters, an
290 | idea inspired by IPLD\footnote{``IPLD/specs,'' GitHub, accessed March 5, 2018.
291 |   https://github.com/ipld/specs/tree/master/ipld.}. These may include Ethereum,
292 | IPFS, other blockchains, or other forms of decentralized storage.
293 | 
294 | \subsubsection*{P2P Network}
295 | 
296 | The Graph implements a structured overlay network which builds on ideas from
297 | \textit{Content-Addressable Networks (CANs}\textbf{\textit{)}} such as
298 | IPFS\footnote{Juan Benet, ``IPFS - Content Addressed, Versioned, P2P File System
299 |   (DRAFT 3),'' IPFS, March 5, 2018,
300 |   https://ipfs.io/ipfs/QmR7GSQM93Cx5eAg6a6yRzNde1FQv7uL6X1o4k7zrJa3LX/ipfs.draft3.pdf}
301 | and BitTorrent\footnote{Andrew Lowenstern and Arvid Norberg, ``DHT Protocol,''
302 |   BitTorrent.org, May 1, 2017, http://www.bittorrent.org/beps/bep\_0005.html}.
303 | We introduce the concept of a \textit{Service-Addressable Network} to describe
304 | our formulation; the key difference is that while CANs leverage
305 | \textit{distributed hash tables (DHTs)} to locate nodes on the network storing a
306 | specific file or object\footnote{Sylvia Ratnasamy, P. Francis, M. Handley, R.
307 |   Karp, and S. Shenker, ``A Scalable Content-Addressable Network,'' Proceedings
308 |   of ACM SIGCOMM, August 2001, http://dx.doi.org/10.1145/383059.383072. }, our
309 | P2P network is used to locate nodes capable of providing a particular service,
310 | which can be any arbitrary computational work. The design of our P2P network
311 | sub-protocol is modular with respect to the service being provided, a fact which
312 | we take advantage of in other parts of the protocol stack.
313 | 
314 | \begin{figure}[H]
315 |   \vspace*{5mm}
316 |   \caption{\textbf{Service-Addressable Network}}
317 |   \begin{center}
318 |     \includegraphics[width=1\textwidth]{media/image6.png}
319 |   \end{center}
320 | \end{figure}
321 | 
322 | \subsubsection*{Query Processing}
323 | 
324 | The \textit{Query Processing} is split into five distinct stages: \textit{Query
325 |   Splitting, Service Discovery, Query Routing, Nested Query Processing} and
326 | \textit{Response Collation}.
327 | 
328 | \subsubsection*{Query Splitting}
329 | 
330 | The first step of the Query Processing sub-protocol is to split a query into
331 | disjoint top-level \textit{Query Fragments,} which may correspond to multiple
332 | Data Sources. A Query Fragment is any part of a query that can be resolved on
333 | its own. These fragments are processed separately, moving through the subsequent
334 | Query Processing stages in parallel.
335 | 
336 | \subsubsection*{Service Discovery}
337 | 
338 | In the Service Discovery stage, we leverage our Service-Addressable Network, to
339 | locate a P2P Node, with a routing table corresponding to a specific Service
340 | Group, as shown in Figure 4. In the context of this sub-protocol we define the
341 | Service Group to be a group of Query Nodes capable of processing a query for a
342 | specific Data Source.
343 | 
344 | \begin{figure}[H]
345 |   \caption{\textbf{The Service Discovery stage.}}
346 |   \includegraphics[width=1\textwidth]{media/image10.png}
347 | \end{figure}
348 | 
349 | \subsubsection*{Query Routing}
350 | 
351 | In the Query Routing stage, the Gateway Node that originated the
352 | query\footnote{From the perspective of the network, it is impossible to
353 |   distinguish whether a Gateway Node originated a query itself, or on behalf of
354 |   a dApp client.} decides which Query Node to forward a specific Query Fragment
355 | to, as shown in Figure 5. The Query Routing stage uses the \textit{Service Group
356 |   Routing Table}, which contains the location of each Node in the Service Group,
357 | as well as additional metadata.
358 | 
359 | \begin{figure}[H]
360 |   \caption{\textbf{The Query Routing stage.}}
361 |   \begin{center}
362 |     \includegraphics[width=.9\textwidth]{media/image9.png}
363 |   \end{center}
364 | \end{figure}
365 | 
366 | The metadata in the Service Group Routing Table is extensible to support modular
367 | routing logic. We take advantage of this in the Query Marketplace sub-protocol,
368 | but it could also be used to support operating the protocol in local networks or
369 | with trusted nodes, where payment requirements may be undesirable.
370 | 
371 | \subsubsection*{Nested Query Processing}
372 | 
373 | Each Query Fragment corresponds to a single entity type that is indexed by The
374 | Graph. If the user wishes to traverse nested entity relationships, these are
375 | processed as separate Query Fragments which go through the steps listed above.
376 | Queries may be arbitrarily deep, and thus are processed recursively, in serial
377 | until all the Query Fragments are processed.
378 | 
379 | \subsubsection*{Response Collation}
380 | 
381 | The final stage is to await execution of all the Query Fragments, both in series
382 | and in parallel, and to collate the responses in a format that meets the GraphQL
383 | specification\footnote{http://facebook.github.io/graphql/draft/} for the given
384 | query.
385 | 
386 | \subsubsection*{Payment Channels}
387 | 
388 | In order to keep throughput high and transaction cost low, the network will use
389 | \textit{Payment Channels} for
390 | microtransactions\footnote{https://lightning.network/lightning-network-paper.pdf}.
391 | Payment Channels may be opened directly between two nodes, but for the
392 | flexibility of the protocol, it is appropriate to use a network of Payment
393 | Channels such as Raiden\footnote{https://raiden.network/}, which allows for
394 | payments to be made between many-to-many nodes without having to settle on-chain
395 | for each new node-to-node transaction. In this way, a single Payment Channel
396 | could be opened once by an End User to support many microtransactions for
397 | metered usage of The Graph. The Payment Channels could then be settled on-chain
398 | on a desired cadence.
399 | 
400 | \subsubsection*{Governance}
401 | 
402 | Since The Graph provides the main API endpoint for dApps to query data, deciding
403 | what data to include in query results impacts users and developers. The Graph
404 | provides a way for the community to vote and decide on what data to include and
405 | exclude. For example there may be competing protocols for any given domain. The
406 | network could choose to include data from one protocol or multiple.
407 | 
408 | To start using The Graph, a dApp developer needs to make sure that a Schema
409 | exists for their domain. A Schema is composed of namespaces, entities, fields,
410 | and relationships between entities. If entities or fields are missing from the
411 | global Schema, a developer can propose changes by staking tokens. Schema changes
412 | can only be accretive and cannot include breaking changes to ensure that
413 | deployed dApps do not break.
414 | 
415 | Once a Schema exists for a domain, a dApp developer can propose to include a new
416 | Data Source for an entity by staking tokens. Other developers working in the
417 | same domain will want to check to make sure that the Data Source is high quality
418 | and compatible since new Data Sources would have an impact on their dApps. If
419 | quality Data Sources are added, their dApps will work better and their users
420 | will be happy. If the new Data Source produces spam or low quality content,
421 | developers will have an incentive to reject it. Participation in the governance
422 | process through staking will be rewarded with token inflation based on how much
423 | value the Data Sources drive to the network.
424 | 
425 | \subsubsection*{Query Marketplace}
426 | 
427 | The Query Marketplace lets End Users pay Query Nodes for individual queries (or
428 | Query Fragments) issued against a specific Data Source. The Query Marketplace
429 | builds on top of the extensible P2P Network and Query Routing sub-protocols. We
430 | leverage the extensible Service Group Routing Table metadata to list a
431 | \textit{Price Sheet} which Query Nodes may use to advertise the fees they will
432 | charge to process queries for a specific Data Source. The fees will be priced in
433 | terms of estimated complexity of the query, size of the query response, and
434 | latency. We route the query to a specific Query Node to achieve the desired
435 | tradeoff of cost and performance for any given query. The Gateway Node may
436 | optionally expose this logic in its query interface such that dApp developers or
437 | End Users may specify the optimal cost versus performance tradeoff for their
438 | specific use case.
439 | 
440 | Query Nodes must bond a desired number of Graph Tokens in order to participate
441 | in the marketplace. The more tokens they bond, the more likely they are to be
442 | seen as trustworthy by users of the network and be able to extract fees for
443 | queries. While most transactions of payment for query processing will occur
444 | off-chain, an End User or Gateway Node may challenge any specific query response
445 | by creating a \textit{Dispute} on-chain, which is a voting smart contract in
446 | which a set of Validators votes on the correctness of a query response in a
447 | commit and reveal process. If the challenge succeeds, then the Query Node's
448 | bonded tokens are forfeited to the challenger who created the Dispute. Validator
449 | Nodes are rewarded for securing the marketplace through token inflation.
450 | 
451 | \subsubsection*{Indexing and Caching Marketplace}
452 | 
453 | While the Query Marketplace incentivizes Query Nodes to respond to individual
454 | queries, it does not provide any guarantees that there are Query Nodes which are
455 | available to process the query performantly. This can be problematic, especially
456 | when bootstrapping a new Data Source which does not yet have usage. The Indexing
457 | and Caching marketplace allows Query Nodes to be compensated for providing a
458 | specific \textit{Service-level Agreement}
459 | \textbf{\textit{(}}\textit{SLA}\textbf{\textit{)}} which is a promise to be
460 | available to process queries for a specific Data Source within certain latency
461 | and cost bounds. If a Query Node is found to be in violation of the SLA then its
462 | staked tokens will be forfeited to the user who paid for the SLA. The network
463 | may implement a challenge-response protocol to verify that an SLA is being met
464 | even when users are not actively querying that Data Source.
465 | 
466 | \section{Token Economics}
467 | 
468 | Graph Tokens are used to secure and govern the network and to incentivize
469 | behaviors that are critical for the network to thrive. Token mechanics are
470 | described in relevant sections of the protocol, but we provide here a
471 | consolidated list of the mechanisms that involve tokens, and how specific
472 | economic agents interact with the tokens. Graph Tokens:
473 | 
474 | \begin{itemize}
475 | \item Are bonded by Query Nodes to participate in the Query Marketplace as well
476 |   as the Indexing and Caching Marketplace.
477 | \item Are bonded by Validators to participate in voting in on-chain Disputes.
478 | \item Are staked by Challengers to create a Dispute.
479 | \item Are paid to Validators through token inflation.
480 | \item Are paid to Data Source curators through token inflation.
481 | \item Are used in decentralized governance mechanisms for a specific Data Source
482 |   (i.e. Token Curated Registries).
483 | \item May be used as fees in the Query Marketplace as well as the Indexing and
484 |   Caching Marketplace
485 | \end{itemize}
486 | 
487 | \section{Roadmap}
488 | 
489 | Our plan is to release The Graph across three major development milestones. The
490 | first release will be a free service that any dApp developer can register to
491 | use. It will include stable interfaces for defining the Schema, registering
492 | Mappings, and querying with GraphQL. This release is slated for Q3 2018.
493 | Launching first as a centralized service will allow us to iterate on the design,
494 | implementation, and economic incentives at a faster rate. The second milestone
495 | will be the launch of the full P2P network in 2019. After this release, anyone
496 | will be able to run a Graph Node and earn Graph Tokens for participating in the
497 | network. This is the stage at which the Query Marketplace as well as the
498 | Indexing and Caching Marketplace will be opened. The third major release will
499 | include support for Private Data. Private Data is anchored on-chain but
500 | encrypted and controlled by users. This release is scheduled for 2020.
501 | 
502 | \section{Conclusion}
503 | 
504 | In this paper we presented the requirements for a Decentralized Query
505 | Protocol\textit{,} a set of rules by which clients pay a decentralized network
506 | of nodes for indexing, caching, and querying data that is stored on public
507 | blockchains and decentralized storage networks. We provided a high-level
508 | overview of The Graph, our formulation of a Decentralized Query Protocol. We
509 | also proposed our definition for what constitutes a dApp---a dApp must put users
510 | in control of their data. dApps are built using data that is either owned and
511 | managed by the community or is private and controlled by the user. This way many
512 | products and services can be built on interchangeable Datasets and users can
513 | freely switch between dApps. The Graph acts as an interoperability layer that
514 | will enable these flourishing ecosystems of interoperable dApps to thrive and
515 | replace centrally controlled data monopolies.
516 | 
517 | \end{document}
518 | 


--------------------------------------------------------------------------------