├── CONTRIBUTING.md
├── COPYRIGHT
├── LICENSE-APACHE
├── LICENSE-MIT
├── README.md
├── SNARK
├── SNARK.md
├── finality-labs-bellman-gpu-report.pdf
├── perf-dizk-bellman.pdf
├── proof-batching-report.pdf
└── trusted-setup-sonic-performance.pdf
├── calculators.md
├── open-problems.md
├── porep
├── encoding-optimization.pdf
└── porep.md
├── problems-glossary.md
├── research-notes
├── 2018-05-jury-trial-data-availability.md
├── 2018-09-consistent-hashing-to-avoid-monopoly-broken.md
├── 2018-09-storage-limitation-notes.md
├── 2018-10-splitzy-attack-on-vdf.md
└── zigzag-modern.pdf
└── research-roadmap-diagram.png
/CONTRIBUTING.md:
--------------------------------------------------------------------------------
1 | Contributing to Filecoin Research
2 | =====
3 |
4 | Thanks for your interest in contributing to Filecoin research endeavours!
5 |
6 | **Disclaimer:** While we work hard to document our work as it progresses, research progress may not be fully reflected here for some time, or may be worked out out-of-band.
7 |
8 | In general, Filecoin Research will favor focus over breadth of engagement, but we welcome any contributions. We've found that discussion and engagement with the underpinnings of Filecoin and helped shape these research efforts.
9 |
10 | Here's how you can engage:
11 |
12 | ## Protocol Questions?
13 |
14 | You may want to check out these sources:
15 | - [The Filecoin spec](https://github.com/filecoin-project/specs)
16 | - [Filecoin slack](filecoinproject.slack.com)
17 | - [Filecoin's Discussion Forum](discuss.filecoin.io)
18 |
19 | Still haven't found an answer to your question? Open an issue, and tag it with `question`.
20 |
21 | ## Problems with Filecoin?
22 |
23 | Have you found a problem with the Protocol?
24 |
25 | Open an issue and tag it with `bug`. We would love to engage with you on it, and potentially start working together to solve it.
26 |
27 | ## Ideas for our `unsolved-problems`?
28 |
29 | How exciting! Engage on the relevant issue and let's take our collaboration forward from there!
30 |
31 | ## New Ideas?
32 |
33 | Share them with the community by opening up a new issue and tagging it with `idea/brainstorm`.
34 |
--------------------------------------------------------------------------------
/COPYRIGHT:
--------------------------------------------------------------------------------
1 | This library is dual-licensed under Apache 2.0 and MIT terms.
2 |
--------------------------------------------------------------------------------
/LICENSE-APACHE:
--------------------------------------------------------------------------------
1 | Copyright 2019 by the Filecoin contributors.
2 |
3 | Licensed under the Apache License, Version 2.0 (the "License");
4 | you may not use this file except in compliance with the License.
5 | You may obtain a copy of the License at
6 |
7 | http://www.apache.org/licenses/LICENSE-2.0
8 |
9 | Unless required by applicable law or agreed to in writing, software
10 | distributed under the License is distributed on an "AS IS" BASIS,
11 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | See the License for the specific language governing permissions and
13 | limitations under the License.
14 |
--------------------------------------------------------------------------------
/LICENSE-MIT:
--------------------------------------------------------------------------------
1 | Permission is hereby granted, free of charge, to any
2 | person obtaining a copy of this software and associated
3 | documentation files (the "Software"), to deal in the
4 | Software without restriction, including without
5 | limitation the rights to use, copy, modify, merge,
6 | publish, distribute, sublicense, and/or sell copies of
7 | the Software, and to permit persons to whom the Software
8 | is furnished to do so, subject to the following
9 | conditions:
10 |
11 | The above copyright notice and this permission notice
12 | shall be included in all copies or substantial portions
13 | of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF
16 | ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED
17 | TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
18 | PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT
19 | SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
20 | CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
21 | OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR
22 | IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
23 | DEALINGS IN THE SOFTWARE.
24 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Filecoin Research
2 |
3 |
4 |
5 | ---
6 |
7 | This repository is the main hub leading to the various efforts in Filecoin Research and should provide you with the means to engage in this work.
8 |
9 | **Disclaimer:** While we work hard to document our work as it progresses, research progress may not be fully reflected here for some time, or may be worked out out-of-band.
10 |
11 | ## Table of Contents
12 |
13 | - [What is Filecoin Research?](#what-is-filecoin-research)
14 | - [Filecoin Research Endeavours](#filecoin-research-endeavours)
15 | - [Overview](#overview)
16 | - [Area: Consensus](#area-consensus)
17 | - [Area: Filecoin Protocol Improvements](#area-filecoin-protocol-improvements)
18 | - [Area: Generic Blockchain Infrastructure](#area-generic-blockchain-infrastructure)
19 | - [Area: Primitives](#area-primitives)
20 | - [Area: Practical zk-SNARKs ](SNARK/SNARK.md)
21 | - [Area: Practial PoRep](porep/porep.md)
22 | - [Key Open Problems](#key-open-problems)
23 | - [Contributing](#contributing)
24 | - [Community](#community)
25 | - [Useful Docs](#useful-docs)
26 | - [License](#license)
27 |
28 | ## What is Filecoin Research?
29 |
30 | - **Make breakthroughs in the Filecoin protocol**
31 | - **Support devs to develop Filecoin**
32 |
33 | The purpose of Filecoin Research is to design or build the predicates enabling Filecoin: a decentralized storage network. We work to prove Filecoin constructions correct, or to improve them. The work here should provide some motivations for decisions about how Filecoin works; its output is [the Filecoin spec](https://github.com/filecoin-project/specs), from which a filecoin network can be implemented.
34 |
35 | ## Filecoin Research Endeavours
36 |
37 | Filecoin Research work is conducted by area of focus, with current efforts ongoing in:
38 | - [Specs](https://github.com/filecoin-project/specs): The Filecoin Spec is the main interface between research and Filecoin development. Research work is only complete once it finds its way into the spec.
39 | - General Research (you're already here): This repo generally regroups unsolved-problems with Filecoin Research and helps organize our research work.
40 | - [Proofs](https://github.com/filecoin-project/rust-proofs): Dedicated to shaping and building out the Filecoin Proving Subsystem (FPS), whose API can be called by a Filecoin node to Seal disk sectors, or generate PoSTs for instance.
41 | - [Consensus](https://github.com/filecoin-project/consensus): Dedicated to finalizing the construction and proving the security of Filecoin's Consensus protocol, through which leaders are elected to mine new blocks and extend the Filecoin blockchain.
42 |
43 | #### Overview
44 |
45 | Here is a list of the Filecoin project's research endeavours, we split them by projects for discoverability and further highlight select problems [below](#key-open-problems). We also specify scope and priority in this table, you can find exact definitions for these [here](problems-glossary.md).
46 |
47 | - [Area: Consensus](#area-consensus)
48 | - [Area: Filecoin Protocol Improvements](#area-filecoin-protocol-improvements)
49 | - [Area: Generic Blockchain Infrastructure](#area-generic-blockchain-infrastructure)
50 | - [Area: Primitives](#area-primitives)
51 |
52 | #### Area: Consensus
53 |
54 | This endeavour deals with the Filecoin consensus layer broadly. It encompassed projects dealing with precise constructions Filecoin uses or could use (like Expected Consensus or Single Secret Leader Election) as well as the broader classification of Storage-Power-based Consensus in the field, for instance in relation to PoW and PoS. See the [consensus repo](https://github.com/filecoin-project/consensus) for more.
55 |
56 |
57 |
58 |
59 |
60 |
61 |
62 | Project |
63 | Description |
64 | Problems |
65 | Status |
66 |
67 |
68 | Expected Consensus (EC) |
69 | Expected Consensus is a consensus protocol that includes a block proposer and a way to achieve agreement (PoS Nakamoto consensus) on a particular block. It yields one secret leader per round on expectation, but may yield 0 or multiple. |
70 | Short-term/Ongoing: - Formal analysis of EC Security - Heuristic Security and attack simulations |
71 | Working on/Collaboration |
72 |
73 |
74 | Secret Single Leader Election (SSLE) |
75 | SSLE is a leader election protocol that guarantees that at each round only a single leader is elected (as opposed to one on expectation) and its identity remains secret until announced. |
76 | Short-term: - A practical SSLE Construction
Medium-term: - A consensus protocol that uses SSLE as leader election (and adaptation into Filecoin) |
77 | Collaboration/RFP |
78 |
79 |
80 | Storage Power Consensus (SPC) |
81 | Storage Power Consensus is the intermediate layer of consensus in the Filecoin system, bridging the gap between a storage network and Proof of Stake consensus to elect leaders based on storage committed to the network. |
82 | Short-term: - Committing power to a particular fork (e.g. through reseal)
Medium-term: - Efficient 51% block signing via all-to-all communications - Proof-of-Space before SEAL
Long-term: - Formally defining the EC/SPC interface |
83 | Working on/Collaboration/RFP |
84 |
85 |
86 | Power Fault Tolerance (PFT) |
87 | PFT is abstracted in terms of influence over the protocol rather than machines |
88 | Medium-term: - Formal framework for PFT in third gen blockchains |
89 | Working on/Collaboration |
90 |
91 |
92 |
93 | ## Area: Filecoin Protocol Improvements
94 |
95 | This area deals with the transaction layer of the Filecoin protocol and encompasses endeavours across the various parts that come together to make up Filecoin. We quickly present our research interests here.
96 |
97 |
98 |
99 |
100 |
101 |
102 |
103 | Endeavour |
104 | Description |
105 | Problems |
106 | Status |
107 |
108 |
109 | Mining |
110 | Mining here refers to the work of storage miners in the Filecoin network, who use proofs to store files on a client's behalf. This section deals with the ways miners interact with proofs as part of storage and retrieval mining in Filecoin. |
111 | Short-term/Ongoing: - PoST difficulty adjustment - Tolerating faults for honest miners
- Medium-term: - PoSpace after Pledge (before SEAL) - Mining Pools in Filecoin |
112 | Working on/Collaboration (Curious for medium-term) |
113 |
114 |
115 | Repair |
116 | Repair miners ensure that as storage miners go offline, clients' orders remain secure. They are verifiers of the chain, catching faults and ensuring orders are re-assigned when storage miners fail. |
117 | Medium-term: - Repair proofs - Scaling Repair
Long-term: - Watchtowers for storage repair |
118 | RFP/Curious |
119 |
120 |
121 | Securing Filecoin |
122 | Filecoin protocol security (to be distinguished from the security of an implementation) is a major endeavour of Filecoin research. It touches our consensus layer, primitives, and transaction layer more broadly. We break this work down into three broad categories: economic security (or incentive compatibility), formal security (provable guarantees), and heuristic security (attack analysis and parameter setting). We detail a few of our endeavours in the space here. |
123 | Short-term: - Cryptoeconomic simulator - PoST security proofs - Formally analyzing Filecoin through the lens of PoS and PoW - Filecoin DoS analysis - Formal analysis of Filecoin finality
Medium-term: - Filecoin checkpointing |
124 | Working on/Collaboration |
125 |
126 |
127 | Storage Market |
128 | The Filecoin's storage market refers to market dynamics around miners' monetization of disk space. Storage market questions concern the tools Filecoin provides miners to manage their disk in filling orders on the network. |
129 | Short-term: - Multiple Sector sizes
Medium-term: - PoST alternatives for proving storage |
130 | Working on/Collaboration, Curious for long-term |
131 |
132 |
133 |
134 |
135 | ## Area: Generic Blockchain Infrastructure
136 |
137 | We plan to improve the state of the art of generic blockchain constructions. As part of developing Filecoin, we've uncovered open problems that may interest the community at large.
138 |
139 |
140 |
141 |
142 |
143 |
144 |
145 | Endeavour |
146 | Description |
147 | Problems |
148 | Status |
149 |
150 |
151 | Chain Scalability and Throughput |
152 | Blockchain design is constrained by limitations of what data can be stored on-chain. |
153 | Short-term/Ongoing: - Signature Aggregation
Medium-term: - Dedicated Nodes for transaction batching - Accumulator-based chain state Long-term: - Snarking the chain |
154 | Working on/Curious (for long-term) |
155 |
156 |
157 | Blockchain VMs |
158 | Filecoin will integrate smart contract functionality through a Filecoin VM. As we look towards this, we are interested in better models for VM execution in the context of a blockchain. |
159 | Medium-term: - WASM
Long-term: - Privacy-supporting smart contracts - Efficient VM execution model |
160 | Working on/Curious |
161 |
162 |
163 | Other Projects of Interest |
164 | This endeavour regroups other insights or problems we've uncovered as part of our work on Filecoin that is likely relevant to other architects and developers working on blockchain-based systems. |
165 | Short-term: - Investigating the necessity of block delay for blockchains - Trustless network joining/node bootstrapping
Medium-term: - Formal treatment of the impact of cryptoeconomics on protocol security - Exploration of transfer freezing for public keys as an alternative to slashing - Off-chain random beacons
|
166 | Working on (short-term), Collaboration/RFP (medium-term), Curious (long-term) |
167 |
168 |
169 |
170 |
171 | #### Area: Primitives
172 |
173 | Filecoin itself relies on the performance and security of cryptographic primitives. We quickly discuss the main open problems we are thinking about with regards to Filecoin primitives below.
174 |
175 |
176 |
177 |
178 |
179 |
180 |
181 | Endeavour |
182 | Description |
183 | Problems |
184 | Status |
185 |
186 |
187 | Proof of Replication (PoRep)/Proof of SpaceTime(PoST) |
188 | Proof-of-Replication is a key component of Filecoin as a storage-based marketplace. PoReps are assembled into Proofs of SpaceTime to ensure that miners are indeed storing client data. |
189 | Short-term: - Reducing hardware costs for Porep/PoST -ASIC-resistant PoRep hash function - PoST Aggregation
Medium-term: - Updateable PoRep - Better PoRep: fast replication & verification, small & fast proof, flexible sector size - PoRep/PoST without timing assumptions - Vertical PoRep/PoST proving system |
190 | Working on/Collaboration/RFP |
191 |
192 |
193 | SEALSTACK |
194 | SEALSTACK refers to a specific attack that breaks PoRep by allowing a miner to cheat on space. The Filecoin team has a few candidate mitigations for SEALSTACK but is working to identify the optimal solution. |
195 | Short-term: - SEALSTACK: Symmetric Proof of replication construction - SEALSTACK: Asymetric PoRep
Medium-term: - SEALSTACK with fast decode |
196 | Working on/RFP |
197 |
198 |
199 | SNARKS and other key primitives |
200 | Filecoin relies on SNARKS to aggregate PoSTs (proof compression of data going to the chain). We are looking into improvements in this space as well as alternative primitives we could use. |
201 | Short-term: - Practical Circuits - Faster SNARKS on GPU and off-the-shelf hardware
Medium-term: - Sector Aggregation (with aggregation nodes) - ZK-cryptographic compression (with easy set membership verification for random variables) - Succinct file inclusion proofs - SNARK-friendly accumulators |
202 | Collaboration/RFP |
203 |
204 |
205 | VDFs |
206 | One of Filecoin's PoST candidate constructions uses VDFs in order to ensure appropriate delay between various challenges to the miner. Thus, the security of a candidate construction for Filecoin (as well as Filecoin consensus) relies on certain guarantees provided by VDFs. As such, we are interested in advancements in the field, as well as alternative constructions. |
207 | Short-term: - Fastest Hash/VDF Function - Removing VDFs from Filecoin - Using a hash function that re-uses the same VDF hardware - VDF pools |
208 | Collaboration/RFP/Curious |
209 |
210 |
211 | Other Primitives of Interest |
212 | Other key primitives would prove highly useful to Filecoin's development. We detail a few here. |
213 | Medium-term: - Proof of data delivery to assure constrain assured price - Better Big Number Library - Weighted Threshold Signatures
Long-term: - Proof of Location |
214 | Collaboration/RFP/Curious |
215 |
216 |
217 |
218 |
219 |
220 | #### Key Open Problems
221 |
222 | We break down a few open problems of high priority for the team below. Some have open [RFPs](https://github.com/protocol/research-RFPs). For all, we welcome any collaborations (potentially leading to new constructions, discoveries and publications). Please reach out at [filecoin-research@protocol.ai](filecoin-research@protocol.ai).
223 |
224 | [See our open problems](./open-problems.md).
225 |
226 | You can also check out slides from talks given about Research Problems in Filecoin:
227 | - [Filecoin: Open Problems building storage-based consensus](https://drive.google.com/a/protocol.ai/file/d/1TeoRVRTDzMvPfYbty0WZ_V75zIyHinke/view?usp=sharing) given by Henri Stern at EPFL Crypto WinterSchool in February 2019.
228 | - [Filecoin Research Problems](https://drive.google.com/a/protocol.ai/file/d/16-74tC09jJeMdgXgKukmTHtmyTGHFkI3/view?usp=sharing) given by Nicola Greco at IC3 Winter Retreat in February 2019.
229 |
230 | ## Contributing
231 |
232 | The purpose of this repo is for Filecoin Research questions to be open to researchers around the world who may be interested in working on them.
233 |
234 | If you want to dive into these topics, please see [CONTRIBUTING.md](CONTRIBUTING.md).
235 |
236 | ## Community
237 |
238 | - Github over Google Docs!
239 | - Join the [Public Filecoin Slack](https://github.com/filecoin-project/community#chat):
240 | - General Filecoin Research: #fil-research
241 | - Proofs development: #fil-proofs
242 | - General Filecoin: #general
243 | - Our [Discussion Forum](discuss.filecoin.io)
244 |
245 | ## Useful docs
246 |
247 | - [**Research Notes**](https://github.com/filecoin-project/research/tree/master/research-notes) that have not yet found a home
248 | - [**Calculators**](https://github.com/filecoin-project/research/tree/master/calculators.md) used to estimate some results ahead of spending cycles on a construction
249 | - [**Spec/Design docs**](https://github.com/filecoin-project/specs) for the Filecoin protocol
250 |
251 | ## License
252 |
253 | The Filecoin Project is dual-licensed under Apache 2.0 and MIT terms:
254 |
255 | - Apache License, Version 2.0, ([LICENSE-APACHE](https://github.com/filecoin-project/research/blob/master/LICENSE-APACHE) or http://www.apache.org/licenses/LICENSE-2.0)
256 | - MIT license ([LICENSE-MIT](https://github.com/filecoin-project/research/blob/master/LICENSE-MIT) or http://opensource.org/licenses/MIT/)
257 |
--------------------------------------------------------------------------------
/SNARK/SNARK.md:
--------------------------------------------------------------------------------
1 | # zk-SNARKs
2 |
3 | This is a top-level page to gather artifacts from completed research into and collaborations surrounding practical use of zk-SNARKS.
4 |
5 | ## Completed Work
6 | + [Performance evaluation of DIZK and Bellman for Blake2s](perf-dizk-bellman.pdf) (March 2019)
7 | + [Trusted Setup & Sonic Performance Report](trusted-setup-sonic-performance.pdf) (April 2019)
8 | + [Finality Labs Bellman GPU Report](finality-labs-bellman-gpu-report.pdf) (May 2019)
9 | + [Proof Batching Report](proof-batching-report.pdf) (May 2019)
10 |
--------------------------------------------------------------------------------
/SNARK/finality-labs-bellman-gpu-report.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/filecoin-project/research/3b36004e592ddc3a35c17efb9a4ec6f6e963ca80/SNARK/finality-labs-bellman-gpu-report.pdf
--------------------------------------------------------------------------------
/SNARK/perf-dizk-bellman.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/filecoin-project/research/3b36004e592ddc3a35c17efb9a4ec6f6e963ca80/SNARK/perf-dizk-bellman.pdf
--------------------------------------------------------------------------------
/SNARK/proof-batching-report.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/filecoin-project/research/3b36004e592ddc3a35c17efb9a4ec6f6e963ca80/SNARK/proof-batching-report.pdf
--------------------------------------------------------------------------------
/SNARK/trusted-setup-sonic-performance.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/filecoin-project/research/3b36004e592ddc3a35c17efb9a4ec6f6e963ca80/SNARK/trusted-setup-sonic-performance.pdf
--------------------------------------------------------------------------------
/calculators.md:
--------------------------------------------------------------------------------
1 | # Calculators
2 |
3 | - [WIP: Filecoin Throughput calculator (@whyrusleeping)](https://beta.observablehq.com/d/37ff2d55942d1354)
4 |
5 | ## Proofs
6 | - [WIP: ZigZag: Repl+Circuit (@nicola)](https://beta.observablehq.com/d/51563fc39810b60d): Aims at simulating sizes and times for replication and snarks
7 | - [WIP: Proof of Spacetime (@nicola)](https://beta.observablehq.com/d/39fb84f240fb9e65): Aims at simulating sizes and times for PoSt
8 | - [WIP: Proof of Spacetime Attacks (@whyrusleeping)](https://beta.observablehq.com/d/2c5e782ba61cc522)
9 | - [WIP: Proofs params: Security Analysis (@nicola)](https://beta.observablehq.com/d/bbabac1947b79011): Aims at calculating the smallest safest params
10 |
--------------------------------------------------------------------------------
/open-problems.md:
--------------------------------------------------------------------------------
1 | Key Open Problems
2 | =====
3 |
4 | We break down a few open problems of high priority for the team below. Some have open [RFPs](). For all, we welcome any collaborations (potentially leading to new constructions, discoveries and publications). Please reach out at [filecoin-research@protocol.ai](filecoin-research@protocol.ai).
5 |
6 | **Disclaimer:** While we work hard to document our work as it progresses, research progress may not be fully reflected here for some time, or may be worked out out-of-band.
7 |
8 | You can read more about what we mean by status or priority [here](./problems-glossary.md).
9 |
10 | ## Table of Contents
11 |
12 | - [Area: Consensus](#area-consensus)
13 | - [EC attacks](#ec-attacks)
14 | - [SSLE](#ssle)
15 | - [Committing power to a particular fork](#committing-power-to-a-particular-fork)
16 | - [Area: Primitives](#area-primitives)
17 | - [ASIC-resistant PoRep hash functions](#asic-resistant-porep-hash-functions)
18 | - [PoST aggregation](#post-aggregation)
19 | - [PoRep without timing assumptions](#porep-without-timing-assumptions)
20 | - [Vertical PoRep/PoST proving system](#vertical-porep-post-proving-system)
21 | - [Faster SNARKS on GPU and off-the-shelf hardware](#faster-snarks-on-gpu-and-off-the-shelf-hardware)
22 | - [Sector Aggregation](#sector-aggregation)
23 | - [SNARK-friendly accumulators](#snark-friendly-accumulators)
24 | - [Weighted Threshold Signatures](#weighted-threshold-signatures)
25 |
26 |
27 |
28 | ## Area: Consensus
29 |
30 | ### EC Security
31 |
32 | **What:** *Expected Consensus* is a consensus protocol that includes a block proposer and a way to achieve agreement (*PoS Nakamoto consensus*) on a particular block. It is both *Secret Leader Election* and a set of chain selection rules that ensure convergence. It guarantees a leader will eventually be elected through the protocol, to be revealed only as they publish a block to the network (thereby preventing *DoS*ing). On expectation, one leader will be elected at every round.
33 |
34 | *EC* takes in randomness from the chain along with a list of miners and their respective powers from the Filecoin power table. Miners are elected in proportion with the storage they have committed to the network. While we believe that *EC* security degrades to that of Snow White, Filecoin research is working toward a formal treatment of *EC* security.
35 |
36 | In parallel, in integrating *EC* within a live system, we are running attack analyses and simulations to tune our parameters and ensure that *EC* provides Filecoin with incentive compatibility.
37 |
38 | Current research focuses on the following attacks on Filecoin consensus (across *EC* and *SPC*):
39 |
40 | - Block Withholding
41 | - Fork Grinding
42 | - Undetectable Nothing at Stake
43 | - Exponential Forking
44 | - Posterior Corruption
45 | - VDF Delay Attack
46 | - SPC Flooding Attack
47 |
48 | **Why:** At a high-level, much of Filecoin’s motivation lies in building useful *Proof-of-Work*, replacing electricity consumption with file storage as the main mechanism for participating in network transactions, leader election and ultimately earning economic rewards. *Expected Consensus* is a *Proof-of-Stake* like protocol which sits atop *Storage Power Consensus* and helps realize this vision. Securing it is crucial to Filecoin's security as a persistent state machine on which people can persistently store data.
49 |
50 | **Status:** Working on/Collaboration
51 |
52 | **Priority:** Ongoing/Short-Term
53 |
54 | **References:**
55 |
56 | - [Filecoin Research Consensus Repo](https://github.com/filecoin-project/consensus)
57 | - [EC Simulations](https://github.com/filecoin-project/consensus/tree/master/code)
58 | - [Filecoin Spec](https://github.com/filecoin-project/specs/)
59 | - [Filecoin Whitepaper](https://filecoin.io/filecoin.pdf)
60 |
61 | ### SSLE
62 |
63 | **What**: *Secret Single Leader Election* is a leader election mechanism that can be used in Filecoin consensus (and other consensus) protocols.
64 |
65 | We are looking for a construction of SSLE with the following properties:
66 |
67 | - *Fair* - Each miner’s chance of becoming the canonical block leader should to be proportional to their power.
68 |
69 | - *Secret* - Only the canonical block leader at a round `r` can know that they are the leader until they broadcast a new block to the other miners.
70 |
71 | - *Unpredictable* - No observer or collection of observers should be able to predict block leaders with any advantage greater than `eps`.
72 |
73 | - *Verifiable* - All miners should be able to verify the canonical block leader non-interactively.
74 |
75 | **Why**: Such a construction would have a great impact in improving Filecoin’s design. Unlike the *Expected Consensus* sortition which secretly elects one leader at every round *on expectation*, single secret leader election elects *at most one* leader, thereby significantly reducing the amount of forks in the chain and greatly simplifying the underlying system.
76 |
77 | We are actively working on this problem as well as [soliciting proposals](https://github.com/protocol/research-RFPs/blob/master/RFPs/rfp-6-SSLE.md) for such a construction. We believe *SSLE* will:
78 |
79 | - Lead to faster convergence in the Filecoin network
80 | - Minimize fork grinding as a potential attack vector for Filecoin (as is the case with *EC*)
81 | - Yield a simpler Filecoin protocol (removing allowances Filecoin makes for multiple winners in leader election)
82 |
83 | **Status:** Ongoing Collaboration/RFP
84 |
85 | **Priority**: Short/Mid-term
86 |
87 | **References:**
88 |
89 | - [Filecoin Research Consensus Repo](https://github.com/filecoin-project/consensus)
90 | - [SSLE RFP](https://github.com/protocol/research-RFPs/blob/master/RFPs/rfp-6-SSLE.md)
91 |
92 | ### Commiting Power to a Particular Fork
93 |
94 | **What:** Like other *Proof-of-Stake* protocols, *Expected Consensus* is subject to *undetectable nothing-at-stake* attacks ([Formal Barriers](https://arxiv.org/pdf/1809.06528.pdf) by Brown-Cohen, Narayanan, et al.). In the context of *Storage Power Consensus*, this means that storage miners can mine multiple forks with the same underlying storage across forks. Recall that the computational requirements needed to generate a *Proof-of-Spacetime* is not the limiting factor in *SPC*.
95 |
96 | There are mitigations:
97 |
98 | - using a lookback parameter for randomness sampling leads to the same leader election outcomes for all forks descended from the head at which the seed is sampled, thereby reducing chain branching.
99 | - *SEAL*ing data forces a miner to commit to a given branch from the *SEAL* onwards, as all future *PoST*s refer back to the initial *SEAL*. Thus, Re*SEAL*ing data would effectively force miners to commit their power to a particular branch.
100 |
101 | We are looking for the optimal strategy in having miners commit power to a particular fork without negatively affecting their potential earnings (i.e. with a high probability of finality).
102 |
103 | **Why:** This is a key issue for *SPC* using *EC*. *SPC* approximates Proof-of-Work by using another limited resource (storage) to force consensus using a *Proof-of-Stake* consensus protocol. However, this ability to mine across forks at low cost is a precise way in which *SPC* poorly approximates *PoW* hardness. Proper mitigation to this issue will improve Filecoin as a useful *Proof-of-Work*.
104 |
105 | **Status:** Working On/Collaboration
106 |
107 | **Priority:** Medium-Term
108 |
109 | **References:**
110 |
111 | - [Filecoin Research Consensus Repo](https://github.com/filecoin-project/consensus)
112 | - [SEAL and PoST security hardness issue](https://github.com/filecoin-project/consensus/issues/30)
113 |
114 | ## Area: Primitives
115 |
116 | ### ASIC-Resistant PoRep Hash Functions
117 |
118 | **What:** TODO
119 |
120 | **Why:**
121 |
122 | **Status:** Future RFP
123 |
124 | **Priority:** Short-Term
125 |
126 | **References:**
127 |
128 |
129 |
130 | ### PoST Aggregation
131 |
132 | **What:** TODO
133 |
134 | **Why:**
135 |
136 | **Status:** Working On/Collaboration
137 |
138 | **Priority:** Short-Term
139 |
140 | **References:**
141 |
142 |
143 |
144 | ### Practical PoRep Without Timing Assumptions
145 |
146 | **What:** A *Proof-of-Replication* is both a *Proof-of-Space* and a *Proof-of-Retrievability*. It is an interactive proof system in which a prover is able to demonstrate that they are dedicating unique resources to storing one or more retrievable replicas of some data.
147 |
148 | While PoReps may unconditionally demonstrate possession of data, they cannot guarantee that the data is stored redundantly. Indeed, In order to make impossible for a storage provider to delete part of the replicas they are pretending to store and derive them on-the-fly upon request, current PoRep constructions rely on timing assumptions (under which the prover is assumed not to be able to generate a proper response).
149 |
150 | The Damgård et. al construction of [Proof of Replication without timing assumption](https://eprint.iacr.org/2018/654) has expensive communication complexity in settings that would tolerate generation attacks: the prover has to sample a set of users and run a multi-party protocol where each user participates in the encoding. We are looking into improvements to this construction and new approaches to the problem considering location and network delays.
151 |
152 | **Why:** Timing assumptions creates a need for specialized hardware (to give provers a level playing field). Getting rid of these timing assumptions would represent an elegant further step in the development of Proofs of Replication.
153 |
154 | Filecoin makes direct use of PoReps through its use of *Proofs-of-Spacetime* to ensure that a file is stored over time by a storage provider.
155 |
156 | **Status:** Collaboration/Future RFP
157 |
158 | **Priority:** Medium-Term
159 |
160 | **References:**
161 |
162 | - [Scaling Proof-of-Replication for Filecoin Mining](https://web.stanford.edu/~bfisch/porep_short.pdf)
163 |
164 | - [PoReps: Proofs of Space on Useful Data](https://eprint.iacr.org/2018/678.pdf)
165 |
166 | - [Tight Proofs of Space and Replication](https://eprint.iacr.org/2018/702.pdf)
167 |
168 | - [Proofs of Replicated Storage Without Timing Assumptions](https://eprint.iacr.org/2018/654.pdf)
169 |
170 |
171 |
172 | ### Vertical PoRep/PoST proving system
173 |
174 | **What:**
175 |
176 | **Why:**
177 |
178 | **Status:** Curious
179 |
180 | **Priority:** Medium-Term
181 |
182 | **References:**
183 |
184 |
185 |
186 | ### Faster SNARKS on GPU and off-the-shelf hardware
187 |
188 | **What:**
189 |
190 | **Why:**
191 |
192 | **Status:** Future RFP
193 |
194 | **Priority:** Short-Term
195 |
196 | **References:**
197 |
198 |
199 |
200 | ### Sector Aggregation
201 |
202 | **What:**
203 |
204 | A `sector` is a unit of storage over which a miner performs a *Proof-of-Replication* to convince the network and their clients that they are dedicating physical storage to the data they promised to store.
205 |
206 | When the miner fills the sector with data from the storage market, they *seal* the sector sector. The miner then posts to the chain the the cryptographic commitments (1) of the original data, `commD`, and (2) of the sealed data, `commR`. In addition, the miner submits a convincing *proof* that the data behind `commR` is a correct encoding of the data behind `commD`. This process is called `Sector Commitment`.
207 |
208 | Having each miner post commitments and proofs for each of their storage units results in a large on-chain footprint.
209 |
210 | **Why:** We want to reduce this footprint so as to: (1) reduce the throughput for storage that Filecoin can on-board (note that for each sector sealed, the miner must submit a minimum of ~300 bytes composed by the proof, `commD` and `commR`), (2) Avoid having miners pay transaction fees for each storage submission (which could penalize particular storage patterns).
211 |
212 | Can we batch storage submissions from the same miner into a single short proof? Can we batch storage submission from multiple miners?
213 |
214 | **Status:** Working On/Collaboration
215 |
216 | **Priority:** Medium-Term
217 |
218 | **References:** https://github.com/filecoin-project/specs/pull/125
219 |
220 |
221 |
222 | ### SNARK-friendly accumulators
223 |
224 | **What:**
225 |
226 | **Why:**
227 |
228 | **Status:** Future RFP/Curious
229 |
230 | **Priority:** Medium-Term
231 |
232 | **References:**
233 |
234 | ### Weighted Threshold Signatures
235 |
236 | **What:** A `(t, n)`-Weighted Threshold Signature allows a group of participants with different signing power to produce a valid signature when a subset of participants with a signing power which sums up to `t` (out of a total signing power on `n`) cooperates in the signing procedure.
237 |
238 | Note that while `(t, n)`-Threshold Signatures already exist, a valid signature here is produced if any `t` (out of `n`) subgroup of participants cooperate in the signing procedure. In this sense, a `(t, n)`-Threshold Signature is a special case of a `(t, n)`-Weighted Threshold Signature where all the participants have weight one. Note that the weight of each participant can be related to the collateral that a particular user is committing to.
239 |
240 | **Why:** The straightforward way to get a `(t, n)`-Weighted Threshold Signature is to provide each participant with a number of keys equal to their weight. Namely, if participant `P` has weight `k`, they receive `k` different keys and can produce `k` valid signatures. Nevertheless, this strawman is highly inefficient, both in terms of keys size and signature size.
241 |
242 | We would like a key sizes independent from weight and signature sizes independent from the threshold `t`.
243 |
244 | Promising directions include:
245 |
246 | - Design an ad-hoc secret sharing scheme for key generation.
247 | - Aggregate `k` multiple weight `1` signatures into a signature of the same size with weight `k`.
248 |
249 | Such a cryptographic tool would be of great use to both PoS consensus protocols and any power-based voting systems.
250 |
251 | **Status:** Curious
252 |
253 | **Priority:** Medium-Term
254 |
--------------------------------------------------------------------------------
/porep/encoding-optimization.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/filecoin-project/research/3b36004e592ddc3a35c17efb9a4ec6f6e963ca80/porep/encoding-optimization.pdf
--------------------------------------------------------------------------------
/porep/porep.md:
--------------------------------------------------------------------------------
1 | # Proof of Replication
2 |
3 | This is a top-level page to gather artifacts from completed research into and collaborations surrounding practical implemenation of proof of replication.
4 |
5 | ## Completed Work
6 | + [Encoding Optimization](encoding-optimization.pdf) (September 2019)
7 |
8 |
--------------------------------------------------------------------------------
/problems-glossary.md:
--------------------------------------------------------------------------------
1 | Filecoin Research Open Problems
2 | =====
3 |
4 | The scope of Filecoin Research is confined to projects and open-problems related to the development and improvements that have an impact on the overall Filecoin protocol, either directly (open problems needed to be solved in order for Filecoin to operate at scale) or indirectly (breakthrough results that would serve our work). It does not include PL Researchers’ work moving research results into the Filecoin spec, responding to feedback/needs from developers, or producing technical write-ups, research papers or other materials to be presented to the research community at large.
5 |
6 | The problems listed in this document are present and future work to be tackled by the Protocol Labs research team, research collaborators, grantees and other interested researchers. This is not an exhaustive list.
7 |
8 | Problems can be of very different scales, not least based on our progress in tackling them. Some represent a few days’ of research or a few months; as they are tackled, they may turn into projects and spark other open problems, or be resolved and lead to a spec change.
9 |
10 | ## Priority/Complexity
11 |
12 | The research problems are categorized as:
13 |
14 | **Short-term:**
15 |
16 | - Priority: problem is high priority for Filecoin development, will be tackled in the next one to two quarters
17 | - Complexity: The problem space for these is generally small
18 |
19 | **Medium-term:**
20 |
21 | - Priority: problem is relevant for the deployment of Filecoin, should be tackled in the next two to three quarters
22 | - Complexity: These are complex problems that will typically require multiple iterations of research.
23 |
24 | **Long-term:**
25 |
26 | - Priority: problem is lower priority for the deployment of Filecoin, but important for the improvement of the protocol
27 | - Complexity: typically open problems that require substantial amount of research
28 |
29 | ## Status
30 |
31 | Related to the above priorities, the problems we are interested in in the space have more or less direct impact/applicability to Filecoin development. Therefore, some are more or less likely to be tackled by the Protocol Labs team in the near future. Some guidance on our current outlook on how we view these problems:
32 |
33 | **Working On:**
34 |
35 | Filecoin Research intends to tackle this problem as part of our main roadmap.
36 |
37 | **Collaborators:**
38 |
39 | This is research we would love to collaborate with external researchers in the field with. We’d be happy to work toward a joint publication on the topic, or support a team doing this work, helping motivate its usefulness for a real-world system like Filecoin.
40 |
41 | **RFP:**
42 |
43 | This is work we would possibly look to put a bounty on, funding external groups to pursue this line of enquiry through Request for Proposals or other forms of grant-making.
44 |
45 | **Curious:**
46 |
47 | We are curious about this area of research and interested in following this line of enquiry, as new constructions or discoveries in the space may have applicability in Filecoin.
--------------------------------------------------------------------------------
/research-notes/2018-05-jury-trial-data-availability.md:
--------------------------------------------------------------------------------
1 | # Idea: Proof of Delivery and Guaranteed Retrieval Price
2 |
3 | - Author: Nicola Greco
4 |
5 | - Historical note: Proposal written in 2017Q4 based on some work that Nicola did at MIT on Jury Trials/Probabilistic trust. The main purpose of this document is to try to answer the question "how do we guarantee retrieval price in Filecoin?". Just for reference, we call Jury Trial any protocol that samples from a population of 51% honesty.
6 |
7 | ---
8 |
9 | The plan:
10 | - In order to have guaranteed retrieval price in Filecoin, we need to be able to write "file contracts" that mentions an agreed upon price and penalization in case the storage provider is not serving the file.
11 | - In order to penalize miners, we need to have a proof that they have not delivered a file to a client that is requesting it (or a proof that the file was delivered).
12 | - So, we will explain in order:
13 | - the problem of guaranteed delivery and how we go around in filecoin
14 | - how to extend guaranteed delivery to guaranteed delivery with penalties
15 | - how to solve guaranteed delivery with a new primitive "fair delivery" which requires a trusted third party
16 | - how to remove the trusted third party with a blockchain and then with a set of randomly sampled validators
17 | - how you can write contracts once you have this primitive
18 |
19 | ---
20 |
21 | # Data Availability Problem (or *Guaranteed Delivery Problem*)
22 |
23 | **Data Availability Problem**: There is some data that is available on request.
24 |
25 | **Guaranteed Delivery Problem**: A client stores some data with a storage provider (or more) and they want to have a guarantee that the data can be retrieved on request.
26 |
27 | **Guaranteed Retrieval Price Problem**: A client stores some data with a storage provider and they want to have a guarantee that the data can be retrieved on request and at an pre-established price.
28 |
29 | ## Naive solution: 1-of-m honest(or rational) provider assumption
30 |
31 | A way to solve the data availability problem is to rely on the following assumption: a client will distribute the data to a sufficiently large set of providers `m`, such that there is at least `1-of-m` honest (or rational) providers that is willing (or incentivized) to serve the data.
32 |
33 | **Problems**
34 | - **Large m**: To guarantee availability, the client must store the data with multiple providers (at least `m`), and this might be have a high cost.
35 | - **Unknown retrieval cost**: In a rational setting, there is no guarantee on the cost that will incentivize the rational provider.
36 | - **monopoly attack**: a miner can have a monopoly over a file and perform an [extortion attack](https://github.com/filecoin-project/aq/issues/67)!
37 |
38 | **Remark**: This is the current way the data availability problem is solved in Filecoin.
39 |
40 | ## New Proposal: Fair Delivery
41 |
42 | ### Data Availability with penalties
43 |
44 | We now need to provide a different notion of data availability:
45 |
46 | A client stores the data with a storage provider (or more) and they want to have a guarantee that **either the data is available (can be retrieved at the time of request), or the storage provider is penalized**.
47 |
48 | ### Fair Delivery
49 | A delivery of a file between a client and a provider is fair if there are two valid outcomes of the protocol:
50 | - Output 1: the client receives the file
51 | - Output 2: the client does not receive the file and the provider is penalized
52 | - (Output 3: the client aborts, the client does not receive the file and the provider is not penalized)
53 |
54 | **Problem with fair delivery!**:
55 | This problem reduces to the fair exchange problem! There is no way we can achieve a fair delivery as described, without a trusted third party! Note: the fair delivery protocol can only be initiated by the client.
56 |
57 | ### Fair Delivery with Third Third Party (TTP)
58 |
59 | We describe an optimistic "Fair Delivery" protocol with a trusted third party (TTP) in an optimistic setting (this means that the client or the provider invoke the trusted third party only in case of conflicts).
60 |
61 | **Protocol**:
62 | - Setup:
63 | - Client sends a file to Provider
64 | - Client sends a hash of the file to TTP
65 | - Provider deposits collateral with TTP
66 | - Honest Delivery:
67 | - Client sends a file request to Provider
68 | - Provider sends the file to Client
69 | - Client verify
70 | - **Output 1: Client has the file, provider is not penalized (great!)**
71 | - Delivery with conflicts:
72 | - Client sends a file request
73 | - Conflict!
74 | - either the client did not receive the file or,
75 | - the provider did not receive the requests, or
76 | - either are lying!
77 | - Conflict resolution:
78 | - client asks TTP to check if there are conflicts
79 | - TTP requests the file from provider with a timeout
80 | - if Provider sends the file to TTP before the timeout
81 | - TTP sends file to client
82 | - **Output 1: Client has the file, provider is not penalized (great!)**
83 | - if Provider doesn't send the file before timeout:
84 | - TTP penalizes the collateral of the provider
85 | - **Output 2: Client doesn't have the file, provider is penalized**
86 |
87 | #### Going around TTP using the blockchain
88 |
89 | The way we have gone around Trusted Third Party in decentralized systems, is by replacing them with a distributed network and a consensus algorithm - in other words, we replace the TTP with a smart contract on a blockchain. Every time, clients or prover interact with the TTP, they would now post transactions to the smart contract. Instead of trusting a single TTP, we now trust that the majority of the consensus protocol is honest.
90 |
91 | **Problem!** In our setting, in case of conflict, the provider must have to post on chain the entire file(!). Can we avoid posting the entire data on chain?
92 |
93 | #### Jury Trial: Going around TTP by sampling validators
94 |
95 | Instead of relying on the majority of the consensus in being the TTP, we can have a smaller set of users which we call "validators". Validators are sampled via a sampling strategy, and they act as a validator.
96 |
97 | **Sampling validators for the jury trial**: If we assume that the majority of the consensus is honest, then we can sample a random set of users from the miners (proportionally to their power in the consensus) - e.g. using cryptographic sortition. If we assume that the majority of "money at stake for becoming a validator" is honest, then we can do the same here.
98 |
99 | ### Fair Delivery with Jury Trial on a blockchain
100 |
101 | **Protocol**:
102 | - Setup:
103 | - Client sends a file to Provider
104 | - Client creates a contract for a special file hash
105 | - Provider deposits collateral in the contract (officially commits to serve the file once)
106 | - Honest Delivery:
107 | - Client sends a file request to Provider
108 | - Provider sends the file to Client
109 | - Client verify
110 | - **Output 1: Client has the file, provider is not penalized (great!)**
111 | - Delivery with conflicts:
112 | - Client sends a file request
113 | - Conflict!
114 | - either the client did not receive the file or,
115 | - the provider did not receive the requests, or
116 | - either are lying!
117 | - Conflict resolution:
118 | - client asks a random sample of validators to help solving the conflict
119 | - Validator set requests the file from provider with a timeout
120 | - if Provider sends the file to validators before the timeout
121 | - Validators sends file to client
122 | - **Output 1: Client has the file, provider is not penalized (great!)**
123 | - if Provider doesn't send the file before timeout:
124 | - Validators sign a penalization transaction and submit it to the smart contract, which penalizes provider
125 | - **Output 2: Client doesn't have the file, provider is penalized**
126 |
127 | ## From Fair Delivery to File Contracts with guarantee retrieval price
128 |
129 | Once we have the fair delivery primitive, then we can have more expressive smart contracts which enforce retrieval of a file X times a price X for some amount of time.
130 |
131 | A contract would look more and less like this
132 |
133 |
134 | ### RetrievalContract()
135 |
136 | - `RetrievalContract.Setup(collateral, hash, clientAddrs, minerAddr, times, expiry)`
137 | - miner deposits a collateral for serving a hash at most `times`
138 | - when block time `expiry` is reached, collateral is given back
139 | - minerAddr is the retrieval miner and clientAddrs are the set of clients allow to retrieve the file
140 | - `RetrievalContract.Penalize(signatures)`
141 | - `signatures` from a valid set of validators prove that the file was not given, collateral is now lost!
142 | - `RetrievalContract.Close(tickets)`
143 | - miner posts tickets signed by the `clientAddrs` proving that they have done the work correctly
144 | - miner gets their collateral back
145 |
146 | Open questions:
147 | - how much collateral should a retrieval miner be request to put down (e.g. if collateral is too low, then they can do a Miner Monopoly Attack (where a miner is the only miner storing a particular file) again!)
148 | - can we use "reputation" instead of collateral?
149 | - how do we incentivize validators?
150 |
--------------------------------------------------------------------------------
/research-notes/2018-09-consistent-hashing-to-avoid-monopoly-broken.md:
--------------------------------------------------------------------------------
1 | # Idea: use consistent hashing to avoid miner-monopoly and greedy-miner attacks (broken)
2 |
3 | - Author: Nicola Greco
4 |
5 | ---
6 |
7 | This is an intuition of how we could go about solving the two following attacks using ideas from consistent hashing.
8 |
9 | This solution is actualy broken, I am posting this here since:
10 | - This is something that we should investigate more (later!), since this could get us two birds with one stone!
11 | - This is also a greaaaat interview question!
12 |
13 | We discussed this solution about a year and a half ago, but didn't make progress on this.
14 |
15 | ### (Storage) Miner Monopoly Attack
16 | A miner `M*` stores all the copies of a file `f`, such that they are the only storage miner and they have monopoly for `f`. This is #15
17 |
18 | Bad things about miner monopoly attacks:
19 | - `M*` can do *data witholding attacks*: never serving data on get requests
20 | - `M*` can do *data extortion attacks*: ask for an insanely high price to return the data
21 |
22 | ### (Storage) Greedy Miner Attack
23 | A *greedy* miner `M*` is a miner that doesn't care about storage rewards, it only cares about the block reward. For this reason, they hire themselves to store their own data
24 |
25 |
26 | ### Proposed solution
27 | The proposed solution here is to use consistent hashing (similarly to how this is used in Chord (I think Kadamlia as well)). The intuition is:
28 | - to put all miners in a circle
29 | - every miner gets assigned a range in the hash namespaces for files to store. (e.g. say that there are three miners A, B, C, A stores all the data from hash 000000..-999999.., B from 999999..-GGGGGG.., C from GGGG..-ZZZZ...).
30 | - miners can only take the orders if hash of the file in the order is in their range
31 | - note: for each redundant copy of a file, we consider the hash of the file to be `H(file || number of the copy)`, such that each copy is assigned to a different miner
32 |
33 | Ideally, this would solve:
34 | - miner monopoly attacks: `M*` can only store the copies in their range (say the miner has 50% of storage, for each copy, there is 1/2 of probability for `M*` to store it)
35 | - greedy miner attacks: (assuming that we have a way to break a file in pieces from a single ask order) `M*` can only store some of their data, since some others will be assigned to other miners
36 |
37 | #### Problems with this solution
38 | - Grinding: a greedy miner can generate some random data and split it in ways for which they would be always be selected. This is actually trivial: (1) get the data, hash it, if it's assigned to you, great, if not, add a nonce and on expectation after `total power/M* power` they find it)
39 | - Miners now cannot really participate in the market since there will be some orders that match their prices, which they can't get, since the hash of the file in the order is not in their range - this screws up the market (not sure how much of a problem this is)
40 | - On expectations, miners can only get a proportion of the orders in the market and no more than that proportion (while today miners can really store as much as they want - there is nothing preventing them)
41 | - We don't have proofs for storage that is not in use, so a miner can just lie about having A LOT of fake empty storage, just so that their key space is large enough. A fix would not make the key space proportional to the storage (but then we can have a lot of sybils spreading around the circle)
42 | - I am sure there are other problems but can't recall
43 |
--------------------------------------------------------------------------------
/research-notes/2018-09-storage-limitation-notes.md:
--------------------------------------------------------------------------------
1 | # Filecoin Storage Limitations
2 |
3 | - Author: Brian Vohaska
4 | - Comments: [#20](https://github.com/filecoin-project/research/issues/20)
5 |
6 |
7 | As @whyrusleeping has shown in his analysis ([1], [2], [3], [4]), we currently have a limitation on the total storage that FIL can support. In part this limitation is due to the number of signatures and other associated data posted to the blockchain [2]; our choice of a ~400KB block; because we have well-defined blocks; and because we are storing deals, et al. on the FIL blockchain.
8 |
9 | ### What is a storage limitation
10 |
11 | FIL is a blockchain which means that when an epoch `E` has passed, a block is generated and *committed* to the blockchain. We can think of epoch as the amount of time it takes for a reasonable set of FIL participants to agree that a set of transactions (or block) is (1) valid and (2) well distributed to the network. When that reasonable set of FIL participants has agreed during the epoch, we say that the block was *committed*. Or in other words, we agree that this block is now a shared truth. This process is called *consensus*.
12 |
13 | Storage limitations come in to play because we have chosen (1) a fixed consensus model and (2) a fixed blocks size (~400KB). This means that only a certain number of transactions can be included in a block. The number of transactions in a block chooses how many storage transactions we can perform per block [1] and as a result how much storage we can maintain in FIL with economic incentives. Note that the network could store more data but there would be no economic incentive to do so.
14 |
15 | There there are some reasons that we have chosen this size [4]. We strongly welcome argument for/against.
16 |
17 | In fact, there are many disjoint and related reasons why we have a total FIL storage limitation. This issue is meant to point out and explore how we can (1) increase FIL total storage or remove the storage limit completely (2) understand all components that lead to a storage limitation AND how those components relate to each other. I propose that by understanding how each component relates to each other we can perform a logical principal component analysis ([PCR](https://en.wikipedia.org/wiki/Principal_component_analysis)) and better solve FIL's storage limitation issue.
18 |
19 | ### Why do we care?
20 |
21 | FIL is a global storage and communication system with the mission of providing secure, reliable, and affordable storage to any party. As a result we need to be able to scale FIL globally AND at a global growth rate. Any fixed limitations we encounter, may lead to out not meeting our vision in the future.
22 |
23 | ### Let's look at some examples
24 |
25 | 1. (Large transaction size) Suppose transactions were each 400KB.This would mean that we can only include one transaction per block and would require that consensus happen very quickly (we we are able to include every participants transaction). We know that consensus requires an epoch to occur which means a reasonable set of participants needs to agree that this transaction is valid, and so forth. We also know that this means there's a lot of communication to many participants...which is typically pretty slow.
26 |
27 | 2. (Small transaction size) Suppose our transactions were 0KB (not realistic but good for argument). This would mean that we can include every transaction that will every be generated BUT consensus wouldn't occur until we decided that some condition has been met and we want an epoch to end. If we waited too long for an epoch to end, this might lead to an unstable economic system where soo many transactions have occurred that a transaction history becomes unverifiable (read as cheaters could sneak in and perform attacks) and there may no longer be string economic incentives to return data to a client. Nevertheless, being able to choose an epoch on consensus speed and not block size could be ideal. Note that having 0KB transactions is equivalent to having infinite block sizes (where communication is free and physics is kind-of broken).
28 |
29 | 3. (Maybe realistic transaction size) Now, suppose our transactions were each on the order of 2KB. This would mean that we can include 200 transactions per block. Consensus will still need to happen and we would still need to do all of the normal work to achieve consensus BUT we would be able to ensure that (1) waiting-too-long attacks don't happen, (2) economic incentives to return data efficiently still exist, (3) we can accommodate many transactions <-- we still aren't sure what a reasonable amount is yet.
30 |
31 | Note that we have not looked at improving consensus in the above cases. This is because consensus is currently an open question though we will note it as an area of research for the storage limitation problem. In each of these cases, we notice that for some fixed consensus model, we would like to increase the number of transactions. Currently, decreasing the size of the transactions is the only means by which doing so.
32 |
33 | ### So what can we do about the limitation
34 |
35 | In one sentence, increase the number of transactions. PLEASE CORRECT THIS IF IT IS WRONG. THIS IS A BIG ASSUMPTION. Given the above explanation, we might be able to:
36 |
37 | - Increase consensus convergence? Can we modify expected consensus (EC) to help us increase the number of transactions
38 |
39 | - Change what a transactions means / off-chain actions? Maybe we can perform some of the trans-actions off-chain?
40 |
41 | - Decrease transaction sizes? Maybe there exists some means by which to decrease the size of transactions? Compression?
42 |
43 | - Decrease the number of headers or header size? If headers exist, maybe we can compress them or aggregate them?
44 |
45 | ### Current components contributing to the FIL storage limitation
46 |
47 | - Signatures ([detail spec](please fill me in with the spec address))
48 |
49 | - Bids/Deals ([detail spec](https://github.com/filecoin-project/specs/blob/master/drafts/storage-market.md))
50 |
51 | - **please edit/add/remove**
52 |
53 | ### Current Issues:
54 |
55 | - [Signature aggregations](https://github.com/filecoin-project/research/issues/19)
56 |
57 | ---
58 |
59 | References:
60 |
61 | [1 - observable/visual analysis of storage limitations](https://beta.observablehq.com/d/37ff2d55942d1354)
62 | [2 - specs/storage market draft](https://github.com/filecoin-project/specs/blob/master/drafts/storage-market.md)
63 | [3 - specs/thoughts on aggregation](https://github.com/filecoin-project/specs/pull/93)
64 | [4 - aq/issue on size constraints](https://github.com/filecoin-project/aq/issues/113)
65 |
--------------------------------------------------------------------------------
/research-notes/2018-10-splitzy-attack-on-vdf.md:
--------------------------------------------------------------------------------
1 | # VDF Attack 1: SPLITZY ATTACK
2 |
3 | Author: Brian Vohaska
4 |
5 | ----
6 |
7 | As discussed at Team Week, here is the outline of a pretty simple possible cryptographic attack on VDF-RSA. This attack is for a _very very_ special case that we expect to see in practice with probability very close to cryptographic-zero. Nevertheless, this attack should be looked at to ensure that it can't be generalized.
8 |
9 | Note to those reading this: sorry about the error. No LaTex here so I'm winging it with the markdown. Feel free to correct variable/index issues.
10 |
11 | ## SPLITZY Attack
12 |
13 | *ExecuSummary: We make exponentiation faster and compute VDF faster than others. [Time-Space Trade-Offs](https://en.wikipedia.org/wiki/Space%E2%80%93time_tradeoff).*
14 |
15 | This attack assumes modular exponentiation on an RSA group with a slowly changing modulus. The goal of this attack is to compute a series of modular exponentiations faster than an honest actor. Where the only advantage an honest actor will implement is [square and multiply](https://en.wikipedia.org/wiki/Exponentiation_by_squaring). The following are assumptions we make about the VDF system,
16 |
17 | 1. `N = p*q` where `p`,`q` are primes
18 | 2. `j` is a _m-bit smooth integer_ chosen at random
19 | 3. `T` is a very large integer and is publicly known
20 |
21 | VDF is of the form,
22 |
23 | `j^(2^T) (mod N)`
24 |
25 | ### One-time Step
26 |
27 | #### Step 1
28 |
29 | For all m-bit primes `I : prime_i in Z/N` compute,
30 |
31 | `c = i^2^T (mod N)`
32 |
33 | and store the result in a table we call `C`. This results in a storage requirement of about `(m*log(n))/log(m)` bits. For a more precise calculation use the prime density theorem and associated lemmata.
34 |
35 | #### Step 2
36 |
37 | Given that `j` is chosen from `Z/N`, calculate the prime factorization for `y` m-bit smooth integers,
38 |
39 | `y = {r_0, r_1, ... , r_y-1}`
40 |
41 | `E = {e_m} <-- Factor(r_w) = p_0^e_0 * p_1^e_1 * ... * p_m^e_m`
42 |
43 | for all `0 <= w < y`
44 |
45 | Using a chosen average, calculate the most likely set `E` for a given m-bit smooth integer in `Z/N`
46 |
47 | For all `e_i in E` and `c in C` calculate and store
48 |
49 | `u = c^(e_i - x)`
50 |
51 | where `x` is an integer chosen based on prime density theorem and ensures that `u` will be close to a desired factorization in Step 2 of this attack. The storage requirement is on the order of `|C|`.
52 |
53 | (optionally) store for a set `u = c^(e_i - x)` for various choices of `x`.
54 |
55 | ### Every-time Step
56 |
57 | Factor `j`,
58 |
59 | `j = p_0^e_0 * p_1^e_1 * ... * p_m^e_m` where p_i is m-bit smooth
60 |
61 | Recall out VDF,
62 |
63 | j^(2^T) (mod N)
64 |
65 | (p_0^e_0 * p_1^e_1 * ... * p_m^e_m)^2^T (mod N)
66 |
67 | (p_0^e_0)^2^T * (p_1^e_1)^2^T * ... * (p_m^e_m)^2^T (mod N)
68 |
69 | Note that we have stored `p0_2T = (p_0^e_0)^2^T (mod N)` as well as integers close to `p0_2T`. Since `mod N` is linear we have,
70 |
71 | (p_0^e_0)^2^T (mod N) * (p_1^e_1)^2^T (mod N) * ... * (p_m^e_m)^2^T (mod N) (mod N)
72 |
73 | (p0_2T) * (p1_2T) * ... * (pm_2T) (mod N)
74 |
75 | Suppose `m|d = 4` for some integer `d` then,
76 |
77 | P1 = (p0_2T) * (p1_2T) * ... * (p(m/4)_2T)
78 |
79 | ....
80 |
81 | P4 = (p(3m/4)_2T) * ... * (pm_2T) (mod N)
82 |
83 | Each part can be computed in parallel and combined later,
84 |
85 | VDF = P1 * ... * P4 (mod N)
86 |
87 | ### Trivial Examples
88 |
89 | #### Suppose `j = (2^18236213)`, log(N) = 2048
90 |
91 | It happens that `log(18236213) = 25` thus`j` is 25-bit smooth. Furthermore, _for reasons_ all choices of `j` will be 26-bit smooth. Therefore we need to store about (2^26)*2048/26 bits = 630 MB in our pre-compute table or about 1GB if we want to allow for some wiggle room for our choices of `c`.
92 |
93 | Now we need to compute the VDF,
94 |
95 | VDF = j^2^T = (2^18236213)^2^T (mod N)
96 |
97 | = (2^2^T)^18236213 = (2^2^T)^(e_2 - x) = (2^2^T)^(e_2) / (2^2^T)^(x) (mod N)
98 |
99 | where `x << T` be our choices of `c`
100 |
101 | = (c_2) * (p2_2T)^x
102 |
103 | we calculate `(p2_2T)^x` which is very easy since we have precomputed `p2_2T` and `x << T`.
104 |
105 | #### Suppose `j` = (2^2134347) * (13^2138097)
106 |
107 | Again, _for reasons_ all choices of `j` will be 26-bit smooth. Thus, we need about `GB to store our precompute table.
108 |
109 | Now we need to compute the VDF,
110 |
111 | VDF = j^2^T = ((2^2134347) * (13^2138097))^2^T (mod N)
112 |
113 | = (2^2^T)^2134347 * (13^2^T)^2138097 (mod N)
114 |
115 | = (2^2^T)^(e_2 - x) * (13^2^T)^(e_13 - x) (mod N)
116 |
117 | = (2^2^T)^(e_2 ) / (2^2^T)^(x) * (13^2^T)^(e_13 - x) / (13^2^T)^(x) (mod N)
118 |
119 | = c_2 * (p2_2T)^x * c_13 * (p13_2T)^x
120 |
121 | where `x << T` be our choices of `c`
122 |
123 | we calculate `(p2_2T)^x` and (p13_2T)^x which is very easy since we have precomputed `p2_2T`, `(p13_2T)`, and `x << T`.
124 |
125 | ### Mitigations
126 |
127 | Probably not needed at this point since this attack relies on unlikely events. However, it might be prudent to consider other VDF constructions If the attack can be generalized.
128 |
--------------------------------------------------------------------------------
/research-notes/zigzag-modern.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/filecoin-project/research/3b36004e592ddc3a35c17efb9a4ec6f6e963ca80/research-notes/zigzag-modern.pdf
--------------------------------------------------------------------------------
/research-roadmap-diagram.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/filecoin-project/research/3b36004e592ddc3a35c17efb9a4ec6f6e963ca80/research-roadmap-diagram.png
--------------------------------------------------------------------------------