└── README.md


/README.md:
--------------------------------------------------------------------------------
  1 | # Peer-to-Peer Frequently Asked Questions
  2 | 
  3 | **N.B.** This FAQ focuses on the very nebulous term "p2p system". There's not a
  4 | single answer that maps exactly to *all* peer-to-peer systems; this FAQ does its
  5 | best to provide a general answer when possible, and provide concrete examples
  6 | where it makes sense.
  7 | 
  8 | This FAQ also provides multiple answers per question, from various authors.
  9 | There is no single objective perspective, so more viewpoints are invited: file a
 10 | pull request!
 11 | 
 12 | ## 1. Sounds great, but will it scale?
 13 | 
 14 | *From [@staltz][staltz]:*
 15 | 
 16 | Yes. It is rare to find a p2p service that does not scale. They are distributed
 17 | systems by design, and most distributed systems are meant to scale. You could
 18 | say, then, that many distributed systems take cues from p2p systems in order to
 19 | scale properly. As a good example, Skype was built by the same engineers who
 20 | built Kazaa, and Skype internally used p2p distribution in order to alleviate
 21 | the load from any single node, and to save costs. Bittorrent also thrives in
 22 | situations where there are a high number of peers.
 23 | 
 24 | Like centralized systems, performance will suffer if the load is not
 25 | distributed. A torrent file with only one seed and thousands of leechers would
 26 | struggle to initially share to the first wave of peers. Unlike a centralized
 27 | system though, once that first wave of peers downloads a copy, the bandwidth for
 28 | that torrent data to be served grows exponentially.
 29 | 
 30 | ## 2. If websites are hosted on p2p, what happens when no peers are online?
 31 | 
 32 | *From [@noffle][noffle]:*
 33 | 
 34 | The same result as when a centralized website goes down: it isn't available.
 35 | 
 36 | The difference is that peer-to-peer networks distribute *the power to host*. I
 37 | could run a peer serving my website on a server. Instantly I have the same
 38 | website availability as a traditional centralized website. The difference is
 39 | that there may be many peers in the swarm that are also hosting my website, so
 40 | if my server goes down, the site will continue to be accessible through those
 41 | seeding peers.
 42 | 
 43 | *From [@retrohacker][retrohacker]:*
 44 | 
 45 | Many p2p systems, i.e. BitTorrent, are optimized for sharing popular content.
 46 | The more popular a piece of content, the more available the content becomes. The
 47 | less popular content is, the less available the content becomes. Popularity in
 48 | this case is the number of peers actively consuming and sharing a piece of
 49 | content. The ability to access any piece of content on a p2p network is limited
 50 | by the availability of peers, no peers no content.
 51 | 
 52 | If you share content on a p2p network that you have a vested interest in being
 53 | always-available, you must invest in maintaining your own highly available peers
 54 | that share this content. This is not dissimilar to a centralized network, in
 55 | that you must build highly available infrastructure to share your content.
 56 | However, unlike a centralized network, you're infrastructure is no longer a
 57 | single point of failure since you have the benefit of a p2p network supporting
 58 | you.
 59 | 
 60 | In the p2p model, you are not soley responsible for your uptime or performance.
 61 | If your system falls over, consumers of your content can still fetch from
 62 | another peer. If there is a spike in popularity of your content, peers will
 63 | share content amongst eachother reducing the burden on your infrastructure. In
 64 | many cases this can provide a better overall experience for consumers of your
 65 | content.
 66 | 
 67 | ## 3. What about security? Somebody could share a hacked version of a p2p website?
 68 | 
 69 | *From [@noffle][noffle]:*
 70 | 
 71 | It depends what the security model of the system hosting the website uses. There
 72 | are two commonly tools I know of for ensuring that a copy of data you've
 73 | received from a potentially untrusted source is authentic:
 74 | 
 75 | 1. You used the [hash](https://wikipedia.org/Hash_function) of the data to
 76 |    request from the p2p network. If so, the data you receive from a peer can be
 77 |    hashed, and that hash compared against the one used to make the request for
 78 |    the data. [IPFS](https://ipfs.io) and [Secure
 79 |    Scuttlebutt](https://scuttlebutt.nz) do this. A caveat is that the data is
 80 |    static: the hash never changes and thus neither can the data. A benefit is
 81 |    that content-addressable data can be safely cached indefinitely.
 82 | 
 83 | 2. You used the [public key](https://wikipedia.org/Public_key_cryptography) of
 84 |    the author to request the data from the p2p network. The idea is that every
 85 |    version of the data is cryptographically signed by the data's author, so that
 86 |    any data you download will have a signature can be checked against the public
 87 |    key used to request the data. This guarantees that the data came from the
 88 |    author you expected, and also permits changes to that data, unlike with
 89 |    content-addressed above. [Dat](https://dat-project.org),
 90 |    [IPFS](https://ipfs.io) and [SSB](https://scuttlebutt.nz) all use this
 91 |    approach for dynamic data.
 92 | 
 93 | *From [@matthiasbeyer][matthiasbeyer]:*
 94 | 
 95 | If cryptographic signatures come into play, this is not possible.
 96 | 
 97 | Consider a content-addressed system. In such systems, content is addressed via
 98 | a cryptographic hash which represents the content. For example, a file
 99 | containing "Hello World" gets a hash "648a6a6ffff".
100 | If a peer now tries to fetch content from the network, it does so by asking
101 | for the content of "648a6a6ffff".
102 | If it gets sent this content, it can then verify with that same hash,
103 | whether the content it got is the actual content it requested.
104 | 
105 | An attacker would be able to host malicious nodes in the network, but as the
106 | node which _requests_ the content (your node) can verify that it got what it
107 | expected.
108 | 
109 | ## 4. What about privacy? Everybody in the p2p network can see what I am looking at.
110 | 
111 | *Help contribute an answer to this question!*
112 | 
113 | ## 5. P2P is great, but sometimes you need a single authoritative source of truth
114 | 
115 | *From [@matthiasbeyer][matthiasbeyer]:*
116 | 
117 | This is not true. Consider git: Each branch could be considered as source of
118 | truth (or rather "point of truth").
119 | Branches may depend on eachother, branches may be merged. Branches may _not_
120 | depend on eachother (git can have multiple "orphan" branches) and may _not_ be
121 | merged. Still, they are points of truth.
122 | With p2p systems in a decentralized environment, this is true as well.
123 | There might never be the _one_ version which is currently the point of truth,
124 | but as long as versions of the system can be merged, this is not a problem.
125 | 
126 | Events in such a system can even be sorted chronologically via
127 | [vector clocks](https://en.wikipedia.org/wiki/Vector_clock)
128 | where each key is the unique peer hash.
129 | 
130 | There exists a technology which brings data types to the table which can exist
131 | in a p2p system without ever needing a single source of truth. These types are
132 | named [CRDT](https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type)s.
133 | 
134 | *From [@noffle][noffle]:*
135 | 
136 | If you are cryptographically signing the data you create (see #3), users can
137 | request your content by your public key. In this way you are able to control
138 | what data appears in this feed of data, but rely on potentially untrusted peers
139 | to distributed that data.
140 | 
141 | By introducing a monotonic increasing sequence number to each new entry in the
142 | signed feed, peers can be assured that no messages were suppressed or censored.
143 | 
144 | ## 6. What if p2p technology is used by "bad actors"?
145 | 
146 | *From [@staltz][staltz]:*
147 | 
148 | A very common concern with P2P technologies is that they aid crime, piracy,
149 | pedophilia, and other bad activities. The upside of not having an authority is
150 | also its unfortunate downside. That said, this aspect of information systems is
151 | overestimated when compared to other technologies like cars, weapons, hard
152 | drives, and kitchen cutlery. Terrorist attacks are carried out often through
153 | cars and common knives, yet it seems absurd to common sense that there would be
154 | global realtime surveillance of all cars and kitchen knives in order to prevent
155 | crimes. On the other hand, information systems by themselves cannot directly
156 | cause any physical harm. The absurdity of censoring cars and cutlery should
157 | extend also to information systems, or at least the discourse around security
158 | and crime prevention should get the priorities right and first address the root
159 | causes, the supporting incentives, the real weapons, and the tradeoffs involved.
160 | 
161 | Another topic to consider the meaning of "bad", and how could *only bad actions* 
162 | be prevented without preventing good. How could technology-for-freedom empower 
163 | good actors *without* empowering bad actors? Conversely, how could 
164 | technology-for-control enable those in power to arrest bad actors without 
165 | enabling them to arrest good actors?
166 | 
167 | It's problematic to have this good/bad debate, because it's about moral. Morality
168 | is culturally-bound, it's relative to the beliefs of a group. Moral in a global 
169 | tech platform (the Internet) is toxic because it pushes one worldview and chokes 
170 | pluralism. Our focus therefore should not be on the discussion around morality, 
171 | it should be around freedom versus control, and how they affect a tech system 
172 | deployed globally.
173 | 
174 | More about this:
175 | https://theintercept.com/2015/11/17/u-s-mass-surveillance-has-no-record-of-thwarting-large-terror-attacks-regardless-of-snowden-leaks/
176 | 
177 | ## 7. What areas do modern p2p apps still struggle with?
178 | 
179 | *From [@noffle][noffle]:*
180 | 
181 | Apps still seem to have a hard time managing resources, like CPU and network
182 | bandwidth. If an app naively tries to download and replicate ALL of the data it
183 | sees, it's easy for it to overwhelm the machine it's running on. Many apps still
184 | have a ways to go in offering good controls for CPU and bandwidth use.
185 | 
186 | 
187 | [noffle]: http://git.scuttlebot.io/@C3iYh/12sO1uvKq1KcZXLFxSySzxOkHxXN8rtNB5MGA=.ed25519
188 | [staltz]: https://github.com/staltz
189 | [retrohacker]: https://github.com/retrohacker
190 | [matthiasbeyer]: https://github.com/retrohacker
191 | 
192 | 


--------------------------------------------------------------------------------