├── LICENSE └── README.md /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2019 Dylan Sprague 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Resources on designing and operating distributed systems. 2 | 3 | ## General Concepts 4 | - https://blog.pragmaticengineer.com/operating-a-high-scale-distributed-system/ 5 | - https://blog.pragmaticengineer.com/distributed-architecture-concepts-i-have-learned-while-building-payments-systems/ 6 | - https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/acrobat-17.pdf (Hints for Computer System Design, Butler Lampson) 7 | - https://engineering.salesforce.com/runway-intro-dc0d9578e248 - Runway design tool for distributed systems 8 | - https://runway.systems/ 9 | - https://blog.acolyer.org/2016/10/31/designing-software-for-ease-of-extension-and-contraction/ 10 | - https://blog.acolyer.org/2016/09/12/on-designing-and-deploying-internet-scale-services/ (also see https://gist.github.com/acolyer/95ef23802803cb8b4eb5 as a checklist for large-scale services) 11 | - https://blog.acolyer.org/2019/11/20/local-first-software/ 12 | - [System Design Primer](https://github.com/donnemartin/system-design-primer) 13 | - [Amazon Builder's Library](https://aws.amazon.com/builders-library/) 14 | - https://aws.amazon.com/builders-library/challenges-with-distributed-systems/ 15 | 16 | 17 | ## Books 18 | - Designing Data-Intensive Applications: https://dataintensive.net/ 19 | 20 | 21 | ## Event Sourcing/CQRS (Command-Query Responsibility Segregation)/Event Logging 22 | - https://www.ahri.net/2019/07/practical-event-driven-and-sourced-programs-in-haskell/ 23 | - https://engineering.salesforce.com/mirrormaker-performance-tuning-63afaed12c21 (Kafka replication performance tuning) 24 | - https://github.com/message-db/message-db (Implementation of messaging/events in Postgres) 25 | - https://arkwright.github.io/event-sourcing.html 26 | 27 | 28 | ## Databases 29 | - https://brandur.org/cloud-databases (compares Aurora, Spanner, Citus, others) 30 | - https://engineering.salesforce.com/the-architecture-files-ep-2-the-crystal-shard-a6025bd9f968 (Salesforce scaling by sharding) 31 | - https://developer.salesforce.com/page/Multi_Tenant_Architecture (Details on Salesforce's multitenant architecture) 32 | - https://engineering.salesforce.com/inside-and-out-transactions-b9535faa5924 (transactions in Salesforce) 33 | - https://netflixtechblog.com/dblog-a-generic-change-data-capture-framework-69351fb9099b (propagating changes from a database to other data stores) 34 | 35 | 36 | ## Logging/Monitoring/Alerting 37 | - https://stripe.com/blog/canonical-log-lines 38 | - https://www.honeycomb.io/ 39 | - Discussion: https://news.ycombinator.com/item?id=20569813 40 | - https://opensource.googleblog.com/2019/08/opencensus-web-unlocking-full-end-to.html 41 | - https://engineering.salesforce.com/implementation-of-a-monitoring-strategy-for-products-based-on-microservices-24ad24c4c3e5 42 | - https://linuxczar.net/sysadmin/philosophy-on-alerting/ 43 | - https://www.transposit.com/blog/2019.11.14-what-makes-a-good-runbook/ 44 | - https://github.com/danielfm/prometheus-for-developers 45 | - https://blog.acolyer.org/2020/02/26/meaningful-availability/ 46 | 47 | 48 | ## Microservices 49 | - https://philcalcado.com/2017/06/11/calcados_microservices_prerequisites.html 50 | - Communication between microservices: 51 | - Using message queues for request/response - https://www.quora.com/Why-use-message-queues-for-a-request-response-pattern-which-is-synchronous-when-queues-are-asynchronous 52 | - https://servicemesh.io/ 53 | 54 | 55 | ## Idempotency 56 | - https://stripe.com/blog/idempotency 57 | - https://brandur.org/idempotency-keys 58 | - https://brandur.org/http-transactions 59 | 60 | 61 | ## Background Job Processing 62 | - https://brandur.org/job-drain 63 | 64 | 65 | ## Serverless/FaaS 66 | - https://www.owasp.org/index.php/OWASP_Serverless_Top_10_Project (security) 67 | - Lambda tuning - https://github.com/alexcasalboni/aws-lambda-power-tuning 68 | 69 | 70 | ## AWS Cost Analysis 71 | - https://segment.com/blog/the-10m-engineering-problem/ 72 | - https://www.lastweekinaws.com/blog/an-aws-bill-analysis-changelogs-md/ 73 | 74 | 75 | ## Simultaneous Work (operational transforms, conflict-free replicated data types, etc.) 76 | - http://archagon.net/blog/2018/03/24/data-laced-with-history/ 77 | - https://blog.acolyer.org/2019/11/25/mergeable-replicated-data-types-part-i/ 78 | - https://blog.acolyer.org/2014/11/27/swiftcloud-fault-tolerant-geo-replication-integrated-all-the-way-to-the-client/ 79 | - https://www.tiny.cloud/blog/real-time-collaboration-ot-vs-crdt/ 80 | - Also see discussion: https://news.ycombinator.com/item?id=22039950 81 | - https://replicache.dev/ - Offline-first app development 82 | - https://crdt.tech/ - CRDT reference 83 | 84 | 85 | ## Chaos Engineering 86 | - https://www.gremlin.com/ 87 | 88 | 89 | ## Formal Methods 90 | - https://www.hillelwayne.com/post/business-case-formal-methods/ 91 | --------------------------------------------------------------------------------