├── README.md ├── airline-checkin.md ├── blogging-platform.md ├── counting-impressions.md ├── distributed-cache.md ├── faster-superfast-kv.md ├── file-sync.md ├── flash-sale.md ├── hashtag-service.md ├── image-service.md ├── live-commentary.md ├── load-balancer.md ├── near-me.md ├── newly-unread-indicator.md ├── onepic.md ├── online-offline-indicator.md ├── queue-consumers.md ├── realtime-claps.md ├── realtime-db.md ├── recent-searches.md ├── s3.md ├── scripts ├── dd.md ├── footer.md ├── footer.py ├── high-level-requirements.md ├── micro-requirements.md ├── new.sh ├── req.md ├── template.md └── toc.sh ├── sql-broker.md ├── sql-kv.md ├── superfast-kv.md ├── tagging-photos-with-people.md ├── task-scheduler.md ├── text-search-engine.md ├── user-affinity.md ├── video-pipeline.md └── word-dictionary.md /README.md: -------------------------------------------------------------------------------- 1 | System Design Questions 2 | === 3 | 4 | The repository contains a set of problem statements around Software Architecture and System Design as conducted by [Arpit's System Design Masterclass](https://arpitbhayani.me/masterclass). 5 | 6 | 7 | # Questions 8 | 9 | - [Design a Blogging Platform](blogging-platform.md) 10 | - [Design Online Offline Indicator](online-offline-indicator.md) 11 | - [Design Airline Check-in](airline-checkin.md) 12 | - [Design SQL backed KV Store](sql-kv.md) 13 | - Design Slack's Realtime Communication - NEW 14 | - [Design a Load Balancer](load-balancer.md) 15 | - [Design Synchronized Queue Consumers](queue-consumers.md) 16 | - [Design an Image Service](image-service.md) 17 | - [Design a HashTag Service](hashtag-service.md) 18 | - [Design OnePic](onepic.md) 19 | - [Design Photo Tagging](tagging-photos-with-people.md) 20 | - [Design User Affinity](user-affinity.md) 21 | - [Design Newly Unread Message Indicator](newly-unread-indicator.md) 22 | - [Design a Distributed Cache](distributed-cache.md) 23 | - [Design a Word Dictionary](word-dictionary.md) 24 | - [Design a Superfast KV Store](superfast-kv.md) 25 | - [Design S3](s3.md) 26 | - [Design a Faster Superfast KV Store](faster-superfast-kv.md) 27 | - [Design a Video Processing Pipeline for Steaming Service](video-pipeline.md) 28 | - [Design a Text-based Search Engine](text-search-engine.md) 29 | - [Design a service that serves Recent Searches for a user](recent-searches.md) 30 | - [Design a Text-based Cricket Commentary Service](live-commentary.md) 31 | - [Design a SQL backed Message Broker](sql-broker.md) 32 | - [Design a Distributed Task Scheduler](task-scheduler.md) 33 | - [Design Flash Sale](flash-sale.md) 34 | - [Design Counting Impressions at Scale](counting-impressions.md) 35 | - [Designing a Remote File Sync Service](file-sync.md) 36 | - [Designing a "who's near me" Service](near-me.md) 37 | 38 | --- 39 | 40 | # Questions that I do not cover anymore 41 | 42 | - [Designing a Realtime DB](realtime-db.md) 43 | 44 | 45 | # Arpit's System Design Masterclass 46 | 47 | > A masterclass that helps you become great at designing _scalable_, _fault-tolerant_, and _highly available_ systems. 48 | 49 | ## The Program 50 | 51 | This is a prime and intermediate-level cohort-based course aimed at providing an exclusive and crisp learning experience. The program will cover most of the topics under System Design and Software Architecture including but not limited to - _Architecting Social Networks_, _Building Storage Engines_ and, _Designing High Throughput Systems_. 52 | 53 | The program will have a blend of _Live Classes happening on Weekends 4 to 6:30 pm IST_, _1:1 Mentorship sessions happening on weekdays_, and _assignments_. The program is designed to be intense and crisp to accelerate learning. 54 | 55 | 56 | ## Highlights 57 | 58 | - The course has been taken up by __200+__ people, spanning __7__ countries. 59 | - The NPS of the course is __89__. 60 | - People from companies like Tesla, Amazon, Microsoft, Google, Yelp, Github, Flipkart, Practo, Grab, PayPal, and many more, have taken up this course. 61 | 62 | 63 | ## Hi, I'm Arpit Bhayani 👋 64 | 65 | 66 | 67 | In my last **~9** years of experience, I have worked at **D. E. Shaw**, **Practo**, **Amazon**, and **Unacademy**; and have built systems, services, and platforms that scaled to billions. 68 | 69 | Post my masters in CSE from **IIIT Hyderabad** I joined D. E. Shaw for a short stint of 2 months, before moving to Practo and working there as a **Platform Engineer**, building and owning close to 8 different microservices. Post Practo I worked at Amazon on their primary mission-critical E-Commerce Database and built **Data Pipelines** that cold tiered the stale data. 70 | 71 | After quitting Amazon in 2018, I joined Unacademy as their first **Technical Architect** and there I designed, built, managed, and scaled services like _Search_, _Notification_, _Logging_, _Deployment Engine_, and many more. I have now transitioned into the role of a Sr. Engineering Manager, leading the Site Reliability vertical. 72 | 73 | In January 2020, I started my [newsletter](https://arpitbhayani.me/newsletter) where I write and share an essay about programming languages internals, deep dives on some super-clever algorithms, and few tips on building scalable distributed systems. The newsletter currently has close to **2000+** subscribers. 74 | 75 | Recently, I have started building [Revine](https://revine.arpitbhayani.me) - a programming langauge for kids helping them develop logic through **animations** and spark their creativity through **artwork**. 76 | 77 |
78 | 79 | 80 | 81 |
82 | -------------------------------------------------------------------------------- /airline-checkin.md: -------------------------------------------------------------------------------- 1 | Design Airline Check-in System 2 | === 3 | 4 | 5 | * [Design Airline Check-in System](#design-airline-check-in-system) 6 | * [Problem Statement](#problem-statement) 7 | * [Requirements](#requirements) 8 | * [Core Requirements](#core-requirements) 9 | * [High Level Requirements](#high-level-requirements) 10 | * [Micro Requirements](#micro-requirements) 11 | * [Output](#output) 12 | * [Design Document](#design-document) 13 | * [Prototype](#prototype) 14 | * [Recommended Tech Stack](#recommended-tech-stack) 15 | * [Keep in mind](#keep-in-mind) 16 | * [Outcome](#outcome) 17 | * [You'll learn](#youll-learn) 18 | * [Share and shoutout](#share-and-shoutout) 19 | 20 | 21 | # Problem Statement 22 | 23 | When you book your tickets with an airline you are required to complete the payment and confirm your reservation. Once the reservation is complete then you can either optionally do a web check-in and confirm your seats or just before your departure do a physical check-in at the airport. 24 | 25 | In this problem statement, let's design this web-check in system, where the passenger logs in to the system with the PNR, performs the seat selection and the gets the boarding pass. If the passenger tries to book a seat, already booked and assigned to the other passenger show an error message requesting passenger to re-select the seats. 26 | 27 | ![Relog Airline Check-in System](https://user-images.githubusercontent.com/4745789/138721841-3fc02879-7075-491a-9dcf-74011dba11e6.png) 28 | 29 | # Requirements 30 | 31 | 32 | *The problem statement is something to start with, be creative and dive into the product details and add constraints and features you think would be important.* 33 | 34 | 35 | ## Core Requirements 36 | 37 | - **one seat** can be assigned to only one passenger and once assigned the seat cannot be transferred 38 | - assume all **100 people** boarding the plane are trying to make a selection of their seat at the same time 39 | - the check-in should be as **fast** as possible 40 | - when one passenger is booking a seat it should **not** lead to other passengers waiting 41 | 42 | ## High Level Requirements 43 | 44 | - make your high-level components operate with **high availability** 45 | - ensure that the data in your system is **durable**, not matter what happens 46 | - define how your system would behave while **scaling-up** and **scaling-down** 47 | - make your system **cost-effective** and provide a justification for the same 48 | - describe how **capacity planning** helped you made a good design decision 49 | - think about how other services will interact with your service 50 | 51 | 52 | ## Micro Requirements 53 | 54 | - ensure the data in your system is **never** going in an inconsistent state 55 | - ensure your system is **free of deadlocks** (if applicable) 56 | - ensure that the throughput of your system is not affected by **locking**, if it does, state how it would affect 57 | 58 | 59 | # Output 60 | 61 | ## Design Document 62 | 63 | Create a **design document** of this system/feature stating all critical design decisions, tradeoffs, components, services, and communications. Also specify how your system handles at scale, and what will eventually become a chokepoint. 64 | 65 | Do **not** create unnecessary components, just to make design look complicated. A good design is **always simple and elegant**. A good way to think about it is if you were to create a spearate process/machine/infra for each component and you will have to code it yourself, would you still do it? 66 | 67 | 68 | ## Prototype 69 | 70 | To understand the nuances and internals of this system, build a prototype that 71 | 72 | - design a database schema for the airline check-in system 73 | - build a simple interface allowing passenger to 74 | - view available seats 75 | - view unavailable seats 76 | - select a seat of their liking 77 | - upon successful booking, print their boarding pass 78 | - simulate multiple passengers trying to book the same seats and handle the concurrency 79 | 80 | ### Recommended Tech Stack 81 | 82 | This is a recommended tech-stack for building this prototype 83 | 84 | |Which|Options| 85 | |-----|-----| 86 | |Language|Golang, Java, C++| 87 | |Database|Relational Database - MySQL, PostgreSQL| 88 | 89 | ### Keep in mind 90 | 91 | These are the common pitfalls that you should keep in mind while you are building this prototype 92 | 93 | - have a primary key to your tables otherwise the entire table might get locked 94 | 95 | # Outcome 96 | 97 | ## You'll learn 98 | 99 | - database locking 100 | - database schema design 101 | 102 | 103 | # Share and shoutout 104 | 105 | If you find this assignment helpful, please 106 | - share this assignment with your friends and peers 107 | - star this repository and help it reach a wider audience 108 | - give me a shoutout on Twitter [@arpit_bhayani](https://twitter.com/@arpit_bhayani), or on LinkedIn at [@arpitbhayani](https://www.linkedin.com/in/arpitbhayani/). 109 | 110 | This assignment is part of [Arpit's System Design Masterclass](https://arpitbhayani.me/masterclass) - A masterclass that helps you become great at designing scalable, fault-tolerant, and highly available systems. 111 | -------------------------------------------------------------------------------- /blogging-platform.md: -------------------------------------------------------------------------------- 1 | Design a Blogging Platform 2 | === 3 | 4 | 5 | * [Design a Blogging Platform](#design-a-blogging-platform) 6 | * [Problem Statement](#problem-statement) 7 | * [Requirements](#requirements) 8 | * [Core Requirements](#core-requirements) 9 | * [High Level Requirements](#high-level-requirements) 10 | * [Micro Requirements](#micro-requirements) 11 | * [Output](#output) 12 | * [Design Document](#design-document) 13 | * [Prototype](#prototype) 14 | * [Recommended Tech Stack](#recommended-tech-stack) 15 | * [Keep in mind](#keep-in-mind) 16 | * [Outcome](#outcome) 17 | * [You'll learn](#youll-learn) 18 | * [Share and shoutout](#share-and-shoutout) 19 | 20 | 21 | # Problem Statement 22 | 23 | Design a simple multi-user publishing/blogging platform, allowing writers to publish and manage the blogs under their personal publication and readers to read them. 24 | 25 | # Requirements 26 | 27 | 28 | *The problem statement is something to start with, be creative and dive into the product details and add constraints and features you think would be important.* 29 | 30 | 31 | ## Core Requirements 32 | 33 | - writers should be able to **publish** blog under their personal publication 34 | - readers should be able to **read** the blog 35 | - a user can be both - a reader as well as a writer 36 | - author of the blog should be able to **delete** the blog 37 | - blog may **contain images**, but will not contain any video 38 | - time to access the blog should be **as low as possible** 39 | - we have to render "**number of blogs**" written by every user on his/her profile 40 | - users should be able to **search** for a particular blog 41 | - the platform should be scaled for **5 million** daily active readers 42 | - the platform should be scaled for **10,000** daily active writers 43 | 44 | ## High Level Requirements 45 | 46 | - make your high-level components operate with **high availability** 47 | - ensure that the data in your system is **durable**, not matter what happens 48 | - define how your system would behave while **scaling-up** and **scaling-down** 49 | - make your system **cost-effective** and provide a justification for the same 50 | - describe how **capacity planning** helped you made a good design decision 51 | - think about how other services will interact with your service 52 | 53 | 54 | ## Micro Requirements 55 | 56 | - ensure the data in your system is **never** going in an inconsistent state 57 | - ensure your system is **free of deadlocks** (if applicable) 58 | - ensure that the throughput of your system is not affected by **locking**, if it does, state how it would affect 59 | 60 | 61 | # Output 62 | 63 | ## Design Document 64 | 65 | Create a **design document** of this system/feature stating all critical design decisions, tradeoffs, components, services, and communications. Also specify how your system handles at scale, and what will eventually become a chokepoint. 66 | 67 | Do **not** create unnecessary components, just to make design look complicated. A good design is **always simple and elegant**. A good way to think about it is if you were to create a spearate process/machine/infra for each component and you will have to code it yourself, would you still do it? 68 | 69 | 70 | ## Prototype 71 | 72 | To understand the nuances and internals of this system, build a prototype that 73 | 74 | - has a realtional database with schema able to handle all the core requirements 75 | - has an interface for writers to 76 | - publish the blog 77 | - manage the blog 78 | - has an interface for readers to 79 | - browse all the publications and read the blogs 80 | - search a blog or a publication 81 | 82 | ### Recommended Tech Stack 83 | 84 | This is a recommended tech-stack for building this prototype 85 | 86 | |Which|Options| 87 | |-----|-----| 88 | |Language|Golang, Java, NodeJS| 89 | |Database|Relational Database - MySQL, PostgreSQL| 90 | 91 | ### Keep in mind 92 | 93 | These are the common pitfalls that you should keep in mind while you are building this prototype 94 | 95 | - data should not be redundant in your schema 96 | 97 | # Outcome 98 | 99 | ## You'll learn 100 | 101 | - database schema design 102 | - building web application - a simple CRUD 103 | 104 | 105 | # Share and shoutout 106 | 107 | If you find this assignment helpful, please 108 | - share this assignment with your friends and peers 109 | - star this repository and help it reach a wider audience 110 | - give me a shoutout on Twitter [@arpit_bhayani](https://twitter.com/@arpit_bhayani), or on LinkedIn at [@arpitbhayani](https://www.linkedin.com/in/arpitbhayani/). 111 | 112 | This assignment is part of [Arpit's System Design Masterclass](https://arpitbhayani.me/masterclass) - A masterclass that helps you become great at designing scalable, fault-tolerant, and highly available systems. 113 | -------------------------------------------------------------------------------- /counting-impressions.md: -------------------------------------------------------------------------------- 1 | Design Counting Impressions at Scale 2 | === 3 | 4 | 5 | * [Design Counting Impressions at Scale](#design-counting-impressions-at-scale) 6 | * [Problem Statement](#problem-statement) 7 | * [Requirements](#requirements) 8 | * [Core Requirements](#core-requirements) 9 | * [High Level Requirements](#high-level-requirements) 10 | * [Micro Requirements](#micro-requirements) 11 | * [Output](#output) 12 | * [Design Document](#design-document) 13 | * [Outcome](#outcome) 14 | * [You'll learn](#youll-learn) 15 | * [Share and shoutout](#share-and-shoutout) 16 | 17 | 18 | # Problem Statement 19 | 20 | Whenever an ad is displayed to you on any website it is counted as an impression. In simple terms, anytime something is shown to you it is treated as an impression in the backend. We have to build a system that counts the impression an ad gets at scale. 21 | 22 | Counting impressions in not just limited to Ad business, it finds its application in Social Media - to count number of views a post gets, Video Streaming - number of views a video gets, Search Engines - number of times a page is shown in search results. 23 | 24 | We have to build the system that helps us answer this query efficiently 25 | 26 | > The number of unique visitors in last `n` units of time 27 | 28 | Here's the `n` will be given as an input in every single query that will be fired on the system. 29 | 30 | # Requirements 31 | 32 | 33 | *The problem statement is something to start with, be creative and dive into the product details and add constraints and features you think would be important.* 34 | 35 | 36 | ## Core Requirements 37 | 38 | - need real or near-realtime answers to the question - unique visitors in the time range 39 | - the values should not be aggregated (hourly, daily, weekly) 40 | - time range will be given on the fly 41 | - the count could be a close approximate 42 | - complete computation should wrap within a few seconds 43 | - the design should be storage efficient 44 | 45 | ## High Level Requirements 46 | 47 | - make your high-level components operate with **high availability** 48 | - ensure that the data in your system is **durable**, not matter what happens 49 | - define how your system would behave while **scaling-up** and **scaling-down** 50 | - make your system **cost-effective** and provide a justification for the same 51 | - describe how **capacity planning** helped you made a good design decision 52 | - think about how other services will interact with your service 53 | 54 | 55 | ## Micro Requirements 56 | 57 | - ensure the data in your system is **never** going in an inconsistent state 58 | - ensure your system is **free of deadlocks** (if applicable) 59 | - ensure that the throughput of your system is not affected by **locking**, if it does, state how it would affect 60 | 61 | 62 | # Output 63 | 64 | ## Design Document 65 | 66 | Create a **design document** of this system/feature stating all critical design decisions, tradeoffs, components, services, and communications. Also specify how your system handles at scale, and what will eventually become a chokepoint. 67 | 68 | Do **not** create unnecessary components, just to make design look complicated. A good design is **always simple and elegant**. A good way to think about it is if you were to create a spearate process/machine/infra for each component and you will have to code it yourself, would you still do it? 69 | 70 | 71 | # Outcome 72 | 73 | ## You'll learn 74 | 75 | - approximation algorithm for counting impression 76 | - counting efficiently optimizing time and space 77 | 78 | 79 | # Share and shoutout 80 | 81 | If you find this assignment helpful, please 82 | - share this assignment with your friends and peers 83 | - star this repository and help it reach a wider audience 84 | - give me a shoutout on Twitter [@arpit_bhayani](https://twitter.com/@arpit_bhayani), or on LinkedIn at [@arpitbhayani](https://www.linkedin.com/in/arpitbhayani/). 85 | 86 | This assignment is part of [Arpit's System Design Masterclass](https://arpitbhayani.me/masterclass) - A masterclass that helps you become great at designing scalable, fault-tolerant, and highly available systems. 87 | -------------------------------------------------------------------------------- /distributed-cache.md: -------------------------------------------------------------------------------- 1 | Design a Distributed Cache 2 | === 3 | 4 | 5 | * [Design a Distributed Cache](#design-a-distributed-cache) 6 | * [Problem Statement](#problem-statement) 7 | * [Requirements](#requirements) 8 | * [Core Requirements](#core-requirements) 9 | * [High Level Requirements](#high-level-requirements) 10 | * [Micro Requirements](#micro-requirements) 11 | * [Output](#output) 12 | * [Design Document](#design-document) 13 | * [Prototype](#prototype) 14 | * [Recommended Tech Stack](#recommended-tech-stack) 15 | * [Keep in mind](#keep-in-mind) 16 | * [Outcome](#outcome) 17 | * [You'll learn](#youll-learn) 18 | * [Share and shoutout](#share-and-shoutout) 19 | 20 | 21 | # Problem Statement 22 | 23 | Design a single node cache and then scale it out to be distributed. We keep this cache simple and hence it should support operations as simple as 24 | 25 | - `GET`: To get a key from the cache 26 | - `PUT`: To put a key in the cache 27 | - `DEL`: To delete a key from the cache 28 | - `TTL`: To set an expiry for a key 29 | 30 | While designing the cache it is very important to note that the cache should be highly available and scalable. Given that cache is a high throughput and highly concurrent system, scaling up and down the cache should not have a major impact on the data or the performance. 31 | 32 | ![Relog Design a Distributed Cache](https://user-images.githubusercontent.com/4745789/141650924-943da5ba-c3a0-4d86-b3f2-a300be7bea9d.png) 33 | 34 | # Requirements 35 | 36 | 37 | *The problem statement is something to start with, be creative and dive into the product details and add constraints and features you think would be important.* 38 | 39 | 40 | ## Core Requirements 41 | 42 | - cache should support `GET`, `PUT`, `DEL`, and `TTL` 43 | - the throughput of the cache should be optimal 44 | - the cache should be lock-free 45 | - measure the cache-hit and cache-miss ratio 46 | - cache should not pause for peripheral sub-systems like monitoring 47 | - every component should be fault tolerant 48 | - pluggable cache eviction and its implication on performance 49 | - data is too big to be stored in one node 50 | 51 | ## High Level Requirements 52 | 53 | - make your high-level components operate with **high availability** 54 | - ensure that the data in your system is **durable**, not matter what happens 55 | - define how your system would behave while **scaling-up** and **scaling-down** 56 | - make your system **cost-effective** and provide a justification for the same 57 | - describe how **capacity planning** helped you made a good design decision 58 | - think about how other services will interact with your service 59 | 60 | 61 | ## Micro Requirements 62 | 63 | - ensure the data in your system is **never** going in an inconsistent state 64 | - ensure your system is **free of deadlocks** (if applicable) 65 | - ensure that the throughput of your system is not affected by **locking**, if it does, state how it would affect 66 | 67 | 68 | # Output 69 | 70 | ## Design Document 71 | 72 | Create a **design document** of this system/feature stating all critical design decisions, tradeoffs, components, services, and communications. Also specify how your system handles at scale, and what will eventually become a chokepoint. 73 | 74 | Do **not** create unnecessary components, just to make design look complicated. A good design is **always simple and elegant**. A good way to think about it is if you were to create a spearate process/machine/infra for each component and you will have to code it yourself, would you still do it? 75 | 76 | 77 | ## Prototype 78 | 79 | To understand the nuances and internals of this system, build a prototype that 80 | 81 | - a basic single node cache with support for lock-free concurrency 82 | - do the load test and benchmark the latencies 83 | 84 | ### Recommended Tech Stack 85 | 86 | This is a recommended tech-stack for building this prototype 87 | 88 | |Which|Options| 89 | |-----|-----| 90 | |Language|Golang, Java, C++| 91 | 92 | ### Keep in mind 93 | 94 | These are the common pitfalls that you should keep in mind while you are building this prototype 95 | 96 | - pessimistic locking will hamper the througput 97 | 98 | # Outcome 99 | 100 | ## You'll learn 101 | 102 | - implementing in-memory cache 103 | - lock-free implementations 104 | - consistent hashing 105 | 106 | 107 | # Share and shoutout 108 | 109 | If you find this assignment helpful, please 110 | - share this assignment with your friends and peers 111 | - star this repository and help it reach a wider audience 112 | - give me a shoutout on Twitter [@arpit_bhayani](https://twitter.com/@arpit_bhayani), or on LinkedIn at [@arpitbhayani](https://www.linkedin.com/in/arpitbhayani/). 113 | 114 | This assignment is part of [Arpit's System Design Masterclass](https://arpitbhayani.me/masterclass) - A masterclass that helps you become great at designing scalable, fault-tolerant, and highly available systems. 115 | -------------------------------------------------------------------------------- /faster-superfast-kv.md: -------------------------------------------------------------------------------- 1 | Design a faster Superfast KV Store 2 | === 3 | 4 | 5 | * [Design a faster Superfast KV Store](#design-a-faster-superfast-kv-store) 6 | * [Problem Statement](#problem-statement) 7 | * [Requirements](#requirements) 8 | * [Core Requirements](#core-requirements) 9 | * [High Level Requirements](#high-level-requirements) 10 | * [Micro Requirements](#micro-requirements) 11 | * [Output](#output) 12 | * [Design Document](#design-document) 13 | * [Prototype](#prototype) 14 | * [Recommended Tech Stack](#recommended-tech-stack) 15 | * [Keep in mind](#keep-in-mind) 16 | * [Outcome](#outcome) 17 | * [You'll learn](#youll-learn) 18 | * [Share and shoutout](#share-and-shoutout) 19 | 20 | 21 | # Problem Statement 22 | 23 | We designed a [Superfast KV Store](https://github.com/relogX/system-design-questions/blob/master/superfast-kv.md), but can we go faster than this? Let's try to model something that is faster than this superfast DB. 24 | 25 | Design a single-node persistent KV Store that supports `GET`, `PUT` and `DEL` operations and it utilizes hardware (disk, RAM) optimally. The response time for all the 3 operations should be as low as possible and complexity of operations should be `O(1)`. It is okay for this KV store to not support infinite number of keys given it is bound to a single node, but make sure you maximize the number of keys a single node can hold. 26 | 27 | > Note: It is okay if your storage engine cannot support very large number of keys. 28 | 29 | # Requirements 30 | 31 | 32 | *The problem statement is something to start with, be creative and dive into the product details and add constraints and features you think would be important.* 33 | 34 | 35 | ## Core Requirements 36 | 37 | - should be able to `GET`, `PUT`, `DEL` on a key 38 | - all operations should happen as fast as possible with complexity of `O(1)` 39 | - this KV store is not distributed and will run on just a single node 40 | 41 | ## High Level Requirements 42 | 43 | - make your high-level components operate with **high availability** 44 | - ensure that the data in your system is **durable**, not matter what happens 45 | - define how your system would behave while **scaling-up** and **scaling-down** 46 | - make your system **cost-effective** and provide a justification for the same 47 | - describe how **capacity planning** helped you made a good design decision 48 | - think about how other services will interact with your service 49 | 50 | 51 | ## Micro Requirements 52 | 53 | - ensure the data in your system is **never** going in an inconsistent state 54 | - ensure your system is **free of deadlocks** (if applicable) 55 | - ensure that the throughput of your system is not affected by **locking**, if it does, state how it would affect 56 | 57 | 58 | # Output 59 | 60 | ## Design Document 61 | 62 | Create a **design document** of this system/feature stating all critical design decisions, tradeoffs, components, services, and communications. Also specify how your system handles at scale, and what will eventually become a chokepoint. 63 | 64 | Do **not** create unnecessary components, just to make design look complicated. A good design is **always simple and elegant**. A good way to think about it is if you were to create a spearate process/machine/infra for each component and you will have to code it yourself, would you still do it? 65 | 66 | 67 | ## Prototype 68 | 69 | To understand the nuances and internals of this system, build a prototype that 70 | 71 | - implement your design and measure the `GET`, `PUT`, `DEL` performance 72 | 73 | ### Recommended Tech Stack 74 | 75 | This is a recommended tech-stack for building this prototype 76 | 77 | |Which|Options| 78 | |-----|-----| 79 | |Language|Golang, Java, C++, Python| 80 | 81 | ### Keep in mind 82 | 83 | These are the common pitfalls that you should keep in mind while you are building this prototype 84 | 85 | - your storage engine will always be bound to single node 86 | - it is okay for your engine to not support very large number of keys 87 | 88 | # Outcome 89 | 90 | ## You'll learn 91 | 92 | - designing storage engine 93 | - utilizing every ounce of your hardware 94 | 95 | 96 | # Share and shoutout 97 | 98 | If you find this assignment helpful, please 99 | - share this assignment with your friends and peers 100 | - star this repository and help it reach a wider audience 101 | - give me a shoutout on Twitter [@arpit_bhayani](https://twitter.com/@arpit_bhayani), or on LinkedIn at [@arpitbhayani](https://www.linkedin.com/in/arpitbhayani/). 102 | 103 | This assignment is part of [Arpit's System Design Masterclass](https://arpitbhayani.me/masterclass) - A masterclass that helps you become great at designing scalable, fault-tolerant, and highly available systems. 104 | -------------------------------------------------------------------------------- /file-sync.md: -------------------------------------------------------------------------------- 1 | Design a Remote File Sync Service 2 | === 3 | 4 | 5 | * [Design a Remote File Sync Service](#design-a-remote-file-sync-service) 6 | * [Problem Statement](#problem-statement) 7 | * [Requirements](#requirements) 8 | * [Core Requirements](#core-requirements) 9 | * [High Level Requirements](#high-level-requirements) 10 | * [Micro Requirements](#micro-requirements) 11 | * [Output](#output) 12 | * [Design Document](#design-document) 13 | * [Prototype](#prototype) 14 | * [Recommended Tech Stack](#recommended-tech-stack) 15 | * [Keep in mind](#keep-in-mind) 16 | * [Outcome](#outcome) 17 | * [You'll learn](#youll-learn) 18 | * [Share and shoutout](#share-and-shoutout) 19 | 20 | 21 | # Problem Statement 22 | 23 | Designing a file sync service in which a user can upload a file to the cloud and it gets sync across all of his/her devices. 24 | 25 | > The core of this system finds its application in messaging apps as well. 26 | 27 | # Requirements 28 | 29 | 30 | *The problem statement is something to start with, be creative and dive into the product details and add constraints and features you think would be important.* 31 | 32 | 33 | ## Core Requirements 34 | 35 | - Sync all the files across all the devices of the user 36 | - A file could be at max 4GB big 37 | - The upload and downloads should be efficient and resumable 38 | - User's bandwidth is critical, so be efficient. 39 | 40 | ## High Level Requirements 41 | 42 | - make your high-level components operate with **high availability** 43 | - ensure that the data in your system is **durable**, not matter what happens 44 | - define how your system would behave while **scaling-up** and **scaling-down** 45 | - make your system **cost-effective** and provide a justification for the same 46 | - describe how **capacity planning** helped you made a good design decision 47 | - think about how other services will interact with your service 48 | 49 | 50 | ## Micro Requirements 51 | 52 | - ensure the data in your system is **never** going in an inconsistent state 53 | - ensure your system is **free of deadlocks** (if applicable) 54 | - ensure that the throughput of your system is not affected by **locking**, if it does, state how it would affect 55 | 56 | 57 | # Output 58 | 59 | ## Design Document 60 | 61 | Create a **design document** of this system/feature stating all critical design decisions, tradeoffs, components, services, and communications. Also specify how your system handles at scale, and what will eventually become a chokepoint. 62 | 63 | Do **not** create unnecessary components, just to make design look complicated. A good design is **always simple and elegant**. A good way to think about it is if you were to create a spearate process/machine/infra for each component and you will have to code it yourself, would you still do it? 64 | 65 | 66 | ## Prototype 67 | 68 | To understand the nuances and internals of this system, build a prototype that 69 | 70 | - does resumable upload and download 71 | - effficiently identifies and shares changes with other devices 72 | 73 | ### Recommended Tech Stack 74 | 75 | This is a recommended tech-stack for building this prototype 76 | 77 | |Which|Options| 78 | |-----|-----| 79 | |Language|Golang, Java, C++| 80 | 81 | ### Keep in mind 82 | 83 | These are the common pitfalls that you should keep in mind while you are building this prototype 84 | 85 | - transferring 4GB file in one call is prone to disruptions 86 | 87 | # Outcome 88 | 89 | ## You'll learn 90 | 91 | - designing resumable uploads and downloads 92 | - efficiently communicate changes and updates across clients 93 | 94 | 95 | # Share and shoutout 96 | 97 | If you find this assignment helpful, please 98 | - share this assignment with your friends and peers 99 | - star this repository and help it reach a wider audience 100 | - give me a shoutout on Twitter [@arpit_bhayani](https://twitter.com/@arpit_bhayani), or on LinkedIn at [@arpitbhayani](https://www.linkedin.com/in/arpitbhayani/). 101 | 102 | This assignment is part of [Arpit's System Design Masterclass](https://arpitbhayani.me/masterclass) - A masterclass that helps you become great at designing scalable, fault-tolerant, and highly available systems. 103 | -------------------------------------------------------------------------------- /flash-sale.md: -------------------------------------------------------------------------------- 1 | Design Flash Sale 2 | === 3 | 4 | 5 | * [Design Flash Sale](#design-flash-sale) 6 | * [Problem Statement](#problem-statement) 7 | * [Requirements](#requirements) 8 | * [Core Requirements](#core-requirements) 9 | * [High Level Requirements](#high-level-requirements) 10 | * [Micro Requirements](#micro-requirements) 11 | * [Output](#output) 12 | * [Design Document](#design-document) 13 | * [Prototype](#prototype) 14 | * [Recommended Tech Stack](#recommended-tech-stack) 15 | * [Keep in mind](#keep-in-mind) 16 | * [Outcome](#outcome) 17 | * [You'll learn](#youll-learn) 18 | * [Share and shoutout](#share-and-shoutout) 19 | 20 | 21 | # Problem Statement 22 | 23 | Design flash sale that supports sale of a fixed inventory in a very short amount of time. 24 | 25 | The flow of the sale to be supported by the system is 26 | 27 | - flash sale announced for XPhone 28 | - the flash sale aims to sell 1000 XPhones [fixed inventory] 29 | - user can buy only one XPhone during the sale 30 | - user waits for the flash sale to start 31 | - flash sale starts 32 | - the first 1000 are allowed to add XPhones to their cart 33 | - the user is given a time of 5 minutes to make the payment 34 | - if user completes the payment within 5 minutes, the XPhone is sold to that user 35 | - if the payment fails, the XPhone is allowed to be purchased by other users 36 | 37 | Note: Flash Sale and Ticket booking systems have the exact same flow; the only thing that varies is the expected througput and scale. 38 | 39 | # Requirements 40 | 41 | 42 | *The problem statement is something to start with, be creative and dive into the product details and add constraints and features you think would be important.* 43 | 44 | 45 | ## Core Requirements 46 | 47 | - given that the inventory is fixed, ensure that only `N` people can add items to the cart 48 | - if the payment is unsuccessful or was not made, make the item available for other users to purchase 49 | - throughput of the database should not be affected 50 | 51 | ## High Level Requirements 52 | 53 | - make your high-level components operate with **high availability** 54 | - ensure that the data in your system is **durable**, not matter what happens 55 | - define how your system would behave while **scaling-up** and **scaling-down** 56 | - make your system **cost-effective** and provide a justification for the same 57 | - describe how **capacity planning** helped you made a good design decision 58 | - think about how other services will interact with your service 59 | 60 | 61 | ## Micro Requirements 62 | 63 | - ensure the data in your system is **never** going in an inconsistent state 64 | - ensure your system is **free of deadlocks** (if applicable) 65 | - ensure that the throughput of your system is not affected by **locking**, if it does, state how it would affect 66 | 67 | 68 | # Output 69 | 70 | ## Design Document 71 | 72 | Create a **design document** of this system/feature stating all critical design decisions, tradeoffs, components, services, and communications. Also specify how your system handles at scale, and what will eventually become a chokepoint. 73 | 74 | Do **not** create unnecessary components, just to make design look complicated. A good design is **always simple and elegant**. A good way to think about it is if you were to create a spearate process/machine/infra for each component and you will have to code it yourself, would you still do it? 75 | 76 | 77 | ## Prototype 78 | 79 | To understand the nuances and internals of this system, build a prototype that 80 | 81 | - allows a fixed number of people to add an item to their cart 82 | - handle error cases when with external payment flows by simulating network calls 83 | 84 | ### Recommended Tech Stack 85 | 86 | This is a recommended tech-stack for building this prototype 87 | 88 | |Which|Options| 89 | |-----|-----| 90 | |Language|Golang, Java, C++| 91 | |Database|MySQL| 92 | 93 | ### Keep in mind 94 | 95 | These are the common pitfalls that you should keep in mind while you are building this prototype 96 | 97 | - distributed transactions are costly 98 | - locking rows for too long will hamper performance of the database 99 | 100 | # Outcome 101 | 102 | ## You'll learn 103 | 104 | - database locking 105 | - breaking a daunting task into managable sub-tasks and tackle them one by one 106 | - building resilient workflows 107 | 108 | 109 | # Share and shoutout 110 | 111 | If you find this assignment helpful, please 112 | - share this assignment with your friends and peers 113 | - star this repository and help it reach a wider audience 114 | - give me a shoutout on Twitter [@arpit_bhayani](https://twitter.com/@arpit_bhayani), or on LinkedIn at [@arpitbhayani](https://www.linkedin.com/in/arpitbhayani/). 115 | 116 | This assignment is part of [Arpit's System Design Masterclass](https://arpitbhayani.me/masterclass) - A masterclass that helps you become great at designing scalable, fault-tolerant, and highly available systems. 117 | -------------------------------------------------------------------------------- /hashtag-service.md: -------------------------------------------------------------------------------- 1 | Design the HashTag Service 2 | === 3 | 4 | 5 | * [Design the HashTag Service](#design-the-hashtag-service) 6 | * [Problem Statement](#problem-statement) 7 | * [Requirements](#requirements) 8 | * [Core Requirements](#core-requirements) 9 | * [High Level Requirements](#high-level-requirements) 10 | * [Micro Requirements](#micro-requirements) 11 | * [Output](#output) 12 | * [Design Document](#design-document) 13 | * [Prototype](#prototype) 14 | * [Recommended Tech Stack](#recommended-tech-stack) 15 | * [Keep in mind](#keep-in-mind) 16 | * [Outcome](#outcome) 17 | * [You'll learn](#youll-learn) 18 | * [Share and shoutout](#share-and-shoutout) 19 | 20 | 21 | # Problem Statement 22 | 23 | Say you own a social network in which people upload photos and with each upload people can provide a list of HashTags as part of the caption. Build a service that manages the hashtags along with it it helps us render a HashTag page that shows 24 | 25 | - hashtag 26 | - total number of photos posted with that HashTag 27 | - top 50 photos with that HashTag 28 | 29 | This service has to handle 5 million photo uploads every hour and each photo has ~8 hashtags. 30 | 31 | ![Relog The HashTag Service](https://user-images.githubusercontent.com/4745789/139570503-5b213da5-3a74-4187-9843-c3f718abe0e4.png) 32 | 33 | # Requirements 34 | 35 | 36 | *The problem statement is something to start with, be creative and dive into the product details and add constraints and features you think would be important.* 37 | 38 | 39 | ## Core Requirements 40 | 41 | - **extract** and **manage** HashTags from all the uploaded photos 42 | - **5 million** photos uploaded every hour 43 | - efficiently drive the HashTag page that shows 44 | - the hashtag 45 | - the number of photos with that hashtags 46 | - top 50 photos for that hashtag 47 | 48 | ## High Level Requirements 49 | 50 | - make your high-level components operate with **high availability** 51 | - ensure that the data in your system is **durable**, not matter what happens 52 | - define how your system would behave while **scaling-up** and **scaling-down** 53 | - make your system **cost-effective** and provide a justification for the same 54 | - describe how **capacity planning** helped you made a good design decision 55 | - think about how other services will interact with your service 56 | 57 | 58 | ## Micro Requirements 59 | 60 | - ensure the data in your system is **never** going in an inconsistent state 61 | - ensure your system is **free of deadlocks** (if applicable) 62 | - ensure that the throughput of your system is not affected by **locking**, if it does, state how it would affect 63 | 64 | 65 | # Output 66 | 67 | ## Design Document 68 | 69 | Create a **design document** of this system/feature stating all critical design decisions, tradeoffs, components, services, and communications. Also specify how your system handles at scale, and what will eventually become a chokepoint. 70 | 71 | Do **not** create unnecessary components, just to make design look complicated. A good design is **always simple and elegant**. A good way to think about it is if you were to create a spearate process/machine/infra for each component and you will have to code it yourself, would you still do it? 72 | 73 | 74 | ## Prototype 75 | 76 | To understand the nuances and internals of this system, build a prototype that 77 | 78 | - build a small prototype that upload photo upload 79 | - extracts the hashtags 80 | - stores them in a database 81 | - updates the count of photos for each hashtag 82 | 83 | ### Recommended Tech Stack 84 | 85 | This is a recommended tech-stack for building this prototype 86 | 87 | |Which|Options| 88 | |-----|-----| 89 | |Language|Golang, Java, C++| 90 | |Database|pick your favourite| 91 | 92 | ### Keep in mind 93 | 94 | These are the common pitfalls that you should keep in mind while you are building this prototype 95 | 96 | - the number of writes on the database would explode at scale 97 | - count++ is not atomic by default 98 | 99 | # Outcome 100 | 101 | ## You'll learn 102 | 103 | - managing counters at scale 104 | - designing loosely coupled architecture 105 | - designing efficient service while keeping in mind User Experience 106 | 107 | 108 | # Share and shoutout 109 | 110 | If you find this assignment helpful, please 111 | - share this assignment with your friends and peers 112 | - star this repository and help it reach a wider audience 113 | - give me a shoutout on Twitter [@arpit_bhayani](https://twitter.com/@arpit_bhayani), or on LinkedIn at [@arpitbhayani](https://www.linkedin.com/in/arpitbhayani/). 114 | 115 | This assignment is part of [Arpit's System Design Masterclass](https://arpitbhayani.me/masterclass) - A masterclass that helps you become great at designing scalable, fault-tolerant, and highly available systems. 116 | -------------------------------------------------------------------------------- /image-service.md: -------------------------------------------------------------------------------- 1 | Design an Image Service 2 | === 3 | 4 | 5 | * [Design an Image Service](#design-an-image-service) 6 | * [Problem Statement](#problem-statement) 7 | * [Requirements](#requirements) 8 | * [Core Requirements](#core-requirements) 9 | * [High Level Requirements](#high-level-requirements) 10 | * [Micro Requirements](#micro-requirements) 11 | * [Output](#output) 12 | * [Design Document](#design-document) 13 | * [Prototype](#prototype) 14 | * [Recommended Tech Stack](#recommended-tech-stack) 15 | * [Keep in mind](#keep-in-mind) 16 | * [Outcome](#outcome) 17 | * [You'll learn](#youll-learn) 18 | * [Share and shoutout](#share-and-shoutout) 19 | 20 | 21 | # Problem Statement 22 | 23 | Design an image service that takes care of uploading, sering and optimizing images at scale of 5 million image upload every hour. The image optimization will be specific to the device requesting it. 24 | 25 | ![Relog Image Service](https://user-images.githubusercontent.com/4745789/139569887-2247a841-f78d-4546-a331-ec4d891f453a.png) 26 | 27 | # Requirements 28 | 29 | 30 | *The problem statement is something to start with, be creative and dive into the product details and add constraints and features you think would be important.* 31 | 32 | 33 | ## Core Requirements 34 | 35 | - upload **5 million images** every hour from various clients and devices 36 | - serving images **efficiently** to the rendering devices 37 | - provide **analytics** around how images are requested from the systems 38 | - bacndwith consumption should be **near-optimal** 39 | 40 | ## High Level Requirements 41 | 42 | - make your high-level components operate with **high availability** 43 | - ensure that the data in your system is **durable**, not matter what happens 44 | - define how your system would behave while **scaling-up** and **scaling-down** 45 | - make your system **cost-effective** and provide a justification for the same 46 | - describe how **capacity planning** helped you made a good design decision 47 | - think about how other services will interact with your service 48 | 49 | 50 | ## Micro Requirements 51 | 52 | - ensure the data in your system is **never** going in an inconsistent state 53 | - ensure your system is **free of deadlocks** (if applicable) 54 | - ensure that the throughput of your system is not affected by **locking**, if it does, state how it would affect 55 | 56 | 57 | # Output 58 | 59 | ## Design Document 60 | 61 | Create a **design document** of this system/feature stating all critical design decisions, tradeoffs, components, services, and communications. Also specify how your system handles at scale, and what will eventually become a chokepoint. 62 | 63 | Do **not** create unnecessary components, just to make design look complicated. A good design is **always simple and elegant**. A good way to think about it is if you were to create a spearate process/machine/infra for each component and you will have to code it yourself, would you still do it? 64 | 65 | 66 | ## Prototype 67 | 68 | To understand the nuances and internals of this system, build a prototype that 69 | 70 | - write an image uploader that uploads image and stores it locally on one of your local folders 71 | - generate a public URL for the image through which the image can be pullged in an `img` tag 72 | - record metrics everytime an image is requested 73 | 74 | ### Recommended Tech Stack 75 | 76 | This is a recommended tech-stack for building this prototype 77 | 78 | |Which|Options| 79 | |-----|-----| 80 | |Language|Golang, Java, C++| 81 | 82 | ### Keep in mind 83 | 84 | These are the common pitfalls that you should keep in mind while you are building this prototype 85 | 86 | - serving image from your custom API server is very simple 87 | 88 | # Outcome 89 | 90 | ## You'll learn 91 | 92 | - serving static files 93 | - image uploading at scale 94 | - using CDN to cache a handle load from different geographies 95 | 96 | 97 | # Share and shoutout 98 | 99 | If you find this assignment helpful, please 100 | - share this assignment with your friends and peers 101 | - star this repository and help it reach a wider audience 102 | - give me a shoutout on Twitter [@arpit_bhayani](https://twitter.com/@arpit_bhayani), or on LinkedIn at [@arpitbhayani](https://www.linkedin.com/in/arpitbhayani/). 103 | 104 | This assignment is part of [Arpit's System Design Masterclass](https://arpitbhayani.me/masterclass) - A masterclass that helps you become great at designing scalable, fault-tolerant, and highly available systems. 105 | -------------------------------------------------------------------------------- /live-commentary.md: -------------------------------------------------------------------------------- 1 | Design Text-based Live Commentary 2 | === 3 | 4 | 5 | * [Design Text-based Live Commentary](#design-text-based-live-commentary) 6 | * [Problem Statement](#problem-statement) 7 | * [Requirements](#requirements) 8 | * [Core Requirements](#core-requirements) 9 | * [High Level Requirements](#high-level-requirements) 10 | * [Micro Requirements](#micro-requirements) 11 | * [Output](#output) 12 | * [Design Document](#design-document) 13 | * [Prototype](#prototype) 14 | * [Recommended Tech Stack](#recommended-tech-stack) 15 | * [Keep in mind](#keep-in-mind) 16 | * [Outcome](#outcome) 17 | * [You'll learn](#youll-learn) 18 | * [Share and shoutout](#share-and-shoutout) 19 | 20 | 21 | # Problem Statement 22 | 23 | Design a service that serves text-based commentary to a live Cricket Match. The commentary has to be a ball-by-ball which is written by a professional commentator and the service has to serve the commentary through the website to anyone who wants to read it. 24 | 25 | The key elements in this design would be 26 | 27 | - the flow of the data to give a great UX 28 | - making design decisions for database, cache 29 | - deciding on two workflows - one for reader and one for commentator 30 | 31 | # Requirements 32 | 33 | 34 | *The problem statement is something to start with, be creative and dive into the product details and add constraints and features you think would be important.* 35 | 36 | 37 | ## Core Requirements 38 | 39 | - commentary will be updated by a professional commentator 40 | - 5 million people would be reading the commentary at any given point in time 41 | - the time to serve the commentary should be as low as possible 42 | 43 | ## High Level Requirements 44 | 45 | - make your high-level components operate with **high availability** 46 | - ensure that the data in your system is **durable**, not matter what happens 47 | - define how your system would behave while **scaling-up** and **scaling-down** 48 | - make your system **cost-effective** and provide a justification for the same 49 | - describe how **capacity planning** helped you made a good design decision 50 | - think about how other services will interact with your service 51 | 52 | 53 | ## Micro Requirements 54 | 55 | - ensure the data in your system is **never** going in an inconsistent state 56 | - ensure your system is **free of deadlocks** (if applicable) 57 | - ensure that the throughput of your system is not affected by **locking**, if it does, state how it would affect 58 | 59 | 60 | # Output 61 | 62 | ## Design Document 63 | 64 | Create a **design document** of this system/feature stating all critical design decisions, tradeoffs, components, services, and communications. Also specify how your system handles at scale, and what will eventually become a chokepoint. 65 | 66 | Do **not** create unnecessary components, just to make design look complicated. A good design is **always simple and elegant**. A good way to think about it is if you were to create a spearate process/machine/infra for each component and you will have to code it yourself, would you still do it? 67 | 68 | 69 | ## Prototype 70 | 71 | To understand the nuances and internals of this system, build a prototype that 72 | 73 | - implement the entire commentary workflow locally 74 | 75 | ### Recommended Tech Stack 76 | 77 | This is a recommended tech-stack for building this prototype 78 | 79 | |Which|Options| 80 | |-----|-----| 81 | |Language|Golang, Java, C++| 82 | |Database|MySQL| 83 | |Cache|Redis| 84 | 85 | ### Keep in mind 86 | 87 | These are the common pitfalls that you should keep in mind while you are building this prototype 88 | 89 | - every resource you decide should be utilized very efficiently 90 | - do not over engineer the solution 91 | 92 | # Outcome 93 | 94 | ## You'll learn 95 | 96 | - making good database design and decisions 97 | - getting optimal performance out of your architecture 98 | - how not to over-engineer 99 | 100 | 101 | # Share and shoutout 102 | 103 | If you find this assignment helpful, please 104 | - share this assignment with your friends and peers 105 | - star this repository and help it reach a wider audience 106 | - give me a shoutout on Twitter [@arpit_bhayani](https://twitter.com/@arpit_bhayani), or on LinkedIn at [@arpitbhayani](https://www.linkedin.com/in/arpitbhayani/). 107 | 108 | This assignment is part of [Arpit's System Design Masterclass](https://arpitbhayani.me/masterclass) - A masterclass that helps you become great at designing scalable, fault-tolerant, and highly available systems. 109 | -------------------------------------------------------------------------------- /load-balancer.md: -------------------------------------------------------------------------------- 1 | Design a Load Balancer 2 | === 3 | 4 | 5 | * [Design a Load Balancer](#design-a-load-balancer) 6 | * [Problem Statement](#problem-statement) 7 | * [Requirements](#requirements) 8 | * [Core Requirements](#core-requirements) 9 | * [High Level Requirements](#high-level-requirements) 10 | * [Micro Requirements](#micro-requirements) 11 | * [Output](#output) 12 | * [Design Document](#design-document) 13 | * [Prototype](#prototype) 14 | * [Recommended Tech Stack](#recommended-tech-stack) 15 | * [Keep in mind](#keep-in-mind) 16 | * [Outcome](#outcome) 17 | * [You'll learn](#youll-learn) 18 | * [Share and shoutout](#share-and-shoutout) 19 | 20 | 21 | # Problem Statement 22 | 23 | Design a load balancer that acts as a [Reverse Proxy](https://en.wikipedia.org/wiki/Reverse_proxy) and balances the load across multiple configured backend servers. 24 | 25 | ![Design Load Balancer](https://user-images.githubusercontent.com/4745789/138110826-1490cac9-5a02-43bd-bb14-74334742dd16.png) 26 | 27 | # Requirements 28 | 29 | 30 | *The problem statement is something to start with, be creative and dive into the product details and add constraints and features you think would be important.* 31 | 32 | 33 | ## Core Requirements 34 | 35 | - ability to **accept incoming TCP** connection and forward it to one of the configured backend server 36 | - ability to **add** and **remove** backend servers at will 37 | - ability to **monitor** healthy backend servers 38 | - ability to have a **configurable** load balancing strategy 39 | - ability to **measure** and monitor load balancer metrics 40 | - should scale to **millions** of concurrent TCP connections 41 | 42 | ## High Level Requirements 43 | 44 | - make your high-level components operate with **high availability** 45 | - ensure that the data in your system is **durable**, not matter what happens 46 | - define how your system would behave while **scaling-up** and **scaling-down** 47 | - make your system **cost-effective** and provide a justification for the same 48 | - describe how **capacity planning** helped you made a good design decision 49 | - think about how other services will interact with your service 50 | 51 | 52 | ## Micro Requirements 53 | 54 | - ensure the data in your system is **never** going in an inconsistent state 55 | - ensure your system is **free of deadlocks** (if applicable) 56 | - ensure that the throughput of your system is not affected by **locking**, if it does, state how it would affect 57 | 58 | 59 | # Output 60 | 61 | ## Design Document 62 | 63 | Create a **design document** of this system/feature stating all critical design decisions, tradeoffs, components, services, and communications. Also specify how your system handles at scale, and what will eventually become a chokepoint. 64 | 65 | Do **not** create unnecessary components, just to make design look complicated. A good design is **always simple and elegant**. A good way to think about it is if you were to create a spearate process/machine/infra for each component and you will have to code it yourself, would you still do it? 66 | 67 | 68 | ## Prototype 69 | 70 | To understand the nuances and internals of this system, build a prototype that 71 | 72 | - is a working load balancer 73 | - has an interface to 74 | - add and remove backend servers 75 | - see which of the configured backend servers are healthy 76 | - visualize load balancer metrics 77 | - change load balancing strategy on the fly 78 | - changes should not require a reboot to take effect 79 | 80 | ### Recommended Tech Stack 81 | 82 | This is a recommended tech-stack that will help you building your Load Balancer effetively 83 | 84 | |Which|Options| 85 | |-----|-----| 86 | |Language|Multi-threaded language like Golang, Java, C++| 87 | 88 | ### Keep in mind 89 | 90 | These are the common pitfalls that you should keep in mind while you are building this Load Balancer 91 | 92 | - System calls are blocking 93 | 94 | # Outcome 95 | 96 | ## You'll learn 97 | 98 | - System Calls 99 | - Internals of Load Balancer 100 | - Concurrent Execution using Multi-threading 101 | 102 | 103 | # Share and shoutout 104 | 105 | If you find this assignment helpful, please 106 | - share this assignment with your friends and peers 107 | - star this repository and help it reach a wider audience 108 | - give me a shoutout on Twitter [@arpit_bhayani](https://twitter.com/@arpit_bhayani), or on LinkedIn at [@arpitbhayani](https://www.linkedin.com/in/arpitbhayani/). 109 | 110 | This assignment is part of [Arpit's System Design Masterclass](https://arpitbhayani.me/masterclass) - A masterclass that helps you become great at designing scalable, fault-tolerant, and highly available systems. 111 | -------------------------------------------------------------------------------- /near-me.md: -------------------------------------------------------------------------------- 1 | Design Who's Near Me Service 2 | === 3 | 4 | 5 | * [Design something awesome](#design-something-awesome) 6 | * [Problem Statement](#problem-statement) 7 | * [Requirements](#requirements) 8 | * [Core Requirements](#core-requirements) 9 | * [High Level Requirements](#high-level-requirements) 10 | * [Micro Requirements](#micro-requirements) 11 | * [Output](#output) 12 | * [Design Document](#design-document) 13 | * [Prototype](#prototype) 14 | * [Recommended Tech Stack](#recommended-tech-stack) 15 | * [Keep in mind](#keep-in-mind) 16 | * [Outcome](#outcome) 17 | * [You'll learn](#youll-learn) 18 | * [Share and shoutout](#share-and-shoutout) 19 | 20 | 21 | # Problem Statement 22 | 23 | Given `k` kilometers as an input, find all the people who are within `k` kilometers from you, efficiently. 24 | 25 | > The core of this system can be used in locating stores near me, landmarks near me, nearby friends, electric vehicles near me, cars near me, etc. 26 | 27 | # Requirements 28 | 29 | 30 | *The problem statement is something to start with, be creative and dive into the product details and add constraints and features you think would be important.* 31 | 32 | 33 | ## Core Requirements 34 | 35 | - answer the nearby location query efficiently 36 | - the number people using the app are 1 million 37 | 38 | ## High Level Requirements 39 | 40 | - make your high-level components operate with **high availability** 41 | - ensure that the data in your system is **durable**, not matter what happens 42 | - define how your system would behave while **scaling-up** and **scaling-down** 43 | - make your system **cost-effective** and provide a justification for the same 44 | - describe how **capacity planning** helped you made a good design decision 45 | - think about how other services will interact with your service 46 | 47 | 48 | ## Micro Requirements 49 | 50 | - ensure the data in your system is **never** going in an inconsistent state 51 | - ensure your system is **free of deadlocks** (if applicable) 52 | - ensure that the throughput of your system is not affected by **locking**, if it does, state how it would affect 53 | 54 | 55 | # Output 56 | 57 | ## Design Document 58 | 59 | Create a **design document** of this system/feature stating all critical design decisions, tradeoffs, components, services, and communications. Also specify how your system handles at scale, and what will eventually become a chokepoint. 60 | 61 | Do **not** create unnecessary components, just to make design look complicated. A good design is **always simple and elegant**. A good way to think about it is if you were to create a spearate process/machine/infra for each component and you will have to code it yourself, would you still do it? 62 | 63 | 64 | ## Prototype 65 | 66 | To understand the nuances and internals of this system, build a prototype that 67 | 68 | - implement the algorithm/approach that you think of locally 69 | 70 | ### Recommended Tech Stack 71 | 72 | This is a recommended tech-stack for building this prototype 73 | 74 | |Which|Options| 75 | |-----|-----| 76 | |Language|Golang, Java, C++| 77 | 78 | ### Keep in mind 79 | 80 | These are the common pitfalls that you should keep in mind while you are building this prototype 81 | 82 | - You cannot perform efficient search in 2D plane without going through all the points 83 | 84 | # Outcome 85 | 86 | ## You'll learn 87 | 88 | - plaving with spatial data 89 | - algorithm that powers geo location queries 90 | 91 | 92 | # Share and shoutout 93 | 94 | If you find this assignment helpful, please 95 | - share this assignment with your friends and peers 96 | - star this repository and help it reach a wider audience 97 | - give me a shoutout on Twitter [@arpit_bhayani](https://twitter.com/@arpit_bhayani), or on LinkedIn at [@arpitbhayani](https://www.linkedin.com/in/arpitbhayani/). 98 | 99 | This assignment is part of [Arpit's System Design Masterclass](https://arpitbhayani.me/masterclass) - A masterclass that helps you become great at designing scalable, fault-tolerant, and highly available systems. 100 | -------------------------------------------------------------------------------- /newly-unread-indicator.md: -------------------------------------------------------------------------------- 1 | Design Newly Unread Message Indicator 2 | === 3 | 4 | 5 | * [Design Newly Unread Message Indicator](#design-newly-unread-message-indicator) 6 | * [Problem Statement](#problem-statement) 7 | * [Requirements](#requirements) 8 | * [Core Requirements](#core-requirements) 9 | * [High Level Requirements](#high-level-requirements) 10 | * [Micro Requirements](#micro-requirements) 11 | * [Output](#output) 12 | * [Design Document](#design-document) 13 | * [Prototype](#prototype) 14 | * [Recommended Tech Stack](#recommended-tech-stack) 15 | * [Keep in mind](#keep-in-mind) 16 | * [Outcome](#outcome) 17 | * [You'll learn](#youll-learn) 18 | * [Share and shoutout](#share-and-shoutout) 19 | 20 | 21 | # Problem Statement 22 | 23 | This service/feature will inform users about presence of new messages. The number in the indicator is the total number of unique users from which the user has received the message. The indicator becomes `0` as soon as user clicks the button - acknowledging that he/she knows about the message. 24 | 25 | The indicator does not tell how many unread messages are there in total, but rather it is simply indicating that some new messages have been received; and as soon as the user clicks the button he/she acknowledges the presence and thus indicator resets to 0. 26 | 27 | A user can have thousands of unread messages but the number `3` in the image below indicates that there are 3 messages from 3 different users that he/she newly received. 28 | 29 | ![Relog New Message Indicator](https://user-images.githubusercontent.com/4745789/139584929-5e00fd58-c731-4f91-aaa8-7383acd99ff4.png) 30 | 31 | # Requirements 32 | 33 | 34 | *The problem statement is something to start with, be creative and dive into the product details and add constraints and features you think would be important.* 35 | 36 | 37 | ## Core Requirements 38 | 39 | - show the **indicator** informing user about newly unread indicators 40 | - the social network has **1 million** daily active users 41 | - the response time of the service should be as **low** as possible 42 | - the update to the indicator should happen in **real-time** 43 | 44 | ## High Level Requirements 45 | 46 | - make your high-level components operate with **high availability** 47 | - ensure that the data in your system is **durable**, not matter what happens 48 | - define how your system would behave while **scaling-up** and **scaling-down** 49 | - make your system **cost-effective** and provide a justification for the same 50 | - describe how **capacity planning** helped you made a good design decision 51 | - think about how other services will interact with your service 52 | 53 | 54 | ## Micro Requirements 55 | 56 | - ensure the data in your system is **never** going in an inconsistent state 57 | - ensure your system is **free of deadlocks** (if applicable) 58 | - ensure that the throughput of your system is not affected by **locking**, if it does, state how it would affect 59 | 60 | 61 | # Output 62 | 63 | ## Design Document 64 | 65 | Create a **design document** of this system/feature stating all critical design decisions, tradeoffs, components, services, and communications. Also specify how your system handles at scale, and what will eventually become a chokepoint. 66 | 67 | Do **not** create unnecessary components, just to make design look complicated. A good design is **always simple and elegant**. A good way to think about it is if you were to create a spearate process/machine/infra for each component and you will have to code it yourself, would you still do it? 68 | 69 | 70 | ## Prototype 71 | 72 | To understand the nuances and internals of this system, build a prototype that 73 | 74 | - simulate newly unread message indicator on local machine 75 | - updates the counter in real-time 76 | 77 | ### Recommended Tech Stack 78 | 79 | This is a recommended tech-stack for building this prototype 80 | 81 | |Which|Options| 82 | |-----|-----| 83 | |Language|Golang, Java, NodeJS, C++| 84 | |Framework|SocketIO| 85 | |Database|Pick your favourite| 86 | 87 | ### Keep in mind 88 | 89 | These are the common pitfalls that you should keep in mind while you are building this prototype 90 | 91 | - newly unread messages are different from unread messages 92 | - keep the data in faster storage to make system work at scale 93 | 94 | # Outcome 95 | 96 | ## You'll learn 97 | 98 | - database schema design 99 | - designing service focussing on low latency user experience 100 | 101 | 102 | # Share and shoutout 103 | 104 | If you find this assignment helpful, please 105 | - share this assignment with your friends and peers 106 | - star this repository and help it reach a wider audience 107 | - give me a shoutout on Twitter [@arpit_bhayani](https://twitter.com/@arpit_bhayani), or on LinkedIn at [@arpitbhayani](https://www.linkedin.com/in/arpitbhayani/). 108 | 109 | This assignment is part of [Arpit's System Design Masterclass](https://arpitbhayani.me/masterclass) - A masterclass that helps you become great at designing scalable, fault-tolerant, and highly available systems. 110 | -------------------------------------------------------------------------------- /onepic.md: -------------------------------------------------------------------------------- 1 | Design OnePic 2 | === 3 | 4 | 5 | * [Design OnePic](#design-onepic) 6 | * [Problem Statement](#problem-statement) 7 | * [Requirements](#requirements) 8 | * [Core Requirements](#core-requirements) 9 | * [High Level Requirements](#high-level-requirements) 10 | * [Micro Requirements](#micro-requirements) 11 | * [Output](#output) 12 | * [Design Document](#design-document) 13 | * [Prototype](#prototype) 14 | * [Recommended Tech Stack](#recommended-tech-stack) 15 | * [Keep in mind](#keep-in-mind) 16 | * [Outcome](#outcome) 17 | * [You'll learn](#youll-learn) 18 | * [Share and shoutout](#share-and-shoutout) 19 | 20 | 21 | # Problem Statement 22 | 23 | OnePic is a product that makes it easy to use one profile picture everywhere. The product let's you have one unique URL that when any website uses under `img` tag renders the image on the page. THe product also let's you upload multiple pictures and set one of them as active. 24 | 25 | ``` 26 | 27 | ``` 28 | 29 | The OnePic URL for user `arpit` will be `https://onepic.relog.in/arpit` which when put under `img` tag renders the one active profile picture of that user. This URL will be used by all other social media to render `arpit`'s profile picture. This way when `arpit` marks some other of his profile picture as active, it will take its effect on all the websites automatically, without needing him to go and update the picture on each site individually. 30 | 31 | ![Relog OnePic](https://user-images.githubusercontent.com/4745789/139574973-6bd4202d-4256-44a1-bbbd-271f9c3b745b.png) 32 | 33 | # Requirements 34 | 35 | 36 | *The problem statement is something to start with, be creative and dive into the product details and add constraints and features you think would be important.* 37 | 38 | 39 | ## Core Requirements 40 | 41 | - user to manage **multiple** profile pictures and mark one as **active** 42 | - one unqiue URL used by all the websites to **render** the user's profile picture 43 | - **50000** photo uploads per minute 44 | - read to write ratio **100000:1** 45 | 46 | ## High Level Requirements 47 | 48 | - make your high-level components operate with **high availability** 49 | - ensure that the data in your system is **durable**, not matter what happens 50 | - define how your system would behave while **scaling-up** and **scaling-down** 51 | - make your system **cost-effective** and provide a justification for the same 52 | - describe how **capacity planning** helped you made a good design decision 53 | - think about how other services will interact with your service 54 | 55 | 56 | ## Micro Requirements 57 | 58 | - ensure the data in your system is **never** going in an inconsistent state 59 | - ensure your system is **free of deadlocks** (if applicable) 60 | - ensure that the throughput of your system is not affected by **locking**, if it does, state how it would affect 61 | 62 | 63 | # Output 64 | 65 | ## Design Document 66 | 67 | Create a **design document** of this system/feature stating all critical design decisions, tradeoffs, components, services, and communications. Also specify how your system handles at scale, and what will eventually become a chokepoint. 68 | 69 | Do **not** create unnecessary components, just to make design look complicated. A good design is **always simple and elegant**. A good way to think about it is if you were to create a spearate process/machine/infra for each component and you will have to code it yourself, would you still do it? 70 | 71 | 72 | ## Prototype 73 | 74 | To understand the nuances and internals of this system, build a prototype that 75 | 76 | - built a simple static site server that serves user uploaded images 77 | - CRUD to manage profile pictures and marking one active 78 | - render the active profile picture whenever the image is requested through the URL 79 | 80 | ### Recommended Tech Stack 81 | 82 | This is a recommended tech-stack for building this prototype 83 | 84 | |Which|Options| 85 | |-----|-----| 86 | |Language|Golang, Java, NodeJS, C++| 87 | |Database|Pick your favourite| 88 | 89 | ### Keep in mind 90 | 91 | These are the common pitfalls that you should keep in mind while you are building this prototype 92 | 93 | - serving images through API server is simple than it looks 94 | 95 | # Outcome 96 | 97 | ## You'll learn 98 | 99 | - static file server 100 | - read heavy systems serving static content 101 | - database schema design 102 | 103 | 104 | # Share and shoutout 105 | 106 | If you find this assignment helpful, please 107 | - share this assignment with your friends and peers 108 | - star this repository and help it reach a wider audience 109 | - give me a shoutout on Twitter [@arpit_bhayani](https://twitter.com/@arpit_bhayani), or on LinkedIn at [@arpitbhayani](https://www.linkedin.com/in/arpitbhayani/). 110 | 111 | This assignment is part of [Arpit's System Design Masterclass](https://arpitbhayani.me/masterclass) - A masterclass that helps you become great at designing scalable, fault-tolerant, and highly available systems. 112 | -------------------------------------------------------------------------------- /online-offline-indicator.md: -------------------------------------------------------------------------------- 1 | Design an Online Offline Indicator 2 | === 3 | 4 | 5 | * [Design an Online Offline Indicator](#design-an-online-offline-indicator) 6 | * [Problem Statement](#problem-statement) 7 | * [Requirements](#requirements) 8 | * [Core Requirements](#core-requirements) 9 | * [High Level Requirements](#high-level-requirements) 10 | * [Micro Requirements](#micro-requirements) 11 | * [Output](#output) 12 | * [Design Document](#design-document) 13 | * [Prototype](#prototype) 14 | * [Recommended Tech Stack](#recommended-tech-stack) 15 | * [Keep in mind](#keep-in-mind) 16 | * [Outcome](#outcome) 17 | * [You'll learn](#youll-learn) 18 | * [Share and shoutout](#share-and-shoutout) 19 | 20 | 21 | # Problem Statement 22 | 23 | Imagine you are building a chat application in which a user can chat with any other user, provided they are both connected. For a user to initiate the chat, it is always helpful if we show who all are online. 24 | 25 | In this assignment, let's design a system that indicates who all are online at the moment. The micro-problem statement is as simple as answering the question - Given a user, return if he/she is online or not. 26 | 27 | ![Designing Online Offline Indicator](https://user-images.githubusercontent.com/4745789/138017480-1f7c30ce-50f2-4a50-99b5-1cf7f0778caa.png) 28 | 29 | # Requirements 30 | 31 | 32 | *The problem statement is something to start with, be creative and dive into the product details and add constraints and features you think would be important.* 33 | 34 | 35 | ## Core Requirements 36 | 37 | - should update online status of a user within **10 seconds** of user coming online 38 | - can be **lineant** in marking a user offline 39 | - should scale for **5 million** active users at any given moment 40 | - a user should see accurate status of any other user, **eventually** 41 | 42 | 43 | ## High Level Requirements 44 | 45 | - make your high-level components operate with **high availability** 46 | - ensure that the data in your system is **durable**, not matter what happens 47 | - define how your system would behave while **scaling-up** and **scaling-down** 48 | - make your system **cost-effective** and provide a justification for the same 49 | - describe how **capacity planning** helped you made a good design decision 50 | - think about how other services will interact with your service 51 | 52 | 53 | ## Micro Requirements 54 | 55 | - ensure the data in your system is **never** going in an inconsistent state 56 | - ensure your system is **free of deadlocks** (if applicable) 57 | - ensure that the throughput of your system is not affected by **locking**, if it does, state how it would affect 58 | 59 | 60 | # Output 61 | 62 | ## Design Document 63 | 64 | Create a **design document** of this system/feature stating all critical design decisions, tradeoffs, components, services, and communications. Also specify how your system handles at scale, and what will eventually become a chokepoint. 65 | 66 | Do **not** create unnecessary components, just to make design look complicated. A good design is **always simple and elegant**. A good way to think about it is if you were to create a spearate process/machine/infra for each component and you will have to code it yourself, would you still do it? 67 | 68 | 69 | ## Prototype 70 | 71 | To understand the nuances and internals of this system, build a prototype that 72 | 73 | - shows a list of users 74 | - shows if a particular user is online or offline 75 | - the user list is grouped such that the online users are shown first and then the offline ones 76 | - if the user is offline, also show "was online X mins ago" 77 | 78 | ### Recommended Tech Stack 79 | 80 | This is a recommended tech-stack for building this prototype 81 | 82 | |Which|Options| 83 | |-----|-----| 84 | |Language|NodeJS, or any language that supports SocketIO| 85 | |Database|MySQL, MongoDB, or any one that you like| 86 | |Library|SocketIO, for realtime status update| 87 | 88 | ### Keep in mind 89 | 90 | These are the common pitfalls that you should keep in mind while you are building this prototype 91 | 92 | - IO calls in NodeJS are asynchronous 93 | - if you are broadcasting the status update, do not do it to "all" the users 94 | 95 | 96 | # Outcome 97 | 98 | ## You'll learn 99 | 100 | - identify when a user is "online" 101 | - finding when a user goes "offline" 102 | - database schema design and deciding what data to store to make this efficient 103 | - building realtime user experiences using SocketIO 104 | 105 | 106 | # Share and shoutout 107 | 108 | If you find this assignment helpful, please 109 | - share this assignment with your friends and peers 110 | - star this repository and help it reach a wider audience 111 | - give me a shoutout on Twitter [@arpit_bhayani](https://twitter.com/@arpit_bhayani), or on LinkedIn at [@arpitbhayani](https://www.linkedin.com/in/arpitbhayani/). 112 | 113 | This assignment is part of [Arpit's System Design Masterclass](https://arpitbhayani.me/masterclass) - A masterclass that helps you become great at designing scalable, fault-tolerant, and highly available systems. 114 | -------------------------------------------------------------------------------- /queue-consumers.md: -------------------------------------------------------------------------------- 1 | Design Synchronized Queue Consumers 2 | === 3 | 4 | 5 | * [Design Synchronized Queue Consumers](#design-synchronized-queue-consumers) 6 | * [Problem Statement](#problem-statement) 7 | * [Requirements](#requirements) 8 | * [Core Requirements](#core-requirements) 9 | * [High Level Requirements](#high-level-requirements) 10 | * [Micro Requirements](#micro-requirements) 11 | * [Output](#output) 12 | * [Design Document](#design-document) 13 | * [Prototype](#prototype) 14 | * [Recommended Tech Stack](#recommended-tech-stack) 15 | * [Keep in mind](#keep-in-mind) 16 | * [Outcome](#outcome) 17 | * [You'll learn](#youll-learn) 18 | * [Share and shoutout](#share-and-shoutout) 19 | 20 | 21 | # Problem Statement 22 | 23 | Say there exists a very blunt remote queue and whenever the consumers makes a network call to fetch the head of the queue, the queues removes the element and returns it. The queue does not give any protection for concurrent consumers, which means it is possible for two consumers to fetch the front of the queue and the queue returning them the same element. Given this scenario, synchronize the consumers such that a message is consumed by only one consumer. 24 | 25 | ![Relog System Design - Synchronizing consumers](https://user-images.githubusercontent.com/4745789/139275645-132ecb0a-0e39-476f-b95d-007dbac76a4e.png) 26 | 27 | # Requirements 28 | 29 | 30 | *The problem statement is something to start with, be creative and dive into the product details and add constraints and features you think would be important.* 31 | 32 | 33 | ## Core Requirements 34 | 35 | - only **one** consumer consumes an element from the queue at any moment 36 | - if two consumers trying to fetch the head of the queue, one has to **wait** while other finishes 37 | - queue is a black box and **cannot** be altered 38 | - throughput of the system does **not** matter 39 | 40 | ## High Level Requirements 41 | 42 | - make your high-level components operate with **high availability** 43 | - ensure that the data in your system is **durable**, not matter what happens 44 | - define how your system would behave while **scaling-up** and **scaling-down** 45 | - make your system **cost-effective** and provide a justification for the same 46 | - describe how **capacity planning** helped you made a good design decision 47 | - think about how other services will interact with your service 48 | 49 | 50 | ## Micro Requirements 51 | 52 | - ensure the data in your system is **never** going in an inconsistent state 53 | - ensure your system is **free of deadlocks** (if applicable) 54 | - ensure that the throughput of your system is not affected by **locking**, if it does, state how it would affect 55 | 56 | 57 | # Output 58 | 59 | ## Design Document 60 | 61 | Create a **design document** of this system/feature stating all critical design decisions, tradeoffs, components, services, and communications. Also specify how your system handles at scale, and what will eventually become a chokepoint. 62 | 63 | Do **not** create unnecessary components, just to make design look complicated. A good design is **always simple and elegant**. A good way to think about it is if you were to create a spearate process/machine/infra for each component and you will have to code it yourself, would you still do it? 64 | 65 | 66 | ## Prototype 67 | 68 | To understand the nuances and internals of this system, build a prototype that 69 | 70 | - create a mock thread unsafe implementation of queue 71 | - simulate concurrent consumers and synchronize them 72 | 73 | ### Recommended Tech Stack 74 | 75 | This is a recommended tech-stack for building this prototype 76 | 77 | |Which|Options| 78 | |-----|-----| 79 | |Language|Golang, Java, C++| 80 | |Queue|a simple array could be a queue| 81 | 82 | ### Keep in mind 83 | 84 | These are the common pitfalls that you should keep in mind while you are building this prototype 85 | 86 | - mutexes won't work 87 | 88 | # Outcome 89 | 90 | ## You'll learn 91 | 92 | - Remote locking and synchronization 93 | 94 | 95 | # Share and shoutout 96 | 97 | If you find this assignment helpful, please 98 | - share this assignment with your friends and peers 99 | - star this repository and help it reach a wider audience 100 | - give me a shoutout on Twitter [@arpit_bhayani](https://twitter.com/@arpit_bhayani), or on LinkedIn at [@arpitbhayani](https://www.linkedin.com/in/arpitbhayani/). 101 | 102 | This assignment is part of [Arpit's System Design Masterclass](https://arpitbhayani.me/masterclass) - A masterclass that helps you become great at designing scalable, fault-tolerant, and highly available systems. 103 | -------------------------------------------------------------------------------- /realtime-claps.md: -------------------------------------------------------------------------------- 1 | Design Realtime Claps 2 | === 3 | 4 | 5 | * [Design Realtime Claps](#design-realtime-claps) 6 | * [Problem Statement](#problem-statement) 7 | * [Requirements](#requirements) 8 | * [Core Requirements](#core-requirements) 9 | * [High Level Requirements](#high-level-requirements) 10 | * [Micro Requirements](#micro-requirements) 11 | * [Output](#output) 12 | * [Design Document](#design-document) 13 | * [Prototype](#prototype) 14 | * [Recommended Tech Stack](#recommended-tech-stack) 15 | * [Keep in mind](#keep-in-mind) 16 | * [Outcome](#outcome) 17 | * [You'll learn](#youll-learn) 18 | * [Share and shoutout](#share-and-shoutout) 19 | 20 | 21 | # Problem Statement 22 | 23 | Imagine you have a publishing platform where writers write articles and readers read them. To appreciate the quality content, the readers _clap_ for the article by clicking the clap button present next to the article. 24 | 25 | ![Designing Realtime Claps](https://user-images.githubusercontent.com/4745789/137951051-3d18a202-e719-4e9c-a430-d8da6ddebaec.png) 26 | The clap button looks as shown above and the number `156` beneath it is the total number of claps the article has received to date. When any user (reader or writer) opens the page the total clap count it fetched from the database and rendered on the page. 27 | 28 | Design a realtime gratification system, that updates the _clap_ count as soon as someone clapped for the article; which means all the users who are reading the same article at the same time should, in realtime, see that someone else clapped for the article. 29 | 30 | # Requirements 31 | 32 | 33 | *The problem statement is something to start with, be creative and dive into the product details and add constraints and features you think would be important.* 34 | 35 | 36 | ## Core Requirements 37 | 38 | - multiple readers reading the same article 39 | - when one reader _claps_ an article, other readers get an realtime update 40 | - the clap count on an article to update in realtime 41 | - **100,000** concurrent users on the platform 42 | - **10,000** concurrent users reading the same article (at peak) 43 | 44 | ## High Level Requirements 45 | 46 | - make your high-level components operate with **high availability** 47 | - ensure that the data in your system is **durable**, not matter what happens 48 | - define how your system would behave while **scaling-up** and **scaling-down** 49 | - make your system **cost-effective** and provide a justification for the same 50 | - describe how **capacity planning** helped you made a good design decision 51 | - think about how other services will interact with your service 52 | 53 | 54 | ## Micro Requirements 55 | 56 | - ensure the data in your system is **never** going in an inconsistent state 57 | - ensure your system is **free of deadlocks** (if applicable) 58 | - ensure that the throughput of your system is not affected by **locking**, if it does, state how it would affect 59 | 60 | 61 | # Output 62 | 63 | ## Design Document 64 | 65 | Create a **design document** of this system/feature stating all critical design decisions, tradeoffs, components, services, and communications. Also specify how your system handles at scale, and what will eventually become a chokepoint. 66 | 67 | Do **not** create unnecessary components, just to make design look complicated. A good design is **always simple and elegant**. A good way to think about it is if you were to create a spearate process/machine/infra for each component and you will have to code it yourself, would you still do it? 68 | 69 | 70 | ## Prototype 71 | 72 | To understand the nuances and internals of this system, build a prototype that 73 | 74 | - build an interface allowing multiple readers read an article 75 | - place a button and a clap counter next to the article's body 76 | - when one user clicks the clap button, the count updates in the database 77 | - the event is then sent to all the readers reading the same article 78 | 79 | ### Recommended Tech Stack 80 | 81 | This is a recommended tech-stack for building this prototype 82 | 83 | |Which|Options| 84 | |-----|-----| 85 | |Language|NodeJS, any other that supports socket IO| 86 | |Database|MongoDB, MySQL, any of your liking| 87 | |Library|SocketIO| 88 | 89 | ### Keep in mind 90 | 91 | These are the common pitfalls that you should keep in mind while you are building this prototype 92 | 93 | - IO calls in NodeJS are asynchronous 94 | - if you are broadcasting the status update, do not do it to "all" the users 95 | 96 | # Outcome 97 | 98 | ## You'll learn 99 | 100 | - realtime communication through SocketIO 101 | - database schema design and deciding what data to store to make this efficient 102 | - optimizing by batching 103 | 104 | 105 | # Share and shoutout 106 | 107 | If you find this assignment helpful, please 108 | - share this assignment with your friends and peers 109 | - star this repository and help it reach a wider audience 110 | - give me a shoutout on Twitter [@arpit_bhayani](https://twitter.com/@arpit_bhayani), or on LinkedIn at [@arpitbhayani](https://www.linkedin.com/in/arpitbhayani/). 111 | 112 | This assignment is part of [Arpit's System Design Masterclass](https://arpitbhayani.me/masterclass) - A masterclass that helps you become great at designing scalable, fault-tolerant, and highly available systems. 113 | -------------------------------------------------------------------------------- /realtime-db.md: -------------------------------------------------------------------------------- 1 | Design a Realtime Database 2 | === 3 | 4 | 5 | * [Design a Realtime Database](#design-a-realtime-database) 6 | * [Problem Statement](#problem-statement) 7 | * [Requirements](#requirements) 8 | * [Core Requirements](#core-requirements) 9 | * [High Level Requirements](#high-level-requirements) 10 | * [Micro Requirements](#micro-requirements) 11 | * [Output](#output) 12 | * [Design Document](#design-document) 13 | * [Prototype](#prototype) 14 | * [Recommended Tech Stack](#recommended-tech-stack) 15 | * [Keep in mind](#keep-in-mind) 16 | * [Outcome](#outcome) 17 | * [You'll learn](#youll-learn) 18 | * [Share and shoutout](#share-and-shoutout) 19 | 20 | 21 | # Problem Statement 22 | 23 | Design a modern realtime KV database that sends realtime updates to all the users connected to it. Users subscribe to tables and anytime a KV is added, updated, or deleted the change is broadcasted to all the users, updating their view/application in realtime. 24 | 25 | ![Relog - Realtime Database](https://user-images.githubusercontent.com/4745789/139183521-43e7a5c8-a629-4f85-9584-21dd873d0ade.png) 26 | 27 | # Requirements 28 | 29 | 30 | *The problem statement is something to start with, be creative and dive into the product details and add constraints and features you think would be important.* 31 | 32 | 33 | ## Core Requirements 34 | 35 | - updates made by one user in one table is broadcasted to all users subscribed to it 36 | - one table has at max **100** subscribers 37 | - there are **1 million** such tables 38 | - one user can subscribe to **1** table at one time 39 | 40 | ## High Level Requirements 41 | 42 | - make your high-level components operate with **high availability** 43 | - ensure that the data in your system is **durable**, not matter what happens 44 | - define how your system would behave while **scaling-up** and **scaling-down** 45 | - make your system **cost-effective** and provide a justification for the same 46 | - describe how **capacity planning** helped you made a good design decision 47 | - think about how other services will interact with your service 48 | 49 | 50 | ## Micro Requirements 51 | 52 | - ensure the data in your system is **never** going in an inconsistent state 53 | - ensure your system is **free of deadlocks** (if applicable) 54 | - ensure that the throughput of your system is not affected by **locking**, if it does, state how it would affect 55 | 56 | 57 | # Output 58 | 59 | ## Design Document 60 | 61 | Create a **design document** of this system/feature stating all critical design decisions, tradeoffs, components, services, and communications. Also specify how your system handles at scale, and what will eventually become a chokepoint. 62 | 63 | Do **not** create unnecessary components, just to make design look complicated. A good design is **always simple and elegant**. A good way to think about it is if you were to create a spearate process/machine/infra for each component and you will have to code it yourself, would you still do it? 64 | 65 | 66 | ## Prototype 67 | 68 | To understand the nuances and internals of this system, build a prototype that 69 | 70 | - an interface to subscribe to one table and perform CRUD operations 71 | - use KV store that was building as part of [Design SQL backed KV Store](sql-kv.md) exercise 72 | - simulate multiple users subscribing to tables through different browser sessions 73 | 74 | ### Recommended Tech Stack 75 | 76 | This is a recommended tech-stack for building this prototype 77 | 78 | |Which|Options| 79 | |-----|-----| 80 | |Language|NodeJS, Golang or any that supports SocketIO| 81 | |Framework|SocketIO| 82 | |Database|KV store built by you, or any database of your liking| 83 | 84 | ### Keep in mind 85 | 86 | These are the common pitfalls that you should keep in mind while you are building this prototype 87 | 88 | - fan-outs take time 89 | - do not boradcast the update before persisting it 90 | 91 | # Outcome 92 | 93 | ## You'll learn 94 | 95 | - realtime communication using SocketIO 96 | - scaling sockets to millions of concurrent people 97 | - building realtime database with a great UX 98 | 99 | 100 | # Share and shoutout 101 | 102 | If you find this assignment helpful, please 103 | - share this assignment with your friends and peers 104 | - star this repository and help it reach a wider audience 105 | - give me a shoutout on Twitter [@arpit_bhayani](https://twitter.com/@arpit_bhayani), or on LinkedIn at [@arpitbhayani](https://www.linkedin.com/in/arpitbhayani/). 106 | 107 | This assignment is part of [Arpit's System Design Masterclass](https://arpitbhayani.me/masterclass) - A masterclass that helps you become great at designing scalable, fault-tolerant, and highly available systems. 108 | -------------------------------------------------------------------------------- /recent-searches.md: -------------------------------------------------------------------------------- 1 | Design Recent Searches 2 | === 3 | 4 | 5 | * [Design Recent Searches](#design-recent-searches) 6 | * [Problem Statement](#problem-statement) 7 | * [Requirements](#requirements) 8 | * [Core Requirements](#core-requirements) 9 | * [High Level Requirements](#high-level-requirements) 10 | * [Micro Requirements](#micro-requirements) 11 | * [Output](#output) 12 | * [Design Document](#design-document) 13 | * [Prototype](#prototype) 14 | * [Recommended Tech Stack](#recommended-tech-stack) 15 | * [Keep in mind](#keep-in-mind) 16 | * [Outcome](#outcome) 17 | * [You'll learn](#youll-learn) 18 | * [Share and shoutout](#share-and-shoutout) 19 | 20 | 21 | # Problem Statement 22 | 23 | When a user taps on the search bar we have to show the last 10 unique recent searches made by him/her. The time for the service to respond should be as low as possible given that user upon tapping the search bar would not want to wait for recent searches to load. 24 | 25 | Design the ingestion pipeline, the core API service, caching , delegation if required and decide on the high level dataflow. 26 | 27 | # Requirements 28 | 29 | 30 | *The problem statement is something to start with, be creative and dive into the product details and add constraints and features you think would be important.* 31 | 32 | 33 | ## Core Requirements 34 | 35 | - fetch last 10 unique recent searches made by the user 36 | - response time of the API should be < 10ms 37 | 38 | ## High Level Requirements 39 | 40 | - make your high-level components operate with **high availability** 41 | - ensure that the data in your system is **durable**, not matter what happens 42 | - define how your system would behave while **scaling-up** and **scaling-down** 43 | - make your system **cost-effective** and provide a justification for the same 44 | - describe how **capacity planning** helped you made a good design decision 45 | - think about how other services will interact with your service 46 | 47 | 48 | ## Micro Requirements 49 | 50 | - ensure the data in your system is **never** going in an inconsistent state 51 | - ensure your system is **free of deadlocks** (if applicable) 52 | - ensure that the throughput of your system is not affected by **locking**, if it does, state how it would affect 53 | 54 | 55 | # Output 56 | 57 | ## Design Document 58 | 59 | Create a **design document** of this system/feature stating all critical design decisions, tradeoffs, components, services, and communications. Also specify how your system handles at scale, and what will eventually become a chokepoint. 60 | 61 | Do **not** create unnecessary components, just to make design look complicated. A good design is **always simple and elegant**. A good way to think about it is if you were to create a spearate process/machine/infra for each component and you will have to code it yourself, would you still do it? 62 | 63 | 64 | ## Prototype 65 | 66 | To understand the nuances and internals of this system, build a prototype that 67 | 68 | - prototype the entre recent searches flow on local machine 69 | 70 | ### Recommended Tech Stack 71 | 72 | This is a recommended tech-stack for building this prototype 73 | 74 | |Which|Options| 75 | |-----|-----| 76 | |Language|Golang, Java, C++| 77 | |Database|decide| 78 | |Caching|decide| 79 | 80 | ### Keep in mind 81 | 82 | These are the common pitfalls that you should keep in mind while you are building this prototype 83 | 84 | - delegate whenever possible 85 | 86 | # Outcome 87 | 88 | ## You'll learn 89 | 90 | - importance of delegation 91 | - designing data flow 92 | - designing a service with SLA < 10ms 93 | 94 | 95 | # Share and shoutout 96 | 97 | If you find this assignment helpful, please 98 | - share this assignment with your friends and peers 99 | - star this repository and help it reach a wider audience 100 | - give me a shoutout on Twitter [@arpit_bhayani](https://twitter.com/@arpit_bhayani), or on LinkedIn at [@arpitbhayani](https://www.linkedin.com/in/arpitbhayani/). 101 | 102 | This assignment is part of [Arpit's System Design Masterclass](https://arpitbhayani.me/masterclass) - A masterclass that helps you become great at designing scalable, fault-tolerant, and highly available systems. 103 | -------------------------------------------------------------------------------- /s3.md: -------------------------------------------------------------------------------- 1 | Design S3 2 | === 3 | 4 | 5 | * [Design S3](#design-s3) 6 | * [Problem Statement](#problem-statement) 7 | * [Requirements](#requirements) 8 | * [Core Requirements](#core-requirements) 9 | * [High Level Requirements](#high-level-requirements) 10 | * [Micro Requirements](#micro-requirements) 11 | * [Output](#output) 12 | * [Design Document](#design-document) 13 | * [Prototype](#prototype) 14 | * [Recommended Tech Stack](#recommended-tech-stack) 15 | * [Keep in mind](#keep-in-mind) 16 | * [Outcome](#outcome) 17 | * [You'll learn](#youll-learn) 18 | * [Share and shoutout](#share-and-shoutout) 19 | 20 | 21 | # Problem Statement 22 | 23 | Design a scalable blob storage like Amazon's S3. S3 is a distributed file storage facilitating storage of blob data. In simpler terms, it could be described as the folder on the cloud. The various functions to think about while designing something as robust as S3 are: scaling API requests, scaling storage, durability, handling hot storage nodes, cost efficiency, disk utilization, data redundancy, data corruption, and permission management. 24 | 25 | # Requirements 26 | 27 | 28 | *The problem statement is something to start with, be creative and dive into the product details and add constraints and features you think would be important.* 29 | 30 | 31 | ## Core Requirements 32 | 33 | While desinging your own S3, account for following features 34 | 35 | - seamless scaling of storage nodes 36 | - durability of data even when a node crashes 37 | - handling hot storage nodes 38 | - support for multi-tenancy 39 | - cost efficienct architecture 40 | - maximal disk utilization 41 | - data redundancy across geographies 42 | - handling on-transit data corruption 43 | - file, user, access level permission management 44 | 45 | ## High Level Requirements 46 | 47 | - make your high-level components operate with **high availability** 48 | - ensure that the data in your system is **durable**, not matter what happens 49 | - define how your system would behave while **scaling-up** and **scaling-down** 50 | - make your system **cost-effective** and provide a justification for the same 51 | - describe how **capacity planning** helped you made a good design decision 52 | - think about how other services will interact with your service 53 | 54 | 55 | ## Micro Requirements 56 | 57 | - ensure the data in your system is **never** going in an inconsistent state 58 | - ensure your system is **free of deadlocks** (if applicable) 59 | - ensure that the throughput of your system is not affected by **locking**, if it does, state how it would affect 60 | 61 | 62 | # Output 63 | 64 | ## Design Document 65 | 66 | Create a **design document** of this system/feature stating all critical design decisions, tradeoffs, components, services, and communications. Also specify how your system handles at scale, and what will eventually become a chokepoint. 67 | 68 | Do **not** create unnecessary components, just to make design look complicated. A good design is **always simple and elegant**. A good way to think about it is if you were to create a spearate process/machine/infra for each component and you will have to code it yourself, would you still do it? 69 | 70 | 71 | ## Prototype 72 | 73 | To understand the nuances and internals of this system, build a prototype that 74 | 75 | - build a static file server to understand the basics of S3 76 | 77 | ### Recommended Tech Stack 78 | 79 | This is a recommended tech-stack for building this prototype 80 | 81 | |Which|Options| 82 | |-----|-----| 83 | |Language|Golang, Java, C++| 84 | 85 | ### Keep in mind 86 | 87 | These are the common pitfalls that you should keep in mind while you are building this prototype 88 | 89 | - anything that could fail would fail 90 | 91 | # Outcome 92 | 93 | ## You'll learn 94 | 95 | - how raw files are served over HTTP 96 | - designing complex systems 97 | 98 | 99 | # Share and shoutout 100 | 101 | If you find this assignment helpful, please 102 | - share this assignment with your friends and peers 103 | - star this repository and help it reach a wider audience 104 | - give me a shoutout on Twitter [@arpit_bhayani](https://twitter.com/@arpit_bhayani), or on LinkedIn at [@arpitbhayani](https://www.linkedin.com/in/arpitbhayani/). 105 | 106 | This assignment is part of [Arpit's System Design Masterclass](https://arpitbhayani.me/masterclass) - A masterclass that helps you become great at designing scalable, fault-tolerant, and highly available systems. 107 | -------------------------------------------------------------------------------- /scripts/dd.md: -------------------------------------------------------------------------------- 1 | Create a **design document** of this system/feature stating all critical design decisions, tradeoffs, components, services, and communications. Also specify how your system handles at scale, and what will eventually become a chokepoint. 2 | 3 | Do **not** create unnecessary components, just to make design look complicated. A good design is **always simple and elegant**. A good way to think about it is if you were to create a spearate process/machine/infra for each component and you will have to code it yourself, would you still do it? 4 | -------------------------------------------------------------------------------- /scripts/footer.md: -------------------------------------------------------------------------------- 1 | # Share and shoutout 2 | 3 | If you find this assignment helpful, please 4 | - share this assignment with your friends and peers 5 | - star this repository and help it reach a wider audience 6 | - give me a shoutout on Twitter [@arpit_bhayani](https://twitter.com/@arpit_bhayani), or on LinkedIn at [@arpitbhayani](https://www.linkedin.com/in/arpitbhayani/). 7 | 8 | This assignment is part of [Arpit's System Design Masterclass](https://arpitbhayani.me/masterclass) - A masterclass that helps you become great at designing scalable, fault-tolerant, and highly available systems. 9 | -------------------------------------------------------------------------------- /scripts/footer.py: -------------------------------------------------------------------------------- 1 | import re 2 | import sys 3 | 4 | def readfile(fname): 5 | with open(fname, "r") as f: return f.read() 6 | 7 | 8 | def writefile(fname, content): 9 | with open(fname, "w") as f: return f.write(content) 10 | 11 | 12 | patterns = [ 13 | ('', '', './scripts/footer.md'), 14 | ('', '', './scripts/high-level-requirements.md'), 15 | ('', '', './scripts/micro-requirements.md'), 16 | ('', '', './scripts/dd.md'), 17 | ('', '', './scripts/req.md'), 18 | ] 19 | 20 | for pattern in patterns: 21 | footer = "\n" + readfile(pattern[2]).strip() + "\n" 22 | content = readfile(sys.argv[1]).strip() 23 | content = re.sub(pattern[0] + '?(.*?)' + pattern[1], pattern[0] + footer + pattern[1], content, flags=re.DOTALL) 24 | writefile(sys.argv[1], content) 25 | -------------------------------------------------------------------------------- /scripts/high-level-requirements.md: -------------------------------------------------------------------------------- 1 | - make your high-level components operate with **high availability** 2 | - ensure that the data in your system is **durable**, not matter what happens 3 | - define how your system would behave while **scaling-up** and **scaling-down** 4 | - make your system **cost-effective** and provide a justification for the same 5 | - describe how **capacity planning** helped you made a good design decision 6 | - think about how other services will interact with your service 7 | -------------------------------------------------------------------------------- /scripts/micro-requirements.md: -------------------------------------------------------------------------------- 1 | - ensure the data in your system is **never** going in an inconsistent state 2 | - ensure your system is **free of deadlocks** (if applicable) 3 | - ensure that the throughput of your system is not affected by **locking**, if it does, state how it would affect 4 | -------------------------------------------------------------------------------- /scripts/new.sh: -------------------------------------------------------------------------------- 1 | cp ./scripts/template.md $1.md 2 | -------------------------------------------------------------------------------- /scripts/req.md: -------------------------------------------------------------------------------- 1 | *The problem statement is something to start with, be creative and dive into the product details and add constraints and features you think would be important.* -------------------------------------------------------------------------------- /scripts/template.md: -------------------------------------------------------------------------------- 1 | Design something awesome 2 | === 3 | 4 | 5 | 6 | 7 | # Problem Statement 8 | 9 | Problem statement goes here. 10 | 11 | Representative image goes here. 12 | 13 | # Requirements 14 | 15 | 16 | 17 | 18 | ## Core Requirements 19 | 20 | - core requirement 1 21 | - core requirement 2 22 | 23 | ## High Level Requirements 24 | 25 | 26 | 27 | ## Micro Requirements 28 | 29 | 30 | 31 | # Output 32 | 33 | ## Design Document 34 | 35 | 36 | 37 | ## Prototype 38 | 39 | To understand the nuances and internals of this system, build a prototype that 40 | 41 | - prototype requirement 1 42 | - prototype requirement 2 43 | - prototype requirement 3 44 | 45 | ### Recommended Tech Stack 46 | 47 | This is a recommended tech-stack for building this prototype 48 | 49 | |Which|Options| 50 | |-----|-----| 51 | |Language|Golang, Java, C++| 52 | 53 | ### Keep in mind 54 | 55 | These are the common pitfalls that you should keep in mind while you are building this prototype 56 | 57 | - pitfall-1 58 | - pitfall-2 59 | - pitfall-3 60 | 61 | # Outcome 62 | 63 | ## You'll learn 64 | 65 | - learning 1 66 | - learning 2 67 | - learning 3 68 | 69 | 70 | 71 | -------------------------------------------------------------------------------- /scripts/toc.sh: -------------------------------------------------------------------------------- 1 | for f in *.md 2 | do 3 | python ./scripts/footer.py $f 4 | gh-md-toc --no-backup --hide-footer $f 5 | done 6 | -------------------------------------------------------------------------------- /sql-broker.md: -------------------------------------------------------------------------------- 1 | Design a SQL backed Message Broker 2 | === 3 | 4 | 5 | * [Design a SQL backed Message Broker](#design-a-sql-backed-message-broker) 6 | * [Problem Statement](#problem-statement) 7 | * [Requirements](#requirements) 8 | * [Core Requirements](#core-requirements) 9 | * [High Level Requirements](#high-level-requirements) 10 | * [Micro Requirements](#micro-requirements) 11 | * [Output](#output) 12 | * [Design Document](#design-document) 13 | * [Prototype](#prototype) 14 | * [Recommended Tech Stack](#recommended-tech-stack) 15 | * [Keep in mind](#keep-in-mind) 16 | * [Outcome](#outcome) 17 | * [You'll learn](#youll-learn) 18 | * [Share and shoutout](#share-and-shoutout) 19 | 20 | 21 | # Problem Statement 22 | 23 | Design a SQL backed Message Broker which allows clients to enqueue messages (max 4KB in size) and dequeue them. The broker should support multiple producers and multiple consumers at the same time allowing the broker to function at high throughput. Upon processing the message the client has to explictly delete the message from the broker. Every message may have an optional expiration time post which message is not allowed to be read. 24 | 25 | > We are building a broker on top of SQL because SQL database out-of-the-box gives us necessary toolset to build a robust broker. This exercise will help us understand the core properties we need in our broker and then we mimic them on any storage of our choice. 26 | 27 | # Requirements 28 | 29 | 30 | *The problem statement is something to start with, be creative and dive into the product details and add constraints and features you think would be important.* 31 | 32 | 33 | ## Core Requirements 34 | 35 | - multiple producers can put message in the broker at the same time 36 | - multiple consumers can read the message from the broker at the same time 37 | - one message to be read and processed by exactly one consumer at one time 38 | - message once deleted should not be read by any other consume 39 | - every message may have an optional expiration time 40 | - the broker should have a high throughput 41 | 42 | ## High Level Requirements 43 | 44 | - make your high-level components operate with **high availability** 45 | - ensure that the data in your system is **durable**, not matter what happens 46 | - define how your system would behave while **scaling-up** and **scaling-down** 47 | - make your system **cost-effective** and provide a justification for the same 48 | - describe how **capacity planning** helped you made a good design decision 49 | - think about how other services will interact with your service 50 | 51 | 52 | ## Micro Requirements 53 | 54 | - ensure the data in your system is **never** going in an inconsistent state 55 | - ensure your system is **free of deadlocks** (if applicable) 56 | - ensure that the throughput of your system is not affected by **locking**, if it does, state how it would affect 57 | 58 | 59 | # Output 60 | 61 | ## Design Document 62 | 63 | Create a **design document** of this system/feature stating all critical design decisions, tradeoffs, components, services, and communications. Also specify how your system handles at scale, and what will eventually become a chokepoint. 64 | 65 | Do **not** create unnecessary components, just to make design look complicated. A good design is **always simple and elegant**. A good way to think about it is if you were to create a spearate process/machine/infra for each component and you will have to code it yourself, would you still do it? 66 | 67 | 68 | ## Prototype 69 | 70 | To understand the nuances and internals of this system, build a prototype that 71 | 72 | - is a broker on SQL database 73 | - simulate multiple producers and consumers and see how broker behaves 74 | 75 | ### Recommended Tech Stack 76 | 77 | This is a recommended tech-stack for building this prototype 78 | 79 | |Which|Options| 80 | |-----|-----| 81 | |Language|Golang, Java, C++| 82 | |Database|MySQL| 83 | 84 | ### Keep in mind 85 | 86 | These are the common pitfalls that you should keep in mind while you are building this prototype 87 | 88 | - in-efficient locking will choke the database 89 | - TTL may not be the only choice 90 | 91 | # Outcome 92 | 93 | ## You'll learn 94 | 95 | - core properties of any brokers 96 | - internals of brokers 97 | - transactions and locking in relational databases 98 | 99 | 100 | # Share and shoutout 101 | 102 | If you find this assignment helpful, please 103 | - share this assignment with your friends and peers 104 | - star this repository and help it reach a wider audience 105 | - give me a shoutout on Twitter [@arpit_bhayani](https://twitter.com/@arpit_bhayani), or on LinkedIn at [@arpitbhayani](https://www.linkedin.com/in/arpitbhayani/). 106 | 107 | This assignment is part of [Arpit's System Design Masterclass](https://arpitbhayani.me/masterclass) - A masterclass that helps you become great at designing scalable, fault-tolerant, and highly available systems. 108 | -------------------------------------------------------------------------------- /sql-kv.md: -------------------------------------------------------------------------------- 1 | Design SQL backed KV Store 2 | === 3 | 4 | 5 | * [Design SQL backed KV Store](#design-sql-backed-kv-store) 6 | * [Problem Statement](#problem-statement) 7 | * [Requirements](#requirements) 8 | * [Core Requirements](#core-requirements) 9 | * [High Level Requirements](#high-level-requirements) 10 | * [Micro Requirements](#micro-requirements) 11 | * [Output](#output) 12 | * [Design Document](#design-document) 13 | * [Prototype](#prototype) 14 | * [Recommended Tech Stack](#recommended-tech-stack) 15 | * [Keep in mind](#keep-in-mind) 16 | * [Outcome](#outcome) 17 | * [You'll learn](#youll-learn) 18 | * [Share and shoutout](#share-and-shoutout) 19 | 20 | 21 | # Problem Statement 22 | 23 | Design a KV Store built on top of a SQL (relational) database. The store exposes APIs to `GET`, `PUT`, `DEL` keys. Along with these core functiona, there should be an API to set `TTL` to an existing key, post which the key is auto-deleted from the store. Scale this KV store **1 million** concurrent API calls and a total storage of **5000 TB**. 24 | 25 | ![Relog - SQL Backed KV Store](https://user-images.githubusercontent.com/4745789/138806145-0ad10712-26c4-4c21-aeed-eb2f28157959.png) 26 | 27 | # Requirements 28 | 29 | 30 | *The problem statement is something to start with, be creative and dive into the product details and add constraints and features you think would be important.* 31 | 32 | 33 | ## Core Requirements 34 | 35 | - should be able to `GET`, `PUT`, `DEL`, `TTL` on keys 36 | - upon expiration the key should be auto-deleted 37 | - should handle **1 million** concurrent transactions 38 | - max storage of one KV store would be **5000 TB** 39 | 40 | ## High Level Requirements 41 | 42 | - make your high-level components operate with **high availability** 43 | - ensure that the data in your system is **durable**, not matter what happens 44 | - define how your system would behave while **scaling-up** and **scaling-down** 45 | - make your system **cost-effective** and provide a justification for the same 46 | - describe how **capacity planning** helped you made a good design decision 47 | - think about how other services will interact with your service 48 | 49 | 50 | ## Micro Requirements 51 | 52 | - ensure the data in your system is **never** going in an inconsistent state 53 | - ensure your system is **free of deadlocks** (if applicable) 54 | - ensure that the throughput of your system is not affected by **locking**, if it does, state how it would affect 55 | 56 | 57 | # Output 58 | 59 | ## Design Document 60 | 61 | Create a **design document** of this system/feature stating all critical design decisions, tradeoffs, components, services, and communications. Also specify how your system handles at scale, and what will eventually become a chokepoint. 62 | 63 | Do **not** create unnecessary components, just to make design look complicated. A good design is **always simple and elegant**. A good way to think about it is if you were to create a spearate process/machine/infra for each component and you will have to code it yourself, would you still do it? 64 | 65 | 66 | ## Prototype 67 | 68 | To understand the nuances and internals of this system, build a prototype that 69 | 70 | - expose `GET`, `PUT`, `DEL`, `TTL` over HTTP server 71 | - simulate concurrent transactions and measure the throughput 72 | 73 | ### Recommended Tech Stack 74 | 75 | This is a recommended tech-stack for building this prototype 76 | 77 | |Which|Options| 78 | |-----|-----| 79 | |Language|Golang, Java, C++| 80 | |Database|Relational Database - MySQL| 81 | 82 | ### Keep in mind 83 | 84 | These are the common pitfalls that you should keep in mind while you are building this prototype 85 | 86 | - have a primary key to your tables otherwise the entire table might get locked 87 | - auto-deletion of keys at scale is tricky, think about it well 88 | 89 | # Outcome 90 | 91 | ## You'll learn 92 | 93 | - SQL Transactions 94 | - Database Locking 95 | - Scaling by sharding 96 | 97 | 98 | # Share and shoutout 99 | 100 | If you find this assignment helpful, please 101 | - share this assignment with your friends and peers 102 | - star this repository and help it reach a wider audience 103 | - give me a shoutout on Twitter [@arpit_bhayani](https://twitter.com/@arpit_bhayani), or on LinkedIn at [@arpitbhayani](https://www.linkedin.com/in/arpitbhayani/). 104 | 105 | This assignment is part of [Arpit's System Design Masterclass](https://arpitbhayani.me/masterclass) - A masterclass that helps you become great at designing scalable, fault-tolerant, and highly available systems. 106 | -------------------------------------------------------------------------------- /superfast-kv.md: -------------------------------------------------------------------------------- 1 | Design a Superfast KV Store 2 | === 3 | 4 | 5 | * [Design a Superfast KV Store](#design-a-superfast-kv-store) 6 | * [Problem Statement](#problem-statement) 7 | * [Requirements](#requirements) 8 | * [Core Requirements](#core-requirements) 9 | * [High Level Requirements](#high-level-requirements) 10 | * [Micro Requirements](#micro-requirements) 11 | * [Output](#output) 12 | * [Design Document](#design-document) 13 | * [Prototype](#prototype) 14 | * [Recommended Tech Stack](#recommended-tech-stack) 15 | * [Keep in mind](#keep-in-mind) 16 | * [Outcome](#outcome) 17 | * [You'll learn](#youll-learn) 18 | * [Share and shoutout](#share-and-shoutout) 19 | 20 | 21 | # Problem Statement 22 | 23 | Design a single-node persistent KV Store that supports `GET`, `PUT` and `DEL` operations and it utilizes hardware (disk, RAM) optimally. The response time for all the 3 operations should be as low as possible and complexity of operations should be `O(1)`. It is okay for this KV store to not support infinite number of keys given it is bound to a single node. 24 | 25 | > Note: It is okay if your storage engine cannot support very large number of keys. 26 | 27 | # Requirements 28 | 29 | 30 | *The problem statement is something to start with, be creative and dive into the product details and add constraints and features you think would be important.* 31 | 32 | 33 | ## Core Requirements 34 | 35 | - should be able to `GET`, `PUT`, `DEL` on a key 36 | - all operations should happen as fast as possible with complexity of `O(1)` 37 | - this KV store is not distributed and will run on just a single node 38 | 39 | ## High Level Requirements 40 | 41 | - make your high-level components operate with **high availability** 42 | - ensure that the data in your system is **durable**, not matter what happens 43 | - define how your system would behave while **scaling-up** and **scaling-down** 44 | - make your system **cost-effective** and provide a justification for the same 45 | - describe how **capacity planning** helped you made a good design decision 46 | - think about how other services will interact with your service 47 | 48 | 49 | ## Micro Requirements 50 | 51 | - ensure the data in your system is **never** going in an inconsistent state 52 | - ensure your system is **free of deadlocks** (if applicable) 53 | - ensure that the throughput of your system is not affected by **locking**, if it does, state how it would affect 54 | 55 | 56 | # Output 57 | 58 | ## Design Document 59 | 60 | Create a **design document** of this system/feature stating all critical design decisions, tradeoffs, components, services, and communications. Also specify how your system handles at scale, and what will eventually become a chokepoint. 61 | 62 | Do **not** create unnecessary components, just to make design look complicated. A good design is **always simple and elegant**. A good way to think about it is if you were to create a spearate process/machine/infra for each component and you will have to code it yourself, would you still do it? 63 | 64 | 65 | ## Prototype 66 | 67 | To understand the nuances and internals of this system, build a prototype that 68 | 69 | - implement your design and measure the `GET`, `PUT`, `DEL` performance 70 | 71 | ### Recommended Tech Stack 72 | 73 | This is a recommended tech-stack for building this prototype 74 | 75 | |Which|Options| 76 | |-----|-----| 77 | |Language|Golang, Java, C++, Python| 78 | 79 | ### Keep in mind 80 | 81 | These are the common pitfalls that you should keep in mind while you are building this prototype 82 | 83 | - your storage engine will always be bound to single node 84 | - it is okay for your engine to not support very large number of keys 85 | 86 | # Outcome 87 | 88 | ## You'll learn 89 | 90 | - designing storage engine 91 | - utilizing every ounce of your hardware 92 | 93 | 94 | # Share and shoutout 95 | 96 | If you find this assignment helpful, please 97 | - share this assignment with your friends and peers 98 | - star this repository and help it reach a wider audience 99 | - give me a shoutout on Twitter [@arpit_bhayani](https://twitter.com/@arpit_bhayani), or on LinkedIn at [@arpitbhayani](https://www.linkedin.com/in/arpitbhayani/). 100 | 101 | This assignment is part of [Arpit's System Design Masterclass](https://arpitbhayani.me/masterclass) - A masterclass that helps you become great at designing scalable, fault-tolerant, and highly available systems. 102 | -------------------------------------------------------------------------------- /tagging-photos-with-people.md: -------------------------------------------------------------------------------- 1 | Design Photo Tagging 2 | === 3 | 4 | 5 | * [Design Photo Tagging](#design-photo-tagging) 6 | * [Problem Statement](#problem-statement) 7 | * [Requirements](#requirements) 8 | * [Core Requirements](#core-requirements) 9 | * [High Level Requirements](#high-level-requirements) 10 | * [Micro Requirements](#micro-requirements) 11 | * [Output](#output) 12 | * [Design Document](#design-document) 13 | * [Prototype](#prototype) 14 | * [Recommended Tech Stack](#recommended-tech-stack) 15 | * [Keep in mind](#keep-in-mind) 16 | * [Outcome](#outcome) 17 | * [You'll learn](#youll-learn) 18 | * [Share and shoutout](#share-and-shoutout) 19 | 20 | 21 | # Problem Statement 22 | 23 | Design a feature that allows users to tag other users in photos they upload. Users can optionally select a rectangular region of the photo and tag it with the person. By default a small square should be picked as the region of interest. 24 | 25 | ![Relog Tgging Photos with People](https://user-images.githubusercontent.com/4745789/139575791-ff4f4b01-f853-482f-9291-66731559da98.png) 26 | 27 | # Requirements 28 | 29 | ## Core Requirements 30 | 31 | - allow people to tag other people in photos 32 | - region is always rectangular and user can pick how big or small the region would be 33 | 34 | ## High Level Requirements 35 | 36 | - make your high-level components operate with **high availability** 37 | - ensure that the data in your system is **durable**, not matter what happens 38 | - define how your system would behave while **scaling-up** and **scaling-down** 39 | - make your system **cost-effective** and provide a justification for the same 40 | - describe how **capacity planning** helped you made a good design decision 41 | - think about how other services will interact with your service 42 | 43 | 44 | ## Micro Requirements 45 | 46 | - ensure the data in your system is **never** going in an inconsistent state 47 | - ensure your system is **free of deadlocks** (if applicable) 48 | - ensure that the throughput of your system is not affected by **locking**, if it does, state how it would affect 49 | 50 | 51 | # Output 52 | 53 | ## Design Document 54 | 55 | Create a **design document** of this system/feature stating all critical design decisions, tradeoffs, components, services, and communications. Also specify how your system handles at scale, and what will eventually become a chokepoint. 56 | 57 | Do **not** create unnecessary components, just to make design look complicated. A good design is **always simple and elegant**. A good way to think about it is if you were to create a spearate process/machine/infra for each component and you will have to code it yourself, would you still do it? 58 | 59 | 60 | ## Prototype 61 | 62 | To understand the nuances and internals of this system, build a prototype that 63 | 64 | - design a database schema to hold people tagged in photos 65 | - render the tagged people on different image resolutions 66 | 67 | ### Recommended Tech Stack 68 | 69 | This is a recommended tech-stack for building this prototype 70 | 71 | |Which|Options| 72 | |-----|-----| 73 | |Language|Golang, Java, NodeJS, C++| 74 | |Database|Pick your favourite| 75 | 76 | ### Keep in mind 77 | 78 | These are the common pitfalls that you should keep in mind while you are building this prototype 79 | 80 | - devices have different resolutions 81 | 82 | # Outcome 83 | 84 | ## You'll learn 85 | 86 | - data representation 87 | - database schema design 88 | 89 | 90 | # Share and shoutout 91 | 92 | If you find this assignment helpful, please 93 | - share this assignment with your friends and peers 94 | - star this repository and help it reach a wider audience 95 | - give me a shoutout on Twitter [@arpit_bhayani](https://twitter.com/@arpit_bhayani), or on LinkedIn at [@arpitbhayani](https://www.linkedin.com/in/arpitbhayani/). 96 | 97 | This assignment is part of [Arpit's System Design Masterclass](https://arpitbhayani.me/masterclass) - A masterclass that helps you become great at designing scalable, fault-tolerant, and highly available systems. 98 | -------------------------------------------------------------------------------- /task-scheduler.md: -------------------------------------------------------------------------------- 1 | Design a Distributed Task Scheduler 2 | === 3 | 4 | 5 | * [Design a Distributed Task Scheduler](#design-a-distributed-task-scheduler) 6 | * [Problem Statement](#problem-statement) 7 | * [Requirements](#requirements) 8 | * [Core Requirements](#core-requirements) 9 | * [High Level Requirements](#high-level-requirements) 10 | * [Micro Requirements](#micro-requirements) 11 | * [Output](#output) 12 | * [Design Document](#design-document) 13 | * [Prototype](#prototype) 14 | * [Recommended Tech Stack](#recommended-tech-stack) 15 | * [Keep in mind](#keep-in-mind) 16 | * [Outcome](#outcome) 17 | * [You'll learn](#youll-learn) 18 | * [Share and shoutout](#share-and-shoutout) 19 | 20 | 21 | # Problem Statement 22 | 23 | Design a distributed task scheduler in which the client can register a task and the time at which it should be executed. The task needs to be picked up within 10 second of its scheduled time of execution. The tasks can be of two types 24 | 25 | - one-time task 26 | - recurring tasks 27 | 28 | Clients can register task with a cron syntax and our scheduler needs to execute it as per the schedule. Client can submit a task that is one-time in nature which means once executed it will never be picked again. 29 | 30 | Potential applications: 31 | 32 | - reminders in calendar applications 33 | - distributed cron 34 | - sending scheduled notifications to users 35 | 36 | # Requirements 37 | 38 | 39 | *The problem statement is something to start with, be creative and dive into the product details and add constraints and features you think would be important.* 40 | 41 | 42 | ## Core Requirements 43 | 44 | - clients to rgister task with either schedule time of execution or cron syntax 45 | - task is either one-time execution or recurring 46 | - task should be picked up for execution within 10 seconds of its scheduled execution 47 | 48 | ## High Level Requirements 49 | 50 | - make your high-level components operate with **high availability** 51 | - ensure that the data in your system is **durable**, not matter what happens 52 | - define how your system would behave while **scaling-up** and **scaling-down** 53 | - make your system **cost-effective** and provide a justification for the same 54 | - describe how **capacity planning** helped you made a good design decision 55 | - think about how other services will interact with your service 56 | 57 | 58 | ## Micro Requirements 59 | 60 | - ensure the data in your system is **never** going in an inconsistent state 61 | - ensure your system is **free of deadlocks** (if applicable) 62 | - ensure that the throughput of your system is not affected by **locking**, if it does, state how it would affect 63 | 64 | 65 | # Output 66 | 67 | ## Design Document 68 | 69 | Create a **design document** of this system/feature stating all critical design decisions, tradeoffs, components, services, and communications. Also specify how your system handles at scale, and what will eventually become a chokepoint. 70 | 71 | Do **not** create unnecessary components, just to make design look complicated. A good design is **always simple and elegant**. A good way to think about it is if you were to create a spearate process/machine/infra for each component and you will have to code it yourself, would you still do it? 72 | 73 | 74 | ## Prototype 75 | 76 | To understand the nuances and internals of this system, build a prototype that 77 | 78 | - schedules mock executions as per configured schedule 79 | - simulate concurrent executions 80 | 81 | ### Recommended Tech Stack 82 | 83 | This is a recommended tech-stack for building this prototype 84 | 85 | |Which|Options| 86 | |-----|-----| 87 | |Language|Golang, Java, C++| 88 | |Database|MySQL| 89 | 90 | ### Keep in mind 91 | 92 | These are the common pitfalls that you should keep in mind while you are building this prototype 93 | 94 | - every component should have predictable SLA to meet overall 10 second SLA 95 | - try to separate the concerns and repeatedly 96 | - scheduling recurring tasks is easier than you think 97 | 98 | # Outcome 99 | 100 | ## You'll learn 101 | 102 | - design components with predictable SLA 103 | - separation of concerns 104 | - how to implement recurring execution in a stateless way 105 | 106 | 107 | # Share and shoutout 108 | 109 | If you find this assignment helpful, please 110 | - share this assignment with your friends and peers 111 | - star this repository and help it reach a wider audience 112 | - give me a shoutout on Twitter [@arpit_bhayani](https://twitter.com/@arpit_bhayani), or on LinkedIn at [@arpitbhayani](https://www.linkedin.com/in/arpitbhayani/). 113 | 114 | This assignment is part of [Arpit's System Design Masterclass](https://arpitbhayani.me/masterclass) - A masterclass that helps you become great at designing scalable, fault-tolerant, and highly available systems. 115 | -------------------------------------------------------------------------------- /text-search-engine.md: -------------------------------------------------------------------------------- 1 | Design a Text-based Search Engine 2 | === 3 | 4 | 5 | * [Design a Text-based Search Engine](#design-a-text-based-search-engine) 6 | * [Problem Statement](#problem-statement) 7 | * [Requirements](#requirements) 8 | * [Core Requirements](#core-requirements) 9 | * [Micro Requirements](#micro-requirements) 10 | * [Output](#output) 11 | * [Design Document](#design-document) 12 | * [Prototype](#prototype) 13 | * [Recommended Tech Stack](#recommended-tech-stack) 14 | * [Outcome](#outcome) 15 | * [You'll learn](#youll-learn) 16 | * [Share and shoutout](#share-and-shoutout) 17 | 18 | 19 | # Problem Statement 20 | 21 | Design a dead-simple text-based search engine that serves relevant results without using any tooling like ElasticSearch. The idea is to understand the internals of Search Engine and the math behind TF-IDF. Extend your search engine to support boolean expressions, typo tolerance, phonetics, and anything that you find amusing. 22 | 23 | # Requirements 24 | 25 | 26 | *The problem statement is something to start with, be creative and dive into the product details and add constraints and features you think would be important.* 27 | 28 | 29 | ## Core Requirements 30 | 31 | - build a simple text based search engine that serves relevant results 32 | - make search engine as robust as possible 33 | 34 | ## Micro Requirements 35 | 36 | - ensure the data in your system is **never** going in an inconsistent state 37 | - ensure your system is **free of deadlocks** (if applicable) 38 | - ensure that the throughput of your system is not affected by **locking**, if it does, state how it would affect 39 | 40 | 41 | # Output 42 | 43 | ## Design Document 44 | 45 | Create a **design document** of this system/feature stating all critical design decisions, tradeoffs, components, services, and communications. Also specify how your system handles at scale, and what will eventually become a chokepoint. 46 | 47 | Do **not** create unnecessary components, just to make design look complicated. A good design is **always simple and elegant**. A good way to think about it is if you were to create a spearate process/machine/infra for each component and you will have to code it yourself, would you still do it? 48 | 49 | 50 | ## Prototype 51 | 52 | To understand the nuances and internals of this system, build a prototype that 53 | 54 | - build a search engine on top of 100MB of text data using your favourite programming language 55 | 56 | ### Recommended Tech Stack 57 | 58 | This is a recommended tech-stack for building this prototype 59 | 60 | |Which|Options| 61 | |-----|-----| 62 | |Language|Golang, Java, C++| 63 | 64 | # Outcome 65 | 66 | ## You'll learn 67 | 68 | - a simple text-based search engine 69 | - math behind tf-idf 70 | - basics of NLP - stemming, lemmatization, and phonetics 71 | 72 | 73 | # Share and shoutout 74 | 75 | If you find this assignment helpful, please 76 | - share this assignment with your friends and peers 77 | - star this repository and help it reach a wider audience 78 | - give me a shoutout on Twitter [@arpit_bhayani](https://twitter.com/@arpit_bhayani), or on LinkedIn at [@arpitbhayani](https://www.linkedin.com/in/arpitbhayani/). 79 | 80 | This assignment is part of [Arpit's System Design Masterclass](https://arpitbhayani.me/masterclass) - A masterclass that helps you become great at designing scalable, fault-tolerant, and highly available systems. 81 | -------------------------------------------------------------------------------- /user-affinity.md: -------------------------------------------------------------------------------- 1 | Design User Affinity 2 | === 3 | 4 | 5 | * [Design User Affinity](#design-user-affinity) 6 | * [Problem Statement](#problem-statement) 7 | * [Requirements](#requirements) 8 | * [Core Requirements](#core-requirements) 9 | * [High Level Requirements](#high-level-requirements) 10 | * [Micro Requirements](#micro-requirements) 11 | * [Output](#output) 12 | * [Design Document](#design-document) 13 | * [Prototype](#prototype) 14 | * [Recommended Tech Stack](#recommended-tech-stack) 15 | * [Keep in mind](#keep-in-mind) 16 | * [Outcome](#outcome) 17 | * [You'll learn](#youll-learn) 18 | * [Share and shoutout](#share-and-shoutout) 19 | 20 | 21 | # Problem Statement 22 | 23 | Every social network has some notion for friends/followers/connections that defines affinity of one user to another. So, let's design user affinity your social network that operates at scale. 24 | 25 | Say, our social network has a notion of _follow_ and every user can follow every other user. The service should answer "people I follow?" and "people following me?" very efficiently. 26 | 27 | ![Relog User Affinity](https://user-images.githubusercontent.com/4745789/139584187-5ae0e08e-16eb-4354-9fa3-fc286a244887.png) 28 | 29 | # Requirements 30 | 31 | 32 | *The problem statement is something to start with, be creative and dive into the product details and add constraints and features you think would be important.* 33 | 34 | 35 | ## Core Requirements 36 | 37 | - every user can **follow** every other user 38 | - there are **50 million** people on our social network 39 | - the two queries that need to be efficiently answered are 40 | - people I follow? 41 | - people following me? 42 | 43 | ## High Level Requirements 44 | 45 | - make your high-level components operate with **high availability** 46 | - ensure that the data in your system is **durable**, not matter what happens 47 | - define how your system would behave while **scaling-up** and **scaling-down** 48 | - make your system **cost-effective** and provide a justification for the same 49 | - describe how **capacity planning** helped you made a good design decision 50 | - think about how other services will interact with your service 51 | 52 | 53 | ## Micro Requirements 54 | 55 | - ensure the data in your system is **never** going in an inconsistent state 56 | - ensure your system is **free of deadlocks** (if applicable) 57 | - ensure that the throughput of your system is not affected by **locking**, if it does, state how it would affect 58 | 59 | 60 | # Output 61 | 62 | ## Design Document 63 | 64 | Create a **design document** of this system/feature stating all critical design decisions, tradeoffs, components, services, and communications. Also specify how your system handles at scale, and what will eventually become a chokepoint. 65 | 66 | Do **not** create unnecessary components, just to make design look complicated. A good design is **always simple and elegant**. A good way to think about it is if you were to create a spearate process/machine/infra for each component and you will have to code it yourself, would you still do it? 67 | 68 | 69 | ## Prototype 70 | 71 | To understand the nuances and internals of this system, build a prototype that 72 | 73 | - design the database schema to hold this information at scale 74 | - build a small UI prototype allowing user to follow/unfollow other users (optional) 75 | - load test the two queries for large number of connections and measure the performance 76 | 77 | ### Recommended Tech Stack 78 | 79 | This is a recommended tech-stack for building this prototype 80 | 81 | |Which|Options| 82 | |-----|-----| 83 | |Language|Golang, Java, C++| 84 | |Database|Pick your favourite| 85 | 86 | ### Keep in mind 87 | 88 | These are the common pitfalls that you should keep in mind while you are building this prototype 89 | 90 | - there would be celebrity accounts who has millions of followers 91 | - the worst case of connections is n^2 92 | 93 | # Outcome 94 | 95 | ## You'll learn 96 | 97 | - schema design 98 | - database sharding 99 | - indexing internals 100 | 101 | 102 | # Share and shoutout 103 | 104 | If you find this assignment helpful, please 105 | - share this assignment with your friends and peers 106 | - star this repository and help it reach a wider audience 107 | - give me a shoutout on Twitter [@arpit_bhayani](https://twitter.com/@arpit_bhayani), or on LinkedIn at [@arpitbhayani](https://www.linkedin.com/in/arpitbhayani/). 108 | 109 | This assignment is part of [Arpit's System Design Masterclass](https://arpitbhayani.me/masterclass) - A masterclass that helps you become great at designing scalable, fault-tolerant, and highly available systems. 110 | -------------------------------------------------------------------------------- /video-pipeline.md: -------------------------------------------------------------------------------- 1 | Design a Video Processing Pipeline for Streaming Service 2 | === 3 | 4 | 5 | * [Design a Video Processing Pipeline for Streaming Service](#design-a-video-processing-pipeline-for-streaming-service) 6 | * [Problem Statement](#problem-statement) 7 | * [Requirements](#requirements) 8 | * [Core Requirements](#core-requirements) 9 | * [High Level Requirements](#high-level-requirements) 10 | * [Micro Requirements](#micro-requirements) 11 | * [Output](#output) 12 | * [Design Document](#design-document) 13 | * [Prototype](#prototype) 14 | * [Outcome](#outcome) 15 | * [You'll learn](#youll-learn) 16 | * [Share and shoutout](#share-and-shoutout) 17 | 18 | 19 | # Problem Statement 20 | 21 | Design a video processing pipeline for a Video Streaming service starting from upload, storage, process, post-process, on-demand processing, and caching. The core of this problem statement is to design the system in such a way that the processing of videos post upload is done with massive parellization. 22 | 23 | # Requirements 24 | 25 | 26 | *The problem statement is something to start with, be creative and dive into the product details and add constraints and features you think would be important.* 27 | 28 | 29 | ## Core Requirements 30 | 31 | - ability to processes thousands of videos in parallel 32 | - design a robust video upload flow 33 | 34 | ## High Level Requirements 35 | 36 | - make your high-level components operate with **high availability** 37 | - ensure that the data in your system is **durable**, not matter what happens 38 | - define how your system would behave while **scaling-up** and **scaling-down** 39 | - make your system **cost-effective** and provide a justification for the same 40 | - describe how **capacity planning** helped you made a good design decision 41 | - think about how other services will interact with your service 42 | 43 | 44 | ## Micro Requirements 45 | 46 | - ensure the data in your system is **never** going in an inconsistent state 47 | - ensure your system is **free of deadlocks** (if applicable) 48 | - ensure that the throughput of your system is not affected by **locking**, if it does, state how it would affect 49 | 50 | 51 | # Output 52 | 53 | ## Design Document 54 | 55 | Create a **design document** of this system/feature stating all critical design decisions, tradeoffs, components, services, and communications. Also specify how your system handles at scale, and what will eventually become a chokepoint. 56 | 57 | Do **not** create unnecessary components, just to make design look complicated. A good design is **always simple and elegant**. A good way to think about it is if you were to create a spearate process/machine/infra for each component and you will have to code it yourself, would you still do it? 58 | 59 | 60 | ## Prototype 61 | 62 | To understand the nuances and internals of this system, understand 63 | 64 | - how CDN streams the video 65 | - how on-demand video optimization works 66 | - how caching strategy is designined for video streaming service 67 | 68 | 69 | # Outcome 70 | 71 | ## You'll learn 72 | 73 | - CDN for video streaming 74 | - designing a robust upload flow of video 75 | - parellel processing of a complex workflow 76 | 77 | 78 | # Share and shoutout 79 | 80 | If you find this assignment helpful, please 81 | - share this assignment with your friends and peers 82 | - star this repository and help it reach a wider audience 83 | - give me a shoutout on Twitter [@arpit_bhayani](https://twitter.com/@arpit_bhayani), or on LinkedIn at [@arpitbhayani](https://www.linkedin.com/in/arpitbhayani/). 84 | 85 | This assignment is part of [Arpit's System Design Masterclass](https://arpitbhayani.me/masterclass) - A masterclass that helps you become great at designing scalable, fault-tolerant, and highly available systems. 86 | -------------------------------------------------------------------------------- /word-dictionary.md: -------------------------------------------------------------------------------- 1 | Design a Word Dictionary 2 | === 3 | 4 | 5 | * [Design a Word Dictionary](#design-a-word-dictionary) 6 | * [Problem Statement](#problem-statement) 7 | * [Requirements](#requirements) 8 | * [Core Requirements](#core-requirements) 9 | * [High Level Requirements](#high-level-requirements) 10 | * [Micro Requirements](#micro-requirements) 11 | * [Output](#output) 12 | * [Design Document](#design-document) 13 | * [Prototype](#prototype) 14 | * [Recommended Tech Stack](#recommended-tech-stack) 15 | * [Keep in mind](#keep-in-mind) 16 | * [Outcome](#outcome) 17 | * [You'll learn](#youll-learn) 18 | * [Share and shoutout](#share-and-shoutout) 19 | 20 | 21 | # Problem Statement 22 | 23 | Design a service that serves English Word Dictionary. The service exposes endpoints for getting the meaning given the word. The dictionary is **weekly** updated through a changelog which has the words and meanings that needs to be updated and this changelog will contain at max 1000 words. The total size of the dictionary is **1TB** and it holds **171476** words. 24 | 25 | > Note: While building this service we are not allowed to use any traditional database like MySQl, PostgreSQL, MongoDB, etc. Be creative and use something different. 26 | 27 | # Requirements 28 | 29 | 30 | *The problem statement is something to start with, be creative and dive into the product details and add constraints and features you think would be important.* 31 | 32 | 33 | ## Core Requirements 34 | 35 | - service should handle **5 million** requests per minute 36 | - dictionary is updated **weekly** through changelog 37 | - you cannot use a traditional database, be creative here 38 | - the service should be cost-efficient and "convenient" 39 | 40 | ## High Level Requirements 41 | 42 | - make your high-level components operate with **high availability** 43 | - ensure that the data in your system is **durable**, not matter what happens 44 | - define how your system would behave while **scaling-up** and **scaling-down** 45 | - make your system **cost-effective** and provide a justification for the same 46 | - describe how **capacity planning** helped you made a good design decision 47 | - think about how other services will interact with your service 48 | 49 | 50 | ## Micro Requirements 51 | 52 | - ensure the data in your system is **never** going in an inconsistent state 53 | - ensure your system is **free of deadlocks** (if applicable) 54 | - ensure that the throughput of your system is not affected by **locking**, if it does, state how it would affect 55 | 56 | 57 | # Output 58 | 59 | ## Design Document 60 | 61 | Create a **design document** of this system/feature stating all critical design decisions, tradeoffs, components, services, and communications. Also specify how your system handles at scale, and what will eventually become a chokepoint. 62 | 63 | Do **not** create unnecessary components, just to make design look complicated. A good design is **always simple and elegant**. A good way to think about it is if you were to create a spearate process/machine/infra for each component and you will have to code it yourself, would you still do it? 64 | 65 | 66 | ## Prototype 67 | 68 | To understand the nuances and internals of this system, build a prototype that 69 | 70 | - code your implementation locaclly and not worry about scale 71 | 72 | ### Recommended Tech Stack 73 | 74 | This is a recommended tech-stack for building this prototype 75 | 76 | |Which|Options| 77 | |-----|-----| 78 | |Language|Golang, Java, C++, Python| 79 | 80 | ### Keep in mind 81 | 82 | These are the common pitfalls that you should keep in mind while you are building this prototype 83 | 84 | - look at the number of words and total size of the dictionary 85 | 86 | # Outcome 87 | 88 | ## You'll learn 89 | 90 | - designing storage formats 91 | - data driven decisions 92 | 93 | 94 | # Share and shoutout 95 | 96 | If you find this assignment helpful, please 97 | - share this assignment with your friends and peers 98 | - star this repository and help it reach a wider audience 99 | - give me a shoutout on Twitter [@arpit_bhayani](https://twitter.com/@arpit_bhayani), or on LinkedIn at [@arpitbhayani](https://www.linkedin.com/in/arpitbhayani/). 100 | 101 | This assignment is part of [Arpit's System Design Masterclass](https://arpitbhayani.me/masterclass) - A masterclass that helps you become great at designing scalable, fault-tolerant, and highly available systems. 102 | --------------------------------------------------------------------------------