└── readme.md /readme.md: -------------------------------------------------------------------------------- 1 | ## System Design Basics 2 | * Key Characteristics and Fundamentals of Distributed Systems 3 | * Monolithic VS Microservice (Service Discovery, Resiliency) 4 | * Vertical vs horizontal scaling [Watch1](https://www.youtube.com/watch?v=xpDnVSmNFX0&list=PLMCXHnjXnTnvo6alSjVkgxV-VH6EPyvoX&index=2) 5 | * Load Balancing / Application Delivery Controller (ADC) [Read1](https://www.citrix.com/en-in/solutions/app-delivery-and-security/load-balancing/what-is-load-balancing.html#:~:text=Load%20balancing%20is%20a%20core,responsiveness%20and%20prevent%20server%20overload.) [Read2](https://logz.io/blog/best-open-source-load-balancers/) [Watch1](https://www.youtube.com/watch?v=K0Ta65OqQkY&list=PLMCXHnjXnTnvo6alSjVkgxV-VH6EPyvoX&index=3) 6 | * Consistent Hashing [Watch1](https://www.youtube.com/watch?v=zaRkONvyGr8&list=PLMCXHnjXnTnvo6alSjVkgxV-VH6EPyvoX&index=4) [Read1](https://medium.com/system-design-blog/consistent-hashing-b9134c8a9062) [Read2](https://www.toptal.com/big-data/consistent-hashing) [Read3](https://en.wikipedia.org/wiki/Consistent_hashing) 7 | * Throughput, Latency 8 | * CAP theorem 9 | * ACID vs BASE 10 | * Redundancy and Replication 11 | * Partitioning/Sharding 12 | * Optimistic vs pessimistic locking 13 | * Strong vs eventual consistency 14 | * SQL vs NoSQL 15 | * Types of NoSQL (Key value, Wide column, Document-based, Graph-based) 16 | * Caching 17 | * Data center/racks/hosts 18 | * CPU/memory/Hard drives/Network bandwidth 19 | * Random vs sequential read/writes to disk 20 | * DNS lookup 21 | * HTTP, HTTPS, HTTP2 22 | * HTTP 23 | * HTTPS [Read1](https://www.thesslstore.com/blog/how-does-https-work/) 24 | * HTTP & SSL/TLS 25 | * Public key infrastructure and certificate authority(CA) 26 | * Symmetric vs asymmetric encryption 27 | * WebSockets 28 | * Long-Polling vs WebSockets vs Server-Sent Events 29 | * TCP/IP model 30 | * IPv4 vs IPv6 31 | * TCP vs UDP 32 | * Consistent Hashing 33 | * CDNs & Edges 34 | * Data Partitioning 35 | * Indexes 36 | * Master-Slave, Master-Master 37 | * Active-Passive, Active-Active 38 | * Leader election 39 | * Design patterns and Object-oriented design 40 | * Virtual machines and containers 41 | * Pub-sub architecture 42 | * REST, GraphQL 43 | * MapReduce 44 | * Bloom filters and Count-Min sketch 45 | * Paxos 46 | * Multithreading, locks, synchronization, CAS(compare and set) 47 | * Proxies 48 | 49 | ## Building Blocks of Any Frequently Asked System Design Question 50 | * Authentication 51 | * JWT 52 | * OAUTH2 53 | * File / Media Upload 54 | * S3, Multiple Quality Files 55 | * 56 | * WIP... 57 | 58 | 59 | ## Tools and Technologies 60 | * Databases [Comparison](https://db-engines.com/en/system/InfluxDB%3BMicrosoft+Azure+Cosmos+DB%3BTimescaleDB) 61 | * Cassandra 62 | * MongoDB/Couchbase 63 | * Mongo: [Read1](https://www.tutorialspoint.com/mongodb/mongodb_overview.htm), [Read2](https://docs.mongodb.com/manual/tutorial/getting-started/), [Read3](https://www.dotnettricks.com/learn/mongodb/what-is-mongodb-and-why-to-use-it), [Read4](https://studio3t.com/knowledge-base/articles/mongodb-advantages-use-cases/), [Read5](https://www.objectrocket.com/blog/mongodb/top-use-cases-for-mongodb/), [Read6](https://docs.mongodb.com/manual/core/replica-set-elections/), [IQ's](https://www.interviewbit.com/mongodb-interview-questions/) 64 | * RabbitMQ / Kafka / Pub-Sub [comparison](https://www.cloudamqp.com/blog/when-to-use-rabbitmq-or-apache-kafka.html) [Comparison](https://engineering.3ap.ch/post/rabbitmq-vs-pubsub-part-1/) 65 | * RabbitMQ: [Watch1](https://www.youtube.com/watch?v=deG25y_r6OY), [Watch2](https://www.youtube.com/watch?v=WmBwTtE5PTQ) 66 | * Google PubSub: [Watch Playlist](https://www.youtube.com/watch?v=cvu53CnZmGI&list=PLIivdWyY5sqKwVLe4BLJ-vlh9r9zCdOse) 67 | * Mysql / PostgreSQL 68 | * Scalability in Postgres 69 | * Redis / Memcached 70 | * InfluxDB [Suitable for TimeSeries, IoT data] 71 | * Zookeeper 72 | * NGINX 73 | * HAProxy 74 | * Solr, Elastic search 75 | * Amazon, EC2, S3 76 | * Docker, Kubernetes 77 | * Hadoop/Spark and HDFS 78 | * Eureka, Hysterix 79 | * Heroku / Azure DevOps 80 | * Jenkins CI/CD 81 | 82 | ## System Design Problems (HLD + LLD) 83 | * TinyURL 84 | * Instagram | Photo hosting platform 85 | * Timeline | Newsfeed | Twitter 86 | * Dropbox | Google Drive 87 | * Whatsapp | Facebook Messenger [NL](https://www.youtube.com/watch?v=L7LtmfFYjc4&t=690s) [GS](https://www.youtube.com/watch?v=vvhC64hQZMk&t=1267s) [Ref](https://medium.com/@thinkwik/web-sockets-vs-xmpp-which-is-better-for-chat-application-113e3520b327) 88 | * MakeMyTrip | BookMyShow 89 | * Amazon | Flipkart 90 | * Youtube | Netflix [NL](https://medium.com/@narengowda/netflix-system-design-dbec30fede8d) 91 | * Uber | IRCTC 92 | * Swiggy | Zomato 93 | * Yelp | Nearby 94 | * Twitter Search 95 | * Google Search 96 | * SplitWise 97 | * Zerodha 98 | * API Rate Limiter 99 | * Web Crawler 100 | * Rate limiting system 101 | * Distributed cache 102 | * Typeahead Suggestion | Auto-complete system 103 | * Recommendation System 104 | * Design a tagging system like tags used in LinkedIn 105 | 106 | ## Low Level Design Problems (Machine Coding Round) [Reference](https://www.linkedin.com/pulse/cracking-he-low-level-design-lld-interview-shashi-bhushan-kumar/) 107 | * Elevator system 108 | * Snake and Ladder game 109 | * Tic Tac Toe 110 | * ATM machine - https://medium.com/swlh/atm-an-object-oriented-design-e3a2435a0830 111 | * Traffic Control System 112 | * Vehicle Parking System 113 | * Online Coding Platform [problem-statement](https://github.com/hocyadav/leetcode-lld-flipkart-coding-blox) 114 | * File Sharing System 115 | * Object Oriented Design Prerations [https://www.oodesign.com/] 116 | * SOLID Principles 117 | * Design Patterns [https://refactoring.guru/design-patterns] 118 | * [More Problems List](https://workat.tech/machine-coding/article/how-to-practice-for-machine-coding-kp0oj3sw2jca) 119 | * More Good Resources: 120 | * https://refactoring.guru/design-patterns/what-is-pattern 121 | * http://www.cs.unibo.it/~cianca/wwwpages/ids/esempi/coffee.pdf Recomended by - [sudoCode](https://www.youtube.com/watch?v=B3zrIwz_t4M) 122 | * https://cseweb.ucsd.edu//~wgg/CSE210/ecoop93-patterns.pdf Recomended by - [sudoCode](https://www.youtube.com/watch?v=B3zrIwz_t4M) 123 | 124 | ## Engineering Blogs [Ref](https://github.com/mrbajaj/engineering-blogs/blob/master/README.md) 125 | [Airbnb](http://nerds.airbnb.com/) 126 | [AirPair](https://www.airpair.com/posts) 127 | [Artsy](http://artsy.github.io/) 128 | [Asana](https://eng.asana.com/) 129 | [Bandcamp](http://bandcamptech.wordpress.com/) 130 | [BenefitFocus](http://engineering.benefitfocus.com/) 131 | [Bitly](http://word.bitly.com/) 132 | [Bittorrent](http://engineering.bittorrent.com/) 133 | [Cerner](http://engineering.cerner.com/) 134 | [Chartbeat](http://engineering.chartbeat.com/) 135 | [Cloudera](http://blog.cloudera.com/blog/) 136 | [Cloudflare](http://blog.cloudflare.com/) 137 | [Docker](http://blog.docker.com/category/engineering/) 138 | [Dropbox](https://blogs.dropbox.com/tech/) 139 | [Ebay](http://www.ebaytechblog.com/) 140 | [Etsy](https://codeascraft.com/) 141 | [Eventbrite](https://engineering.eventbrite.com/) 142 | [Facebook](https://code.facebook.com/posts/) 143 | [Flickr](http://code.flickr.net/) 144 | [Fiftythree](http://making.fiftythree.com/) 145 | [Flipboard](http://engineering.flipboard.com/) 146 | [Foursquare](http://engineering.foursquare.com/) 147 | [Github](http://githubengineering.com/) 148 | [Gnip](https://engineering.gnip.com/) 149 | [GoSquared](https://engineering.gosquared.com/) 150 | [Grouper](http://eng.joingrouper.com/) 151 | [Groupon](https://engineering.groupon.com/) 152 | [Harry's](http://engineering.harrys.com/) 153 | [Heroku](http://engineering.heroku.com/) 154 | [Honeybadger](http://blog.honeybadger.io/) 155 | [Indeed](http://engineering.indeed.com/blog/) 156 | [Instagram](http://instagram-engineering.tumblr.com/) 157 | [Intent](http://engineering.intenthq.com/) 158 | [Linkedin](https://engineering.linkedin.com/blog) 159 | [Livechat](http://developers.livechatinc.com/blog/) 160 | [Medallia](http://engineering.medallia.com/blog/) 161 | [Monetate](http://engineering.monetate.com/) 162 | [Netflix](http://techblog.netflix.com/) 163 | [Oyster](http://tech.oyster.com/) 164 | [Paypal](https://www.paypal-engineering.com/) 165 | [Pinterest](http://engineering.pinterest.com/) 166 | [Prezi](https://medium.com/prezi-engineering) 167 | [Quora](http://engineering.quora.com/) 168 | [Rightscale](http://eng.rightscale.com/) 169 | [Salesforce](https://developer.salesforce.com/blogs/engineering/) 170 | [Shopify](http://www.shopify.com/technology) 171 | [Simple](https://www.simple.com/engineering) 172 | [Slideshare](http://engineering.slideshare.net/) 173 | [Songkick](http://devblog.songkick.com/) 174 | [Soundcloud](https://developers.soundcloud.com/blog/) 175 | [Spotify](https://labs.spotify.com/) 176 | [Square](https://corner.squareup.com/) 177 | [Strava](http://engineering.strava.com/) 178 | [Tumblr](http://engineering.tumblr.com/) 179 | [Twitter](https://blog.twitter.com/engineering) 180 | [Twilio](https://www.twilio.com/engineering/) 181 | [Thumbtack](https://www.thumbtack.com/engineering/) 182 | [Wayfair](http://engineering.wayfair.com/) 183 | [Wealthfront](http://eng.wealthfront.com/) 184 | [Webengage](http://engineering.webengage.com/) 185 | [Yahoo](http://yahooeng.tumblr.com/) 186 | [Yammer](http://engineeringblog.yelp.com/) 187 | [Yelp](http://engineeringblog.yelp.com/) 188 | [Zenpayroll](http://engineering.zenpayroll.com/) 189 | [Zillow](https://engineering.zillow.com/) 190 | 191 | ## Other Useful Resources: 192 | * HOW TO ACE A SYSTEMS DESIGN INTERVIEW-https://www.palantir.com/2011/10/how-to-ace-a-systems-design-interview/ 193 | * HighScalability Blog-http://highscalability.com/ 194 | * Distributed Systems-http://book.mixu.net/distsys/single-page.html 195 | * Distributed Deep Dive - https://ably.com/blog/introducing-distributed-deep-dive-interview-series-by-ably-realtime 196 | * Architecture for microservice by Microsoft - https://docs.microsoft.com/en-us/dotnet/architecture/microservices/ 197 | 198 | ## Golden Rules to Remember 199 |
200 | 1.  If we are dealing with a read-heavy system, it's good to consider using a Cache.
201 | 
202 | 2.  If we need low latency in the system, it's good to consider using a Cache & CDN.
203 | 
204 | 3.  If we are dealing with a write-heavy system, it's good to use a Message Queue for async processing OR Append only logs
205 | 
206 | 4.  If we need a system to be an ACID complaint, we should go for RDBMS or SQL Database
207 | 
208 | 5.  If data is unstructured & doesn't require ACID properties, we should go for NoSQL Database
209 | 
210 | 6.  If the system has complex data in the form of videos, images, files etc, we should go for Blob/Object storage
211 | 
212 | 7.  If the system requires complex/heavy pre-computation like a news feed, we should use a Message Queue & Cache
213 | 
214 | 8.  If the system requires searching data in high volume, we should consider using a search index, tries or a search engine like Elasticsearch
215 | 
216 | 9.  If the system requires to Scale SQL Database, we should consider using Database Sharding & Partitioning
217 | 
218 | 10. If the system requires High Availability, Performance, & Throughput, we should consider using a Load Balancer
219 | 
220 | 11. If the system requires faster data delivery globally, reliability, high availability, & performance, we should consider using a CDN
221 | 
222 | 12. If the system has data with nodes, edges, and relationships like friend lists, & road connections, we should consider using a Graph Database
223 | 
224 | 13. If the system needs scaling of various components like servers, databases, etc, we should consider using Horizontal Scaling
225 | 
226 | 14. If the system requires high-performing database queries, we should use Database Indexes
227 | 
228 | 15. If the system requires bulk job processing, we should consider using Batch Processing & Message Queues
229 | 
230 | 16. If the system requires reducing server load and preventing DOS  attacks, we should use a Rate Limiter
231 | 
232 | 17. If the system has microservices, we should consider using an API Gateway (Authentication, SSL Termination, Routing etc)
233 | 
234 | 18. If the system has a single point of failure, we should implement Redundancy in that component
235 | 
236 | 19. If the system needs to be fault-tolerant, & durable, we should implement Data Replication (creating multiple copies of data on different servers)
237 | 
238 | 20. If the system needs user-to-user communication (bi-directional) in a fast way, we should use Websockets
239 | 
240 | 21. If the system needs the ability to detect failures in a distributed system, we should implement a Heartbeat
241 | 
242 | 22. If the system needs to ensure data integrity, we should use Checksum Algorithm
243 | 
244 | 23. If the system needs to scale servers with add/removal of nodes efficiently, with no hotspots, we should implement Consistent Hashing
245 | 
246 | 24. If the system needs to transfer data between various servers in a decentralized way, we should go for\
247 |     Gossip Protocol
248 | 
249 | 25. If the system needs anything to deal with a location like maps, nearby resources, we should consider using Quadtree, Geohash etc
250 | 
251 | 26. Avoid using any specific technology names such as - Kafka, S3, or EC2. Try to use more generic names like message queues, object storage etc
252 | 
253 | 27. If High Availability is required in the system, it's better to mention that the system cannot have strong consistency. Eventual Consistency is possible
254 | 
255 | 28. If asked how domain name query in the browser works and resolves IP addresses. Try to sketch or mention about DNS (Domain Name System)
256 | 
257 | 29. If asked how to limit the huge amount of data for a network request like youtube search, trending videos etc. One way is to implement Pagination which limits response data.
258 | 
259 | 30. If asked which policy you would use to evict a Cache. The preferred/asked Cache eviction policy is LRU (Least Recently Used) Cache. Prepare around its Data Structure and Implementation.
260 | 
261 | 
262 | 
263 | 264 | Credit: https://leetcode.com/discuss/interview-question/system-design/3616948/golden-rules-to-answer-in-a-system-design-interview 265 | 266 | ## System Design Interview Approach Template 267 | ### THINGS TO CONSIDER [5 min] 268 |
    (1) Features
269 |     (2) API
270 |     (3) Availability
271 |     (4) Latency
272 |     (5) Scalability
273 |     (6) Durability
274 |     (7) Class Diagram
275 |     (8) Security and Privacy
276 |     (9) Cost-effective
277 | 
278 | ### FEATURE EXPECTATIONS [5 min] 279 |
    (1) Use cases
280 |     (2) Scenarios that will not be covered
281 |     (3) Who will use
282 |     (4) How many will use
283 |     (5) Usage patterns
284 | 
285 | ### ESTIMATIONS [5 min] 286 |
    (1) Throughput (QPS for read and write queries)
287 |     (2) Latency expected from the system (for read and write queries)
288 |     (3) Read/Write ratio
289 |     (4) Traffic estimates
290 |             - Write (QPS, Volume of data)
291 |             - Read  (QPS, Volume of data)
292 |     (5) Storage estimates
293 |     (6) Memory estimates
294 |             - If we are using a cache, what is the kind of data we want to store in cache
295 |             - How much RAM and how many machines do we need for us to achieve this ?
296 |             - Amount of data you want to store in disk/ssd
297 | 
298 | ### DESIGN GOALS [5 min] 299 |
    (1) Latency and Throughput requirements
300 |     (2) Consistency vs Availability  [Weak/strong/eventual => consistency | Failover/replication => availability]
301 | 
302 | ### HIGH LEVEL DESIGN [5-10 min] 303 |
    (1) APIs for Read/Write scenarios for crucial components
304 |     (2) Database schema
305 |     (3) Basic algorithm
306 |     (4) High level design for Read/Write scenario
307 | 
308 | ### DEEP DIVE [15-20 min] 309 |
    (1) Scaling the algorithm
310 |     (2) Scaling individual components: 
311 |             -> Availability, Consistency and Scale story for each component
312 |             -> Consistency and availability patterns
313 |     #### Think about the following components, how they would fit in and how it would help
314 |             a) DNS
315 |             b) CDN [Push vs Pull]
316 |             c) Load Balancers [Active-Passive, Active-Active, Layer 4, Layer 7]
317 |             d) Reverse Proxy
318 |             e) Application layer scaling [Microservices, Service Discovery]
319 |             f) DB [RDBMS, NoSQL]
320 |                     > RDBMS 
321 |                         >> Master-slave, Master-master, Federation, Sharding, Denormalization, SQL Tuning
322 |                     > NoSQL
323 |                         >> Key-Value, Wide-Column, Graph, Document
324 |                             Fast-lookups:
325 |                             -------------
326 |                                 >>> RAM  [Bounded size] => Redis, Memcached
327 |                                 >>> AP [Unbounded size] => Cassandra, RIAK, Voldemort
328 |                                 >>> CP [Unbounded size] => HBase, MongoDB, Couchbase, DynamoDB
329 |             g) Caches
330 |                     > Client caching, CDN caching, Webserver caching, Database caching, Application caching, Cache @Query level, Cache @Object level
331 |                     > Eviction policies:
332 |                             >> Cache aside
333 |                             >> Write through
334 |                             >> Write behind
335 |                             >> Refresh ahead
336 |             h) Asynchronism
337 |                     > Message queues
338 |                     > Task queues
339 |                     > Back pressure
340 |             i) Communication
341 |                     > TCP
342 |                     > UDP
343 |                     > REST
344 |                     > RPC
345 | 
346 | ### JUSTIFY [5 min] 347 |
(1) Throughput of each layer
348 | (2) Latency caused between each layer
349 | (3) Overall latency justification
350 | 
351 | 352 | #### More Resources: 353 | * 25 Interview Questions 354 | * 25 Interview Questions 355 | * High-Scalability 356 | * Hired In Tech 357 | * workat.tech 358 | * System Design 359 | * SYSTEM DESIGN PREPARATION 360 | * The System Design Primer 361 | * Gaurav Sen Playlist 362 | * Narendra L - Tech Dummies 363 | * low-level-design-primer 364 | * [System Design Interesting Reads](https://docs.google.com/document/d/1iKk6vJbWtI02AllnIEZTrKWQb4dT2QthJdRt05vq6Hw/edit) 365 | * [Real Time Analytics on Big Data Architecture](https://docs.microsoft.com/en-us/azure/architecture/solution-ideas/articles/real-time-analytics) 366 | * [how-i-finally-got-some-awesome-offers](https://leetcode.com/discuss/interview-experience/1172461/how-i-finally-got-some-awesome-offers) 367 | * [Pragmatic Programming Techniques](http://horicky.blogspot.com/2010/10/scalable-system-design-patterns.html) 368 | * [Multithreading](http://www.albahari.com/threading/) 369 | * [Helpful list of LeetCode Posts on System Design](https://leetcode.com/discuss/interview-question/1140451/helpful-list-of-leetcode-posts-on-system-design-at-facebook-google-amazon-uber-microsoft) 370 | * [Booking.com interview exp] https://leetcode.com/discuss/interview-experience/1184565/Booking-or-Amsterdam-or-Senior-Java-Developer-or-Apr-2021-Offer 371 | 372 | #### Credit: 373 | * https://leetcode.com/discuss/career/229177/My-System-Design-Template 374 | * https://www.youtube.com/watch?time_continue=1&v=UzLMhqg3_Wc&feature=emb_logo 375 | --------------------------------------------------------------------------------