├── .editorconfig
├── .travis.yml
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── LICENSE
├── README.md
├── algorithms
├── README.md
├── array.md
├── bit-manipulation.md
├── dynamic-programming.md
├── geometry.md
├── graph.md
├── hash-table.md
├── heap.md
├── interval.md
├── linked-list.md
├── math.md
├── matrix.md
├── oop.md
├── permutation.md
├── queue.md
├── sorting-searching.md
├── stack.md
├── string.md
├── topics.md
└── tree.md
├── assets
└── book.svg
├── design
├── README.md
├── collaborative-editor.md
├── news-feed.md
└── search-engine.md
├── domain
├── async-loading
│ └── index.html
├── databases.md
├── networking.md
├── pagination-sorting
│ ├── data.js
│ └── index.html
├── security.md
├── snake-game
│ └── snake-game.md
├── software-engineering.md
└── tic-tac-toe
│ └── index.html
├── front-end
└── README.md
├── interviewers
└── basics.md
├── non-technical
├── behavioral.md
├── cover-letter.md
├── interview-formats.md
├── negotiation.md
├── psychological-tricks.md
├── questions-to-ask.md
├── resume.md
└── self-introduction.md
├── preparing
├── README.md
└── cheatsheet.md
└── utilities
├── javascript
├── binToInt.js
├── binarySearch.js
├── deepEqual.js
├── graphTopoSort.js
├── intToBin.js
├── intervalsIntersect.js
├── intervalsMerge.js
├── isSubsequence.js
├── matrixClone.js
├── matrixTranspose.js
├── matrixTraverse.js
├── mergeSort.js
├── treeEqual.js
└── treeMirror.js
└── python
├── binary_search.py
├── char_prime_map.py
├── graph_dfs.py
├── graph_topo_sort.py
├── heap.py
├── is_subsequence.py
├── linked_list.py
├── quick_select.py
├── rabin_karp_hash.py
├── tree_equal.py
├── tree_mirror.py
├── tree_traversal.py
├── trie.py
└── union_find.py
/.editorconfig:
--------------------------------------------------------------------------------
1 | root = true
2 |
3 | [*]
4 | end_of_line = lf
5 | insert_final_newline = true
6 | trim_trailing_whitespace = true
7 |
8 | [*.{js,py}]
9 | charset = utf-8
10 | indent_style = space
11 | indent_size = 4
12 |
--------------------------------------------------------------------------------
/.travis.yml:
--------------------------------------------------------------------------------
1 | install:
2 | - gem install awesome_bot
3 |
4 | script:
5 | - awesome_bot **/*.md --allow-dupe --allow-redirect --allow 429 --skip-save-results
6 |
--------------------------------------------------------------------------------
/CODE_OF_CONDUCT.md:
--------------------------------------------------------------------------------
1 | # Code of Conduct
2 |
3 | We have adopted the same Code of Conduct as Facebook that we expect project participants to adhere to. Please read [the full text](https://code.facebook.com/codeofconduct) so that you can understand what actions will and will not be tolerated.
4 |
--------------------------------------------------------------------------------
/CONTRIBUTING.md:
--------------------------------------------------------------------------------
1 | ## Contributing
2 |
3 | When contributing to this repository, if it is a non-trivial change, please first discuss the change you wish to make via creating an issue in this repository.
4 |
5 | As much as possible, try to follow the existing format of markdown and code. JavaScript code should adopt [Standard style](https://standardjs.com/).
6 |
7 | Please note we have a Code of Conduct, please follow it in all your interactions with the project.
8 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2017-Present Yangshun Tay
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 |
13 |
14 | ## What is this?
15 |
16 | Carefully curated content to help you ace your next technical interview, with a focus on algorithms. System design questions are in-progress. Besides the usual algorithm questions, other **awesome** stuff includes:
17 |
18 | * [How to prepare](preparing) for coding interviews
19 | * [Interview Cheatsheet](preparing/cheatsheet.md) - Straight-to-the-point Do's and Don'ts 🆕
20 | * [Algorithm tips and the best practice questions](algorithms) categorized by topic
21 | * ["Front-end Job Interview Questions" answers](https://github.com/yangshun/front-end-interview-handbook)
22 | * [Interview formats](non-technical/interview-formats.md) of the top tech companies
23 | * [Behavioral questions](non-technical/behavioral.md) categorized by companies
24 | * [Good questions to ask your interviewers](non-technical/questions-to-ask.md) at the end of the interviews
25 | * [Helpful resume tips](non-technical/resume.md) to get your resume noticed and the Do's and Don'ts
26 |
27 | This handbook is pretty new and help from you in contributing content would be very much appreciated!
28 |
29 | ## Why do I want this?
30 |
31 | This repository has _practical_ content that covers all phases of a technical interview, from applying for a job to passing the interviews to offer negotiation. Technically competent candidates might still find the non-technical content helpful as well.
32 |
33 | ## Who is this for?
34 |
35 | Anybody who wants to land a job at a tech company but is new to technical interviews, seasoned engineers who have not been on the other side of the interviewing table in a while and want to get back into the game, or anyone who wants to be better at technical interviewing.
36 |
37 | ## How is this repository different?
38 |
39 | There are so many awesome books like [Cracking the Coding Interview](http://www.crackingthecodinginterview.com/) and interview-related repositories out there on GitHub, what makes this repository different? The difference is that many existing interview repositories contain mainly links to external resources whereas this repository contains top quality curated content directly for your consumption.
40 |
41 | Also, existing resources focus mainly on algorithm questions and lack coverage for more domain-specific and non-technical questions. This handbook aims to cover content beyond the typical algorithmic coding questions. 😎
42 |
43 | ## Looking for Front End content?
44 |
45 | Front end-related content have been extracted out into a separate repository - [Front End Interview Handbook](https://github.com/yangshun/front-end-interview-handbook).
46 |
47 | ## Contents
48 |
49 | * **[Preparing for a Coding Interview](preparing)**
50 | * [Interview cheatsheet](preparing/cheatsheet.md) - Straight-to-the-point Do's and Don'ts
51 | * **[Algorithm Questions](algorithms)** - Questions categorized by topics
52 | * **[Design Questions](design)**
53 | * **[Front-end Job Interview Questions and Answers](https://github.com/yangshun/front-end-interview-handbook) 🔥⭐** - Answers to the famous "Front-end Job Interview Questions"
54 | * **[Non-Technical Tips](non-technical)** - Random non-technical tips that cover behavioral and psychological aspects, interview formats and "Do you have any questions for me?"
55 | * [Resume Tips](non-technical/resume.md)
56 | * [Behavioral Questions](non-technical/behavioral.md)
57 | * [Interview Formats](non-technical/interview-formats.md)
58 | * [Psychological Tricks](non-technical/psychological-tricks.md)
59 | * [Questions to Ask](non-technical/questions-to-ask.md)
60 | * [Negotiation Tips](non-technical/negotiation.md)
61 | * **[Utilities](utilities)** - Snippets of algorithms/code that will help in coding questions
62 | * **UPDATE** - Check out [Lago](https://github.com/yangshun/lago), which is a Data Structures and Algorithms library that contains more high-quality implementations with 100% test coverage.
63 |
64 | ## Related
65 |
66 | If you are interested in how data structures are implemented, check out [Lago](https://github.com/yangshun/lago), a Data Structures and Algorithms library for JavaScript. It is pretty much still WIP but I intend to make it into a library that is able to be used in production and also a reference resource for revising Data Structures and Algorithms.
67 |
68 | ## Contributing
69 |
70 | There are no formal contributing guidelines at the moment as things are still in flux and we might find a better approach to structure content as we go along. You are welcome to contribute whatever you think will be helpful to fellow engineers. If you would like to contribute content for different domains, feel free to create an issue or submit a pull request and we can discuss further.
71 |
72 | ## Maintainers
73 |
74 | * [Yangshun Tay](https://github.com/yangshun)
75 | * [Louie Tan](https://github.com/louietyj)
76 |
--------------------------------------------------------------------------------
/algorithms/array.md:
--------------------------------------------------------------------------------
1 | Arrays
2 | ==
3 |
4 | - In an array of arrays, e.g. given `[[], [1, 2, 3], [4, 5], [], [], [6, 7], [8], [9, 10], [], []]`, print: `1, 2, 3, 4, 5, 6, 7, 8, 9, 10`.
5 | - Implement an iterator that supports `hasNext()`, `next()` and `remove()` methods.
6 | - Given a list of item prices, find all possible combinations of items that sum a particular value `K`.
7 | - Paginate an array with constraints, such as skipping certain items.
8 | - Implement a circular buffer using an array.
9 | - Given array of arrays, sort them in ascending order.
10 | - Given an array of integers, print out a histogram using the said array; include a base layer (all stars)
11 | - E.g. `[5, 4, 0, 3, 4, 1]`
12 |
13 | ```
14 | *
15 | ** *
16 | ** **
17 | ** **
18 | ** ***
19 | ******
20 | ```
21 |
22 | - Given an array and an index, find the product of the elements of the array except the element at that index.
23 | - Given a set of rectangles represented by a height and an interval along the y-axis, determine the size of its union.
24 | - Given 2 separate arrays, write a method to find the values that exist in both arrays and return them.
25 | - Given an array of integers find whether there is a sub-sequence that sums to 0 and return it.
26 | - E.g. `[1, 2, -3, 1]` => `[1, 2, -3]` or `[2, -3, 1]`.
27 | - Given an input array and another array that describes a new index for each element, mutate the input array so that each element ends up in their new index. Discuss the runtime of the algorithm and how you can be sure there would not be any infinite loops.
28 | - Given an array of non-negative numbers, find continuous subarray with sum to S.
29 | - [Source](http://blog.gainlo.co/index.php/2016/06/01/subarray-with-given-sum/).
30 | - Given an array of numbers list out all triplets that sum to 0. Do so with a running time of less than O(n^3).
31 | - [Source](http://blog.gainlo.co/index.php/2016/07/19/3sum/).
32 | - Given an array of numbers list out all quadruplets that sum to 0. Do so with a running time of less than O(n^4).
33 | - Given an array of integers, move all the zeroes to the end while preserving the order of the other elements. You have to do it in-place and are not allowed to use any extra storage.
34 | - Given an array of integers, find the subarray with the largest sum. Can you do it in linear time.
35 | - Maximum subarray sum problem.
36 | - You have an array with the heights of an island (at point 1, point 2 etc) and you want to know how much water would remain on this island (without flowing away).
37 | - Trapping rain water question.
38 | - Given an array containing only digits `0-9`, add one to the number and return the array.
39 | - E.g. Given `[1, 4, 2, 1]` which represents `1421`, return `[1, 4, 2, 2]` which represents `1422`.
40 | - Find the second maximum value in an array.
41 | - Given an array, find the longest arithmetic progression.
42 | - Rotate an array by an offset of k.
43 | - Remove duplicates in an unsorted array where the duplicates are at a distance of k or less from each other.
44 | - Given an unsorted list of integers, return true if the list contains any duplicates within k indices of each element. Do it faster than O(n^2).
45 | - Given an unsorted list of integers, return true if the list contains any fuzzy duplicates within k indices of each element. A fuzzy duplicate is another integer within d of the original integer. Do it faster than O(n^2).
46 | - E.g. If d = 4, then 6 is a fuzzy duplicate of 3 but 8 is not.
47 | - Say you have an unordered list of numbers ranging from 1 to n, and one of the numbers is removed, how do you find that number? What if two numbers are removed?
48 | - Given an array of string, find the duplicated elements.
49 | - [Source](http://blog.gainlo.co/index.php/2016/05/10/duplicate-elements-of-an-array/).
50 | - Given an array of integers, find a maximum sum of non-adjacent elements.
51 | - E.g. `[1, 0, 3, 9, 2]` should return `10 (1 + 9)`.
52 | - [Source](http://blog.gainlo.co/index.php/2016/12/02/uber-interview-question-maximum-sum-non-adjacent-elements/)
53 | - Given an array of integers, modify the array by moving all the zeros to the end (right side). The order of other elements doesn't matter.
54 | - E.g. `[1, 2, 0, 3, 0, 1, 2]`, the program can output `[1, 2, 3, 1, 2, 0, 0]`.
55 | - [Source](http://blog.gainlo.co/index.php/2016/11/18/uber-interview-question-move-zeroes/).
56 | - Given an array, return the length of the longest increasing contiguous subarray.
57 | - E.g., `[1, 3, 2, 3, 4, 8, 7, 9]`, should return `4` because the longest increasing array is `[2, 3, 4, 8]`.
58 | - [Source](http://blog.gainlo.co/index.php/2017/02/02/uber-interview-questions-longest-increasing-subarray/).
59 | - Given an array of integers where every value appears twice except one, find the single, non-repeating value. Follow up: do so with O(1) space.
60 | - E.g., `[2, 5, 3, 2, 1, 3, 4, 5, 1]` returns 4, because it is the only value that appears in the array only once.
61 |
--------------------------------------------------------------------------------
/algorithms/bit-manipulation.md:
--------------------------------------------------------------------------------
1 | Bit Manipulation
2 | ==
3 |
4 | - How do you verify if an interger is a power of 2?
5 | - Write a program to print the binary representation of an integer.
6 | - Write a program to print out the number of 1 bits in a given integer.
7 | - Write a program to determine the largest possible integer using the same number of 1 bits in a given number.
8 |
--------------------------------------------------------------------------------
/algorithms/dynamic-programming.md:
--------------------------------------------------------------------------------
1 | Dynamic Programming
2 | ==
3 |
4 | - Given a flight itinerary consisting of starting city, destination city, and ticket price (2D list) - find the optimal price flight path to get from start to destination. (A variation of Dynamic Programming Shortest Path)
5 | - Given some coin denominations and a target value `M`, return the coins combination with the minimum number of coins.
6 | - Time complexity: `O(MN)`, where N is the number of distinct type of coins.
7 | - Space complexity: `O(M)`.
8 | - Given a set of numbers in an array which represent a number of consecutive days of Airbnb reservation requested, as a host, pick the sequence which maximizes the number of days of occupancy, at the same time, leaving at least a 1-day gap in-between bookings for cleaning.
9 | - The problem reduces to finding the maximum sum of non-consecutive array elements.
10 | - E.g.
11 | ~~~
12 | // [5, 1, 1, 5] => 10
13 | The above array would represent an example booking period as follows -
14 | // Dec 1 - 5
15 | // Dec 5 - 6
16 | // Dec 6 - 7
17 | // Dec 7 - 12
18 |
19 | The answer would be to pick Dec 1-5 (5 days) and then pick Dec 7-12 for a total of 10 days of
20 | occupancy, at the same time, leaving at least 1-day gap for cleaning between reservations.
21 |
22 | Similarly,
23 | // [3, 6, 4] => 7
24 | // [4, 10, 3, 1, 5] => 15
25 | ~~~
26 | - Given a list of denominations (e.g., `[1, 2, 5]` means you have coins worth $1, $2, and $5) and a target number `k`, find all possible combinations, if any, of coins in the given denominations that add up to `k`. You can use coins of the same denomination more than once.
27 |
--------------------------------------------------------------------------------
/algorithms/geometry.md:
--------------------------------------------------------------------------------
1 | Geometry
2 | ==
3 |
4 | - You have a plane with lots of rectangles on it, find out how many of them intersect.
5 | - Which data structure would you use to query the k-nearest points of a set on a 2D plane?
6 | - Given many points, find k points that are closest to the origin.
7 | - How would you triangulate a polygon?
8 |
--------------------------------------------------------------------------------
/algorithms/graph.md:
--------------------------------------------------------------------------------
1 | Graph
2 | ==
3 |
4 | - Given a list of sorted words from an alien dictionary, find the order of the alphabet.
5 | - Alien Dictionary Topological Sort question.
6 | - Find if a given string matches any path in a labeled graph. A path may contain cycles.
7 | - Given a bipartite graph, separate the vertices into two sets.
8 | - You are a thief trying to sneak across a rectangular 100 x 100m field. There are alarms placed on the fields and they each have a circular sensing radius which will trigger if anyone steps into it. Each alarm has its own radius. Determine if you can get from one end of the field to the other end.
9 | - Given a graph and two nodes, determine if there exists a path between them.
10 | - Determine if a cycle exists in the graph.
11 |
--------------------------------------------------------------------------------
/algorithms/hash-table.md:
--------------------------------------------------------------------------------
1 | Hash Table
2 | ==
3 |
4 | - Describe an implementation of a least-used cache, and big-O notation of it.
5 | - A question involving an API's integration with hash map where the buckets of hash map are made up of linked lists.
6 | - Implement data structure `Map` storing pairs of integers (key, value) and define following member functions in O(1) runtime: `void insert(key, value)`, `void delete(key)`, `int get(key)`, `int getRandomKey()`.
7 | - [Source](http://blog.gainlo.co/index.php/2016/08/14/uber-interview-question-map-implementation/).
8 |
--------------------------------------------------------------------------------
/algorithms/heap.md:
--------------------------------------------------------------------------------
1 | Heap
2 | ==
3 |
4 | - Merge `K` sorted lists together into a single list.
5 | - Given a stream of integers, write an efficient function that returns the median value of the integers.
6 |
--------------------------------------------------------------------------------
/algorithms/interval.md:
--------------------------------------------------------------------------------
1 | Interval
2 | ==
3 |
4 | - Given a list of schedules, provide a list of times that are available for a meeting.
5 | ```
6 | [
7 | [[4,5], [6,10], [12,14]],
8 | [[4,5], [5,9], [13,16]],
9 | [[11,14]]
10 | ]
11 |
12 | Example Output:
13 | [[0,4], [11,12], [16,23]]
14 | ```
15 | - You have a number of meetings (with their start and end times). You need to schedule them using the minimum number of rooms. Return the list of meetings in every room.
16 | - Interval ranges:
17 | - Given 2 interval ranges, create a function to tell me if these ranges intersect. Both start and end are inclusive: `[start, end]`
18 | - E.g. `[1, 4]` and `[5, 6]` => `false`
19 | - E.g. `[1, 4]` and `[3, 6]` => `true`
20 | - Given 2 interval ranges that intersect, now create a function to merge the 2 ranges into a single continuous range.
21 | - E.g. `[1, 4]` and `[3, 6]` => `[1, 6]`
22 | - Now create a function that takes a group of unsorted, unorganized intervals, merge any intervals that intersect and sort them. The result should be a group of sorted, non-intersecting intervals.
23 | - Now create a function to merge a new interval into a group of sorted, non-intersecting intervals. After the merge, all intervals should remain
24 | non-intersecting.
25 | - Given a list of meeting times, check if any of them overlap. The follow-up question is to return the minimum number of rooms required to accommodate all the meetings.
26 | - [Source](http://blog.gainlo.co/index.php/2016/07/12/meeting-room-scheduling-problem/)
27 | - If you have a list of intervals, how would you merge them?
28 | - E.g. `[1, 3], [8, 11], [2, 6]` => `[1, 6], [8-11]`
29 |
--------------------------------------------------------------------------------
/algorithms/linked-list.md:
--------------------------------------------------------------------------------
1 | Linked List
2 | ==
3 |
4 | - Given a linked list, in addition to the next pointer, each node has a child pointer that can point to a separate list. With the head node, flatten the list to a single-level linked list.
5 | - [Source](http://blog.gainlo.co/index.php/2016/06/12/flatten-a-linked-list/)
6 | - Reverse a singly linked list. Implement it recursively and iteratively.
7 | - Convert a binary tree to a doubly circular linked list.
8 | - Implement an LRU cache with O(1) runtime for all its operations.
9 | - Check distance between values in linked list.
10 | - A question involving an API's integration with hash map where the buckets of hash map are made up of linked lists.
11 | - Given a singly linked list (a list which can only be traversed in one direction), find the item that is located at 'k' items from the end. So if the list is a, b, c, d and k is 2 then the answer is 'c'. The solution should not search the list twice.
12 | - How can you tell if a Linked List is a Palindrome?
13 |
--------------------------------------------------------------------------------
/algorithms/math.md:
--------------------------------------------------------------------------------
1 | Math
2 | ==
3 |
4 | - Create a square root function.
5 | - Given a string such as "123" or "67", write a function to output the number represented by the string without using casting.
6 | - Make a program that can print out the text form of numbers from 1 - 1000 (ex. 20 is "twenty", 105 is "one hundred and five").
7 | - Write a function that parses Roman numerals.
8 | - E.g. `XIV` returns `14`.
9 | - Write in words for a given digit.
10 | - E.g. `123` returns `one hundred and twenty three`.
11 | - Given a number `N`, find the largest number just smaller than `N` that can be formed using the same digits as `N`.
12 | - Compute the square root of `N` without using any existing functions.
13 | - Given numbers represented as binary strings, and return the string containing their sum.
14 | - E.g. `add('10010', '101')` returns `'10111'`.
15 | - Take in an integer and return its english word-format.
16 | - E.g. 1 -> "one", -10,203 -> "negative ten thousand two hundred and three".
17 | - Write a function that returns values randomly, according to their weight. Suppose we have 3 elements with their weights: A (1), B (1) and C (2). The function should return A with probability 25%, B with 25% and C with 50% based on the weights.
18 | - [Source](http://blog.gainlo.co/index.php/2016/11/11/uber-interview-question-weighted-random-numbers/)
19 | - Given a number, how can you get the next greater number with the same set of digits?
20 | - [Source](http://blog.gainlo.co/index.php/2017/01/20/arrange-given-numbers-to-form-the-biggest-number-possible/)
21 |
--------------------------------------------------------------------------------
/algorithms/matrix.md:
--------------------------------------------------------------------------------
1 | Matrix
2 | ==
3 |
4 | - You're given a 3 x 3 board of a tile puzzle, with 8 tiles numbered 1 to 8, and an empty spot. You can move any tile adjacent to the empty spot, to the empty spot, creating an empty spot where the tile originally was. The goal is to find a series of moves that will solve the board, i.e. get `[[1, 2, 3], [4, 5, 6], [7, 8, - ]]` where - is the empty tile.
5 | - Boggle implementation. Given a dictionary, and a matrix of letters, find all the words in the matrix that are in the dictionary. You can go across, down or diagonally.
6 | - The values of the matrix will represent numbers of carrots available to the rabbit in each square of the garden. If the garden does not have an exact center, the rabbit should start in the square closest to the center with the highest carrot count. On a given turn, the rabbit will eat the carrots available on the square that it is on, and then move up, down, left, or right, choosing the square that has the most carrots. If there are no carrots left on any of the adjacent squares, the rabbit will go to sleep. You may assume that the rabbit will never have to choose between two squares with the same number of carrots. Write a function which takes a garden matrix and returns the number of carrots the rabbit eats. You may assume the matrix is rectangular with at least 1 row and 1 column, and that it is populated with non-negative integers. For example,
7 | - Example: `[[5, 7, 8, 6, 3], [0, 0, 7, 0, 4], [4, 6, 3, 4, 9], [3, 1, 0, 5, 8]]` should return `27`.
8 | - Print a matrix in a spiral fashion.
9 | - In the Game of life, calculate how to compute the next state of the board. Follow up was to do it if there were memory constraints (board represented by a 1 TB file).
10 | - Grid Illumination: Given an NxN grid with an array of lamp coordinates. Each lamp provides illumination to every square on their x axis, every square on their y axis, and every square that lies in their diagonal (think of a Queen in chess). Given an array of query coordinates, determine whether that point is illuminated or not. The catch is when checking a query all lamps adjacent to, or on, that query get turned off. The ranges for the variables/arrays were about: 10^3 < N < 10^9, 10^3 < lamps < 10^9, 10^3 < queries < 10^9.
11 | - You are given a matrix of integers. Modify the matrix such that if a row or column contains a 0, make the values in the entire row or column 0.
12 | - Given an N x N matrix filled randomly with different colors (no limit on what the colors are), find the total number of groups of each color - a group consists of adjacent cells of the same color touching each other.
13 | - You have a 4 x 4 board with characters. You need to write a function that finds if a certain word exists in the board. You can only jump to neighboring characters (including diagonally adjacent).
14 | - Count the number of islands in a binary matrix of 0's and 1's.
15 | - Check a 6 x 7 Connect 4 board for a winning condition.
16 | - Given a fully-filled Sudoku board, check whether fulfills the Sudoku condition.
17 | - Implement a function that checks if a player has won tic-tac-toe.
18 | - Given an N x N matrix of 1's and 0's, figure out if all of the 1's are connected.
19 |
--------------------------------------------------------------------------------
/algorithms/oop.md:
--------------------------------------------------------------------------------
1 | Object-Oriented Programming
2 | ==
3 |
4 | - How would you design a chess game? What classes and objects would you use? What methods would they have?
5 | - How would you design the data structures for a book keeping system for a library?
6 | - Explain how you would design a HTTP server? Give examples of classes, methods, and interfaces. What are the challenges here?
7 | - Discuss algorithms and data structures for a garbage collector?
8 | - How would you implement an HR system to keep track of employee salaries and benefits?
9 |
--------------------------------------------------------------------------------
/algorithms/permutation.md:
--------------------------------------------------------------------------------
1 | Permutation
2 | ==
3 |
4 | - You are given a 7 digit phone number, and you should find all possible letter combinations based on the digit-to-letter mapping on numeric pad and return only the ones that have valid match against a given dictionary of words.
5 | - Give all possible letter combinations from a phone number.
6 | - Generate all subsets of a string.
7 | - Print all possible `N` pairs of balanced parentheses.
8 | - E.g. when `N` is `2`, the function should print `(())` and `()()`.
9 | - E.g. when `N` is `3`, we should get `((()))`, `(()())`, `(())()`, `()(())`, `()()()`.
10 | - [Source](http://blog.gainlo.co/index.php/2016/12/23/uber-interview-questions-permutations-parentheses/)
11 | - Given a list of arrays, return a list of arrays, where each array is a combination of one element in each given array.
12 | - E.g. If the input is `[[1, 2, 3], [4], [5, 6]]`, the output should be `[[1, 4, 5], [1, 4, 6], [2, 4, 5], [2, 4, 6], [3, 4, 5], [3, 4, 6]]`.
13 |
--------------------------------------------------------------------------------
/algorithms/queue.md:
--------------------------------------------------------------------------------
1 | Queue
2 | ==
3 |
4 | - Implement a Queue class from scratch with an existing bug, the bug is that it cannot take more than 5 elements.
5 | - Implement a Queue using two stacks. You may only use the standard `push()`, `pop()`, and `peek()` operations traditionally available to stacks. You do not need to implement the stack yourself (i.e. an array can be used to simulate a stack).
6 |
--------------------------------------------------------------------------------
/algorithms/sorting-searching.md:
--------------------------------------------------------------------------------
1 | Sorting and Searching
2 | ==
3 |
4 | - Sorting search results on a page given a certain set of criteria.
5 | - Sort a list of numbers in which each number is at a distance `K` from its actual position.
6 | - Given an array of integers, sort the array so that all odd indexes are greater than the even indexes.
7 | - Given users with locations in a list and a logged-in user with locations, find their travel buddies (people who shared more than half of your locations).
8 | - Search for an element in a sorted and rotated array.
9 | - [Source](http://blog.gainlo.co/index.php/2017/01/12/rotated-array-binary-search/)
10 | - Sort a list where each element is no more than k positions away from its sorted position.
11 | - Search for an item in a sorted, but rotated, array.
12 | - Merge two sorted lists together.
13 | - Give 3 distinct algorithms to find the K largest values in a list of N items.
14 | - Find the minimum element in a sorted rotated array in faster than O(n) time.
15 | - Write a function that takes a number as input and outputs the biggest number with the same set of digits.
16 | - [Source](http://blog.gainlo.co/index.php/2017/01/20/arrange-given-numbers-to-form-the-biggest-number-possible/)
17 |
--------------------------------------------------------------------------------
/algorithms/stack.md:
--------------------------------------------------------------------------------
1 | Stack
2 | ==
3 |
4 | - Implementation of an interpreter for a small language that does multiplication/addition/etc.
5 | - Design a `MinStack` data structure that supports a `min()` operation that returns the minimum value in the stack in O(1) time.
6 | - Write an algorithm to determine if all of the delimiters in an expression are matched and closed.
7 | - E.g. `{ac[bb]}`, `[dklf(df(kl))d]{}` and `{[[[]]]}` are matched. But `{3234[fd` and `{df][d}` are not.
8 | - [Source](http://blog.gainlo.co/index.php/2016/09/30/uber-interview-question-delimiter-matching/)
9 | - Sort a stack in ascending order using an additional stack.
10 |
--------------------------------------------------------------------------------
/algorithms/string.md:
--------------------------------------------------------------------------------
1 | String
2 | ==
3 |
4 | - Output list of strings representing a page of hostings given a list of CSV strings.
5 | - Given a list of words, find the word pairs that when concatenated form a palindrome.
6 | - Find the most efficient way to identify what character is out of place in a non-palindrome.
7 | - Implement a simple regex parser which, given a string and a pattern, returns a boolean indicating whether the input matches the pattern. By simple, we mean that the regex can only contain the following special characters: `*` (star), `.` (dot), `+` (plus). The star means that there will be zero or more of the previous character in that place in the pattern. The dot means any character for that position. The plus means one or more of previous character in that place in the pattern.
8 | - Find all words from a dictionary that are x edit distance away.
9 | - Given a string IP and number n, print all CIDR addresses that cover that range.
10 | - Write a function called `eval`, which takes a string and returns a boolean. This string is allowed 6 different characters: `0`, `1`, `&`, `|`, `(`, and `)`. `eval` should evaluate the string as a boolean expression, where `0` is `false`, `1` is `true`, `&` is an `and`, and `|` is an `or`.
11 | - E.g `"(0 | (1 | 0)) & (1 & ((1 | 0) & 0))"`
12 | - Given a pattern string like `"abba"` and an input string like `"redbluebluered"`, return `true` if and only if there's a one to one mapping of letters in the pattern to substrings of the input.
13 | - E.g. `"abba"` and `"redbluebluered"` should return `true`.
14 | - E.g. `"aaaa"` and `"asdasdasdasd"` should return `true`.
15 | - E.g. `"aabb"` and `"xyzabcxzyabc"` should return `false`.
16 | - If you received a file in chunks, calculate when you have the full file. Quite an open-ended question. Can assume chunks come with start and end, or size, etc.
17 | - Given a list of names (strings) and the width of a line, design an algorithm to display them using the minimum number of lines.
18 | - Design a spell-checking algorithm.
19 | - Count and say problem.
20 | - Longest substring with `K` unique characters.
21 | - [Source](http://blog.gainlo.co/index.php/2016/04/12/find-the-longest-substring-with-k-unique-characters/)
22 | - Given a set of random strings, write a function that returns a set that groups all the anagrams together.
23 | - [Source](http://blog.gainlo.co/index.php/2016/05/06/group-anagrams/)
24 | - Given a string, find the longest substring without repeating characters. For example, for string `'abccdefgh'`, the longest substring is `'cdefgh'`.
25 | - [Source](http://blog.gainlo.co/index.php/2016/10/07/facebook-interview-longest-substring-without-repeating-characters/)
26 | - Given a string, return the string with duplicate characters removed.
27 | - Write a function that receives a regular expression (allowed chars = from `'a'` to `'z'`, `'*'`, `'.'`) and a string containing lower case english alphabet characters and return `true` or `false` whether the string matches the regex.
28 | - E.g. `'ab*a'`, `'abbbbba'` => `true`.
29 | - E.g. `'ab*b.'`, `'aba'` => `true`.
30 | - E.g. `'abc*'`, `'acccc'` => `false`.
31 | - Given a rectangular grid with letters, search if some word is in the grid.
32 | - Given two strings representing integer numbers (`'123'` , `'30'`) return a string representing the sum of the two numbers: `'153'`.
33 | - A professor wants to see if two students have cheated when writing a paper. Design a function `hasCheated(String s1, String s2, int N)` that evaluates to `true` if two strings have a common substring of length `N`.
34 | - Follow up: Assume you don't have the possibility of using `String.contains()` and `String.substring()`. How would you implement this?
35 | - Print all permutations of a given string.
36 | - Parse a string containing numbers and `'+'`, `'-'` and parentheses. Evaluate the expression. `-2+(3-5)` should return `-4`.
37 | - Output a substring with at most `K` unique characters.
38 | - E.g. `'aabc'` and `k` = 2 => `'aab'`.
39 | - Ensure that there are a minimum of `N` dashes between any two of the same characters of a string.
40 | - E.g. `n = 2, string = 'ab-bcdecca'` => `'ab--bcdec--ca'`.
41 | - Find the longest palindrome in a string.
42 | - Give the count and the number following in the series.
43 | - E.g. `1122344`, next: `21221324`, next: `12112211121214`.
44 | - Count and say problem.
45 | - Compress a string by grouping consecutive similar questions together:
46 | - E.g. `'aaabbbcc' => `'a3b3c2'`.
47 | - You have a string consisting of open and closed parentheses, but parentheses may be imbalanced. Make the parentheses balanced and return the new string.
48 | - Given a set of strings, return the smallest subset that contains prefixes for every string.
49 | - E.g. `['foo', 'foog', 'food', 'asdf']` => `['foo', 'asdf']`.
50 | - Write a function that would return all the possible words generated when using a phone (pre-smartphone era) numpad to type.
51 | - Given a dictionary and a word, find the minimum number of deletions needed on the word in order to make it a valid word.
52 | - [Source](http://blog.gainlo.co/index.php/2016/04/29/minimum-number-of-deletions-of-a-string/)
53 | - How to check if a string contains an anagram of another string?
54 | - [Source](http://blog.gainlo.co/index.php/2016/04/08/if-a-string-contains-an-anagram-of-another-string/)
55 | - Find all k-lettered words from a string.
56 | - Given a string of open and close parentheses, find the minimum number of edits needed to balance a string of parentheses.
57 | - Run length encoding - Write a string compress function that returns `'R2G1B1'` given `'RRGB'`.
58 | - Write a function that finds all the different ways you can split up a word into a concatenation of two other words.
59 |
--------------------------------------------------------------------------------
/algorithms/topics.md:
--------------------------------------------------------------------------------
1 | Topics
2 | ==
3 |
4 | ## Arrays
5 |
6 | ## Strings
7 |
8 | - Prefix trees (Tries)
9 | - Suffix trees
10 | - Suffix arrays
11 | - KMP
12 | - Rabin-Karp
13 | - Boyer-Moore
14 |
15 | ## Sorting
16 |
17 | - Bubble sort
18 | - Insertion sort
19 | - Merge sort
20 | - Quick sort
21 | - Selection sort
22 | - Bucket sort
23 | - Radix sort
24 | - Counting sort
25 |
26 | ## Linked Lists
27 |
28 | ## Stacks
29 |
30 | ## Queues
31 |
32 | ## Hash tables
33 |
34 | - Collision resolution algorithms
35 |
36 | ## Trees
37 |
38 | - BFS
39 | - DFS (inorder, postorder, preorder)
40 | - Height
41 |
42 | ## Binary Search Trees
43 |
44 | - Insert node
45 | - Delete a node
46 | - Find element in BST
47 | - Find min, max element in BST
48 | - Get successor element in tree
49 | - Check if a binary tree is a BST or not
50 |
51 | ## Heaps / Priority Queues
52 |
53 | - Insert
54 | - Bubble up
55 | - Extract max
56 | - Remove
57 | - Heapify
58 | - Heap sort
59 |
60 | ## Graphs
61 |
62 | - Various implementations
63 | - Adjacency matrix
64 | - Adjacency list
65 | - Adjacency map
66 | - Single-source shortest path
67 | - Dijkstra
68 | - Bellman-Ford
69 | - Topo sort
70 | - MST
71 | - Prim algorithm
72 | - Kruskal's algorithm
73 | - Union Find Data Structure
74 | - Count connected components in a graph
75 | - List strongly connected components in a graph
76 | - Check for bipartite graph
77 |
78 | ## Dynamic Programming
79 |
80 | - Count Change
81 | - 0-1 Knapsack
82 |
83 | ## System Design
84 |
85 | - http://www.hiredintech.com/system-design/
86 | - https://www.quora.com/How-do-I-prepare-to-answer-design-questions-in-a-technical-interview?redirected_qid=1500023
87 | - http://blog.gainlo.co/index.php/2015/10/22/8-things-you-need-to-know-before-system-design-interviews/
88 | - https://github.com/donnemartin/system-design-primer
89 | - https://github.com/jwasham/coding-interview-university/blob/master/extras/cheat%20sheets/system-design.pdf
90 |
--------------------------------------------------------------------------------
/algorithms/tree.md:
--------------------------------------------------------------------------------
1 | Tree
2 | ==
3 |
4 | - Find the height of a tree.
5 | - Find the longest path from the root to leaf in a tree.
6 | - Find the deepest left leaf of a tree.
7 | - Print all paths of a binary tree.
8 | - [Source](http://blog.gainlo.co/index.php/2016/04/15/print-all-paths-of-a-binary-tree/)
9 | - Second largest element of a BST.
10 | - [Source](http://blog.gainlo.co/index.php/2016/06/03/second-largest-element-of-a-binary-search-tree/)
11 | - Given a binary tree and two nodes, how to find the common ancestor of the two nodes?
12 | - [Source](http://blog.gainlo.co/index.php/2016/07/06/lowest-common-ancestor/)
13 | - Find the lowest common ancestor of two nodes in a binary search tree.
14 | - Print the nodes in an n-ary tree level by level, one printed line per level.
15 | - Given a directory of files and folders (and relevant functions), how would you parse through it to find equivalent files?
16 | - Write a basic file system and implement the commands ls, pwd, mkdir, create, rm, cd, cat, mv.
17 | - Compute the intersection of two binary search trees.
18 | - Given a binary tree, output all the node to leaf paths of it.
19 | - Given a string of characters without spaces, is there a way to break the string into valid words without leftover characters?
20 | - Print a binary tree level by level.
21 | - Determine if a binary tree is "complete" (i.e, if all leaf nodes were either at the maximum depth or max depth-1, and were 'pressed' along the left side of the tree).
22 | - Find the longest path in a binary tree. The path may start and end at any node.
23 | - Determine if a binary tree is a BST.
24 | - Given a binary tree, serialize it into a string. Then deserialize it.
25 | - Print a binary tree by column.
26 | - Given a node, find the next element in a BST.
27 | - Find the shortest subtree that consist of all the deepest nodes. The tree is not binary.
28 | - Print out the sum of each row in a binary tree.
29 | - Pretty print a JSON object.
30 | - Convert a binary tree to a doubly circular linked list.
31 | - Find the second largest number in a binary tree.
32 | - Given a tree, find the longest branch.
33 | - Convert a tree to a linked list.
34 | - Given two trees, write code to find out if tree A is a subtree of tree B.
35 | - Deepest node in a tree.
36 | - [Source](http://blog.gainlo.co/index.php/2016/04/26/deepest-node-in-a-tree/)
37 |
--------------------------------------------------------------------------------
/assets/book.svg:
--------------------------------------------------------------------------------
1 |
44 |
--------------------------------------------------------------------------------
/design/README.md:
--------------------------------------------------------------------------------
1 | Design Questions
2 | ==
3 |
4 | ## Guides
5 |
6 | - [Grokking the System Design Interview](https://www.educative.io/collection/5668639101419520/5649050225344512)
7 | - https://github.com/donnemartin/system-design-primer
8 | - https://github.com/checkcheckzz/system-design-interview
9 | - https://github.com/shashank88/system_design
10 | - https://gist.github.com/vasanthk/485d1c25737e8e72759f
11 | - http://www.puncsky.com/blog/2016/02/14/crack-the-system-design-interview/
12 | - https://www.palantir.com/2011/10/how-to-rock-a-systems-design-interview/
13 | - http://blog.gainlo.co/index.php/2017/04/13/system-design-interviews-part-ii-complete-guide-google-interview-preparation/
14 |
15 | ## Flow
16 |
17 | #### A. Understand the problem and scope
18 |
19 | - Define the use cases, with interviewer's help.
20 | - Suggest additional features.
21 | - Remove items that interviewer deems out of scope.
22 | - Assume high availability is required, add as a use case.
23 |
24 | #### B. Think about constraints
25 |
26 | - Ask how many requests per month.
27 | - Ask how many requests per second (they may volunteer it or make you do the math).
28 | - Estimate reads vs. writes percentage.
29 | - Keep 80/20 rule in mind when estimating.
30 | - How much data written per second.
31 | - Total storage required over 5 years.
32 | - How much data reads per second.
33 |
34 | #### C. Abstract design
35 |
36 | - Layers (service, data, caching).
37 | - Infrastructure: load balancing, messaging.
38 | - Rough overview of any key algorithm that drives the service.
39 | - Consider bottlenecks and determine solutions.
40 |
41 | Source: https://github.com/jwasham/coding-interview-university#system-design-scalability-data-handling
42 |
43 | ## Grading Rubrics
44 |
45 | - Problem Solving - How systematic is your approach to solving the problem step-by-step? Break down a problem into its core components.
46 | - Communication - How well do you explain your idea and communicate it with others?
47 | - Evaluation - How do you evaluate your system? Are you aware of the trade-offs made? How can you optimize it?
48 | - Estimation - How fast does your system need to be? How much space does it need? How much load will it experience?
49 |
50 | ## Specific Topics
51 |
52 | - URL Shortener
53 | - http://stackoverflow.com/questions/742013/how-to-code-a-url-shortener
54 | - http://blog.gainlo.co/index.php/2016/03/08/system-design-interview-question-create-tinyurl-system/
55 | - https://www.interviewcake.com/question/python/url-shortener
56 | - Collaborative Editor
57 | - http://blog.gainlo.co/index.php/2016/03/22/system-design-interview-question-how-to-design-google-docs/
58 | - Photo Sharing App
59 | - http://blog.gainlo.co/index.php/2016/03/01/system-design-interview-question-create-a-photo-sharing-app/
60 | - Social Network Feed
61 | - http://blog.gainlo.co/index.php/2016/02/17/system-design-interview-question-how-to-design-twitter-part-1/
62 | - http://blog.gainlo.co/index.php/2016/02/24/system-design-interview-question-how-to-design-twitter-part-2/
63 | - http://blog.gainlo.co/index.php/2016/03/29/design-news-feed-system-part-1-system-design-interview-questions/
64 | - Trending Algorithm
65 | - http://blog.gainlo.co/index.php/2016/05/03/how-to-design-a-trending-algorithm-for-twitter/
66 | - Facebook Chat
67 | - http://blog.gainlo.co/index.php/2016/04/19/design-facebook-chat-function/
68 | - Key Value Store
69 | - http://blog.gainlo.co/index.php/2016/06/14/design-a-key-value-store-part-i/
70 | - http://blog.gainlo.co/index.php/2016/06/21/design-key-value-store-part-ii/
71 | - Recommendation System
72 | - http://blog.gainlo.co/index.php/2016/05/24/design-a-recommendation-system/
73 | - Cache System
74 | - http://blog.gainlo.co/index.php/2016/05/17/design-a-cache-system/
75 | - E-commerce Website
76 | - http://blog.gainlo.co/index.php/2016/08/22/design-ecommerce-website-part/
77 | - http://blog.gainlo.co/index.php/2016/08/28/design-ecommerce-website-part-ii/
78 | - Web Crawler
79 | - http://blog.gainlo.co/index.php/2016/06/29/build-web-crawler/
80 | - http://www.makeuseof.com/tag/how-do-search-engines-work-makeuseof-explains/
81 | - https://www.quora.com/How-can-I-build-a-web-crawler-from-scratch/answer/Chris-Heller
82 | - YouTube
83 | - http://blog.gainlo.co/index.php/2016/10/22/design-youtube-part/
84 | - http://blog.gainlo.co/index.php/2016/11/04/design-youtube-part-ii/
85 | - Hit Counter
86 | - http://blog.gainlo.co/index.php/2016/09/12/dropbox-interview-design-hit-counter/
87 | - Facebook Graph Search
88 | - Design [Lyft Line](https://www.lyft.com/line).
89 | - Design a promo code system (with same promo code, randomly generated promo code, and promo code with conditions).
90 | - Model a university.
91 | - How would you implement Pacman?
92 | - Sketch out an implementation of Asteroids.
93 | - Implement a spell checker.
94 | - Design the rubik cube.
95 | - Design a high-level interface to be used for card games (e.g. poker, blackjack etc).
96 |
--------------------------------------------------------------------------------
/design/collaborative-editor.md:
--------------------------------------------------------------------------------
1 | Collaborative Document Editor
2 | ==
3 |
4 | ## Variants
5 |
6 | - Design Google docs.
7 | - Design a collaborative code editor like Coderpad/Codepile.
8 | - Design a collaborative markdown editor.
9 |
10 | ## Requirements Gathering
11 |
12 | - What is the intended platform?
13 | - Web
14 | - What features are required?
15 | - Creating a document
16 | - Editing a document
17 | - Sharing a document
18 | - Bonus features
19 | - Document revisions and reverting
20 | - Searching
21 | - Commenting
22 | - Chatting
23 | - Executing code (in the case of code editor)
24 | - What is in a document?
25 | - Text
26 | - Images
27 | - Which metrics should we optimize for?
28 | - Loading time
29 | - Synchronization
30 | - Throughput
31 |
32 | ## Core Components
33 |
34 | - Front end
35 | - WebSockets/long polling for real-time communication between front end and back end.
36 | - Back end services behind a reverse proxy.
37 | - Reverse proxy will proxy the requests to the right server.
38 | - Split into a few services for different purposes.
39 | - The benefit of this is that each service can use different languages that best suits its purpose.
40 | - API servers for non-collaborative features and endpoints.
41 | - Ruby/Rails/Django for the server that deals with CRUD operations on data models where performance is not that crucial.
42 | - WebSocket servers for handling document edits and publishing updates to listeners.
43 | - Possibly Node/Golang for WebSocket server which will need high performance as updates are frequent.
44 | - Task queue to persist document updates to the database.
45 | - ELB in front of back end servers.
46 | - MySQL database.
47 | - S3 and CDN for images.
48 |
49 | ## Data Modeling
50 |
51 | - What kind of database to use?
52 | - Data is quite structured. Would go with SQL.
53 | - Design the necessary tables, its columns and its relations.
54 | - `users`
55 | - `id`
56 | - `name`
57 | - `document`
58 | - `id`
59 | - `owner_id`
60 | - `permissions`
61 | - `id`
62 | - `name`
63 | - `document_permissions`
64 | - `id`
65 | - `document_id`
66 | - `user_id`
67 |
68 | ## Collaborative Editing - Client
69 |
70 | - Upon loading of the page and document, the client should connect to the WebSocket server over the WebSocket protocol `ws://`.
71 | - Upon connection, perform a time sync with the server, possibly via Network Time Protocol (NTP).
72 | - The most straightforward way is to send the whole updated document content to the back end, and all users currently viewing the document will receive the updated document. However, there are a few problems with this approach:
73 | - Race condition. If two users editing the document at the same time, the last one to edit will overwrite the changes by the previous user. One workaround is to lock the document when a user is currently editing it, but that will not make it real-time collaborative.
74 | - A large payload (the whole document) is being sent to servers and published to users on each change, and the user is likely to already have most of the content. A lot of redundant data being sent.
75 | - A feasible approach would be to use operational transforms and send just the action deltas to the back end. The back end publishes the action deltas to the listeners. What is considered an action delta?
76 | - (a) Changing a character/word, (b) inserting a character/word/image, (c) deleting a character/word.
77 | - With this approach, the payload will contain only small amount of data, such as (a) type of change, (b) character/word, (c) position in document: line/column, (d) timestamp. Why is the timestamp needed? Read on to find out.
78 | - Updates can also be throttled and batched, to avoid flooding the web server with requests. For example, if a user inserts a
79 |
80 | ## Back End
81 |
82 | The back end is split into a few portions: WebSocket server for receiving and broadcasting document updates, CRUD server for reading and writing non-document-related data, and a task queue for persistence of the document.
83 |
84 | ## WebSocket Server
85 |
86 | - Languages and frameworks that support async requests and non-blocking I/O will be suitable for the collaborative editor server. Node and Golang comes to my mind.
87 | - However, the WebSocket server is not stateless, so is it not that straightforward to scale horizontally. One approach would be for a Load Balancer to use Redis to maintain a map of the client to the WebSocket server instance IP, such that subsequent requests from the same client will be routed to the same server.
88 | - Each document corresponds to a room (more of namespace). Users can subscribe to the events happening within a room.
89 | - When a action delta is being received, blast it out to the listeners within the room and add it to the task queue.
90 |
91 | ## CRUD Server
92 |
93 | - Provides APIs for reading and writing non-document-related data, such as users, permissions.
94 |
95 | ## Task Queue + Worker Service
96 |
97 | - Worker service retrieves messages from the task queue and writes the updated documents to the database in an async fashion.
98 | - Batch the actions together and perform one larger write that consists of multiple actions. For example, instead of persisting to the database once per addition of a word, combine these additions and write them into the database at once.
99 | - Publish the save completion event to the WebSocket server to be published to the listeners, informing that the latest version of the document is being saved.
100 | - Benefit of using a task queue is that as the amount of tasks in the queue goes up, we can scale up the number of worker services to clear the backlog of work faster.
101 |
102 | ## Document Persistence
103 |
104 | TODO
105 |
106 | ###### References
107 |
108 | - http://blog.gainlo.co/index.php/2016/03/22/system-design-interview-question-how-to-design-google-docs/
109 |
--------------------------------------------------------------------------------
/design/news-feed.md:
--------------------------------------------------------------------------------
1 | News Feed
2 | ==
3 |
4 | ## Variants
5 |
6 | - Design Facebook news feed.
7 | - Design Twitter news feed.
8 | - Design Quora feed.
9 | - Design Instagram feed.
10 |
11 | ## Requirements Gathering
12 |
13 | - What is the intended platform?
14 | - Mobile (mobile web or native)? Web? Desktop?
15 | - What features are required?
16 | - CRUD posts.
17 | - Commenting on posts.
18 | - Sharing posts.
19 | - Trending posts?
20 | - Tag people?
21 | - Hashtags?
22 | - What is in a news feed post?
23 | - Author.
24 | - Content.
25 | - Media.
26 | - Tags?
27 | - Hashtags?
28 | - Comments/Replies.
29 | - Operations:
30 | - CRUD
31 | - Commenting/replying to a post.
32 | - What is in a news feed?
33 | - Sequence of posts.
34 | - Query pattern: query for a user's ranked news feed.
35 | - Operations:
36 | - Append - Fetch more posts.
37 | - Delete - I don't want to see this.
38 | - Which metrics should we optimize for?
39 | - User retention.
40 | - Ads revenue.
41 | - Fast loading time.
42 | - Bandwidth.
43 | - Server costs.
44 |
45 | ## Core Components
46 |
47 | TODO
48 |
49 | ## Data modeling
50 |
51 | - What kind of database to use?
52 | - Data is quite structured. Would go with SQL.
53 | - Design the necessary tables, its columns and its relations.
54 | - `users`
55 | - `posts`
56 | - `likes`
57 | - `follows`
58 | - `comments`
59 |
60 | > There are two basic objects: user and feed. For user object, we can store userID, name, registration date and so on so forth. And for feed object, there are feedId, feedType, content, metadata etc., which should support images and videos as well.
61 | >
62 | > If we are using a relational database, we also need to model two relations: user-feed relation and friend relation. The former is pretty straightforward. We can create a user-feed table that stores userID and corresponding feedID. For a single user, it can contain multiple entries if he has published many feeds.
63 | >
64 | > For friend relation, adjacency list is one of the most common approaches. If we see all the users as nodes in a giant graph, edges that connect nodes denote friend relation. We can use a friend table that contains two userIDs in each entry to model the edge (friend relation). By doing this, most operations are quite convenient like fetch all friends of a user, check if two people are friends.
65 | >
66 | > The system will first get all userIDs of friends from friend table. Then it fetches all feedIDs for each friend from user-feed table. Finally, feed content is fetched based on feedID from feed table. You can see that we need to perform 3 joins, which can affect performance.
67 | >
68 | > A common optimization is to store feed content together with feedID in user-feed table so that we don't need to join the feed table any more. This approach is called denormalization, which means by adding redundant data, we can optimize the read performance (reducing the number of joins).
69 | >
70 | > The disadvantages are obvious:
71 | > - Data redundancy. We are storing redundant data, which occupies storage space (classic time-space trade-off).
72 | > - Data consistency. Whenever we update a feed, we need to update both feed table and user-feed table. Otherwise, there is data inconsistency. This increases the complexity of the system.
73 | > - Remember that there's no one approach always better than the other (normalization vs denormalization). It's a matter of whether you want to optimize for read or write.
74 |
75 | ## Feed Display
76 |
77 | - The most straightforward way is to fetch posts from all the people you follow and render them sorted by time.
78 | - There can be many posts to fetch. How many posts should you fetch?
79 | - What are the pagination approaches and the pros and cons of each approach?
80 | - Offset by page size
81 | - Offset by time
82 | - What data should the post contain when you initially fetch them?
83 | - Lazy loading approach for loading associated data: media, comments, people who liked the post.
84 | - Media
85 | - If the post contains media such as images and videos, how should they be handled? Should they be loaded on the spot?
86 | - A better way would be to fetch images only when they are about to enter the viewport.
87 | - Videos should not autoplay. Only fetch the thumbnail for the video, and only play the video when user clicks play.
88 | - If the content is being refetched, the media should be cached and not fetched over the wire again. This is especially important on mobile connections where data can be expensive.
89 | - Comments
90 | - Should you fetch all the comments for a post? For posts by celebrities, they can contain a few hundred or thousand comments.
91 | - Maybe fetch the top few comments and display them under the post, and the user is given the choice to "show all comments".
92 | - How does the user request for new content?
93 | - Infinite scrolling.
94 | - User has to tap next page.
95 |
96 | ## Feed Ranking
97 |
98 | - First select features/signals that are relevant and then figure out how to combine them to calculate a final score.
99 | - How do you show the relevant posts that the user is interested in?
100 | - Chronological - While a chronological approach works, it may not be the most engaging approach. For example, if a person posts 30 times within the last hour, his followers will have their news feed clogged up with his posts. Maybe set a cap on the number of time a person's posts can appear within the feed.
101 | - Popularity - How many likes and comments does the post have? Does the user usually like posts by that person?
102 | - How do you determine which are the more important posts? A user might be more interested in a few-hour old post from a good friend than a very recent post from an acquaintance.
103 | - A common strategy is to calculate a post score based on various features and rank posts by its score.
104 | - Prior to 2013, Facebook was using the [EdgeRank](https://www.wikiwand.com/en/EdgeRank) algorithm to determine what articles should be displayed in a user's News Feed.
105 | - Edge Rank basically is using three signals: affinity score, edge weight and time decay.
106 | - Affinity score (u) - For each news feed, affinity score evaluates how close you are with this user. For instance, you are more likely to care about feed from your close friends instead of someone you just met once.
107 | - Edge weight (e) - Edge weight basically reflects importance of each edge. For instance, comments are worth more than likes.
108 | - Time decay (d) - The older the story, the less likely users find it interesting.
109 | - Affinity score
110 | - Various factors can be used to reflect how close two people are. First of all, explicit interactions like comment, like, tag, share, click etc. are strong signals we should use. Apparently, each type of interaction should have different weight. For instance, comments should be worth much more than likes.
111 | - Secondly, we should also track the time factor. Perhaps you used to interact with a friend quite a lot, but less frequent recently. In this case, we should lower the affinity score. So for each interaction, we should also put the time decay factor.
112 | - A good ranking system can improve some core metrics - user retention, ads revenue, etc.
113 |
114 | ## Feed Publishing
115 |
116 | TODO. Refer to http://blog.gainlo.co/index.php/2016/04/05/design-news-feed-system-part-2/.
117 |
118 | ## Additional Features
119 |
120 | #### Tagging feature
121 |
122 | - Have a `tags` table that stores the relation between a post and the people tagged in it.
123 |
124 | #### Sharing feature
125 |
126 | - Add a column to `posts` table called `original_post_id`.
127 | - What should happen when the original post is deleted?
128 | - The shared `posts` have to be deleted too.
129 |
130 | #### Notifications feature
131 |
132 | - When should notifications happen?
133 | - Can the user subscribe to only certain types of notifications?
134 |
135 | #### Trending feature
136 |
137 | - What constitutes trending? What signals would you look at? What weight would you give to each signal?
138 | - Most frequent hashtags over the last N hours.
139 | - Hottest search queries.
140 | - Fetch the recent most popular feeds and extract some common words or phrases.
141 |
142 | #### Search feature
143 |
144 | - How would you index the data?
145 |
146 | ## Scalability
147 |
148 | - Master-slave replication.
149 | - Write to master database and read from replica databases/in-memory data store.
150 | - Post contents are being read more than they are updated. It is acceptable to have a slight lag between a user updating a post and followers seeing the updated content. Tweets are not even editable.
151 | - Data for real-time queries should be in memory, disk is for writes only.
152 | - Pre-computation offline.
153 | - Tracking number of likes and comments.
154 | - Expensive to do a `COUNT` on the `likes` and `comments` for a post.
155 | - Use Redis/Memcached for keeping track of how many likes/comments a post has. Increment when there's new activity, decrement when someone unlikes/deletes the comment.
156 | - Load balancer in front of your API servers.
157 | - Partitioning the data.
158 |
159 | ###### References
160 |
161 | - [Design News Feed System (Part 1)](http://blog.gainlo.co/index.php/2016/03/29/design-news-feed-system-part-1-system-design-interview-questions/)
162 | - [Design News Feed System (Part 1)](http://blog.gainlo.co/index.php/2016/04/05/design-news-feed-system-part-2/)
163 | - [Etsy Activity Feeds Architecture](https://www.slideshare.net/danmckinley/etsy-activity-feeds-architecture)
164 | - [Big Data in Real-Time at Twitter](https://www.slideshare.net/nkallen/q-con-3770885)
165 |
--------------------------------------------------------------------------------
/design/search-engine.md:
--------------------------------------------------------------------------------
1 | Search Engine
2 | ==
3 |
4 | ###### References
5 |
6 | - [How Do Search Engines Work?](http://www.makeuseof.com/tag/how-do-search-engines-work-makeuseof-explains/)
7 |
--------------------------------------------------------------------------------
/domain/async-loading/index.html:
--------------------------------------------------------------------------------
1 |
2 |
3 |
8 |
9 |
10 |