├── System Design ├── Key Components │ ├── Relational Databases.md │ ├── Proxies.md │ ├── Load Balancers.md │ ├── Caching.md │ ├── Duplication.md │ └── Microservices.md ├── Fundamentals │ ├── mentalmodel-HTTP2_in_Action2.png │ ├── Multiplexing.md │ ├── Client-Server Model.md │ ├── Hashing.md │ ├── Storage.md │ ├── Partitioning.md │ ├── Network Protocols.md │ └── REST API Design.md ├── Characteristics │ ├── Strong vs Eventual Consistency.md │ ├── Latency and Throughput.md │ └── Availability.md ├── Mobile │ ├── Connection Types for Continuous Data.md │ └── Mobile System Design.md └── Contents.md ├── CompSci ├── Regex.md ├── Powers of 2.md ├── Hash Functions.md ├── Character Sets.md └── Bitwise operations.md ├── Maths ├── n-choose-k.md ├── DeMorgan's Law.md └── Sets.md ├── Data Structures ├── Trees │ ├── AVL Tree.md │ ├── Red-Black Tree.md │ ├── Tree Basics.md │ ├── Breadth-First Search.md │ ├── Trie.md │ ├── Binary Heap.md │ ├── Depth-First Search.md │ └── Binary Search Tree.md ├── Stack.md ├── TreeMap.md ├── Queue.md ├── Array.md ├── Graphs │ ├── Adjacency List.md │ └── Graph Representations.md ├── Hash Table.md ├── LRU Cache.md └── Linked List.md ├── Java ├── Java Basics.md ├── General Java Questions.md └── Threading.md ├── README.md ├── Android ├── Activity Architecture.md ├── Thread Scheduling.md └── Architecture Overview.md ├── Algorithms ├── Topics │ ├── Kadane's Algorithm.md │ ├── Bubble Sort.md │ ├── Rotate an Array.md │ ├── Dijkstra.md │ ├── Binary Search.md │ ├── Merge Intervals.md │ ├── Dynamic Programming.md │ ├── Topological Sort.md │ ├── Quick Sort.md │ ├── Merge Sort.md │ ├── Range-Sum Query.md │ └── Knapsack Problem.md └── Contents.md ├── Kotlin └── General Things to Remember.md ├── Intro.md ├── Algorithmic Complexity and Big O Notation.md ├── Behavioural └── Behavioural Interview Notes.md ├── Common Strategies.md └── LICENSE /System Design/Key Components/Relational Databases.md: -------------------------------------------------------------------------------- 1 | # Relational Databases 2 | 3 | -------------------------------------------------------------------------------- /CompSci/Regex.md: -------------------------------------------------------------------------------- 1 | # Replacing Whitespace 2 | 3 | ```java 4 | "string".replaceAll("\\s", "") 5 | ``` -------------------------------------------------------------------------------- /Maths/n-choose-k.md: -------------------------------------------------------------------------------- 1 | # n-choose-k Problems 2 | 3 | ${\displaystyle {\binom {n}{k}}={\frac {n!}{k!(n-k)!}}.}$ -------------------------------------------------------------------------------- /Data Structures/Trees/AVL Tree.md: -------------------------------------------------------------------------------- 1 | Slower to insert or delete than a [[Red-Black Tree]], but quicker to query, so more frequently used in high-read scenarios. -------------------------------------------------------------------------------- /System Design/Fundamentals/mentalmodel-HTTP2_in_Action2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ditn/interviewprep/HEAD/System Design/Fundamentals/mentalmodel-HTTP2_in_Action2.png -------------------------------------------------------------------------------- /Maths/DeMorgan's Law.md: -------------------------------------------------------------------------------- 1 | # Law 2 | - the negation of a disjunction is the conjunction of the negations 3 | - the negation of a conjunction is the disjunction of the negations 4 | 5 | `!(Q OR Q) == !P AND !Q` 6 | 7 | `!(P AND Q) == !P OR !Q` 8 | 9 | ### Notes 10 | 11 | See [Wikipedia](https://en.wikipedia.org/wiki/De_Morgan%27s_laws) -------------------------------------------------------------------------------- /Java/Java Basics.md: -------------------------------------------------------------------------------- 1 | # Java Basics 2 | Reminders because my Java is rusty. 3 | 4 | ### Array Syntax 5 | 6 | ```java 7 | // Basic array 8 | int[] array = { 2, 3, 6, 9 }; 9 | 10 | // 2D Array 11 | int[][] array = { 12 | { 2, 3, 6, 9 }, 13 | { 2, 3, 6, 9 } 14 | }; 15 | 16 | int[][] array = new int[10][20]; 17 | ``` 18 | 19 | # Helpers 20 | 21 | `Character.isLetterOrDigit('x')` -------------------------------------------------------------------------------- /Data Structures/Stack.md: -------------------------------------------------------------------------------- 1 | # Stacks 2 | 3 | Supports two operations - `push` to add an item and `pop` to remove. 4 | 5 | Stacks are known as Last-In, First-Out (LIFO), which is the opposite of a [[Queue]]. 6 | 7 | In Kotlin, these are also represented by a [[Linked List]]: 8 | 9 | ```kotlin 10 | val stack = LinkedList() 11 | 12 | stack.push(1) 13 | 14 | stack.pop() `should be equal to` 1 15 | ``` -------------------------------------------------------------------------------- /System Design/Fundamentals/Multiplexing.md: -------------------------------------------------------------------------------- 1 | # Multiplexing 2 | 3 | At it's most basic, multiplexing allows clients to fire multiple requests at the same time on the same connection, and receive events back in any order. 4 | 5 | Historically with HTTP 1.1, clients have been limited to 6 simultaneous connections. This is a lot of connections to manage and impacts both the client and the server. 6 | 7 | ![[mentalmodel-HTTP2_in_Action2.png]] -------------------------------------------------------------------------------- /Data Structures/Trees/Red-Black Tree.md: -------------------------------------------------------------------------------- 1 | # Red-Black Tree 2 | 3 | A red-black tree is a self-balancing [[Binary Search Tree]] where each node has an extra bit - this bit is interpreted as red or black. These colours are then used to ensure that the tree remains balanced during insertions or deletions. 4 | 5 | This balance may not be perfect, but it's enough to ensure around $O(log \cdot n)$ read time. Ergo, the height $h$ of a red-black tree is always $O(log \cdot n)$. 6 | 7 | Quicker to add or delete items than an [[AVL Tree]], so often used in higher-write scenarios, or implementations of [[Hash Table]] data structures. -------------------------------------------------------------------------------- /CompSci/Powers of 2.md: -------------------------------------------------------------------------------- 1 | | Bit line | Exponent | Bit weight | Max Decimal | 2 | |-|-|-|-| 3 | | 1 | $2^0$ | 1 | 1 | 4 | | 2 | $2^1$ | 2 | 3 | 5 | | 3 | $2^2$ | 4 | 7 | 6 | | 4 | $2^3$ | 8 | 15 | 7 | | 5 | $2^4$ | 16 | 31 | 8 | | 6 | $2^5$ | 32 | 63 | 9 | | 7 | $2^6$ | 64 | 127 | 10 | | 8 | $2^7$ | 128 | 255 | 11 | | 9 | $2^8$ | 256 | 511 | 12 | | 10 | $2^9$ | 512 | 1,023 | 13 | | 11 | $2^{10}$ | 1,024 | 2,047 | 14 | | 12 | $2^{11}$ | 2,048 | 4,095 | 15 | | 13 | $2^{12}$ | 4,096 | 8,191 | 16 | | 14 | $2^{13}$ | 8,192 | 16,383 | 17 | | 15 | $2^{14}$ | 16,384 | 32,767 | 18 | | 16 | $2^{15}$ | 32,768| 65,535 | 19 | | 32 | $2^{31}$ | 2,147,483,648 | 4,294,967,295 | 20 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # interviewprep 2 | A [Obsidian.md](https://obsidian.md/) markdown vault containing all of the study notes I made whilst prepping to interview in Summer 2021. Accompanying blog post explaining the rationale behind this [here](https://adambennett.dev/2021/11/on-interviewing/). This isn't intended to be comprehensive - more of an example of how to use Obsidian effectively. If you see anything obviously incorrect though, please do open a PR. 3 | 4 | To view these files correctly, install Obsidian and import the root folder as a vault. View `Intro.md` for more info and a table of contents. 5 | 6 | You can of course view these files here on GitHub, but much of the formatting (LaTeX, Mermaid) isn't supported and will look weird. 7 | -------------------------------------------------------------------------------- /CompSci/Hash Functions.md: -------------------------------------------------------------------------------- 1 | # Hash Functions 2 | 3 | Typically a hash function is _locality insensitive_, which means that a small change to the input results in a very large change to the output - you can't predict an output from a previous output and use it to work out if you're close to guessing a password. 4 | 5 | ```kotlin 6 | 7 | "dog".md5() = "06d80eb0c50b49a509b49f2424e8c805" 8 | "dot".md5() = "69eb76c88557a8211cbfc9beda5fc062" 9 | 10 | ``` 11 | 12 | However, this may be undesirable. 13 | 14 | # Simhash 15 | Simhash does the opposite. Small changes in a file or input result in small changes to a hash, allowing you to compute if two files are similar. This can be used to detect copyrighted materials, cheating/plagiarising, or detecting duplicates when crawling the web. -------------------------------------------------------------------------------- /Android/Activity Architecture.md: -------------------------------------------------------------------------------- 1 | # Activity Architecture 2 | 3 | When an application starts for the first time, the `AcitivtyManagerService` creates a special kind of window token (see [[Architecture Overview#Binder IPC]]) called an application window token. This uniquely identifies the application's top-level container window. 4 | 5 | The `ActivityManager` gives this token to both the application and the `WindowManager`. Each time the application wants to add a new screen, it must send this token to the `WindowManager` again. 6 | 7 | This ensures that it's impossible for one application to draw over another. It also makes it easier for the `ActivityManager` to request to close all of a token's windows, as the `WindowManager` will be able to correctly identify all of the affected windows. 8 | 9 | -------------------------------------------------------------------------------- /Algorithms/Topics/Kadane's Algorithm.md: -------------------------------------------------------------------------------- 1 | # Kadane's Algorithm 2 | Find the largest sum of any contiguous subarray. 3 | 4 | ## Pseudocode 5 | ```javascript 6 | function max_subarray(numbers) 7 | best_sum = -inf 8 | current_sum = 0 9 | for x in numbers 10 | current_sum = max(0, current_sum + x) 11 | best_sum = max(best_sum, current_sum) 12 | return best_sum 13 | ``` 14 | 15 | ## Implementation 16 | ```kotlin 17 | fun maxSubArray(nums: IntArray): Int { 18 | var currentSubArray = nums.first() 19 | var maxSubArray = nums.first() 20 | 21 | for (index in 1 until nums.size) { 22 | val number = nums[index] 23 | // If current subarray sum is negative, discard it, otherwise add to it 24 | currentSubArray = Math.max(number, currentSubArray + number) 25 | maxSubArray = Math.max(maxSubArray, currentSubArray) 26 | } 27 | 28 | return maxSubArray 29 | } 30 | ``` -------------------------------------------------------------------------------- /CompSci/Character Sets.md: -------------------------------------------------------------------------------- 1 | # ASCII 2 | 128 characters, one byte per character. 3 | 4 | # Extended ASCII 5 | 256 characters, one byte per character. 6 | 7 | # Unicode 8 | `a = 97` 9 | `b = 98` 10 | `c = 99` 11 | `...` 12 | `z = 122` 13 | 14 | Two bytes per character. 15 | 16 | # Convert Char to number 17 | 18 | ```kotlin 19 | "1234"[3] - '0' == 4 20 | ``` 21 | 22 | # Char to Index 23 | 24 | ```kotlin 25 | "apple"[0] - 'a' == 0 26 | ``` 27 | 28 | # Number Range 29 | 30 | ```kotlin 31 | '0'.toInt() == 48 32 | '9'.toInt() = 57 33 | ``` 34 | 35 | e.g.: 36 | 37 | ```kotlin 38 | fun parseInt(string: String): Int { 39 | var num = 0 40 | 41 | string.forEach { 42 | if (it.toInt() in 48..57) { 43 | num = num * 10 + (it - 48).toInt() 44 | } else { 45 | throw NumberFormatException("Invalid character $it") 46 | } 47 | } 48 | 49 | return num 50 | } 51 | ``` -------------------------------------------------------------------------------- /System Design/Characteristics/Strong vs Eventual Consistency.md: -------------------------------------------------------------------------------- 1 | # Strong vs Eventual Consistency 2 | 3 | We aim for transations to be ACID, that is: 4 | 5 | - **Atomic** - transactions either succeed or fail, there should be no inbetween state 6 | - **Consistency** - a transaction cannot bring a database into an invalid state 7 | - **Isolation** - execution of multiple transactions concurrently will have the same affect as if they had been executed sequentially 8 | - **Durability** - any committed transaction is written to non-volative storage. It cannot be undone by a crash, power loss, network partition etc. 9 | 10 | ## Strong Consistency 11 | Usually refers to ACID transactions. 12 | 13 | ## Eventual Consistency 14 | A model where reads of a system might return stale data. Generally, an eventually consistent datastore will give guarantees that the datastore will eventually reflect writes within a given period. This could be 10 seconds, or it could be several minutes. -------------------------------------------------------------------------------- /Data Structures/TreeMap.md: -------------------------------------------------------------------------------- 1 | # TreeMap 2 | 3 | The `TreeMap` interface in Java is used to implement `Map`. The `Map` is sorted according to the natural ordering of its keys, or by a `Comparator` implementation. 4 | 5 | Because these keys are sorted, you can fetch a submap using keys. This makes it particularly useful for questions where you might need to keep track of the incidences of something but also search that list. 6 | 7 | ```kotlin 8 | fun subMap(fromKey: K, toKey: K): SortedMap 9 | ``` 10 | 11 | In Kotlin you can call `toSortedMap()` on a `Map`. 12 | 13 | TreeMap provides a performance of $O(log \cdot n)$ for most operations like `add()`, `remove()` and `contains()`. 14 | 15 | ## Uses 16 | TreeMaps are particularly useful when looking up ranges, such as time intervals. 17 | 18 | ## Implementation 19 | Internally, a TreeMap is maintained via a [[Red-Black Tree]]. 20 | 21 | Therefore, operations like `add`, `remove` and `contains` are all $O(log \cdot n)$. -------------------------------------------------------------------------------- /Data Structures/Queue.md: -------------------------------------------------------------------------------- 1 | # Queues 2 | These support two operations, `enqueue` and `dequeue`. 3 | 4 | Queues are an example of First-In, First-Out or FIFO - the opposite to a [[Stack]]. 5 | 6 | # Priority Queue 7 | This is a special type of queue where `enqueue` takes both a value `T` but also a priority `p`. 8 | 9 | Think of it like a queue in a hospital - those who are higher priority get seen first. A priority queue is often used in computer processors to work out what to compute next - for instance, keyboard input is first priority. 10 | 11 | # Kotlin 12 | In Kotlin, a Queue is normally represented with a [[Linked List]]: 13 | 14 | ```kotlin 15 | val queue = LinkedList() 16 | 17 | queue.add(1) 18 | // or 19 | queue.offer(1) 20 | 21 | val dequeued = queue.remove() 22 | // or 23 | val dequeued = queue.poll() 24 | 25 | val last = queue.peek() 26 | ``` 27 | 28 | ## Implementation 29 | Can be implemented with two stacks: 30 | 31 | - on enqueue, move all items from stack one to stack two 32 | - add item to stack one 33 | - move all items back to stack one -------------------------------------------------------------------------------- /Data Structures/Trees/Tree Basics.md: -------------------------------------------------------------------------------- 1 | # Trees 2 | 3 | Like a [[Linked List]], Tree data structures don't require items in contiguous memory. 4 | 5 | ### Basic Definitions 6 | - **Node** - an item in the tree 7 | - **Root node** - the root of the tree 8 | - **Edge** - a pointer to another node 9 | - **Leaf node** - a node with no pointers to other nodes 10 | - Nodes must have exactly one **parent** 11 | - Nodes that have the same parent are **siblings** 12 | - A node's **level** is the number of steps from it to the **root node** 13 | - A tree's **height** is the level from the bottom-most node to the **root node** 14 | - A **branch** is the path from the root to a **leaf node** 15 | 16 | ```kotlin 17 | class Node { 18 | 19 | val value: T? = null 20 | val edges: List = emptyList() 21 | } 22 | ``` 23 | 24 | ### Tree Definitions 25 | 26 | A [[Binary Search Tree]] is a tree that can be efficiently searched. Typically, this is possibly because the tree is ordered, and there are strict rules regarding the number of children per node (there can only be two, i.e. binary!). -------------------------------------------------------------------------------- /Data Structures/Array.md: -------------------------------------------------------------------------------- 1 | # Arrays 2 | Prefer an Array over a [[Linked List]] when: 3 | - You frequently need to access to data at random positions 4 | - You need extreme performance on lookup ($O(1)$) 5 | - You know the size of the dataset in advance - remember there's a penalty for growing an Array 6 | 7 | Typically, Arrays don't automatically resize, however the ArrayList class in Java do. Normal implementation is when the ArrayList is full, it doubles in size (resizing factor of 2). 8 | - This doubling takes $O(n)$. 9 | - It happens so infrequently that the amortized insertion time is still $O(1)$ for a single element 10 | - For many elements, $O(n)$ 11 | 12 | # Big O 13 | ### Time 14 | - O(n) to add or remove an item at the start of the list 15 | - O(1) to add or remove an item at the end of the list 16 | - O(1) to find an access via an index 17 | - O(1) to update or replace an item 18 | - O(n) to insert or remove elsewhere 19 | 20 | ### Space 21 | - Contiguous memory, and the proximity of the items helps performance 22 | - Space required = (array capacity, which is >=n) * size of item = O(n) -------------------------------------------------------------------------------- /Kotlin/General Things to Remember.md: -------------------------------------------------------------------------------- 1 | # Decrementing 2 | 3 | Don't forget that you can traverse an array backwards: 4 | 5 | ```kotlin 6 | val list = listOf(1, 2, 3, 4, 5) 7 | 8 | for (index in list.size - 1 downTo 0) { 9 | // Do something starting from the right-hand side 10 | } 11 | ``` 12 | 13 | # Init'ing Lists 14 | ```kotlin 15 | val list = MutableList(5) { index -> index * 2 } 16 | ``` 17 | 18 | # Largest Value in Map 19 | 20 | ```kotlin 21 | val map = mapOf("one" to 1, "two" to 2, "three" to 3) 22 | 23 | val max = map.maxBy { it.value }!!.key // "three" 24 | ``` 25 | 26 | # Sorting Lists 27 | ```kotlin 28 | val randomList = listOf(3, 5, 1, -10, 89, 3) 29 | 30 | // Create a new list - O(n) space, O(n log n) complexity 31 | val newList = randomList.sorted() 32 | 33 | // Sort in-place - O(1) space, O(n log n) complexity 34 | randomList.sort() 35 | ``` 36 | 37 | # Indexes 38 | ```kotlin 39 | val lastIndex = "123".size - 1 40 | // or 41 | val lastIndex = "123".lastIndex 42 | 43 | for (index in 0 until list.size) { 44 | // Operation 45 | } 46 | // or 47 | for (index in list.indices) { 48 | // Operation 49 | } 50 | ``` -------------------------------------------------------------------------------- /Android/Thread Scheduling.md: -------------------------------------------------------------------------------- 1 | # Thread Scheduling 2 | 3 | ## Nice Values 4 | 5 | Nice Values in Android are a measure of a thread's priority. Threads with a _higher_ nice value are _lower_ priority - they are being _nice_ to other threads. 6 | 7 | The two most important priorities are `default` and `background`. The UI thread is given `default` priority. 8 | 9 | In theory this would be enough but in practise this system is somewhat lacking - 20 background threads with one thread driving the UI would end up impacting the performance of the UI considerably, resulting in lag. 10 | 11 | ## Cgroups 12 | To fix this issue, Android adopts foreground vs background scheduling from Linux using `cgroups` (control groups). 13 | 14 | Threads with `background` priority are moved into a background `cgroup`, where they are limited to a small percentage of available CPU horsepower _if_ threads in other groups are busy. This lets background threads make _some_ progress without starving the UI, for instance. 15 | 16 | Android also moves all threads belonging to applications that aren't currently in the foreground to the background - ergo, the currently focused application will always have priority. -------------------------------------------------------------------------------- /System Design/Characteristics/Latency and Throughput.md: -------------------------------------------------------------------------------- 1 | # Latency and Throughput 2 | 3 | ## Latency 4 | Latency is a measure of how long it takes for data to go from one part of a system to another - from client to server, from disk to processor, from distributed DB to some other component etc. 5 | 6 | There's tradeoffs to be made here - speed vs cost vs consistency vs persistence etc. 7 | 8 | ## Comparisions 9 | Reading 1 MB from: 10 | 11 | | Transaction | Latency | 12 | |-|-| 13 | | RAM | $0.25 ms$ | 14 | | SSD | $1 ms$ | 15 | | Network
(1 Gbps, no geographical distance) | $10 ms$ | 16 | | HDD | $20 ms$ | 17 | | Intercontinental round trip for a packet | $150 ms$ | 18 | 19 | ## Throughput 20 | Throughput is a measure of how much work a machine can handle in a given timeframe - measured in Gbps/Mbps/Kbps or in Requests Per Second (RPS). 21 | 22 | Typically we increase the throughput of a system by adding more servers, but this has a tradeoff. 23 | 24 | When designing a system, you have to consider where your throughput bottlenecks will be - there's no point having a server which can handle thousands of RPS, only to be constrained by a single database which can only handle a few tens of RPS. 25 | -------------------------------------------------------------------------------- /Algorithms/Topics/Bubble Sort.md: -------------------------------------------------------------------------------- 1 | # Bubble Sort 2 | 3 | ### Explanation 4 | ```javascript 5 | [8, 5, 2, 9, 5, 6, 3] 6 | ``` 7 | 8 | Given an unsorted array, we iterate through the array and compare each item to the next item. If we find that the first item is larger than the second item, we swap them. 9 | 10 | This causes the largest number in the set to "bubble up" to the end of the array. This also means on each pass, we _know_ that the last number is the largest, so we can decrement a pointer from the end of the array and slowly shrink the comparison set. 11 | 12 | We repeat this over and over until the array is sorted. 13 | 14 | ### Big O 15 | 16 | Best case with a sorted array - $O(n)$ 17 | Worst case - $O(n^2)$ 18 | 19 | ### Implementation 20 | 21 | ```kotlin 22 | fun bubbleSort(array: MutableList): List { 23 | var endPointer = array.size - 1 24 | 25 | while (endPointer > 0) { 26 | for (index in 0 until endPointer) { 27 | val firstItem = array[index] 28 | val secondItem = array[index + 1] 29 | if (firstItem > secondItem) { 30 | array[index] = secondItem 31 | array[index + 1] = firstItem 32 | } 33 | } 34 | 35 | endPointer-- 36 | } 37 | 38 | return array 39 | } 40 | ``` -------------------------------------------------------------------------------- /System Design/Fundamentals/Client-Server Model.md: -------------------------------------------------------------------------------- 1 | # The Client-Server Model 2 | 3 | The client-server model is the dominant paradigm for the modern internet. 4 | 5 | ### Clients 6 | - A machine or process that sends data to, or requests data from, a server. Typically this is a personal computer, browser, phone or app, but it can also be another server. 7 | 8 | ### Servers 9 | - A machine or process that provides data or some kind of service to clients, typically by listening for network calls. A server can also be a client for another server. 10 | 11 | ### DNS Queries 12 | - A client will ask a pre-determined set of servers: what's the IP address of some other server? 13 | - An IP address is a unique identifier for a specific server 14 | - These IPs are assigned by some kind of authority - Amazon/Google Cloud 15 | 16 | ### Requests 17 | - A **source IP address** sends a series of bytes as packets 18 | - The server uses this source IP adddress to return another series of packets 19 | 20 | ### Ports 21 | - If an IP address is similar to a mailing address, a port is equivalent to the flat number 22 | - Machines have 16k (14 bits?) ports 23 | - You can attempt to communicate with any of them, but typically the port you have to request is dictacted by the protocol 24 | - `http` uses port 80 25 | - `https` uses port 443 26 | -------------------------------------------------------------------------------- /System Design/Key Components/Proxies.md: -------------------------------------------------------------------------------- 1 | # Proxies 2 | 3 | Proxies are a fundamental unit in system design with many uses. There are two types of proxies - Forward Proxies and Reverse Proxies. Many people refer to just "proxies" - this is almost always the forward type. 4 | 5 | ## Forward Proxies 6 | 7 | ```mermaid 8 | graph LR 9 | Client --> id[Forward Proxy] 10 | id[Forward Proxy] --> Server 11 | Server --> id[Forward Proxy] 12 | id[Forward Proxy] --> Client 13 | ``` 14 | 15 | - Sits inbetween clients and servers 16 | - Is a server which acts on behalf of the client 17 | - Client IP is typically masked 18 | - A good example of this is a VPN 19 | - Servers _are not aware_ of forward proxies 20 | 21 | ## Reverse Proxies 22 | 23 | ```mermaid 24 | graph LR 25 | Client --> id[Reverse Proxy] 26 | id[Reverse Proxy] --> Server 27 | Server --> id[Reverse Proxy] 28 | id[Reverse Proxy] --> Client 29 | ``` 30 | 31 | - Similar to forward proxies but acts on behalf of the server instead 32 | - Clients _are not aware_ of reverse proxies - a DNS request for google.com might hit a reverse proxy, and clients don't realise this 33 | 34 | These have multiple uses: 35 | - Request filtering 36 | - Logging/metrics 37 | - Caching 38 | - Load balancing 39 | 40 | An example of a popular proxy is Nginx, which is a webserver often used for this purpose. -------------------------------------------------------------------------------- /System Design/Fundamentals/Hashing.md: -------------------------------------------------------------------------------- 1 | # Hashing 2 | 3 | Hashing is often used to help balance traffic loads. This can be done by computing a hash of: 4 | - The client IP 5 | - The user ID 6 | - Or any other thing which makes sense for the problem you're solving 7 | 8 | Hash functions could naively work like the hash functions used in storing/retrieving items in [[Hash Table]] data structures. 9 | 10 | `input -> number % server count -> server index` 11 | 12 | However this would mean that spinning up new servers due to traffic, or a server going down, would result in a complete re-compute of all indexes, resulting in client traffic hitting entirely new servers and a tonne of cache misses. 13 | 14 | # Consistent Hashing 15 | This is a type of hashing that minimises the number of remapped keys when a server is added or removed. This therefore minimises the number of requests which end up hitting new servers. 16 | 17 | This is implemented in a closest-neighbour way - it can be thought of as distributing traffic across a circle or a hash ring. 18 | 19 | # Rendezvous Hashing 20 | This type of hasing is also known as "highest random weight" hashing. 21 | 22 | 1) Calculate a score for each of your servers 23 | 2) Pick the highest one 24 | 1) If your server goes down, pick the second highest 25 | 2) If a new server is added, it doesn't affect you 26 | -------------------------------------------------------------------------------- /Algorithms/Topics/Rotate an Array.md: -------------------------------------------------------------------------------- 1 | # Rotate an Array 2 | 3 | Rotating an array by 90$^{\circ}$ is a common question which has a bit of a trick to it. 4 | 5 | The key is you can achieve this by _transposing_, then _reflecting_ the matrix. _Transposing_ means to swap each entry about the diagonal, and _reflecting_ means reversing the array from left to right. 6 | 7 | ```kotlin 8 | fun rotate(matrix: Array): Unit { 9 | transpose(matrix) 10 | reflect(matrix) 11 | } 12 | 13 | private fun transpose(matrix: Array): Unit { 14 | val length = matrix.size 15 | 16 | for (row in 0 until length) { 17 | // Note that we start iterating through columns at position == row 18 | // This is because we are working diagonally 19 | for (column in row until length) { 20 | val temp = matrix[row][column] 21 | matrix[row][column] = matrix[column][row] 22 | matrix[column][row] = temp 23 | } 24 | } 25 | } 26 | 27 | private fun reflect(matrix: Array): Unit { 28 | val length = matrix.size 29 | 30 | for (row in 0 until length) { 31 | // Note that we reflect until length / 2 32 | // No reason to keep going after halfway 33 | for (column in 0 until length / 2) { 34 | val temp = matrix[row][column] 35 | matrix[row][column] = matrix[row][length - column - 1] 36 | matrix[row][length - column - 1] = temp 37 | } 38 | } 39 | } 40 | ``` -------------------------------------------------------------------------------- /Data Structures/Graphs/Adjacency List.md: -------------------------------------------------------------------------------- 1 | # Adjacency List 2 | Example input: 3 | ```javascript 4 | Input: n = 5, edges = [[0,1],[0,2],[0,3],[1,4]] 5 | Output: true 6 | ``` 7 | 8 | For a graph to be a valid tree, it must have _exactly_ `n - 1` edges, where `n` is the number of nodes. 9 | 10 | Any less, and it can't possibly be fully connected. Any more, and it _has_ to contain cycles. Additionally, if the graph is fully connected _and_ contains exactly `n - 1` edges, it can't _possibly_ contain a cycle, and therefore must be a tree! 11 | 12 | ```kotlin 13 | // Where n is number of nodes 14 | fun validTree(n: Int, edges: Array): Boolean { 15 | if (edges.size != n - 1) return false 16 | 17 | val adjacencyList = List(n) { mutableListOf() } 18 | val seen = mutableSetOf() 19 | 20 | // Build adjacency list 21 | for (edge in edges) { 22 | adjacencyList[edge[0]].add(edge[1]) 23 | adjacencyList[edge[1]].add(edge[0]) 24 | } 25 | 26 | val queue = LinkedList() 27 | queue.offer(0) 28 | seen.add(0) 29 | 30 | while (queue.isNotEmpty()) { 31 | val node = queue.poll() 32 | 33 | // BFS is just adding all neighbours to Queue 34 | for (neighbour in adjacencyList[node]) { 35 | if (seen.contains(neighbour)) continue 36 | 37 | queue.offer(neighbour) 38 | seen.add(neighbour) 39 | } 40 | } 41 | 42 | return seen.size == n 43 | } 44 | ``` -------------------------------------------------------------------------------- /Algorithms/Contents.md: -------------------------------------------------------------------------------- 1 | ## Basic Searching/Traversal Algorithms 2 | 3 | All data structures are meant to hold information. Some structures need specials ways to access this information efficiently that are more involved than simply accessing an array index or hashmap key. Data-structures that hold collections of data such as Trees and Graphs use a few basic algorithms to access the information within. 4 | 5 | - [[Depth-First Search]] 6 | - [[Breadth-First Search]] 7 | - [[Binary Search]] 8 | 9 | ## Advanced Searching/Traversal Algorithms 10 | 11 | There is a ton of research in this field, but for the purposes of a tech interview, the most advanced searching algorithms you’ll likely come across in order of frequency are: 12 | 13 | - Quick Select 14 | - [[Dijkstra]] 15 | - Bellman-Ford 16 | - A-star 17 | 18 | ## Sorting Algorithms 19 | 20 | Sorting is a common tool used to increase the performance of a solution. There are many sorting algorithms, but the most popular for interviews are listed below. Know how to implement all of these without looking anything up. 21 | 22 | - [[Bubble Sort]] 23 | - [[Quick Sort]] 24 | - [[Merge Sort]] 25 | - [[Topological Sort]] 26 | - Counting Sort 27 | 28 | ## Other Common Algorithms 29 | - [[Rotate an Array]] 30 | - [[Kadane's Algorithm]] 31 | - [[Merge Intervals]] 32 | - [[Range-Sum Query]] 33 | - [[Dynamic Programming]] -------------------------------------------------------------------------------- /System Design/Fundamentals/Storage.md: -------------------------------------------------------------------------------- 1 | # Storage 2 | 3 | The predominant storage solution for systems design is a **database**. These come in many forms, each one optimised for different things - certain guarantees about your data, speed, consistency etc. 4 | 5 | - Databases, at the end of the day, are just servers 6 | - This means that _persistence_ is something that we have to think about 7 | - If the power goes down and you're executing transations to RAM - that data is lost forever 8 | - However, writing to disk for guaranteed persistence is extremely slow (see [[Latency and Throughput#Comparisions]]) 9 | - Writing to SSD is quicker, but much more expensive and may not be ideal for situations where lots of writes occur 10 | 11 | Remember that there are distributed databases - how are these managed? How is data sharded? 12 | 13 | ## SQL vs NoSQL 14 | 15 | 1. SQL databases are relational, NoSQL databases are non-relational. 16 | 2. SQL databases use structured query language and have a predefined schema. NoSQL databases have dynamic schemas for unstructured data. 17 | 3. SQL databases are vertically scalable, while NoSQL databases are horizontally scalable. 18 | 4. SQL databases are table-based, while NoSQL databases are document, key-value, graph, or wide-column stores. 19 | 5. SQL databases are better for multi-row transactions, while NoSQL is better for unstructured data like documents or JSON. -------------------------------------------------------------------------------- /Intro.md: -------------------------------------------------------------------------------- 1 | Study plan based on https://github.com/jwasham/coding-interview-university 2 | Good resource here https://hackernoon.com/how-to-prepare-yourself-for-data-structures-and-algorithms-interviews-at-faang-mm23316l 3 | 4 | # Goals 5 | * Further my computer science knowledge 6 | * Keep my skills up to date 7 | * Improve my problem solving skills 8 | * To gain the skills to land a role at a top-tier company 9 | 10 | # Format 11 | This collection of notes was built with [Obsidian](https://obsidian.md/), a "knowledge base" system built ontop of Markdown files. It supports Mermaid graphs, LaTeX formatting and more. 12 | 13 | # Contents 14 | * [[Algorithmic Complexity and Big O Notation]] 15 | * [[Common Strategies]] 16 | 17 | ### Data Structures 18 | * [[Array]] 19 | * [[Hash Table]] 20 | * [[TreeMap]] 21 | * [[Linked List]] 22 | * [[Queue]] 23 | * [[Stack]] 24 | * [[LRU Cache]] 25 | * [[Binary Search Tree]] 26 | * [[Binary Heap]] 27 | * [[Trie]] 28 | 29 | ### Algorithms 30 | * [[Algorithms/Contents]] 31 | 32 | ### System Design 33 | 34 | * [[System Design/Contents]] 35 | 36 | ### Compsci 37 | * [[Bitwise operations]] 38 | * [[Character Sets]] 39 | * [[Powers of 2]] 40 | * [[Regex]] 41 | * [[Hash Functions]] 42 | 43 | ### Android 44 | * [[Architecture Overview]] 45 | * [[Activity Architecture]] 46 | * [[Thread Scheduling]] 47 | 48 | ### Maths 49 | * [[DeMorgan's Law]] 50 | * [[Sets]] 51 | * [[n-choose-k]] 52 | 53 | ### Behavioural 54 | * [[Behavioural Interview Notes]] -------------------------------------------------------------------------------- /Algorithms/Topics/Dijkstra.md: -------------------------------------------------------------------------------- 1 | # Dijkstra's Algorithm 2 | 3 | Given a graph of a road network, where the weights of each edge represent the difficulty of traversing the road (smaller is better and represents a faster road), Dijkstra's algorithm basically starts by inspecting the fastest roads first. It does this using a priority queue. 4 | 5 | Dijkstra's algorithm only works on directed graphs, and the edges cannot have negative values. 6 | 7 | ```mermaid 8 | 9 | graph LR 10 | S---|7|A; 11 | S---|3|C; 12 | S---|2|B; 13 | A---|3|B; 14 | A---|4|D; 15 | B---|4|D; 16 | B---|1|H; 17 | D---|5|F; 18 | F---|3|H; 19 | H---|2|G; 20 | G---|2|E; 21 | E---|5|K; 22 | K---|4|I; 23 | K---|4|J; 24 | J---|6|I; 25 | J---|4|L; 26 | I---|4|L; 27 | L---|2|C; 28 | ``` 29 | 30 | ```javascript 31 | function Dijkstra(Graph, source): 32 | 33 | create vertex set Q 34 | 35 | for each vertex v in Graph: 36 | dist[v] ← INFINITY 37 | prev[v] ← UNDEFINED 38 | add v to Q 39 | dist[source] ← 0 40 | 41 | while Q is not empty: 42 | u ← vertex in Q with min dist[u] 43 | 44 | remove u from Q 45 | 46 | // only v that are still in Q 47 | for each neighbor v of u: 48 | alt ← dist[u] + length(u, v) 49 | if alt < dist[v]: 50 | dist[v] ← alt 51 | prev[v] ← u 52 | 53 | return dist[], prev[] 54 | ``` -------------------------------------------------------------------------------- /Algorithms/Topics/Binary Search.md: -------------------------------------------------------------------------------- 1 | # Binary Search 2 | 3 | ### Explanation 4 | 5 | Binary search only works on a pre-sorted list. 6 | 7 | ```javascript 8 | [2, 3, 5, 8, 10, 15, 22, 23] 9 | ^ ^ ^ 10 | low mid high 11 | ``` 12 | * If item at mid point is correct, return it. 13 | * If item at mid point is greater than the target, move `high` to below the mid: 14 | 15 | ```javascript 16 | [2, 3, 5, 8, 10, 15, 22, 23] 17 | ^ ^ 18 | low high 19 | ``` 20 | 21 | * If item at mid point is lesser than the target, move `low` to above the mid: 22 | 23 | ```javascript 24 | [2, 3, 5, 8, 10, 15, 22, 23] 25 | ^ ^ 26 | low high 27 | ``` 28 | 29 | Keep going until guess is found. This can be done both recursively and iteratively. 30 | 31 | ### < or <=? 32 | * If you are returning from inside the loop, use `low <= high` 33 | * If you are reducing the search space, use `low < high` and finally return a `low` 34 | 35 | ### Implementation 36 | 37 | ```kotlin 38 | /** 39 | * Returns index of target item, or -1 if not found 40 | */ 41 | fun search(array: List, target: Int): Int { 42 | var low = 0 43 | var high = array.size - 1 44 | 45 | while (low <= high) { 46 | val mid = low + (high - low) / 2 47 | val guess = array[mid] 48 | 49 | if (guess == target) { 50 | return mid 51 | } else if (guess < target) { 52 | low = mid + 1 53 | } else { 54 | high = mid - 1 55 | } 56 | } 57 | 58 | return -1 59 | } 60 | ``` 61 | 62 | ### Big O 63 | 64 | Runs in $O(log \cdot n)$ time. -------------------------------------------------------------------------------- /System Design/Key Components/Load Balancers.md: -------------------------------------------------------------------------------- 1 | # Load Balancers 2 | 3 | ```mermaid 4 | graph LR 5 | Client --> id[Load balancer] 6 | id[Load balancer] --> Server 7 | Server --> Client 8 | ``` 9 | 10 | A [[Proxies#Reverse Proxies]] that distributes traffic across multiple servers using some kind of selection strategy. 11 | 12 | Load balancers are used in every layer, from DNS to the database layer. 13 | 14 | With DNS load balancers, a round-robin strategy might be used - `dig https://www.googlecom` will return _different_ IP addresses based on the load balancer. This is an extremely common pattern. 15 | 16 | # Server Selection Strategy 17 | - How a load balancer chooses which servers to distribute traffic to. 18 | - Multiple popular strategies: 19 | - Round robin 20 | - Weighted round robin 21 | - Random selection 22 | - Performance-based selection 23 | - IP-based strategy 24 | - Path-based - all requests related to a specific type of request (for instance, payments) can be distributed to specific servers 25 | 26 | Round-robin strategies and others have a shortcoming - sequential, indentical requests from a client aren't guaranteed to hit the same server. This is a problem if your architecture relies heavily on caching, as it'll result in a cache miss. 27 | 28 | One way of solving this problem is with [[Hashing]]. 29 | 30 | # Hot Spots 31 | When distributing workloads across servers, this workload might be spread unevenly. This can happen if your sharding key or hashing function are less than optimal or if your workload is naturally skewed. 32 | 33 | # Example 34 | Nginx is a very popular example of a [[Proxies#Reverse Proxies]] and load balancer. 35 | -------------------------------------------------------------------------------- /Data Structures/Hash Table.md: -------------------------------------------------------------------------------- 1 | # Hash Table 2 | 3 | A hash table is generally made of an [[Array]] of [[Linked List]] and a hash code function - this hashcode function is used to compute the position in the Array via modulo. 4 | 5 | 1) First, we compute the hashcode of the provided key - note that two different keys might have the same hashcode (collisions), so we may have an infinite number of keys with the same hashcode 6 | 2) Next, use the hashcode to compute the position in the Array. Typically this is something like `hash(key) % array_length`. Two different codes could of course map to the same index. 7 | 3) At this index there is a [[Linked List]] of both keys and values. Store this item and key in the [[Linked List]] - this is done because there may be collisions. 8 | 9 | To avoid collisions, we typically ensure that a HashTable has 50% of it's space free, otherwise it's performance becomes much worse. This means that a Hash Table typically takes up quite a lot of space in the form of sequential memory compared to an [[Array]]. 10 | 11 | ## Big O 12 | It is this potential collision which means that the _worst case_ lookup for an item is $O(n)$, where $n$ is the number of keys. 13 | 14 | Best case, and the amortized case, is $O(1)$. 15 | 16 | This is because looking up an item is the same process - compute the hashcode from the key, compute the index from the hashcode, and then search the [[Linked List]] for the key. 17 | 18 | Alternatively, we can implement a hash table with a [[Binary Search Tree]] - this reduces the worst-case time complexity to $O(log n)$. This potentially uses less space since we no longer allocate a large array. We can also iterate through the keys in order, which is sometimes useful. 19 | 20 | -------------------------------------------------------------------------------- /CompSci/Bitwise operations.md: -------------------------------------------------------------------------------- 1 | [Bithacks](https://graphics.stanford.edu/~seander/bithacks.html) is a great resource. 2 | 3 | # Left Shift 4 | `<<` operator in Java moves all `x` bits `y` places to the left. For example: 5 | 6 | `1 << 5` == 32, where in binary: 7 | 8 | `00000001 << 5` == `00100000` 9 | 10 | ### How can you quickly compute $2^x$? 11 | 12 | Left shift by `x`: `1 << x` 13 | 14 | ### Compute `0110` + `0110` 15 | 16 | This is equivalent to $0110 * 2$ which is equal to `0110 << 1` = `1100` 17 | 18 | # Negate 19 | 20 | This is the `~` operator, e.g. `~0110`. 21 | 22 | A number `XOR` it's negated value is always `1`, so `1011 ^ (~1011)` = `1111` . 23 | 24 | `~0` is a sequence of `1`s, so `~0 << 2 ` = `...11111100`. 25 | 26 | `~(1 << 5)` = `~(00100000)` = `11011111` 27 | 28 | # Tricks 29 | 30 | `x ^ 0s` = `x` 31 | `x & 0s` = `0` 32 | `x | 0s` = `x` 33 | 34 | `x ^ 1s` = `~x` 35 | `x & 1s` = `x` 36 | `x | 1s` = `1s` 37 | 38 | `x ^ x` = `0` 39 | `x & x` = `x` 40 | `x | x` = `x` 41 | 42 | # XOR 43 | 44 | $a \oplus a \oplus b = 0 \oplus b = b$ 45 | 46 | # Two's Complement 47 | Numbers are represented as positive or negative using a sign bit. A two's complement is the "opposite" number when the sign bit is flipped: 48 | 49 | | Positive Values | Negative Values | 50 | | - | - | 51 | | 7 -> 0111 | -1 -> 1111 | 52 | | 6 -> 0110 | -2 -> 1110 | 53 | | 5 -> 0101 | -3 -> 1101 | 54 | | 4 -> 0100 | -4 -> 1100 | 55 | | 3 -> 0011 | -5 -> 1011 | 56 | | 2 -> 0010 | -6 -> 1010 | 57 | | 1 -> 0001 | -7 -> 1001 | 58 | | 0 -> 0000 | | 59 | 60 | Notice how the left and right side always sum to $2^3$. 61 | 62 | The binary representation of `-K` as an N-bit number is $concat(1, 2^{N-1} - K)$. 63 | 64 | Note that a sequence of all `1`s is equal to -1. -------------------------------------------------------------------------------- /Algorithms/Topics/Merge Intervals.md: -------------------------------------------------------------------------------- 1 | # Merge Intervals 2 | - Sort the inputs by the first value - $O(n \cdot log \cdot n)$ 3 | - Check if the start of the current meeting (for instance) is greater than the end of the previous one 4 | - Handle appropriately 5 | - Either update the interval or return false if attempting to work out meeting schedules 6 | 7 | 8 | ```kotlin 9 | fun merge(intervals: Array): Array { 10 | intervals.sortBy { it.first() } 11 | 12 | val merged = LinkedList() 13 | 14 | for (array in intervals) { 15 | // if the list of merged intervals is empty or if the current 16 | // interval does not overlap with the previous, simply append it. 17 | if (merged.isEmpty() || merged.last()[1] < array.first()) { 18 | merged.add(array) 19 | } else { 20 | // otherwise, there is overlap, so we merge the current and previous 21 | // intervals. 22 | merged.last()[1] = Math.max(merged.last()[1], array[1]) 23 | } 24 | } 25 | 26 | return merged.toTypedArray() 27 | } 28 | ``` 29 | 30 | ```kotlin 31 | fun canAttendMeetings(intervals: Array): Boolean { 32 | intervals.sortBy { it.first() } 33 | 34 | for (i in 0 until intervals.size - 1) { 35 | if (intervals[i][1] > intervals[i + 1][0]) return false 36 | } 37 | 38 | return true 39 | } 40 | ``` 41 | 42 | ```kotlin 43 | fun minMeetingRooms(intervals: Array): Int { 44 | if (intervals.isEmpty()) return 0 45 | intervals.sortBy { it.first() } 46 | 47 | val minHeap = PriorityQueue { a, b -> a - b } 48 | 49 | minHeap.add(intervals.first()[1]) 50 | 51 | for (interval in intervals.drop(1)) { 52 | if (interval[0] >= minHeap.peek()) { 53 | minHeap.poll() 54 | } 55 | 56 | minHeap.offer(interval[1]) 57 | } 58 | 59 | return minHeap.size 60 | } 61 | ``` -------------------------------------------------------------------------------- /Data Structures/Graphs/Graph Representations.md: -------------------------------------------------------------------------------- 1 | # Representations 2 | 3 | A graph is similar to a tree, but it has **no root node**. This distinction means it can be used to model all sorts of different types of data - social networks, for example. 4 | 5 | This also means that there might be cycles - similar to a [[Linked List]], or disconnectivity where one node is orphaned. 6 | 7 | Graphs _can_ be trees. For a graph to be a valid tree, it must have _exactly_ `n - 1` edges, where `n` is the number of nodes. 8 | 9 | Any less, and it can't possibly be fully connected. Any more, and it _has_ to contain cycles. Additionally, if the graph is fully connected _and_ contains exactly `n - 1` edges, it can't _possibly_ contain a cycle, and therefore must be a tree! 10 | 11 | Graphs can be represented in 3 ways: 12 | 13 | - An [[Adjacency List]] 14 | - A matrix 15 | - Objects and pointers 16 | 17 | | Operation | Adjacency Matrix | Adjancency List | 18 | | - | - | -| 19 | | Storage space | Because of the use of a $V \cdot V$ matrix, space = $O(v^2)$ | $O(V \cdot E)$ | 20 | | + vertex | Storage must increase to $(V +1)^2$, and to achieve this we must copy the entire array. Therefore, $O(n^2)$ | $O(1)$, as we're just adding an item to a list | 21 | | + edge | `matrix[i][j] = 1`, so $O(1)$ | Same as adding a vertex, $O(1)$ | 22 | | - vertex | Storage must be decreased, so $O(v^2)$ | To remove a vertex, we must search for it. After this we must search for edges, so $O(V \cdot E)$ | 23 | | - edge | `matrix[i][j] = 0`, so $O(1)$| Must traverse through all edges, so $O(E)$ | 24 | | Querying | Content of the matrix must be checked - this lookup is $O(1)$ | To check for an edge, we must check for vertices adjacent to a given vertex. A vertex can have $V$ neighbours and worst-case we have to check every one. So time complexity is $O(V)$. | 25 | 26 | 27 | 28 | -------------------------------------------------------------------------------- /Java/General Java Questions.md: -------------------------------------------------------------------------------- 1 | # General Java Questions 2 | ### Core Java Features 3 | - Object oriented 4 | - Inheritance 5 | - Encapsulation 6 | - Polymorphism 7 | - Abstraction 8 | - Platform independent 9 | - High performance 10 | - Multi-threaded 11 | - Pass-by-value 12 | 13 | ### Primitives 14 | Because of these primitive types, Java is not strictly object-oriented: 15 | 16 | - `byte` 17 | - `string` 18 | - `int` 19 | - `float` 20 | - `char` 21 | - `double` 22 | - `boolean` 23 | - `long` 24 | 25 | These often get wrapped in *Wrapper classes*, which convert primitives to reference types (objects). 26 | 27 | ### Volatile Keyword 28 | In Java, each thread has its own stack including its own copy of any variables it can access. When the thread is created, it copies the value of all accessible variables into the stack. 29 | 30 | The `volatile` keyword tells the JVM that this variable might be modified in another thread - so the thread is forced to read it directly from memory rather than using the cached value. 31 | 32 | ### Classloader 33 | Part of the runtime environment that loads classes on demand (lazily loading) into the JVM. 34 | 1) Bootstrap classloader - loads core Java API files 35 | 2) Extension classloader - loads JAR files from folder 36 | 3) Application classloader - loads JAR files from path specified in CLASSPATH env variable 37 | 38 | ### String Pool 39 | ```java 40 | String string = "Hello" 41 | String string = new String("Hello") 42 | ``` 43 | 44 | First option is more efficient, because a `String` with value `Hello` will be created in the String pool. If another string with the same value is created, it will reference the same object. 45 | 46 | Calling `new String("...")` creates a `String` with value `Hello` in the String pool, and that string is then passed to the `String` constructor. This creates another `String` object which is _not_ in the String pool. Each call therefore creates a new object rather than just reusing an object from the pool. 47 | 48 | -------------------------------------------------------------------------------- /Data Structures/LRU Cache.md: -------------------------------------------------------------------------------- 1 | # LRU Cache 2 | 3 | A Least Recently Used Cache is typically implemented with a Doubly-Linked List and a Map: 4 | 5 | ```kotlin 6 | class DoublyLinkedList { 7 | 8 | private var head: Entry = Entry(0, 0) 9 | private var tail: Entry = Entry(0, 0) 10 | 11 | init { 12 | head.next = tail 13 | tail.previous = head 14 | } 15 | 16 | fun insertHead(entry: Entry) { 17 | entry.previous = head 18 | entry.next = head.next 19 | 20 | head.next?.previous = entry 21 | head.next = entry 22 | } 23 | 24 | fun remove(entry: Entry) { 25 | entry.previous?.next = entry.next 26 | entry.next?.previous = entry.previous 27 | } 28 | 29 | fun removeTail(): Int { 30 | val node = tail.previous!! 31 | val key = node.key 32 | 33 | remove(node) 34 | 35 | return key 36 | } 37 | } 38 | 39 | class Entry( 40 | val key: Int, 41 | val value: Int, 42 | var next: Entry? = null, 43 | var previous: Entry? = null 44 | ) 45 | 46 | class LRUCache(val capacity: Int) { 47 | 48 | private val cache = mutableMapOf() 49 | private val list = DoublyLinkedList() 50 | 51 | operator fun get(key: Int): Int { 52 | if (!cache.containsKey(key)) return -1 53 | update(key, cache[key]!!) 54 | 55 | return cache[key]!!.value 56 | } 57 | 58 | fun put(key: Int, value: Int) { 59 | val n = Entry(key, value) 60 | 61 | if (cache.containsKey(key)) { 62 | list.remove(cache[key]!!) 63 | } else if (cache.size >= capacity) { 64 | val k = list.removeTail() 65 | cache.remove(k) 66 | } 67 | 68 | list.insertHead(n) 69 | cache[key] = n 70 | } 71 | 72 | private fun update(key: Int, n: Entry) { 73 | list.remove(n) 74 | list.insertHead(n) 75 | cache[key] = n 76 | } 77 | } 78 | ``` -------------------------------------------------------------------------------- /Algorithms/Topics/Dynamic Programming.md: -------------------------------------------------------------------------------- 1 | # Dynamic Programming 2 | 3 | ### A Framework to Solve 4 | Typically, dynamic programming problems can be solved with 3 main components: 5 | 6 | 1) First, we need a function or [[Array]] that represents the answer to the problem from a given state. This is often named `dp`. 7 | 8 | For instance, we may have an [[Array]] `dp` where `dp[i]` represents the length of the longest increasing subsequence that ends with the $i^{th}$ element. 9 | 10 | 2) We need a way to transition between states - e.g. from `dp[5]` to `dp[7]`. This is called the **recurrence relation** and can be quite difficult to figure out. 11 | 12 | Given that we know `dp[0]`, `dp[1]` and `dp[2]`, how can we calculate `dp[3]`? 13 | 14 | In the case of the longest increasing subsequence, if `nums[3] > nums[2]`, we can simply take `dp[2]` and increase the length by 1. 15 | 16 | Of course we're trying to maximise `dp[3]`, so we need to check `dp[0..2]`. 17 | 18 | 3) We need a **base case**. Typically for this, we initialize every element of `dp` to some value - perhaps `1` or `0`, sometimes `true` or `false`. 19 | 20 | ### Example - Longest Increasing Subsequence 21 | 22 | 1. Initialize an array `dp` with length `nums.length` and all elements equal to 1. `dp[i]` represents the length of the longest increasing subsequence that ends with the element at index `i`. 23 | 2. Iterate from `i = 1` to `i = nums.length - 1`. At each iteration, use a second for loop to iterate from `j = 0` to `j = i - 1` (all the elements before i). For each element before `i`, check if that element is smaller than `nums[i]`. If so, set `dp[i] = max(dp[i], dp[j] + 1)`. 24 | 3. Return the max value from `dp`. 25 | 26 | ```kotlin 27 | fun lengthOfLIS(nums: IntArray): Int { 28 | val dp = IntArray(nums.size) { 1 } 29 | 30 | for (i in 1 until nums.size) { 31 | // All elements before i 32 | for (j in 0 until i) { 33 | if (nums[j] < nums[i]) { 34 | dp[i] = Math.max(dp[i], dp[j] + 1) 35 | } 36 | } 37 | } 38 | 39 | return dp.max()!! 40 | } 41 | ``` -------------------------------------------------------------------------------- /Data Structures/Linked List.md: -------------------------------------------------------------------------------- 1 | # Linked-Lists 2 | Prefer a Linked List over an [[Array]] when: 3 | - You require very fast insertions or deletions 4 | - You don't need random access to items in the list (you have to start from position 0 and iterate) 5 | - You can't evaluate the exact size of the list (it may grow or shrink during execution) 6 | 7 | # Big O 8 | ### Time 9 | - $O(1)$ to add or remove an item at the start of the list 10 | - $O(n)$ to add or remove an item at the end of the list 11 | - $O(n)$ to find and access via an index 12 | - $O(1)$ to update 13 | - $O(n)$ to insert or remove elsewhere 14 | 15 | # Singley-Linked 16 | Has a nullable reference to the next node in the List: 17 | ```java 18 | class Node { 19 | T item; 20 | @Nullable next Node; 21 | } 22 | ``` 23 | 24 | # Doubley-Linked 25 | Has nullable references to both the next and previous nodes in the list: 26 | ```java 27 | class Node { 28 | T item; 29 | @Nullable next Node; 30 | @Nullable previous Node; 31 | } 32 | ``` 33 | 34 | # Delete Kth Node from LinkedList 35 | * Use two pointers 36 | * Move the first pointer `k` steps away from the head 37 | * Move both in unison until the end of the list 38 | * Delete the node at the first pointer 39 | 40 | ```kotlin 41 | fun removeKthNodeFromEnd(head: LinkedList, k: Int) { 42 | var counter = 1 43 | var first = head 44 | var current: LinkedList? = head 45 | 46 | while (counter <= k) { 47 | current = current!!.next 48 | counter++ 49 | } 50 | 51 | if (current == null) { 52 | head.value = head.next!!.value 53 | head.next = head.next!!.next 54 | return 55 | } 56 | 57 | while (current!!.next != null) { 58 | current = current.next 59 | first = first.next!! 60 | } 61 | 62 | first.next = first.next!!.next 63 | } 64 | ``` 65 | 66 | # Reverse Linked List 67 | 68 | ```kotlin 69 | fun reverseList(head: ListNode?): ListNode? { 70 | var previous: ListNode? = null 71 | var current: ListNode? = head 72 | 73 | while (current != null) { 74 | val temp = current.next 75 | current.next = previous 76 | previous = current 77 | current = temp 78 | } 79 | 80 | return previous 81 | } 82 | ``` -------------------------------------------------------------------------------- /Maths/Sets.md: -------------------------------------------------------------------------------- 1 | A set is just a collection of items. 2 | 3 | # Subset 4 | A collection of objects that are contained inside another set. 5 | 6 | # Union 7 | Given two sets, $S_1$ and $S_2$, the items which belong to _either_ $S_1$ or $S_2$ is described as the _union_ of the two sets. 8 | 9 | $$S_3 = S_1 \cup S_2$$ 10 | 11 | # Intersection 12 | Given two sets, $S_1$ and $S_2$, the items which belong to both $S_1$ and $S_2$ is described as the _intersection_ of the two sets. 13 | 14 | $$S_3 = S_1 \cap S_2$$ 15 | 16 | # Power Set 17 | Given a collection of objects $S$, a power set is a set of all subsets of $S$. 18 | 19 | $$P_s = {S_1,S_2,...,S_{16}}$$ 20 | 21 | For a simple example: 22 | 23 | > You are given a set of fragrances, `F`. Compute a set which contains all possible combinations of `F` - list all fragrances that can be made. 24 | 25 | Computing this requires nested `for` loops. An outer loop keeps track of the next flower to consider. The inner loop duplicate the fragrances, adding the current flower to the duplicates: 26 | 27 | ```javascript 28 | function power_set(flowers) 29 | fragrances <- Set.new 30 | fragrances.add(Set.new) 31 | 32 | for each flower in flowers 33 | new_fragrances <- copy(fragrances) 34 | 35 | for each fragrance in new_fragrances 36 | fragrance.add(flower) 37 | fragrances <- fragrances + new_fragrances 38 | 39 | return fragrances 40 | ``` 41 | 42 | Each extra flower causes `fragrances` to double in size - O(2^n) complexity. 43 | 44 | Generating a power set is equivalent to generating a truth table - it gets out of hand quickly. 45 | 46 | ```kotlin 47 | fun subsets(nums: IntArray): List> { 48 | val results = mutableListOf>() 49 | results.add(mutableListOf()) 50 | 51 | for (num in nums) { 52 | val newSubsets = mutableListOf>() 53 | 54 | for (current in results) { 55 | val subset = current + num 56 | newSubsets.add(subset.toMutableList()) 57 | } 58 | 59 | results.addAll(newSubsets) 60 | } 61 | 62 | return results 63 | } 64 | ``` -------------------------------------------------------------------------------- /Algorithmic Complexity and Big O Notation.md: -------------------------------------------------------------------------------- 1 | # Big O Notation 2 | 3 | $O(1)$ - Constant time 4 | 5 | $O(log \cdot n)$ - Logarithmic time 6 | 7 | $O(n)$ - Linear time 8 | 9 | $O(n \cdot log \cdot n)$ - Log linear time 10 | 11 | $O(n^2)$ - Quadratic time 12 | 13 | $O(n^3)$ - Cubic time 14 | 15 | $O(2^n)$ - Exponential time 16 | 17 | $O(n!)$ - Factorial time 18 | 19 | # Time Complexity 20 | 21 | | Data Structure | Access | Search | Insertion | Deletion | 22 | |-|-|-|-|-| 23 | | Array | $O(1)$ | $O(n)$ | $O(n)$ | $O(n)$ | 24 | | Stack | $O(n)$ | $O(n)$ | $O(1)$ | $O(1)$ | 25 | | Queue | $O(n)$ | $O(n)$ | $O(1)$ | $O(1)$ | 26 | | Linked List | $O(n)$ | $O(n)$ | $O(1)$ | $O(1)$ | 27 | | Doubly-Linked List | $O(n)$ | $O(n)$ | $O(1)$ | $O(1)$ | 28 | | Hash Table | N/A | $O(1)$ | Best: $O(1)$
Worst: $O(n)$ | Best: $O(1)$
Worst: $O(n)$ | 29 | | Binary Search Tree | Best: $O(log \cdot n)$
Worst: $O(n)$ | Best: $O(log \cdot n)$
Worst: $O(n)$ | Best: $O(log \cdot n)$
Worst: $O(n)$ | Best: $O(log \cdot n)$
Worst: $O(n)$ | 30 | 31 | | Algorithm | Best | Average | Worst | Space | 32 | |-|-|-|-|-| 33 | | Quicksort | $O(n \cdot log \cdot n)$ | $O(n \cdot log \cdot n)$ | $O(n^2)$ | $O(log \cdot n)$ | 34 | | Mergesort | $O(n \cdot log \cdot n)$ | $O(n \cdot log \cdot n)$ | $O(n \cdot log \cdot n)$ | $O(n)$ | 35 | | Timsort | $O(n)$ | $O(n \cdot log \cdot n)$ | $O(n \cdot log \cdot n)$ | $O(n)$ | 36 | | Heapsort | $O(n \cdot log \cdot n)$ | $O(n \cdot log \cdot n)$ | $O(n \cdot log \cdot n)$ | $O(1)$ | 37 | | Bubblesort | $O(n)$ | $O(n^2)$ | $O(n^2)$ | $O(1)$ | 38 | | Insertion Sort | $O(n)$ | $O(n^2)$ | $O(n^2)$ | $O(1)$ | 39 | | Selection Sort | $O(n^2)$ | $O(n^2)$ | $O(n^2)$ | $O(1)$ | 40 | | Tree Sort | $O(n \cdot log \cdot n)$ | $O(n \cdot log \cdot n)$ | $O(n^2)$ | $O(n)$ | 41 | | Shell Sort | $O(n \cdot log \cdot n)$ | $O((n \cdot log \cdot n)^2)$ | $O((n \cdot log \cdot n)^2)$ | $O(1)$ | 42 | | Bucket Sort | $O(n + k)$ | $O(n + k)$ | $O(n^2)$ | $O(n)$ | 43 | | Radix Sort | $O(n + k)$ | $O(nk)$ | $O(nk)$ | $O(n + k)$ | 44 | | Counting Sort | $O(n + k)$ | $O(n + k)$ | $O(n + k)$ | $O(k)$ | 45 | | Cubesort | $O(n)$ | $O(n \cdot log \cdot n)$ | $O(n \cdot log \cdot n)$ | $O(n)$ | 46 | -------------------------------------------------------------------------------- /System Design/Characteristics/Availability.md: -------------------------------------------------------------------------------- 1 | # Availability 2 | 3 | How stable does the system you're designing need to be? Consider the usecase: 4 | - What is the financial cost of downtime? Missed transactions, customers getting frustrated etc 5 | - Is it a critical service? Air Traffic Control probably has greater availability constraints than Twitter, for instance. How much downtime is acceptable in these scenarios? 6 | - Does your service enable other services? For instance, Cloudflare going down is really really bad. 7 | - Do different parts of your services require different availability? A system that provides profile photos might well be allowed to be less robust than the system that takes payments 8 | 9 | ### SLA/SLO 10 | Service Level Agreements/Service Level Objectives are guarantees that services make with customers about availability. Would your service provide an SLA? 11 | 12 | What would be the penalty for breaking this SLA? For reference, Google Cloud provides refunds below certain levels of availability. 13 | 14 | ## Redundancy 15 | The way that we ensure high availability is by building systems with redundancy. 16 | - Ensure that there's no single point of failure in your system - if you only have one server and it goes down, how screwed are you? 17 | 18 | ### Passive Redundancy 19 | Multiple components for a specific layer of your architecture. For instance, you might have several load balancers - if one goes down, the others handle the increased load just fine, but there has to be available bandwidth for them. 20 | 21 | ### Active Redundancy 22 | This is more complex, but in essence it's having machines which agree which one or subset of them does the incoming work. If a machine fails, the other machines agree upon a way to route around the problem by taking up the work themselves or distributing it to others. 23 | 24 | See [[Leader Election]]. 25 | 26 | ## Resolving Issues 27 | What happens if some machines _do_ go down? Does a human need to intervene? 28 | 29 | ## Nines 30 | Over the course of a year: 31 | 32 | | Uptime | Nines | Time Down | 33 | |-|-|-| 34 | | 99% | Two 9s | 87.7 hours | 35 | | 99.9% | Three 9s | 8.8 hours | 36 | | 99.99% | Four 9s | 52.6 minutes | 37 | | 99.999% | Five 9s | 5.3 minutes | 38 | -------------------------------------------------------------------------------- /Java/Threading.md: -------------------------------------------------------------------------------- 1 | # Threading 2 | 3 | ### When might you not want to use Coroutines? 4 | Coroutines are lightweight, and you can spawn many coroutines within a thread. However, if you're doing something computationally expensive using a coroutine, you may find that it starves other coroutines of CPU time until the OS sends an interrupt. 5 | 6 | This is why coroutines in Kotlin have multiple schedulers, so that a computationally heavy coroutine doesn't starve the UI thread, for example. 7 | 8 | ### Process vs Thread 9 | A process is a self-contained execution environment - e.g. a program or application. A thread is a single task of execution within that process. 10 | 11 | Threads share process resources, processes don't share resources for security reasons. 12 | 13 | ### Creating Threads 14 | - Extend the `Thread` class 15 | - Implement the `Runnable` interface and create a `Thread` from it 16 | 17 | ### Thread Pool 18 | A thread pool manages a pool of worker threads, and contains a queue that holds tasks waiting to be executed. 19 | 20 | ### Context Switching 21 | This is the process of storing and restoring CPU state so that thread execution can be resumed from the same point at a later point in time. Context switching is an essential feature for multitasking operating systems in multi-threaded environments. 22 | 23 | ### Deadlock 24 | For deadlock to occur, you must have 4 conditions be true: 25 | 1) Mutual exclusion - only one process can access a resource at a given time 26 | 2) Hold and wait - processes already holding a resource can request additional resources, without relinquishing their current resources 27 | 3) No preemption - one process cannot forcibly remove another process' resource 28 | 4) Circular wait - two or more processes form a circular chain where each process is waiting on another resource in the chain 29 | 30 | Remove any of the above conditions to break a deadlock. Most deadlock algorithsm focus on avoiding the circular wait problem. 31 | 32 | ### Livelock 33 | A livelock occurs when more than one process is continually attempting to get out of the way of another, which means that they never actually access the resource they want and therefore never complete. 34 | 35 | A simple analogy for this is two people meeting in a corridor and attempting to pass eachother, but always stepping in the same direction. 36 | 37 | 38 | -------------------------------------------------------------------------------- /System Design/Fundamentals/Partitioning.md: -------------------------------------------------------------------------------- 1 | # Partitioning 2 | 3 | When a dataset can no longer fit in a single node, it must be partitioned. This is a general technique that can be used in a variety of circumstances, such as sharding [[Network Protocols#TCP Transmission Control Protocol]] requests across backends in a load balancer. 4 | 5 | ## Strategies 6 | When a client sends a request to a service that has been partitioned, the request needs to be routed to the datastore whic holds the data. This is often done using a [[Microservices#The API Gateway]] service which can route the request and knows how to map keys to partitions, and partitions to nodes. 7 | 8 | This mapping is normally maintained in a strongly consistent configuration store such as etcd or Zookeeper. 9 | 10 | Generally, two strategies for partioning are used: 11 | 12 | ### Range Partitioning 13 | Data is split into partitions by key range in lexicographical order, and each partition contains a continuous range of keys. The data can be stored in sorted order, making scans very fast. 14 | 15 | This doesn't make sense however if the distribution of keys isn't uniform - like much of the English language. Similarly, partitioning by date makes no sense, as recent data ends up in the same place. 16 | 17 | Creating unbalanced partitions can create hotspots in your system, which is likely to cause failures. 18 | 19 | ### Hash Partitioning 20 | Hashing should alleviate this problem by uniformly distributing keys across partitions. Note that this _doesn't_ eliminate hotspots if the access pattern isn't uniform! If certain keys are accessed more frequently, these be broken down into subkeys. 21 | 22 | Modular hashing, where you `mod` the number of servers/datastores/load balancers etc can be extremely problematic when you add or remove more resources. Reshuffling this data is expensive and uses a lot of bandwidth, so it is best avoided. 23 | 24 | Ideally, if a partition is added, only $k/n$ keys should be shuffled around, where $k$ is the number of keys and $n$ the number of partitions. 25 | 26 | A hashing strategy that guarantees this is called **Stable or Consistent Hashing**. For more, see [[Hashing]]. 27 | 28 | One drawback of hash partitioning is you lose the sort order guarantees of range partitioning. However, the data inside each hashed partition can be sorted according to another subkey for quick lookup. -------------------------------------------------------------------------------- /Data Structures/Trees/Breadth-First Search.md: -------------------------------------------------------------------------------- 1 | # Breadth-First Search 2 | 3 | In Breadth-First Search, an algorithm starts at the root of the tree again but it explores neighbor nodes first, before moving to the next-level neighbors. 4 | 5 | This is implemented by level order traversal. 6 | 7 | ```mermaid 8 | graph TB 9 | 1 --> 2 & 3 & 4 10 | 2 --> 5 & 6 11 | 5 --> 9 & 10 12 | 4 --> 7 & 8 13 | 7 --> 11 & 12 14 | ``` 15 | 16 | Usecases: 17 | * Copying garbage collection 18 | * Finding the shortest path between two nodes, with the path measured by the number of edges 19 | * Testing a graph for bipartiteness 20 | * Maximum spanning tree for unweighted graph 21 | * Web crawler 22 | * Finding nodes in any connected component of a graph 23 | * Computing maximum flow in a flow network 24 | * Serialization/deserialization of a binary tree 25 | 26 | # Implementations 27 | 28 | Given a tree: 29 | 30 | ```mermaid 31 | graph TB 32 | 1 --> 2 & 3 33 | 2 --> 4 & 5 34 | ``` 35 | 36 | * Level order: 1 2 3 4 5 37 | 38 | ```javascript 39 | function level_order(tree_node) 40 | queue <- Queue.new 41 | queue.add(tree_node) 42 | 43 | while (!queue.is_empty) 44 | temp_node <- queue.poll 45 | visit(temp_node.data) 46 | 47 | if temp_node.left != null 48 | queue.add(temp_node.left) 49 | if temp_node.right != null 50 | queue.add(temp_node.right) 51 | ``` 52 | 53 | ### Implementation 54 | 55 | ```kotlin 56 | fun breadthFirstSearch(root: BinaryTree): List { 57 | val queue = LinkedList() 58 | queue.offer(root) 59 | 60 | while (queue.isNotEmpty()) { 61 | val node = queue.poll() 62 | 63 | print(node.value) 64 | 65 | if (node.left != null) queue.add(node.left) 66 | if (node.right != null) queue.add(node.right) 67 | } 68 | } 69 | ``` 70 | 71 | ### Usecase: Depth of a BST 72 | ```kotlin 73 | fun maxDepth(root: TreeNode?): Int { 74 | if (root == null) return 0 75 | 76 | val queue = LinkedList() 77 | queue.offer(root) 78 | 79 | var depth = 0 80 | 81 | while (queue.isNotEmpty()) { 82 | depth++ 83 | val count = queue.size 84 | 85 | for (index in 0 until count) { 86 | val current = queue.poll() 87 | if (current.left != null) queue.offer(current.left!!) 88 | if (current.right != null) queue.offer(current.right!!) 89 | } 90 | } 91 | 92 | return depth 93 | } 94 | ``` -------------------------------------------------------------------------------- /System Design/Key Components/Caching.md: -------------------------------------------------------------------------------- 1 | # Caching 2 | 3 | Caching is used to speed up systems in various ways, reducing latency by storing the result of previous requests or expensive computations. 4 | 5 | - Caching can be done in hardware or software. 6 | - Can be client-side, server-side or anywhere inbetween various systems including the DB 7 | - Client-side can be used to reduce network requests 8 | - Server-side can be used to reduce computational loads 9 | - A cache over the DB can reduce DB reads 10 | - Hashes of incoming requests can be stored in a key/value store like Redis, and data returned without hitting a DB. 11 | 12 | ## Consistency 13 | One pitfall with caching is that there can be two or more sources of truth! There are various ways to tackle this: 14 | 15 | **Write-Through Caching** 16 | - Edits to data (say a Facebook post) are written to the cache, then the database, then returned to the user 17 | - However, we still have to go to the DB! We haven't reduced writes or decreased latency 18 | 19 | **Writeback Caching** 20 | - Edits to data update the cache and data is returned to the user 21 | - The cache is out of sync with the database 22 | - Later on, we asynchronously write this data to the database, grouped with other transactions 23 | - This can be done on a schedule basis, or when the cache is full 24 | 25 | The downside with this is that if the cache hasn't been written to the database yet and we lose the cache, we also lose all of the transactions! 26 | 27 | ## Stale Data 28 | Data can end up stale when a cache is _behind_ the source of truth or other caches. 29 | 30 | This may or may not be acceptable in some cases - view counts on YouTube videos for instance are probably absolutely fine to be stale sometimes. 31 | 32 | ## Cache Hits/Misses 33 | A cache **hit** is where data is found in the cache. 34 | 35 | A cache **miss** is where data isn't found - typically this is a sign that something has gone wrong. An example of this is a server being down, requiring a load balancer to hit a different server, likely resulting in a cache miss. 36 | 37 | ## Eviction Policies 38 | This is also critical when caching - we cannot store infinite data. 39 | 40 | - How do you get rid of old or stale data? 41 | - LRU (Least Recently Used) caches are popular 42 | - Least Frequently Used caches are also a thing - for instance the Instagram profile of a famous person is probably quite high in an LFU cache 43 | - FIFO/LIFO caching 44 | -------------------------------------------------------------------------------- /System Design/Fundamentals/Network Protocols.md: -------------------------------------------------------------------------------- 1 | # Network Protocols 2 | 3 | A protocol is an agreed-upon set of rules for communication. Clients and servers communicate via various existing protocols. 4 | 5 | ### IP (Internet Protocol) 6 | 7 | - Yes, this is what the IP in IP Address stands for 8 | - Data sent from one machine to another is sent via IP packets at it's most basic level 9 | - These packets contain a header & data: 10 | 11 | | Header | Payload | 12 | |-|-| 13 | | Source IP
destination IP
size of packet
IP version - IPv4 or IPv6 | Arbitrary data | 14 | 15 | (Ignore the scaling of the diagram - in reality, the **payload** is much larger) 16 | 17 | - These packets are $2^{16}$ bytes maximum - this isn't a lot of data! Therefore a great many packets have to be sent across the wire 18 | - IP does not guarantee ordering 19 | - IP has no way of confirming that the recipient received anything 20 | - No error correction 21 | 22 | ### TCP (Transmission Control Protocol) 23 | 24 | - To address the shortcomings of IP, TCP was created 25 | - This has a way to see if data was received and guarantees the correct ordering 26 | - Allows you to send arbitrarily long pieces of data 27 | 28 | | Header | TCP Header | Payload | 29 | |-|-|-| 30 | | Source IP
destination IP
size of packet
IP version - IPv4 or IPv6 | Info about payload
Ordering data | Arbitrary data | 31 | 32 | - For a client and a server to communicate, a TCP connection is made and this connection is held open until a timeout or the client or server requests to end the connection 33 | - This connection is opened via a handshake 34 | 35 | ### HTTP (HyperText Transfer Protocol) 36 | 37 | - A higher level abstraction designed for developers which builds ontop of [[#TCP Transmission Control Protocol]] 38 | - Uses a request <-> response paradigm 39 | 40 | #### Requests 41 | - **host** -> `adambennett.dev` 42 | - **port** -> typically `80` or `443` 43 | - **path** -> `/posts` 44 | - **method** -> `GET`, `POST`, `PUT`, `DELETE` 45 | - **headers** -> `content-type: application/json`, `content-length: 51` 46 | - **body** -> arbitray data, typically JSON 47 | 48 | #### Responses 49 | - **status code** -> `200`, `404`, `500` etc 50 | - **headers** -> `content-type: application/json`, `access-control-allow-origin: 'https://adambennett.dev'` 51 | - **body** -> arbitray data, typically JSON 52 | 53 | For more, see [[REST API Design]]. 54 | 55 | # Summary 56 | 57 | ```mermaid 58 | graph LR 59 | IP --> TCP 60 | TCP --> HTTP 61 | ``` 62 | -------------------------------------------------------------------------------- /System Design/Mobile/Connection Types for Continuous Data.md: -------------------------------------------------------------------------------- 1 | # Connection Types for Continuous Data 2 | 3 | ## Long/Short Polling 4 | Otherwise known as client pull. 5 | 6 | ### Cons 7 | - Every request is a full HTTP request - including headers which can actually represent quite a lot of data transferred. This is inefficient e.g. 15KB of headers for 5KB of data. 8 | - Maximal latency - the server responds, then cannot respond again until the next request from the client. 9 | - This means the maximal latency is actually three network requests - response, request, response 10 | - Infact, it's _at least_ three, factoring in packet-loss, retransmission etc. 11 | - This is a huge problem when switching cell networks etc. 12 | 13 | ## Websockets 14 | Websockets allow you to open an interactive session with the server, sending messages and receiving event-driven responses without having to poll for a reply. 15 | 16 | This protocol provides full-duplex communication channels over a single TCP connection (at TCP layer 4). It is not automatically multiplexed over HTTP/2 connections because it doesn't really run ontop of HTTP at all. Connections are established over HTTP ports 80 and 443 via an Upgrade header to change protocol from HTTP to websockets. 17 | 18 | ### Cons 19 | - Load balancing is very complicated as there can be many sockets open at the same time 20 | - You can potentially suffer from a wave of "reconnect" messages which overwhelms your server 21 | - It isn't possible to move socket connections to a different server if one experiences high load: they must be closed and re-opened. 22 | - Without WiFi, websockets require the full-duplex antenna to work almost constantly, drawing huge amounts of power. Typically, mobile devices use a low-power antenna to receive data, and full duplex in order to establish calls. 23 | 24 | ## Server-Sent Events 25 | 26 | Server-Sent Events are similar to websockets in that they happen in real-time, but they're very much one-way. A server emits these events in real-time and they're received by the client, with the client sending very few events. 27 | 28 | ### Pros 29 | - The connection stream is coming from the server and is read-only 30 | - Uses regular HTTP requests, no special protocol - so [[multiplexing]] over HTTP/2 works out of the box. 31 | - If the connection drops, the EventSource fires an error event and tries to reconnect, with a timeout 32 | - Clients can send a unique ID with each message. When a client tries to reconnect after a dropped connection, it sends the last know ID 33 | - The server can then send `n` number of missed messages from a backlog -------------------------------------------------------------------------------- /Data Structures/Trees/Trie.md: -------------------------------------------------------------------------------- 1 | # Trie 2 | ## Insertion $O(m)$ 3 | We insert a key by starting at the root and searching for a link which corresponds to the first character: 4 | - If a link exists, move to that node 5 | - If not, add that node as a link to the current node and move to it 6 | - If we reach the end of the word, mark the last node as the end 7 | 8 | # Search $O(m)$ 9 | Each word is represented as a path from the root to a leaf node. We start from the root with the first character, looking for links corresponding to that character: 10 | - If the link exists, we move to the next node in the path following this link 11 | - If there's no keys in the search word left and the node isn't marked as `isEnd`, return false 12 | - Otherwise return true 13 | 14 | # Search Prefix 15 | Exactly the same as searching for a word, but we don't need to check `isEnd`. 16 | 17 | 18 | ```kotlin 19 | class Trie() { 20 | 21 | private val root = TrieNode() 22 | 23 | fun insert(word: String) { 24 | var node = root 25 | for (char in word) { 26 | if (!node.containsKey(char)) { 27 | node.put(char, TrieNode()) 28 | } 29 | 30 | node = node.get(char)!! 31 | } 32 | 33 | node.setEnd() 34 | } 35 | 36 | /** 37 | * Returns true if the entire word is in the Trie 38 | */ 39 | fun search(word: String): Boolean { 40 | var node = root 41 | for (char in word) { 42 | if (!node.containsKey(char)) return false 43 | 44 | node = node.get(char)!! 45 | } 46 | 47 | return node.isEnd() 48 | } 49 | 50 | /** 51 | * Returns true if there is any word in the trie that starts with the 52 | * given prefix 53 | */ 54 | fun startsWith(prefix: String): Boolean { 55 | var node = root 56 | for (char in prefix) { 57 | if (!node.containsKey(char)) return false 58 | 59 | node = node.get(char)!! 60 | } 61 | 62 | return true 63 | } 64 | 65 | } 66 | 67 | class TrieNode { 68 | 69 | private val NUM_CHARS = 26 70 | 71 | private val links = Array(NUM_CHARS) { null } 72 | 73 | private var isEnd = false 74 | 75 | fun containsKey(char: Char): Boolean { 76 | return links[char - 'a'] != null 77 | } 78 | 79 | fun get(char: Char): TrieNode? { 80 | return links[char - 'a'] 81 | } 82 | 83 | fun put(char: Char, node: TrieNode) { 84 | links[char - 'a'] = node 85 | } 86 | 87 | fun setEnd() { 88 | isEnd = true 89 | } 90 | 91 | fun isEnd() = isEnd 92 | } 93 | ``` -------------------------------------------------------------------------------- /Algorithms/Topics/Topological Sort.md: -------------------------------------------------------------------------------- 1 | # Topological Sort 2 | 3 | Works only on Directed Acyclic Graphs. 4 | 5 | ## Example 6 | You're given a list of jobs that need to be completed. You're also given a list of dependencies, where each job in the list is a dependency of the next job. 7 | 8 | Return a list of jobs in a valid order (there may be more than one). 9 | 10 | ```javascript 11 | jobs = [1, 2, 3, 4] 12 | deps = [[1, 2], [1, 3], [3, 2], [4, 2], [4, 3]] 13 | 14 | output = [1, 4, 3, 2] or [4, 1, 3, 2] 15 | ``` 16 | 17 | This can be represented as a graph: 18 | 19 | ```mermaid 20 | graph LR 21 | 1 --> 2 22 | 4 --> 2 23 | 4 --> 3 24 | 1 --> 3 25 | 3 --> 2 26 | ``` 27 | 28 | Notice how `1` and `4` have no dependencies - they can be safely added to the list first. 29 | 30 | ## Solution - $O(j + d)$, e.g. $O(v + e)$ 31 | 1. Iterate over all the edges in the input and create an adjacency list and also a map of node v/s in-degree. 32 | 2. Initialize a queue, `Q` to keep a track of all the nodes in the graph with 0 in-degree. 33 | 3. Add all the nodes with 0 in-degree to `Q`. 34 | 4. The following steps are to be done until the `Q` becomes empty. 35 | 1. Pop a node from the `Q`. Let's call this node, `N`. 36 | 2. For all the neighbors of this node, `N`, reduce their in-degree by 1. If any of the nodes' in-degree reaches 0, add it to the `Q`. 37 | 3. Add the node `N` to the list maintaining topologically sorted order. 38 | 4. Continue from step 4.1. 39 | 40 | ```kotlin 41 | fun findOrder(numCourses: Int, prerequisites: Array): IntArray { 42 | 43 | val adjacencyList = mutableMapOf>() 44 | val inDegree = IntArray(numCourses) { 0 } 45 | val topologicalOrder = IntArray(numCourses) { 0 } 46 | 47 | prerequisites.forEach { prerequisite -> 48 | val destination = prerequisite[0] 49 | val source = prerequisite[1] 50 | 51 | val list = adjacencyList.getOrDefault(source, mutableListOf()) 52 | list.add(destination) 53 | adjacencyList[source] = list 54 | 55 | inDegree[destination] += 1 56 | } 57 | 58 | val queue = LinkedList() 59 | 60 | for (index in 0 until numCourses) { 61 | if (inDegree[index] == 0) { 62 | queue.add(index) 63 | } 64 | } 65 | 66 | var i = 0 67 | while (queue.isNotEmpty()) { 68 | val node = queue.remove() 69 | topologicalOrder[i++] = node 70 | 71 | if (adjacencyList.contains(node)) { 72 | adjacencyList[node]!!.forEach { neighbour -> 73 | inDegree[neighbour] -= 1 74 | 75 | if (inDegree[neighbour] == 0) { 76 | queue.add(neighbour) 77 | } 78 | } 79 | } 80 | } 81 | 82 | return when (i) { 83 | numCourses -> topologicalOrder 84 | else -> intArrayOf() 85 | } 86 | } 87 | ``` -------------------------------------------------------------------------------- /Algorithms/Topics/Quick Sort.md: -------------------------------------------------------------------------------- 1 | # Quick Sort 2 | ## Complexity 3 | | | Best | Worst | 4 | | - | -| - | 5 | | Time | $O(n \cdot log \cdot n)$ | $O(n^2)$ | 6 | | Space | | $(log \cdot n)$ | 7 | 8 | ## Usecase 9 | In general, prefer [[Merge Sort]] for larger arrays. However, Quick Sort tends to be faster in the real world because of fewer comparisons, and despite the worst-case performance being $O(n^2)$, this can generally be avoided by using a pivot picked at random. 10 | 11 | That being said, avoid using Quick Sort on data sets which: 12 | - Contain many duplicates 13 | - Are largely sorted or reverse-sorted 14 | - Can't be held in memory and require many reads from disk (Quick Sort does this with many random reads, whereas [[Merge Sort]] does this sequentially). 15 | 16 | ## Basic Algorithm 17 | 18 | Quicksort == pivot 19 | 20 | A pivot is an item in the array that meets the following 3 conditions when the array is sorted: 21 | 22 | 1) The pivot is in the correct position in the final, sorted array 23 | 2) Items to the left are smaller 24 | 3) Items to the right are larger 25 | 26 | Method: 27 | - Pick a random element and partition the array so that elements to the left of the partition element are smaller, and elements to the right are larger 28 | - This occurs through a series of swaps 29 | - Repeatedly partitioning the array and it's sub arrays this way yields a sorted array 30 | 31 | In terms of time complexity, Quicksort can be very slow ($O(n^2)$) if the partition is chosen poorly (e.g. the first element of each subarray). 32 | 33 | ## Implementation 34 | 35 | ```kotlin 36 | fun quickSort(array: IntArray) { 37 | quickSort(array, 0, array.size - 1) 38 | } 39 | 40 | private fun quickSort(array: IntArray, left: Int, right: Int) { 41 | val index = partition(array, left, right) 42 | 43 | if (left < index - 1) { 44 | // Sort left half 45 | quickSort(array, left, index - 1) 46 | } 47 | if (right > index) { 48 | // Sort right half 49 | quickSort(array, index, right) 50 | } 51 | } 52 | 53 | private fun partition(array: IntArray, left: Int, right: Int): Int { 54 | var leftIndex = left 55 | var rightIndex = right 56 | val middle = (left + right) / 2 57 | val pivot = array[middle] 58 | 59 | while (leftIndex <= rightIndex) { 60 | // Find element that should be on right 61 | while (array[leftIndex] < pivot) leftIndex++ 62 | // Find element that should be on left 63 | while (array[rightIndex] > pivot) rightIndex-- 64 | 65 | // Swap elements and move indices 66 | if (leftIndex <= rightIndex) { 67 | swap(array, leftIndex, rightIndex) 68 | leftIndex++ 69 | rightIndex-- 70 | } 71 | } 72 | 73 | // Return new pivot 74 | return leftIndex 75 | } 76 | 77 | private fun swap(array: IntArray, left: Int, right: Int) { 78 | val temp = array[left] 79 | array[left] = array[right] 80 | array[right] = temp 81 | } 82 | ``` -------------------------------------------------------------------------------- /System Design/Fundamentals/REST API Design.md: -------------------------------------------------------------------------------- 1 | # REST API Design 2 | - REST APIs are designed around `resources`, where any kind of object, data or service can be accessed by the client. 3 | - Each resource has an identifier, which is a URI that uniquely identifies that resource. 4 | - Clients interact with a service by exchanging _representations_ of resources - typically JSON. 5 | 6 | ## Queries 7 | - `GET` retrieves a representation of the resource at the specified URI. The body of the response message contains the details of the requested resource. 8 | - `POST` creates a new resource at the specified URI. The body of the request message provides the details of the new resource. Note that POST can also be used to trigger operations that don't actually create resources. 9 | - `PUT` either creates or replaces the resource at the specified URI. The body of the request message specifies the resource to be created or updated. 10 | - `PATCH` performs a partial update of a resource. The request body specifies the set of changes to apply to the resource. 11 | - `DELETE` removes the resource at the specified URI. 12 | 13 | When passing data to a server via `POST` or `PUT`, you can choose between two methods: 14 | - Content body (JSON) 15 | - Query parameters, e.g. `v1/authors?name=orwell&year=1984` 16 | 17 | Generally, content body is used for data that is to be uploaded or downloaded from the server. Query parameters on the other hand are used to specify the exact data requested. 18 | 19 | For example, when you upload a file, you specify the name, MIME type etc in the body - but when fetching a list of files you use query parameters to filter the list by some property. 20 | 21 | In general, query paramters are a property of the query, not the data. 22 | 23 | ## POST vs PUT 24 | | PUT | POST | 25 | | - | - | 26 | | `PUT` requests store data under the supplied REST URI - e.g. update if existing, add if not
`PUT /questions/{question_id}` | `POST` requests that the origin server accept the enclosed entity as a new subordinate of the resource.
`POST /questions` | 27 | | Idempotent. If you retry multiple times, a request would be equivalent to a single request modification. | NOT idempotent. If you retry `n` times, you end up with `n` resources with `n` different URIs created on the server. | 28 | |Use `PUT` when you want to modify a singular resource which is already part of the resources collection. `PUT` replaces a resource in its entirety. Use `PATCH` if you must update a part of the resource.| `POST` when you want to add a child resource under the resources collection. | 29 | | Despite idempotency, we shall not cache it's response.| `POST` is not cachable unless the response includes the appropriate Cache-Control or Expires headers. However you can redirect to a cacheable resource via a `303`.| 30 | | Always use `PUT` for `UPDATE` operations.| Always use `POST` for `CREATE` operations. | 31 | 32 | ## Versioning 33 | 34 | All APIs should have versioning to allow us to modify the API surface and respond to requirements changes. This can be done via headers or via the API path, e.g. `someservice.com/1/whatever`. -------------------------------------------------------------------------------- /Android/Architecture Overview.md: -------------------------------------------------------------------------------- 1 | # Architecture Overview 2 | 3 | Android is made up of several layers: 4 | 5 | ### Applications 6 | This is where your app lives, and is the top layer in the software stack. 7 | 8 | ### Application Framework 9 | This layer provides high-level services to applications in the form of Java classes. This includes: 10 | 11 | - **Activity Manager**, which controls application lifecycle and Activity stacks 12 | - **Content Providers**, which allow applications to publish and share data with other applications 13 | - **Resource Manager**, providing access to resources such as Strings, colors, layouts etc 14 | 15 | ### Binder IPC 16 | The [Binder Inter-Process Communication](https://www.androiddesignpatterns.com/2013/07/binders-window-tokens.html) mechanism allows the Application Framework to cross process boundaries (Android runs all apps with process isolation) and call into the Android system services code. This enables high-level code to interact with the System Server and other remote service components. At the application framework level, this communication is hidden from the developer and things "just work". 17 | 18 | `Intents`, `ContentProviders` and `Messengers` are built ontop of Binder, as they enable communication across processes. When a process sends a message to another process, the kernel allocates space in the destination process's memory and copies the message data directly from the sending process. It then queues a message which tells the receiver where the message data is. 19 | 20 | Each `Binder` object has a unique reference in memory and this is used for security as it can act as an identifier token - noone is able to create another `Binder` with the same identifier. For instance, calling `PowerManagerService` to acquire a `WakeLock` requires a token - releasing that `WakeLock` requires the same token. This way, one application or process cannot pretend to be another and interfere. 21 | 22 | A good example of this is in the `View` class, where `getWindowToken()` is called extensively. This is a `Binder` token, and is used to ensure that malicious applications aren't allowed to draw ontop of other applications. 23 | 24 | ### Android Runtime 25 | This is part of the third layer along with Android libraries. ART uses AOT compilation to translate DEX bytecode down to native instructions in Executable & Linkable Format (ELF) 26 | 27 | ### Android Libraries 28 | Encompasses Java-based libraries available to Android apps. This includes packages such as: 29 | - android.app 30 | - android.content 31 | - android.database 32 | - android.os 33 | - android.net 34 | - android.text 35 | - android.view 36 | - android.widget 37 | - android.webkit 38 | 39 | These by and large are Java wrappers over C++ - not a tonne of logic here. 40 | 41 | ### HAL 42 | The Hardware Abstraction Layer sits ontop of the kernel, and is responsible for: 43 | 44 | - Camera 45 | - Audio 46 | - Graphics 47 | - And more 48 | 49 | It defines a standard interface for manufacturers to implement which allows Android to be agnostic about lower-level driver implementations. 50 | 51 | ### Linux Kernel and Libraries 52 | Onotop of the kernal there's a set of libraries in C++ such as WebKit, SQLite, SSL etc. The kernel provides abstraction over device hardware and contains drivers, networking, memory management etc. -------------------------------------------------------------------------------- /Algorithms/Topics/Merge Sort.md: -------------------------------------------------------------------------------- 1 | # Merge Sort 2 | ## Complexity 3 | | | Best | Worst | 4 | | - | -| - | 5 | | Time | $O(n \cdot log \cdot n)$ | $O(n \cdot log \cdot n)$ | 6 | | Space | | $O(n)$ | 7 | 8 | ## Usecase 9 | Merge Sort is preferable to [[Quick Sort]] in a few scenarios. For very very large datasets, say 200GB of data, you can use MapReduce to offload computation of Merge Sort across multiple machines. This cannot be done with [[Quick Sort]] because the entire dataset has to be held in memory. 10 | 11 | ## Basic Algorithm 12 | - Divide the array to be sorted in half 13 | - Sort each half 14 | - Merge these arrays back together 15 | 16 | It is this merging function which does all the heavy lifting. 17 | 18 | This version operates by copying all elements from the target array segment into a helper array, and keeping track of where the start of the left and right halves should be. 19 | 20 | We then iterate through the helper, copying the smaller element from each half into the array. Once this is done, we copy any remaining elements into the target array. 21 | 22 | Note that this algorithm works in-place, and doesn't return a new array. 23 | 24 | ## Implementation 25 | 26 | ```kotlin 27 | fun mergeSort(array: IntArray) { 28 | val helperArray = IntArray(array.size) 29 | mergeSort(array, helperArray, 0, array.size - 1) 30 | } 31 | 32 | private fun mergeSort( 33 | array: IntArray, 34 | helperArray: IntArray, 35 | low: Int, 36 | high: Int 37 | ) { 38 | if (low < high) { 39 | val middle = (low + high) / 2 40 | // Sort left half 41 | mergeSort(array, helperArray, low, middle) 42 | // Sort right half 43 | mergeSort(array, helperArray, middle + 1, high) 44 | // Merge left and right 45 | merge(array, helperArray, low, middle, high) 46 | } 47 | } 48 | 49 | private fun merge( 50 | array: IntArray, 51 | helperArray: IntArray, 52 | low: Int, 53 | middle: Int, 54 | high: Int 55 | ) { 56 | // Copy both halves into helper array 57 | for (index in low..high) { 58 | helperArray[index] = array[index] 59 | } 60 | 61 | var helperLeft = low 62 | var helperRight = middle + 1 63 | var current = low 64 | 65 | // Compare two halves, copying smaller element of left and right back 66 | // into the original array 67 | while (helperLeft <= middle && helperRight <= high) { 68 | // If left element is smaller, copy into array 69 | if (helperArray[helperLeft] <= helperArray[helperRight]) { 70 | array[current] = helperArray[helperLeft] 71 | helperLeft++ 72 | } else { 73 | // If right element is smaller, copy into array 74 | array[current] = helperArray[helperRight] 75 | helperRight++ 76 | } 77 | // Increment array 78 | current++ 79 | } 80 | // Right half doesn't need to be copied - it's already there! 81 | val remaining = middle - helperLeft 82 | 83 | for (index in 0..remaining) { 84 | array[current + index] = helperArray[helperLeft + index] 85 | } 86 | } 87 | ``` 88 | 89 | Consider `[1, 4, 5, || 2, 8, 9]`. Prior to merging, both the target and helper array will end in `[8, 9]`. Once we copy over four elements, `[1, 4, 5, 2]`, into the target array, the `8` and `9` will still be in place in both arrays. No need to copy. -------------------------------------------------------------------------------- /Data Structures/Trees/Binary Heap.md: -------------------------------------------------------------------------------- 1 | # Binary Heap 2 | A binary heap is a special type of [[Binary Search Tree]] where we can find the smallest (min-heap) or highest (max-heap) value instantly - it's the root node. 3 | 4 | It has the same rules for adding a node as a BST, with two additional rules: 5 | - A parent node must be greater than (or smaller than) both child nodes - this is known as the Heap Property 6 | - Each level of a tree must be filled up completely, except the last level which must be filled from left to right 7 | 8 | Useful when you frequently work with the smallest or largest value in a set, but otherwise searching, insertion and deletion take the same amount of time. 9 | 10 | # Array Representations 11 | 12 | Given a Min Heap: 13 | 14 | ```mermaid 15 | graph TB 16 | 8 --> 12 & 23 17 | 12 --> 17 & 31 18 | 17 --> 102 & 18 19 | 23 --> 30 & 44 20 | ``` 21 | 22 | This can be represented with an array: 23 | 24 | ```javascript 25 | [8, 12, 23, 17, 31, 30, 44, 102, 18] 26 | ``` 27 | 28 | To calculate the children of a current node at index $i$: 29 | 30 | child one = $2i + 1$ 31 | child two = $2i + 2$ 32 | 33 | To calculate the parent index of any given node: 34 | 35 | parent = floor$((i - 1) \div 2)$ 36 | 37 | # In Kotlin 38 | Use the `ProrityQueue` interface: 39 | 40 | ```kotlin 41 | val minHeap = PriorityQueue { a, b -> 42 | a.value - b.value 43 | } 44 | 45 | val maxHeap = PriorityQueue { a, b -> 46 | b.value - a.value 47 | } 48 | 49 | while (heap.isNotEmpty()) { 50 | val node = heap.poll() 51 | // Do something here 52 | } 53 | ``` 54 | 55 | # Usecase 56 | Classically, these are used to merge `k` [[Linked List]]s in $O(N \cdot log \cdot k)$ time, where `k` is the number of Linked Lists. 57 | 58 | ```kotlin 59 | fun mergeKLists(lists: Array): ListNode? { 60 | val heap = PriorityQueue { a, b -> 61 | a.value - b.value 62 | } 63 | 64 | for (list in lists.filterNotNull()) { 65 | var current: ListNode? = list 66 | while (current != null) { 67 | heap.add(current) 68 | current = current.next 69 | } 70 | } 71 | 72 | val head = ListNode(0) 73 | var current = head 74 | 75 | while (heap.isNotEmpty()) { 76 | val node = heap.poll() 77 | node.next = null 78 | current.next = node 79 | current = current.next!! 80 | } 81 | 82 | return head.next 83 | } 84 | ``` 85 | 86 | Heaps are also used to help find the median value from a stream in $O(log \cdot n)$ time: 87 | - A max heap `lo` stores the lower half of the stream 88 | - A min heap `hi` stores the higher half of the stream 89 | 90 | ```kotlin 91 | class MedianFinder { 92 | 93 | // Max heap 94 | private val lo = PriorityQueue { a, b -> b - a} 95 | // Min heap 96 | private val hi = PriorityQueue { a, b -> a - b} 97 | 98 | fun addNum(num: Int) { 99 | lo.add(num) 100 | hi.add(lo.poll()) 101 | 102 | if (lo.size < hi.size) { 103 | lo.add(hi.poll()) 104 | } 105 | } 106 | 107 | fun findMedian(): Double = when { 108 | lo.size > hi.size -> lo.peek().toDouble() 109 | else -> (lo.peek() + hi.peek()) * 0.5 110 | } 111 | } 112 | ``` 113 | 114 | # Complexity 115 | - Insert/Delete $O(log \cdot n)$ 116 | - Poll $O(log \cdot n)$ 117 | - Peek $O(1)$ 118 | 119 | -------------------------------------------------------------------------------- /Algorithms/Topics/Range-Sum Query.md: -------------------------------------------------------------------------------- 1 | # Range-Sum Query 2 | 3 | Given a 2D matrix `matrix`, handle multiple queries of the following type: 4 | 5 | 1. Calculate the **sum** of the elements of `matrix` inside the rectangle defined by its **upper left corner** `(row1, col1)` and **lower right corner** `(row2, col2)`. 6 | 7 | Implement the NumMatrix class: 8 | 9 | - `NumMatrix(int[][] matrix)` Initializes the object with the integer matrix `matrix`. 10 | - `int sumRegion(int row1, int col1, int row2, int col2)` Returns the **sum** of the elements of `matrix` inside the rectangle defined by its **upper left corner** `(row1, col1)` and **lower right corner** `(row2, col2)`. 11 | 12 | `sums[i+1][j+1]` represents the sum of area from `matrix[0][0]` to `matrix[i][j]` 13 | 14 | ``` 15 | +-----+-+-------+ +--------+-----+ +-----+---------+ +-----+--------+ 16 | | | | | | | | | | | | | | 17 | | | | | | | | | | | | | | 18 | +-----+-+ | +--------+ | | | | +-----+ | 19 | | | | | = | | + | | | - | | 20 | +-----+-+ | | | +-----+ | | | 21 | | | | | | | | | 22 | | | | | | | | | 23 | +---------------+ +--------------+ +---------------+ +--------------+ 24 | 25 | sums[i][j] = sums[i-1][j] + sums[i][j-1] - sums[i-1][j-1] + 26 | 27 | matrix[i-1][j-1] 28 | ``` 29 | 30 | So, we use the same idea to find the specific area's sum. 31 | 32 | ``` 33 | +---------------+ +---------+----+ +---+-----------+ +---------+----+ +---+----------+ 34 | | | | | | | | | | | | | | | 35 | | (r1,c1) | | | | | | | | | | | | | 36 | | +------+ | | | | | | | +---------+ | +---+ | 37 | | | | | = | | | - | | | - | (r1,c2) | + | (r1,c1) | 38 | | | | | | | | | | | | | | | 39 | | +------+ | +---------+ | +---+ | | | | | 40 | | (r2,c2)| | (r2,c2)| | (r2,c1) | | | | | 41 | +---------------+ +--------------+ +---------------+ +--------------+ +--------------+ 42 | ``` 43 | 44 | ```kotlin 45 | class NumMatrix(matrix: Array) { 46 | 47 | private val dp = Array(matrix.size + 1) { IntArray(matrix[0].size + 1) } 48 | 49 | init { 50 | if (matrix.isNotEmpty() && matrix[0].isNotEmpty()) { 51 | 52 | for (row in 0 until matrix.size) { 53 | for (column in 0 until matrix[0].size) { 54 | dp[row + 1][column + 1] = 55 | dp[row + 1][column] + 56 | dp[row][column + 1] + 57 | matrix[row][column] - 58 | dp[row][column] 59 | } 60 | } 61 | } 62 | } 63 | 64 | fun sumRegion(row1: Int, col1: Int, row2: Int, col2: Int): Int { 65 | return dp[row2 + 1][col2 + 1] - 66 | dp[row1][col2 + 1] - 67 | dp[row2 + 1][col1] + 68 | dp[row1][col1] 69 | } 70 | } 71 | ``` -------------------------------------------------------------------------------- /Behavioural/Behavioural Interview Notes.md: -------------------------------------------------------------------------------- 1 | # Behavioural Interview Notes 2 | Best resource that I've found is [this YouTube video](https://www.youtube.com/watch?v=PJKYqLP6MRE). 3 | 4 | # Be Genuine 5 | Why do you want to work at the company? Where do you want to take your career? What do you want next? 6 | 7 | # Own your Strengths, Weaknesses, Successes, Failures 8 | Be introspective, don't hide your failures or faults. 9 | 10 | Equally, don't be afraid at discussing your strengths and own that. Being _specific_ helps shift these sorts of responses from arrogant to good answers. 11 | 12 | # Any Questions? 13 | Write your questions down so that you don't forget them. 14 | 15 | What are the tough questions that you want to know about? Don't shy away from these questions. You are interviewing the company back. 16 | 17 | - When people do good work, how does this company recognise this? 18 | - What is the work-life balance like? 19 | - How does the test/release process work? Do you have QA engineers? 20 | 21 | ## Signals 22 | - **Genuineness** 23 | - If they sense dishonesty, it's game over. You have to show genuinely who you are, what you value, what you get excited about and how you are authentic 24 | - _Do not_ try to tailor your answers to what you think the interviewer wants to hear 25 | - **Communication** 26 | - Don't just focus on the technical stuff - interpersonal is extremely important 27 | - Can you effectively communicate your frustrations? 28 | - Can you communicate across teams? 29 | - Have you given critical feedback? 30 | - **Collaboration** 31 | - Will you share your toys? Or are you toxic and steal all the good projects? 32 | - Do you want to foster the people around you and raise them up? 33 | - Are you able to jump into a team and immediately be successful? 34 | - If someone makes a technical decision that you disagree with, do you resent them for it? Or do you let it go - the world moves on - being magnanimous when the technical problem appears? 35 | - **Passion** 36 | - How much do you care about your work? Your team? Shipping things? 37 | - Or do you mostly care about status, salary, achievements? 38 | - Are you passionate over a long period? 39 | - What are you passionate about outside of work? Bring that spark to the interview 40 | 41 | Is is very easy to signal clear no's - harder to signal clear yes's. It's a no list, not a yes list. An example of this is taking a misstep from your career and blaming it on the circumstances instead of learning and looking inwards. 42 | 43 | That said, these questions aren't gotchas, they're really about where you take these questions. 44 | 45 | ## Examples of open-ended questions 46 | 47 | What would you bring to a team? What does a team mean to you? Are you a net positive contributor? 48 | 49 | - Talk about a project you worked on 50 | - Talk about a project that you worked on that failed 51 | - Talk about a time when you didn't think you could do something 52 | - Talk about a time when the people around you disagreed about something, and how you resolved it 53 | - What are you bad at? 54 | - Biggest success? 55 | 56 | ## Advice 57 | 58 | - Checksumming your history 59 | - Trying to understand what your progression has looked like over time 60 | - Making sense of your personal story - know your own story, your projects, your own job history 61 | - Know what you've told your interviewer through your CV 62 | - They're going to want to know the contributions that you've made or lead 63 | - "We" is a dangerous word in these interviews 64 | - Don't be humble where it's not applicable 65 | - It's great to say "my team and I shipped `x`", but you _must_ say "I did these things which helped us ship `x`" 66 | - You are not getting your team the job 67 | -------------------------------------------------------------------------------- /System Design/Key Components/Duplication.md: -------------------------------------------------------------------------------- 1 | # Duplication 2 | 3 | The easiest way to add more capacity to a service to add more instances of the service and have a way of routing, or balancing, requests to it. Creating more instances is a fast and cheap way to scale a stateless service, _as long as you have taken into account the impact on its dependencies_. 4 | 5 | For instance, adding multiple instances of a service is fine - unless all of them rely on the same datastore. This datastore will rapidly become a bottleneck, and so this will require duplication too. 6 | 7 | A critical tool in duplication is the [[Load Balancers]]. 8 | 9 | ## Replication 10 | When the servers involved are stateless, creating new instances and fanning out is simple to achieve. However if there's state involved, it gets much more difficult to handle and there must be some form of coordination involved. 11 | 12 | Replication is the process of storing a copy of the same data in multiple nodes of the system. Keeping this data in sync is the tricky part. 13 | 14 | ### Single Leader Replication 15 | This is the most common approach to the problem, where you elect a leader with multiple followers. Clients write directly to one leader, which updates its local state and then replicates this change to its followers. Typically this is done with the Raft replication algorithm. 16 | 17 | At its most basic, replication can happen either fully synchronously, fully asynchronously, or somewhere inbetween. 18 | 19 | #### Async 20 | - The leader receives a write request from a client 21 | - The leader asynchronously broadcasts it to the followers and replies to the client 22 | - The leader then responds to the client before the replication has completed 23 | 24 | This approach is fast but not very fault tolerant. If the leader crashes before the followers have all updated, and a follower without the correct state is promoted to leader - you're in trouble. This is a terrible tradeoff. 25 | 26 | There's also a lack of strong consistency here, where a successful write might not be visible by some or all replicas. 27 | 28 | #### Sync 29 | This requires leaders to wait for all replicas to finish writing before telling the client that the request has been successful. 30 | 31 | There is obviously a performance penalty here - the entire request is only as fast as the slowest replica. If a replica is down, then the entire system won't respond - and the more followers/replicas there are, the more likely it is that this will happen. 32 | 33 | As you can see, fully async or fully sync are extremes that have high tradeoffs. Most datstores use some combination of the two strategies. 34 | 35 | #### Multi-Leader Replication 36 | In this strategy, there is more than one node that can accept writes. This strategy is often used when the write throughput is too high for a single machine to handle, or when the leader needs to be highly geographically available. 37 | 38 | The replication between leaders happens asynchronously. This means that there can be conflicts between two leaders, as they might have differing data due to data being updated concurrently between them. To resolve these conflicts, there has to be a resolution strategy. 39 | 40 | The simplest solution is to avoid the need for multiple leaders in the first place. For example, if European requests are router to a European data centre, that centre can have a single leader and no conflicts. The leader could go down, but you can also have a backup in the same region. 41 | 42 | If this isn't possible then conflicts are inevitable. One way to fix conflicts when updating a record is to store the conflicting writes and return them to the next client that reads the record. The client then has to resolve the conflict and update the datastore with the resolution. 43 | 44 | Alternatively, the data store could allow clients to upload a conflict resolution procedure, which can be executed by the datastore when a conflict is detected. 45 | 46 | The datastore could also leverage data structures which provide automatic conflict resolution, such as CRDTs. 47 | 48 | #### Leaderless Replication 49 | Imagine a world where any replica could accept writes from any clients - no leaders, and conflict resolution is offloaded to clients. 50 | 51 | This is used in Dynamo-style datastores, and is even more complex than multi-leader replication. -------------------------------------------------------------------------------- /System Design/Key Components/Microservices.md: -------------------------------------------------------------------------------- 1 | # Microservices 2 | 3 | Microservices are a popular approach to building networked systems, in contrast to the now out-of-fashion monolithic model. A microservice architecture involves breaking down your business needs into clearly defined areas of concern, and then building small systems to deal with that specific usecase. This is called _functional decomposition_. 4 | 5 | ## Pros 6 | - With well defined boundaries, microservices provide good levels of decoupling 7 | - These decoupled units are easier to work on in teams without tripping over other people 8 | - Teams can own these services, and engineers don't have to hold so much in their heads 9 | - Both deployment and rolling back changes is far simpler and requires less coordination 10 | - Each service can adopt a tech stack or data store which suits their individual needs or requirements 11 | 12 | ## Cons 13 | - Each service having it's own data store means that you're more likely to adopt Eventual Consistency sooner rather than later. This isn't a bad thing _per se_, but something that you should be aware of 14 | - Testing these deployments is incredibly difficult 15 | - Can be very expensive 16 | 17 | # The API Gateway 18 | Clients should not be aware of dozens of microservices, so an API Gateway acts as a facade and provides routing, composition and translation: 19 | 20 | ## Routing 21 | API Gateways typically provide a 1:1 mapping between external paths and internal paths. This means that internal changes don't affect clients because the gateway is transparent and unknown to the client. 22 | 23 | ## Composition 24 | Gateways can query multiple services on your behalf and stitch this data back together for you into a useful representation. 25 | 26 | This reduces the number of requests that a client has to make - on mobile this can be expensive/battery draining. 27 | 28 | However, the availability of an API Gateway reduces as the number of internal requests it needs to make goes up, due to the chance of failure for each request. 29 | 30 | ## Translation 31 | A gateway can also provide translation for different types of client - a web client may request more information or more granular information, whereas mobile clients may want just one request with less data. 32 | 33 | They can also provide translation from one IPC mechanism to another - e.g. from a RESTful HTTP service to a gRPC one. 34 | 35 | ## Cross-cutting 36 | Because gateways act as a middleman or proxy (NGINX, for example, is a common system for this), they can implement cross-cutting. Without cross-cutting, many services would have to implement the same logic multiple times - for instance, authentication and authorization. 37 | 38 | Another common example is caching frequently accessed resources, improving performance and reducing the bandwidth requirements for other services. 39 | 40 | ```mermaid 41 | graph LR 42 | Client --> Gateway 43 | Gateway --> A[Auth Service] 44 | A[Auth Service] --> Gateway 45 | Gateway --> B[Service A] 46 | Gateway --> C[Service B] 47 | ``` 48 | 49 | ## Drawbacks 50 | An API Gateway is an eventual bottleneck and must also scale like any other service. 51 | 52 | There is a limit to the composition model. Querying across multiple services can be very expensive, especially when this requires large in-memory joins. 53 | 54 | The datastoremight be optimised for one specific thing but not another - for example, a simple relational database would not handle geospatial or graph queries well. 55 | 56 | Equally, the datastore chosen might not scale to handle the number of reads, which is typically orders of magnitude more than the number of writes. To solve these problems, we can use CQRS. 57 | 58 | # CQRS 59 | Command Query Responsibility Segregation is a technique which decouples the read path from the write path of an API. These two paths can have different data models and different datastores backing them that fit the specific needs of reads and writes. 60 | 61 | To keep these two synchronised, the write path posts updates to the read path - this inevitably results in some replication lag/eventual consistency issues. This is also more complex to manage. 62 | 63 | ```mermaid 64 | graph TD 65 | Client -- CRUD operations --> Gateway 66 | Gateway -- Reads --> C[Service C] 67 | Gateway -- CUD operations --> A[Service A] 68 | Gateway -- CUD operations --> B[Service B] 69 | B[Service B] -...- Replicate -.-> C[Service C] 70 | A[Service A] -...- Replicate -.-> C[Service C] 71 | A[Service A] --> db1[DB] 72 | B[Service B] --> db2[DB] 73 | C[Service C] --> db3[DB] 74 | ``` -------------------------------------------------------------------------------- /Algorithms/Topics/Knapsack Problem.md: -------------------------------------------------------------------------------- 1 | # Knapsack Problem 2 | ## Problem 3 | You're given an array of arrays, where each array represents an item. The first value in that array is the value, the second is the weight. 4 | 5 | Fit the items in your knapsack without having the sum of the weights exceed the knapsack's capacity. Your goal is the maximise the combined value. 6 | 7 | Return the maximised combined value and the indices of each item. 8 | 9 | ## Example 10 | ```javascript 11 | items = [[1, 2], [4, 3], [5, 6], [6, 7]] 12 | capacity = 10 13 | 14 | output = [10, [1, 3]] // e.g. [4, 3] and [6, 7] 15 | ``` 16 | 17 | ## Implementation 18 | We can construct a 2D array to represent the capacity of the knapsack and the list of items. Each cell therefore contains the maximum value of the current combination. 19 | 20 | $x$ axis = capacity 21 | $y$ axis = items 22 | 23 | | | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 24 | | - | - | - | - | - | - | - | - | - | - | - | - | 25 | | [] | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 26 | | [1, 2] | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 27 | | [4, 3] | 0 | 0 | 1 | 4 | 4 | 5 | 5 | 5 | 5 | 5 | 5 | 28 | | [5, 6] | 0 | 0 | 1 | 4 | 4 | 5 | 5 | 5 | 6 | 9 | 9 | 29 | | [6, 7] | 0 | 0 | 1 | 4 | 4 | 5 | 5 | 6 | 6 | 9 | 10 | 30 | 31 | Remember that for dynamic programming, the base case is often `0` or an empty array, so your array should be init'd one larger than you might normally. 32 | 33 | ```kotlin 34 | fun knapsack(items: List>, capacity: Int): Pair> { 35 | val array = List(items.size + 1) { MutableList(capacity.size + 1) { 0 } } 36 | // ... 37 | } 38 | ``` 39 | 40 | This also means that fetching any values from this array typically involves accessing `index - 1`, as each value is actually shifted due to adding the `0` and `[]` base cases. 41 | 42 | For array `values`, where `w` is the weight of the item at `i` and `v` is the value of the current item at `i`: 43 | 44 | ```javascript 45 | if (w <= j) 46 | values[i][j] = max(values[i - 1][j], values[i - 1][j - w] + v) 47 | else 48 | values[i][j] = values[i - 1][j] 49 | ``` 50 | 51 | Once the array is constructed, you'll need to work out which items were actually added to create this maximal value. To do this, you backtrack starting at the bottom right-most item. In the example above, this is `10`. 52 | 53 | ```javascript 54 | max = values[items.size][capacity] 55 | ``` 56 | 57 | Compare this with the value directly above. If the value is the same, move one step up because no item was added. 58 | 59 | If the value at your current index is *greater*, add the current position to the outputs, and then move both one row up and the weight of the current item backwards. In this example, we would move to 4 (`values[3, 3]`). 60 | 61 | Continue repeating these steps until reaching a capacity of `0`. In this example, the steps look like: 62 | 63 | | | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 64 | | - | - | - | - | - | - | - | - | - | - | - | - | 65 | | [] | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 66 | | [1, 2] | **0** | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 67 | | [4, 3] | 0 | 0 | 1 | **4** | 4 | 5 | 5 | 5 | 5 | 5 | 5 | 68 | | [5, 6] | 0 | 0 | 1 | **4** | 4 | 5 | 5 | 5 | 6 | 9 | 9 | 69 | | [6, 7] | 0 | 0 | 1 | 4 | 4 | 5 | 5 | 6 | 6 | 9 | **10** | 70 | 71 | ```javascript 72 | values = [10, 4, 4, 0] 73 | positions = [values[5, 10], values[3, 3], values[2, 3], values[1, 0]] 74 | items = [[6, 7], [4, 3]] 75 | ``` 76 | 77 | ## Complexity 78 | $O(n \cdot c)$ time and space, where $n$ is the number of items and $c$ is the capacity of the knapsack. 79 | 80 | ## Implementation 81 | ```kotlin 82 | fun knapsackProblem(items: List>, capacity: Int): Pair> { 83 | val array = List(items.size + 1) { MutableList(capacity + 1) { 0 } } 84 | 85 | for (item in 1 until items.size + 1) { 86 | val (value, weight) = items[item - 1] 87 | 88 | for (cap in 0 until capacity + 1) { 89 | val previous = array[item - 1][cap] 90 | if (weight > cap) { 91 | array[item][cap] = previous 92 | } else { 93 | val potentialMax = array[item - 1][cap - weight] + value 94 | array[item][cap] = Math.max(previous, potentialMax) 95 | } 96 | } 97 | } 98 | 99 | val maximisedValue = array[items.size][capacity] 100 | 101 | return maximisedValue to reconstruct(array, items) 102 | } 103 | 104 | private fun reconstruct( 105 | array: List>, 106 | items: List> 107 | ): List { 108 | val results = mutableListOf() 109 | 110 | var i = array.size - 1 111 | var c = array[0].size - 1 112 | 113 | while (i > 0) { 114 | if (array[i][c] == array[i - 1][c]) { 115 | i-- 116 | } else { 117 | results.add(i - 1) 118 | val (_, weight) = items[i - 1] 119 | c -= weight 120 | i-- 121 | } 122 | 123 | if (i == 0) break 124 | } 125 | 126 | return results 127 | } 128 | ``` 129 | -------------------------------------------------------------------------------- /System Design/Contents.md: -------------------------------------------------------------------------------- 1 | # Contents 2 | [[Mobile System Design]] 3 | 4 | ## TL;DR 5 | - Write down as much as possible 6 | - Write-talk-write-talk if necessary 7 | - Concise and neat is better than quick and illegible 8 | - Engage with the interviewer 9 | - describe what you’re doing 10 | - ask questions 11 | - signpost frequently 12 | - check understanding 13 | - handle interruptions gracefully 14 | - Mention relevant tech 15 | - but don’t bluff 16 | - and don’t let it take over the interview 17 | - Track and mention considerations 18 | - like security, accessibility, and testing 19 | 20 | ## Interview Overview 21 | 22 | Systems design interviews are very different to coding interviews - there's no 100% correct or 100% wrong answer. Because of this, you must be able to defend your position, or adjust it as new data emerges - convincing your interviewer that your solution is thought through. 23 | 24 | Questions are intentionally very vague - sometimes just two words. "Design Facebook", "design YouTube" etc. Consequently, it's extremely important that you ask a tonne of clarifying questions with your interviewer to help narrow down the scope of the system and tease out which components they're expecting to see. 25 | 26 | Systems design interviews can be thought of in 4 broad sections: 27 | 28 | ### Foundational knowledge 29 | This is the core stuff, without it you really can't bullshit your way through these interviews. We're talking fundamentals, such as: 30 | - [[Client-Server Model]] 31 | - [[Network Protocols]] 32 | - [[Storage]] 33 | 34 | ### Key Characteristics 35 | You need to decide what they key characteristics of your system are going to be, as this informs the technical choices you make later and it makes it clear to the interviewer that you've thought about the system. 36 | 37 | Examples include: 38 | - [[Latency and Throughput]] 39 | - [[Availability]] 40 | - [[Hashing]] 41 | - [[Strong vs Eventual Consistency]] 42 | - Leader election 43 | - Rate limiting 44 | 45 | ### Key Components 46 | Once you've decided what your system's key characteristics are, which subsystems are you going to use, in what combinations, to fulfill these requirements? 47 | 48 | Examples might be: 49 | - [[Caching]] 50 | - [[Proxies]] 51 | - [[Load Balancers]] 52 | - [[Relational Databases]] 53 | - [[Microservices]] 54 | - [[Microservices#The API Gateway]] 55 | - [[Partitioning]] 56 | - [[Duplication]] 57 | 58 | ### Actual Tech 59 | This is where you get to show that you actually have experience with this stuff. What existing tech out there fulfills your requirements? 60 | 61 | Tools include: 62 | - Amazon S3 63 | - Google Cloud Storage 64 | - Redis 65 | - Nginx 66 | - MapReduce 67 | - Zookeeper 68 | - Kubernetes 69 | 70 | # Strategies 71 | See [here](https://cternus.net/blog/2018/01/26/tackling-the-system-design-interview/) for more information. 72 | 73 | 1) Immediately write down the question in the top corner of the board so that you don't forget and can go back to it. At this stage, write down any constraints that have been mentioned. 74 | 2) Ask clarifying questions - write down any revealed constraints in bullet point form 75 | 3) Think through the broad strokes of the solution and write this on the board. "I'm going to draw out a basic solution here as a starting point". 76 | 4) Drill down into individual sections, changing things as you go. Describe how it works and the tradeoffs that you're making, as well as the failure modes. 77 | 1) Keep a list of "considerations" on one side of the board. Add to these as you remember facets that haven't been tackled yet, and cross these out as you sort them. 78 | 5) Write down a timeline of an interaction with the system, if appropriate 79 | 6) Towards the end, summarise your system. Are there any outstanding issues that you would have tackled if you had more time? 80 | 81 | ### Write. Down. Everything. 82 | You cannot write down too little. Interviewers won't be paying 100% attention and may miss things, so write as much as you can. 83 | 84 | ### Interact with your interviewer! 85 | 86 | Think of yourself as a tour guide walking the interviewer through your solution. 87 | 88 | - Does that seem reasonable? 89 | - Should I discuss X or Y next? 90 | - What would you like to see more of? 91 | - Does it seem like I’m missing anything? 92 | 93 | # Considerations 94 | - Security and compliance 95 | - Who has access? How much access? Are there bad actors? How might the system be attacked? How does authentication work? 96 | - Compliance 97 | - GDPR, medical data, payment data 98 | - Accessibility 99 | - Physical and mental disabilities, cultural/internationalisation 100 | - Backups 101 | - How will backups occur? How much data could you afford to lose? 102 | - Testing, monitoring, alerting 103 | - How will you verify that your solution is correct? How will you know that it's working? If the system goes down, who fixes it? 104 | -------------------------------------------------------------------------------- /System Design/Mobile/Mobile System Design.md: -------------------------------------------------------------------------------- 1 | # Mobile System Design Interviews 2 | 3 | - What are we being asked to design? 4 | - Who is the user, and how will they use our system? 5 | - What's the initial number of users? And the expected growth? 6 | - Are we being given an initial design/wireframes, or should we produce them as well? 7 | - Are we designing an MVP or final-product? 8 | - Are we building this from scratch, or can we leverage any existing components? Any existing patterns/architecture we should follow? 9 | - How big is the team who will implement and maintain our system? 10 | - Are we expected to design just the mobile application or other parts of the overall system too (e.g. API)? 11 | - Is it iOS or Android only, or cross-platform? Shall we support smartphones, tablets, both? 12 | 13 | ## Design Requirements 14 | - Client-side only 15 | - Client-side + API 16 | - Client-side + API + Backend 17 | 18 | Don't forget to gather key **functional**, **non-functional** and **out of scope** requirements. 19 | 20 | For example, in the case of Twitter: 21 | 22 | - Functional 23 | - Users should be able to scroll through an infinite list of tweets 24 | - Users should be able to click on a tweet to see replies 25 | - Users should be able to like a tweet 26 | - Non-functional 27 | - Offline support 28 | - Real-time notifications 29 | - Optimal bandwidth/CPU usages 30 | - Out of Scope 31 | - Login/Authentication 32 | - Sending Tweets 33 | - Followers/Retweets 34 | 35 | ## Identify Technical Requirements 36 | - How many users do we expect? 37 | - How big is the engineering team? 38 | 39 | ### Networking 40 | - REST API? How is it provided and what does it look like? 41 | - Any live updates? Websockets, polling, push notifications - tradeoffs 42 | 43 | ### Security 44 | - Authentication - how will your design verify who the user of your app is? 45 | - Storing sensitive data - will you save credentials? Access tokens? Refresh tokens? PII? Keychain/KeyStore 46 | - Secure communications - cert pinning, TLS 47 | 48 | ### Availability 49 | - Will the app support offline mode? If so, what solution will you use? How will you handle going online again? 50 | - Caching for images or other media - what's the cleanup policy (LRU) 51 | 52 | ### Scalability 53 | How will you build an app that can be worked on by dozens of engineers? 54 | 55 | - Modularisation for features 56 | - Breaking UI into standard components/design systems 57 | 58 | ### Performance 59 | - Any UI-heavy operations such as animations? How do you ensure no jank? 60 | - How do we load heavy resources like images asynchronously? What are the bottlenecks and challenges of your approach? 61 | 62 | ### Testing 63 | - Explain your testing strategy - pyramid of tests 64 | - Highlight strength of architecture - how easy is it to test an individual component? 65 | - Use of DI 66 | 67 | ### Monitoring 68 | - Crash reporting 69 | - Analytics 70 | - Performance monitoring 71 | - Breadcrumbs 72 | 73 | ### Deployment 74 | How do you forsee the app going live? 75 | - CI/CD pipeline with automated releases 76 | - Leveraging feature flags 77 | 78 | # High-Level Solution 79 | - Draw the main screens as boxes, describing the main content 80 | - Go over the flow, adding arrows to represent the user journey 81 | - Add more details to discuss each screen's composition - main UI elements, reusable components etc 82 | 83 | # Principal Systems 84 | (If required) 85 | 86 | - Mobile clients 87 | - API services 88 | - Backend apps 89 | - Databases 90 | - Notification services 91 | 92 | # Define Basic Data Entities 93 | - User object 94 | - Posts/Stories/Whatever 95 | 96 | # Describe Primary Endpoints 97 | - POST `v1/auth` etc - don't forget versioning! 98 | - Idempotency keys - can be a header 99 | - Cached by server, so subsequent requests with the same key get the same result (or error) 100 | - Makes sense for `POST` & `PUT` & `DELETE` & `PATCH` - `GET` is idempotent anyway 101 | - See https://miro.medium.com/max/700/1*GhDYXaU9DSNEHXtR0x7GmA.png 102 | - Describe input parameters, expected outputs 103 | 104 | # Describe Client Architecture 105 | - Presentation layer 106 | - MVVM, Coordinator, Activities + Fragments 107 | - Domain layer 108 | - Use cases to combine data from the user and repositories 109 | - Data layer 110 | - Repositories to coordinate fetching data from the network, caching, disk etc 111 | - Helper services 112 | - Notifications 113 | - Networking 114 | - Session service for user info 115 | - Credentials store 116 | 117 | # Quality of Service 118 | Assign a QoS to each of your network requests: 119 | * **User-critical** - should be dispatched as fast as possible, fetching the next page of data in a feed, for instance 120 | * **UI-critical** - dispatched after user-critical. Fetching thumbnails while scrolling, but could be cancelled if the user scrolls past the target tweet. May be delayed in the case of fast scrolling. 121 | * **UI-non-critical** - dispatched after UI-critical. High-res images for the feed. Cancelled if the user scrolls past the target tweet. 122 | * **Background** - should be dispatched after all of the above have finished. This can include posting "likes", analytics. 123 | 124 | Use a priority queue for scheduling network requests and dispatch in the order of their priority. Suspend low-priority requests if the maximum number of concurrent operations is reached and a high-priority request is scheduled. 125 | 126 | # Deep-Dive 127 | - Chose the most interesting screen and draw it's architecture. Cover all layers, from UI components to ViewModels, repositories, endpoints, network layer, local store etc 128 | - Trace the dependencies from the caller 129 | - Walk over the flow - what does the user see at every step? Don't forget view states! Loading, Error, Data, No Data 130 | - Think about the most challenging parts 131 | - Real-time updates 132 | - Image caching 133 | - Reusing data 134 | - Buffering requests 135 | -------------------------------------------------------------------------------- /Data Structures/Trees/Depth-First Search.md: -------------------------------------------------------------------------------- 1 | # Depth-First Search 2 | In a Depth-First Search, an algorithm starts are the root of the tree and then explores as far along as possible before backtracking. 3 | 4 | This can be implemented using pre-order, in-order and post-order traversal. 5 | 6 | ```mermaid 7 | graph TB 8 | 1 --> 2 & 7 & 8 9 | 2 --> 3 & 6 10 | 3 --> 4 & 5 11 | 8 --> 9 & 12 12 | 9 --> 10 & 11 13 | ``` 14 | 15 | ### Usecases 16 | * Finding connected components in a graph 17 | * Topological sorting in a DAG 18 | * Finding 2/3 (edge or vertex) connected components 19 | * Finding the bridges of a graph 20 | * Finding strongly connected components 21 | * Solving puzzles with only one solution such as mazes 22 | * Finding biconnectivity in graphs 23 | 24 | ## In-Order 25 | Given a tree: 26 | 27 | ```mermaid 28 | graph TB 29 | 1 --> 2 & 3 30 | 2 --> 4 & 5 31 | ``` 32 | 33 | * Left -> root -> right 34 | * 4 2 5 1 3 35 | * As we visit the left (lowest) subtree first, we get values printed in ascending order (i.e. it's in-order!) 36 | 37 | ### Implementation 38 | 39 | #### Recursive 40 | 41 | ```javscript 42 | function in_order(tree_node) 43 | if tree_node == null 44 | return 45 | 46 | in_order(tree_node.left) 47 | visit(tree_node.data) 48 | in_order(tree_node.right) 49 | ``` 50 | 51 | 52 | #### Iterative 53 | 54 | 1) Create an empty stack. 55 | 2) Initialize current node as root 56 | 3) Push the current node to the stack and set current = current -> left until current is NULL 57 | 4) If current is NULL and stack is not empty then 58 | a) Pop the top item from stack. 59 | b) Print the popped item, set current = popped_item -> right 60 | c) Go to step 3. 61 | 5) If current is NULL and stack is empty then we are done. 62 | 63 | ```kotlin 64 | fun inOrderTraverse(tree: BST) { 65 | val stack = LinkedList() 66 | var current: BST? = tree 67 | 68 | while (stack.isNotEmpty() || current != null) { 69 | while (current != null) { 70 | // Place a pointer to the tree node on the stack 71 | // before traversing left subtree 72 | stack.push(current) 73 | current = current.left 74 | } 75 | 76 | current = stack.pop() 77 | 78 | // Do something with the value 79 | val value = current.value 80 | 81 | // We've visited the node at it's left subtree 82 | // Time to visit the right instead! 83 | current = current.right 84 | } 85 | } 86 | ``` 87 | 88 | Note that if you want to print the tree in **reverse** order, you simply swap `left` for `right`. This is useful in questions where you have to find the `k`th largest value. 89 | 90 | ## Preorder 91 | Given a tree: 92 | 93 | ```mermaid 94 | graph TB 95 | 1 --> 2 & 3 96 | 2 --> 4 & 5 97 | ``` 98 | 99 | * Root -> left -> right 100 | * 1 2 4 5 3 101 | * Used to make a copy of a tree 102 | * Root is visited first 103 | * Implement with a Stack 104 | 105 | ### Implementation 106 | 107 | #### Recursive 108 | 109 | ```javascript 110 | function pre_order(tree_node) 111 | if tree_node == null 112 | return 113 | 114 | visit(tree_node.data) 115 | pre_order(tree_node.left) 116 | pre_order(tree_node.right) 117 | ``` 118 | 119 | #### Iterative 120 | 1) Create an empty stack and push root node to stack. 121 | 2) Do the following while the stack is not empty. 122 | a) Pop an item from the stack and print it. 123 | b) Push right child of a popped item to stack 124 | c) Push left child of a popped item to stack 125 | 126 | The right child is pushed before the left child to make sure that the left subtree is processed first. 127 | 128 | ```kotlin 129 | fun preOrderTraverse(tree: BST) { 130 | val stack = LinkedList() 131 | stack.push(tree) 132 | 133 | while (stack.isNotEmpty()) { 134 | val node = stack.pop() 135 | 136 | // Do something with the value 137 | val value = node.value 138 | 139 | if (node.right != null) stack.push(node.right) 140 | if (node.left != null) stack.push(node.left) 141 | } 142 | } 143 | ``` 144 | 145 | ## Postorder 146 | Given a tree: 147 | 148 | ```mermaid 149 | graph TB 150 | 1 --> 2 & 3 151 | 2 --> 4 & 5 152 | ``` 153 | 154 | * Left -> right -> root 155 | * 4 5 2 3 1 156 | * Used for tree deletion 157 | * Root is visited last 158 | 159 | ### Implementation 160 | 161 | #### Recursive 162 | 163 | ```javascript 164 | function post_order(tree_node) 165 | if tree_node == null 166 | return 167 | 168 | post_order(tree_node.left) 169 | post_order(tree_node.right) 170 | visit(tree_node.data) 171 | ``` 172 | 173 | #### Iterative 174 | 1) Push root to first stack. 175 | 2) Loop while first stack is not empty 176 | a) Pop a node from first stack and push it to second stack 177 | b) Push left and right children of the popped node to first stack 178 | 3) Print contents of second stack 179 | 180 | ```kotlin 181 | fun postOrderTraverse(tree: BST) { 182 | val stack1 = LinkedList() 183 | val stack2 = LinkedList() 184 | stack1.push(tree) 185 | 186 | while (stack1.isNotEmpty()) { 187 | // Pop from first stack, push to second stack 188 | val node = stack1.pop() 189 | stack2.push(node) 190 | 191 | if (node.left != null) stack1.push(node.left) 192 | if (node.right != null) stack1.push(node.right) 193 | } 194 | 195 | while (stack2.isNotEmpty()) { 196 | val node = stack2.pop() 197 | 198 | // Do something with the value 199 | val value = node.value 200 | } 201 | } 202 | ``` 203 | 204 | # Tips 205 | Don't forget that [[Depth-First Search]] can be used on matrices. When doing so, you may have to iterate through offsets - i.e. access the row above/blow and the column left/right: 206 | 207 | ```kotlin 208 | private val RowOffsets = intArrayOf(0, 1, 0 , -1) 209 | private val ColumnOffsets = intArrayOf(1, 0, -1 , 0) 210 | 211 | for (offset in 0 until 4) { 212 | val ret = backtrack( 213 | board, 214 | target, 215 | row + RowOffsets[offset], 216 | column + ColumnOffsets[offset], 217 | index + 1 218 | ) 219 | 220 | if (ret) break 221 | } 222 | 223 | ``` -------------------------------------------------------------------------------- /Data Structures/Trees/Binary Search Tree.md: -------------------------------------------------------------------------------- 1 | # Binary Search Trees 2 | 3 | A binary search tree is a Tree data structure where: 4 | 5 | - Nodes can only have two or less children 6 | - Children nodes to the left of the parent must be smaller in value 7 | - Children nodes to the right of the parent must be larger in value 8 | 9 | ```mermaid 10 | graph TB 11 | 10 --> 6 & 18 12 | 6 --> 4 & 8 13 | 18 --> 15 & 21 14 | ``` 15 | 16 | # Time Complexity 17 | | Operation | Average | Worst | 18 | |-|-|-| 19 | | Space | $O(n)$ | $O(n)$ | 20 | | Search | $O(log \cdot n)$ | $O(n)$ | 21 | | Insert | $O(log \cdot n)$ | $O(n)$ | 22 | | Delete | $O(log \cdot n)$ | $O(n)$ | 23 | 24 | ### Basic search 25 | Due to these rules, it's very easy to find a particular key/value within a tree: 26 | 27 | ```javascript 28 | function find_node(binary_tree, value) 29 | node <- binary_tree.root_node 30 | 31 | while node: 32 | if node.value == value 33 | return node 34 | if value > node.value 35 | node <- node.right 36 | else 37 | node <- node.left 38 | 39 | return null 40 | ``` 41 | 42 | ### Insertion 43 | To insert an item, we search the binary tree for the value we want to insert. We take the last node that we explored in the search, and make its right or left pointer point to the new node: 44 | 45 | ```javascript 46 | function insert_node(binary_tree, new_node) 47 | node <- binary_tree.root_node 48 | 49 | while node: 50 | last_node <- node 51 | if new_node.value > node.value 52 | node <- node.right 53 | else 54 | node <- node.left 55 | 56 | if new_node.value > last_node.value 57 | last_node.right <- new_node 58 | else 59 | last_node.left <- new_node 60 | ``` 61 | 62 | ### Deletion 63 | There's lots of edgecases here: 64 | 65 | - Search the tree as you normally would by searching left or right - but keep track of the parent 66 | - Once the value is found: 67 | - If the node has two children: 68 | - Set the value of the node to the value of it's right child 69 | - Call remove on the right child 70 | - If the parent node is null 71 | - It's the root! 72 | - If the left child isn't null 73 | - Set the value of the current node to that of the left child 74 | - Set the right node to the left child's right node 75 | - Set the left node to the lefts childs's left node 76 | - If the right child isn't null 77 | - Set the value of the current node to that of the right child 78 | - Set the left node to the right child's left node 79 | - Set the right node to the right child's right node 80 | - If both children are null 81 | - Null out the value - you're done 82 | - If the current node is equal to the parent's left node 83 | - Set the parent node's left node to the current left ?: right node 84 | - If the current node is equal to the parent's right node 85 | - Set the parent node's right node to the current left ?: right node 86 | 87 | ### Balancing 88 | Inserting too many nodes in a BST will result in a tree that's too tall, where many nodes have only one child - we can end up with something that looks more like a [[Linked List]]. This is inefficient - the taller the tree, the longer the average path between nodes in the tree. 89 | 90 | A perfectly balanced tree has the minimum possible height. 91 | 92 | ```javascript 93 | function build_balanced(nodes) 94 | if nodes is empty 95 | return null 96 | 97 | middle <- nodes.length / 2 98 | left <- nodes.slice(0, middle - 1) 99 | right <- nodes.slice(middle + 1, nodes.length) 100 | 101 | balanced <- BinaryTree.new(root = nodes[middle]) 102 | balanced.left <- build_balanced(left) 103 | balanced.right <- build_balanced(right) 104 | 105 | return balanced 106 | ``` 107 | 108 | * Maximum height - i.e. a [[Linked List]] is $n$ 109 | * Minimum height is $log \cdot n$ 110 | * Complexity of searching a tree is proportional to it's height 111 | * In the worst-case, searching a tree requires searching all the way down to the deepest nodes, therefore searching is an $O(log \cdot n)$ operation. 112 | 113 | Tree re-balancing is an expensive operation - re-balancing a tree after every insertion or deletion would rapidly get out of hand. 114 | 115 | To handle the scenario where writes happen frequently, **self balancing trees** were invented - specifically the [[Red-Black Tree]] and [[AVL Tree]]. 116 | 117 | A B-Tree is often used for magnetic data storage - here we want more than two children per node so that we can efficiently work with data in large chunks. 118 | 119 | ### Constructing a balanced tree 120 | If the input array is sorted, take the middle element and make it the root. 121 | 122 | Slice the array into two - left and right sides. Take the midde element of each and make them the left and right children, respectively. 123 | 124 | Carry on doing this until you've run out of values. 125 | 126 | ```kotlin 127 | class BST(var value: Int) { 128 | var left: BST? = null 129 | var right: BST? = null 130 | } 131 | 132 | fun minHeightBst(array: List): BST { 133 | return construct(array, 0, array.size - 1)!! 134 | } 135 | 136 | fun construct(array: List, start: Int, end: Int): BST? { 137 | if (end < start) return null 138 | 139 | val mid = (start + end) / 2 140 | val bst = BST(array[mid]) 141 | bst.left = construct(array, start, mid - 1) 142 | bst.right = construct(array, mid + 1, end) 143 | 144 | return bst 145 | } 146 | ``` 147 | 148 | ### Compute If Balanced 149 | A [[Binary Search Tree]] is balanced if the difference in height between the left and right subtree is no greater than 1. Note that the height of a `null` tree is generally `-1`. 150 | 151 | ```kotlin 152 | fun isBalanced(root: Bst): Boolean { 153 | return checkHeight(root) != Int.MIN_VALUE 154 | } 155 | 156 | private fun checkHeight(root: Bst): Int { 157 | if (root == null) return -1 158 | 159 | val leftHeight = checkHeight(root.left) 160 | if (leftHeight == Int.MIN_VALUE) return Int.MIN_VALUE 161 | 162 | val rightHeight = checkHeight(root.right) 163 | if (rightHeight == Int.MIN_VALUE) return Int.MIN_VALUE 164 | 165 | val diff = leftHeight - rightHeight 166 | if (Math.abs(diff) > 1) { 167 | return Int.MIN_VALUE 168 | } else { 169 | return Math.max(leftHeight, rightHeight) + 1 170 | } 171 | } 172 | ``` -------------------------------------------------------------------------------- /Common Strategies.md: -------------------------------------------------------------------------------- 1 | # Common Strategies 2 | https://hackernoon.com/14-patterns-to-ace-any-coding-interview-question-c5bb3357f6ed 3 | 4 | ## Cheat Sheet 5 | >1. IF sorted THEN (binary search OR two pointer) 6 | >2. IF all permutations/subsets THEN backtracking 7 | >3. IF tree THEN (recursion OR two pointer OR obvious recursion below) 8 | >4. IF graph THEN dfs/bfs 9 | >5. IF linkedlist $O(1)$ space THEN two pointer 10 | >6. IF obvious recursion problem but recursion banned THEN stack 11 | >7. IF options (+1 or +2) OR min/max + previously made choices THEN DP 12 | >8. IF k items THEN heap 13 | >9. IF common strings THEN (map OR trie) 14 | >10. ELSE (map/set for $O(n)$ time $O(n)$ space or sort for $O(n \cdot log \cdot n)$ time $O(1)$ space) 15 | 16 | ## Multiple pointers 17 | When iterating through arrays, strings, Linked Lists - having two pointers - a fast one and a slow one, or one high one low, is often part of the solution. 18 | 19 | ## Iterate in reverse 20 | Many solutions involve either traversing an array from the right-hand side or adding elements to an array from the end. This is especially common with `String` related questions. 21 | 22 | ## Sorting 23 | Could the problem be made any simpler by pre-sorting the inputs? This tends to enable [[Binary Search]] - which can be used on more than just numbers! 24 | 25 | ## Backtracking 26 | This strategy is used when the solution is a series of choices, and each choice constraints subsequent choices. These types of problems are known as Constraint Satisfaction Problems (CSPs), and are generally solved with recursion. 27 | 28 | Backtracking problems can be thought of as trees where decisions are made at each node. When we evaluate a new node and decide it's not for us, we backtrack to the parent and investigate the next node. In doing so, we are _pruning_ the recursion tree - rather than enumerating all possible choices, we're eagerly rejecting invalid decisions. 29 | 30 | ```mermaid 31 | graph TD 32 | A --> N 33 | A ==> I 34 | N --> T 35 | N --> D 36 | I ==> M 37 | I --> R 38 | 39 | T --> id1(ANT) 40 | D --> id2(AND) 41 | M ==> id3(AIM) 42 | R --> id4(AIR) 43 | 44 | ``` 45 | 46 | ### General Algorithm 47 | ```javascript 48 | function backtrack(candidate) 49 | if find_solution(candidate) { 50 | output(candidate) 51 | return 52 | } 53 | 54 | // iterate all possible candidates. 55 | for next_candidate in list_of_candidates 56 | if is_valid(next_candidate) 57 | // try this partial candidate solution 58 | place(next_candidate) 59 | // given the candidate, explore further. 60 | backtrack(next_candidate) 61 | // backtrack 62 | remove(next_candidate) 63 | ``` 64 | 65 | ### Example 66 | Find all valid combinations of `k` numbers that sum up to `n` such that the following conditions are true: 67 | 68 | - Only numbers `1` through `9` are used. 69 | - Each number is used **at most once**. 70 | 71 | Return _a list of all possible valid combinations_. The list must not contain the same combination twice, and the combinations may be returned in any order. 72 | 73 | ```kotlin 74 | fun combinationSum3(k: Int, target: Int): List> { 75 | val results = mutableListOf>() 76 | backtrack(target, k, 0, results, LinkedList()) 77 | return results 78 | } 79 | 80 | private fun backtrack( 81 | remain: Int, 82 | k: Int, 83 | nextStart: Int, 84 | results: MutableList>, 85 | combinations: LinkedList 86 | ) { 87 | // Basecase where nothing remaing having chosen k integers 88 | if (remain == 0 && combinations.size == k) { 89 | results.add(combinations.toList()) 90 | return 91 | // Basecase where we missed the target with k integers 92 | } else if (remain < 0 || combinations.size == k) { 93 | return 94 | } 95 | 96 | for (index in nextStart until 9) { 97 | combinations.add(index + 1) 98 | backtrack(remain - index - 1, k, index + 1, results, combinations) 99 | combinations.removeLast() 100 | } 101 | } 102 | ``` 103 | 104 | ## Greedy Algorithms 105 | This is the opposite of backtracking - optimize for a certain thing (for instance, value of items in a knapsack) and never backtrack. You cannot guarantee that this method will produce the optimum solution, but it will likely be _good enough_ for some types of problems, and much faster. 106 | 107 | We don't investigate whether or not the previous choice makes a difference to future choices. 108 | 109 | ## Divide and conquer 110 | Divide problems into smaller pieces with optimal substructure and compute from there. 111 | 112 | For instance: merge sort, $O(n \cdot log \cdot n)$ 113 | 114 | ```javascript 115 | function merge_sort(list) 116 | if list.length == 1 117 | return list 118 | 119 | left <- list.first_half 120 | right <- list.second_half 121 | 122 | return merge( 123 | merge_sort(left), 124 | merge_sort(right) 125 | ) 126 | ``` 127 | 128 | ## Dynamic Programming 129 | Identifying identical/overlapping sub-problems so that we don't have to compute them more than once - see [[Dynamic Programming]]. Consider Fibonacci: 130 | 131 | ```javascript 132 | function fib(n) 133 | if n <= 2 134 | return 1 135 | 136 | return fib(n - 1) + fib(n - 2) 137 | ``` 138 | 139 | In the recursive solution, `fib(3)` would be calculated multiple times. We can store these values so that we don't have to calculate them repeatedly: 140 | 141 | ```javascript 142 | M <- [1 => 1; 2 => 2] 143 | function d_fib(n) 144 | if n not in M 145 | M[n] <- d_fib(n - 1) + d_fib(n - 2) 146 | return M[n] 147 | ``` 148 | 149 | Often these solutions involve matrices and summing the surrounding values - could these be stored in temporary variables only, reducing the space complexity from $O(n \cdot m)$ to $O(1)$? 150 | 151 | Remember that for dynamic programming, the base case is often `0`, so many arrays will be built as such: 152 | 153 | ```kotlin 154 | fun knapsackProblem(items: List>, capacity: Int): Pair> { 155 | val array = List(items.size + 1) { MutableList(capacity.size + 1) { 0 } } 156 | 157 | // ... 158 | } 159 | ``` 160 | 161 | See also [[Knapsack Problem]]. 162 | 163 | There's two main approaches in dynamic programming: 164 | 165 | ### Memoization 166 | Memoization is where we add caching to a function - typically this is used on recursive functions for a **top-down** solution that starts with the initial problem and then recursively calls itself to solve smaller problems. 167 | 168 | ### Tabulation 169 | Tabulation uses a table to keep track of sub-problem results and works in a **bottom-up** manner: solving the smallest sub-problems in an iterative manner before solving the larger ones. 170 | 171 | ## Branch and Bound 172 | Many problems involve trying to find maximum profits, shortest paths etc - these are called _optimization problems_. When the solution is a series of choices, we can use a strategy called Branch and Bound. 173 | 174 | ## Sliding Window 175 | Given an array: 176 | ```javascript 177 | [a b c d e f g h] 178 | ``` 179 | 180 | A sliding window of size 3 would look like: 181 | 182 | ```javascript 183 | [a b c] 184 | [b c d] 185 | [c d e] 186 | [d e f] 187 | [e f g] 188 | [f g h] 189 | ``` 190 | 191 | Often we'll be summing these slices - we could end up doing a lot of unnecessary work: 192 | 193 | ```javascript 194 | [4,6,3],8,3 195 | 4,[6,3,8],3 196 | ``` 197 | For the second slice, rather than sum every value, we can take the last slice value, subtract the `4` and add the `3`. See [this](https://stackoverflow.com/a/64111403/3245482) answer on StackOverflow for a visual representation. 198 | 199 | ### Example 200 | Given the height of a staircase and the max number of steps you can take at any one time, calculate the number of ways you can climb a staircase. 201 | 202 | ```javascript 203 | height = 4 204 | maxSteps = 2 205 | output = 5 206 | // 1, 1, 1, 1 207 | // 1, 1, 2 208 | // 1, 2, 1 209 | // 2, 1, 1 210 | // 2, 2 211 | ``` 212 | 213 | ```kotlin 214 | fun staircaseTraversal(height: Int, maxSteps: Int): Int { 215 | var currentNumberOfWays = 0 216 | val waysToTop = mutableListOf(1) 217 | 218 | // height + 1 because we must account for 0th step, which can be traversed 219 | // in exactly 1 step 220 | for (currentHeight in 1 until height + 1) { 221 | // Window size is maxSteps 222 | val startOfWindow = currentHeight - maxSteps - 1 223 | val endOfWindow = currentHeight - 1 224 | // Remove the value from the start of the window 225 | if (startOfWindow >= 0) currentNumberOfWays -= waysToTop[startOfWindow] 226 | // Add the value at the end of the window 227 | currentNumberOfWays += waysToTop[endOfWindow] 228 | 229 | waysToTop.add(currentNumberOfWays) 230 | } 231 | 232 | return waysToTop[height] 233 | } 234 | ``` 235 | 236 | 237 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | 178 | APPENDIX: How to apply the Apache License to your work. 179 | 180 | To apply the Apache License to your work, attach the following 181 | boilerplate notice, with the fields enclosed by brackets "[]" 182 | replaced with your own identifying information. (Don't include 183 | the brackets!) The text should be enclosed in the appropriate 184 | comment syntax for the file format. We also recommend that a 185 | file or class name and description of purpose be included on the 186 | same "printed page" as the copyright notice for easier 187 | identification within third-party archives. 188 | 189 | Copyright [yyyy] [name of copyright owner] 190 | 191 | Licensed under the Apache License, Version 2.0 (the "License"); 192 | you may not use this file except in compliance with the License. 193 | You may obtain a copy of the License at 194 | 195 | http://www.apache.org/licenses/LICENSE-2.0 196 | 197 | Unless required by applicable law or agreed to in writing, software 198 | distributed under the License is distributed on an "AS IS" BASIS, 199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 200 | See the License for the specific language governing permissions and 201 | limitations under the License. 202 | --------------------------------------------------------------------------------