├── 10-XXX Machine Learning ├── 10-601 Introduction to Machine Learning.md ├── 10-605 Machine Learning with Large Datasets.md ├── 10-617 Intermediate Deep Learning.md └── 10-725 Convex Optimization.md ├── 11-XXX Language Technologies ├── 11-642 Search Engines.md └── 11-711 Advanced NLP.md ├── 14-XXX Information Networking └── 14-736 Distributed System.md ├── 15-XXX Computer Systems ├── 15-513 Introduction to Computer Systems.md ├── 15-619 Cloud Computing.md ├── 15-641 Computer Network.md └── 15-645 Database Systems.md ├── 18-XXX Electrical and Computer Engineering └── 18-746 Storage Systems.md ├── LICENSE └── README.md /10-XXX Machine Learning/10-601 Introduction to Machine Learning.md: -------------------------------------------------------------------------------- 1 | [Lecture 1: Supervised and Unsupervised Learning](#Supervised-and-Unsupervised-Learning)
2 | [Lecture 2: Machine Learning as Functional Approximation](#Machine-Learning-as-Functional-Approximation)
3 | 4 | ## Supervised and Unsupervised Learning 5 | ### Supervised models:
6 | · Decision Trees
7 | · KNN
8 | · Naïve Bayes
9 | · Perceptron
10 | · Logistic Regression
11 | · Linear Regression
12 | · Neural Networks
13 | 14 | ### Definition of Machine Learning 15 | Three components: Task T, Performance metric P, and Experience E. 16 | 17 | ### Definitions: Machine Learning Classifier 18 | A **classifier** is a function that takes feature values as input and outputs a label.
19 | A **test dataset** is used to evaluate a classifier’s predictions.
20 | The **error rate** is the proportion of data points where the prediction is wrong
21 | 22 | ### Types of Machine Learning Classifiers 23 | **Majority vote classifier**: always predicts the most common label in the **training** dataset.
24 | **Memorizer**: if a set of features exists in the training dataset, predict its corresponding label; otherwise, predict the majority vote
25 | **Decision Stump**: based on a single feature, xd, predict the most common label in the **training** dataset among all data points that have the same value for xd
26 | 27 | ### A typical Machine Learning Routine 28 | · Step 1 – **Training**
29 | Input: a labeled training dataset
30 | Output: a classifier
31 | · Step 2 – **Testing**
32 | Inputs: a classifier, a test dataset
33 | Output: predictions for each test data point
34 | · Step 3 – **Evaluation**
35 | Inputs: predictions from step 2, test dataset labels
36 | Output: some measure of how good the predictions are;
37 | usually (but not always) error rate
38 | 39 | ## Machine Learning as Functional Approximation 40 | ### Notations 41 | · **Feature Space**: x
42 | · **Label Space**: y
43 | · **Unknown target function**: c*: x -> y
44 | · **Training Dataset**: D = {x(1), c\*(x(1)) = y(1), (x(2), y(2))...,(x(N), y(N))}
45 | · **Hypothesis Space**: H
46 | · **Goal**: Find the classifier, h ∈ H, that best approximates c\*
47 | 48 | ## Loss function and error Rates 49 | **Loss Function**: l : y x y -> ℝ
50 | This defines how "bad" predictions $\hat{y}$ = h(x), are compared to the true labels, y = c\*(x)
51 | 52 | ### Common choices for loss functions: 53 | 1. Squared loss (for regression): l(y, $\hat{y}$ ) = (y - $\hat{y}$ )2
54 | 2. Binary or 0-1 loss (for classification): l(y, $\hat{y}$ ) = 1(y != $\hat{y}$ )
55 | 56 | ### Decision Stump Pseudocode: 57 | Screen Shot 2024-01-22 at 5 22 35 PM 58 | -------------------------------------------------------------------------------- /10-XXX Machine Learning/10-605 Machine Learning with Large Datasets.md: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /10-XXX Machine Learning/10-617 Intermediate Deep Learning.md: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /10-XXX Machine Learning/10-725 Convex Optimization.md: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /11-XXX Language Technologies/11-642 Search Engines.md: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /11-XXX Language Technologies/11-711 Advanced NLP.md: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /14-XXX Information Networking/14-736 Distributed System.md: -------------------------------------------------------------------------------- 1 | ## Intro to Distributed Systems 2 | ### Distributed Systems Overview 3 | A distributed system can be generically defined as a collection of resources on distinct physical components that appear to function as a single coherent system
4 | 5 | Examples: the Internet, Embedded SoC, Mobile Gaming
6 | 7 | The Benefits of having a distributed system:
8 | - Geographic distribution of users, peers, providers, etc.
9 | - The distribution of resources can match that of the population
10 | - Cost per resource (processor, memory, storage, network) at scale
11 | - Capacity (processor, memory, storage, network)
12 | - Failure model (high chance of 1 of N failure, lower chance of N of N failure)
13 | - Localized risks
14 | 15 | ## Computing Over Networks 16 | ### Challenges of Distributed Systems 17 | 18 | Fundamental distributed systems challenges:
19 | 20 | - Timing:
21 | - How do processes on different networked devices determine the ordering of events that took place? The "Who shot first?" problem
22 | - Concurrency: H
23 | - How do processes on different networked devices safely access/use a commonly shared resource on yet another device?
24 | - Robustness:
25 | - How do processes on different networked devices deal with imperfections in the networks that connect them?
26 | - Consistency:
27 | - How do we guarantee the same outcome from a process regardless of where/how the process runs?
28 | 29 | ### Timing 30 | **Reliability**: The delivery of all packets in order, with correct content, without duplicates.
31 | - Reliable transport provides applications with guaranteed delivery of message segments in order, with verified content, without duplication (e.g., TCP; useful for bulk data delivery)
32 | - Reliability is traded for latency (no bounded delay guarantee) and performance overhead (potentially lots of retransmissions)
33 | 34 | Reliability != timeliness 35 | - The two are often in direct conflict 36 | - Reliability is only guaranteed on a per-session basis, not globally 37 | - In general, **time** is a major challenge to be dealt with. 38 | 39 | ### Concurrency 40 | **Coordination**: 41 | - Finding “stuff” is tricky (content, resources, …) 42 | - Maybe someone/something needs to keep track of all the things? 43 | - Coordination is a new (or at least amplified) challenge 44 | - In general, **management** is a major challenge to be dealt with 45 | 46 | ### Robustness 47 | **(Partial) Failure**: 48 | - Servers can fail 49 | - Especially if the system comprises a large number of cheap/commodity servers instead of one monolithic, high-performance, many-9s availability server 50 | - If one server fails, the entire system doesn’t fail 51 | - Partial failure usually allows for recovery (ideally invisible to the client) and hopefully doesn’t affect availability 52 | - In general, **fault tolerance and recoverability** are highly desirable 53 | 54 | Failures are difficult to handle because it is difficult for us to know which specific processing state the failure occurred.
55 | 56 | ### Consistency 57 | **Consistency**: 58 | - If many clients are interacting with the same data, how do we guarantee that everyone agrees on the same value? 59 | - Mutexes/locks are great, but how can we extend that from a single OS to an entire DS? 60 | - If we can’t, can the clients at least detect that there’s a problem? 61 | - Consistency is one of the biggest challenges in asynchronous, concurrent, distributed systems 62 | 63 | ## Coordinating Operations over a Network 64 | ### Remote Operation 65 | - I have a task (e.g., compute something, access some data), and I need another networked component to help me 66 | - This is abstracted as a function/subroutine/… -> y = f(x) 67 | - y: what is the desired outcome that I’m asking for 68 | - x: what inputs am I providing as part of my request 69 | - f: what work is being done to provide my outcome, and I don’t care where/how it’s implemented 70 | 71 | ### Local Operation 72 | - If this was done locally: 73 | - Find f and x in memory 74 | - Load x in register(s) 75 | - Branch to subroutine 76 | - Subroutine computes y 77 | - Load y in register(s) 78 | - Branch return 79 | - Store y in memory 80 | 81 | - The OS hides most of this, so the dev sees a simple **function/method call**
82 | 83 | ### Local vs. Remote Operation 84 | Apart from the local operations,
85 | - At the remote entity: 86 | - Get messages from the other party 87 | - Unpack f and x (how?) 88 | - Load x in register(s) 89 | - Subroutine, compute y 90 | - Pack into reply message, send 91 | Screen Shot 2024-01-23 at 1 28 00 PM 92 | 93 | ### Remote Procedure Calls (RPC) 94 | - RPC is a way to provide a programmer-friendly abstraction to handle the complexity of coordination required for remote services 95 | - RPC moves inter-procedure communication from stack to network 96 | - Attempts to provide an interface for the programmer that still looks like a local procedure call 97 | 98 | ### Structure of the RPC 99 | - Client-side: 100 | - Client applications only see a local “stub” (aka proxy) that presents a simple function/method/API for local use 101 | - The stub layer abstracts away most networking details 102 | - Client app calls f(x) presented by stub, stub interacts with library to perform all the details of the remote call but doesn't "do" f(x) 103 | - Nothing about transport has to change 104 | - Server-side: 105 | - Same idea on the server side, where the actual service provided doesn’t care about underlying network interactions, just "doing" f(x) 106 | - Similar server stub (aka “skeleton”) abstracts away all networking details 107 | - Stub interacts with the library to handle incoming remote requests, decoding them in a way that’s useful for the server, calls the local subroutine, and handles the reply 108 | - Nothing about transport has to change 109 | Screen Shot 2024-01-23 at 1 32 33 PM 110 | Screen Shot 2024-01-23 at 1 32 46 PM 111 | 112 | ### Benefits and Difficulties of the RPC 113 | Benefits:
114 | - All the programmer sees when using RPC is the immediate call to the stub and the corresponding return 115 | - This provides: 116 | - Ease of programming with familiar model (just call a function) 117 | - Hidden complexity and architecture of the remote component 118 | - Automation of distributed computation 119 | 120 | Difficulties:
121 | - While the abstraction and ease of programming are great, some challenges need to be addressed 122 | - Caller and callee procedures run on different machines, using data in different address spaces, perhaps in different environments/OS, … 123 | - Must convert to the local representation of data (e.g., variable type, endian conversion) 124 | - Components or networks can fail 125 | 126 | ### Marshalling 127 | - Stubs are responsible for marshaling and unmarshalling these interactions 128 | - Client stub marshals arguments into machine-independent format ⇒ sends request ⇒ waits for response ⇒ unmarshals result 129 | - Server stub unmarshals arguments and builds stack frame ⇒ calls procedure ⇒ marshals results and replies 130 | 131 | - Since marshaling relies on standards and conventions in a variety of places, rather than making some assumption about what those are, stubs rely on an **interface definition language (IDL)** 132 | 133 | - There are many such IDLs supported by various libraries and middleware, often resembling XML, JSON, TLV, etc. 134 | 135 | ### RPC Diagram 136 | Screen Shot 2024-01-23 at 1 39 03 PM
137 | 138 | ### RPC By Reference 139 | - Early RPC implementations were for stateless/idempotent subroutines, so there was no notion of call-by-reference 140 | - There is no shared memory space, so what value is a pointer? 141 | 142 | - RPC is often limited to call-by-value or call-by-copy-and-restore, where values must be explicitly sent over the network 143 | - This creates some complications for complex data types and especially for custom data types 144 | - It is possible to maintain state and create generic "references", like pointers but not for memory 145 | - Different levels of integration are supported by different systems, depending on the hetero/homogeneity of the system, languages, etc. 146 | 147 | ### Challenges of RPC 148 | 1. Failures 149 | 150 | - Partial Failure: 151 | - In local computing: 152 | - If the machine fails, the application fails 153 | - There are much bigger concerns than the application at this point 154 | - In distributed computing: 155 | - If a machine fails, part of the application fails and part doesn’t 156 | - Can you tell the difference between machine failure and network failure? 157 | 158 | ## Replication 159 | 160 | - Reasons for Replication 161 | - Increased throughput/bandwidth/parallelism 162 | - Increased fault tolerance 163 | - Improve latency by storing content closer to distributed consumers, e.g., “at the edge” 164 | 165 | ### Primary and Secondary Replicas 166 | - Primary replicas 167 | - Exact copies of the original 168 | - Common for file servers, databases, etc. 169 | 170 | - Secondary replicas (eg. Github repo not updated) 171 | - Derivable from the original 172 | - Lower fidelity in some way 173 | - Caches (possibly stale) 174 | - Thumbnails (lower resolution, faster to transmit, smaller to store) 175 | - Compressed copies (slower to access, harder to update) 176 | 177 | - Primary takes all updates 178 | - Secondary may get occasional/periodic updates or “listen” to primary updates 179 | - Primary needs additional overhead to manage, secondary can be managed when possible 180 | - Frequency of secondary updates can be balanced for desired overhead, consistency, usefulness, etc. 181 | 182 | - Secondaries may be read-only caches 183 | - Leaving primary as the definitive and most up-to-date copy 184 | - An update of secondary affects how useful it is as a backup for primary 185 | 186 | ### Efficient Replica Management 187 | - Rather than enforce perfect replication (which has high overhead), it’s often reasonable to allow replicas to vary from each other slightly (e.g., one can be out of date 188 | w.r.t. others) in trade for significantly less latency/overhead 189 | 190 | - Need to be careful not to overly complicate the management of replication, or we could end up with more latency/overhead 191 | 192 | - Goal: we want to achieve one-copy semantics, meaning we can perform read and write operations on a collection of replicas with the same results as if there were one nonreplicated object 193 | 194 | ### Quorum 195 | - A group of servers trying to provide one-copy semantics 196 | 197 | - Setup of the Quorum 198 | - N primary replicas, no secondary replicas 199 | - One-copy semantics 200 | - Each "read" makes a local copy of multiple replicas, determines which is "correct" and discards others 201 | - Each "write" overwrites multiple replicas with a single local copy 202 | - Writes are not edits, they are replacements 203 | - A "read-then-write" approach is typically used 204 | 205 | - Concurrency controls are assumed 206 | - All reads and writes are "safe" 207 | 208 | ### Concurrency Control Addresses Conflict 209 | - Replication conflict 210 | - Any situation where concurrent actions break the usefulness of replicas 211 | 212 | - Writes cause conflicts 213 | - Write-write 214 | - If there are 2 replicas, A writes to one while B writes to the other, which is the newest? 215 | - If there is 1 replica, both A and B write, the outcome is uncertain (race condition) 216 | 217 | - Write-read 218 | - If there are 2 replicas, A writes to one while B reads from the other, then B reads an old file 219 | - If there is 1 replica, A writes as B is reading, B may read a mix of old and new content 220 | 221 | - Reads don’t cause conflicts 222 | - Read-read is safe 223 | - But caution is still needed while reading… 224 | 225 | ### Read/Write of Replicated Data 226 | - If a large number of users can access and modify replicated data, we have to be careful about who is reading/writing which replica 227 | 228 | - Ex: if there are four replicas f1, f2, f3, f4 of a file, what happens if one user writes f3 and f4 while another user reads f1 and f2? 229 | 230 | ### Read and Write Quora 231 | - Read quorum: Number of servers R a reader should read from 232 | - Write quorum: Number of servers W a writer should write to 233 | - What is the relationship between these? 234 | - Can we tune them to achieve different goals?
235 | 236 | Screen Shot 2024-02-06 at 5 43 24 PM
237 | 238 | ### Quora Must Overlap 239 | - The read quorum and the write quorum must overlap 240 | - Why? Without overlap, readers could miss-write and get old value 241 | - R + W > N is a necessary condition for correctness 242 | - Requires the ability to identify the most recent version among the replicas that are read (more in a few slides)
243 | 244 | Screen Shot 2024-02-06 at 5 44 17 PM
245 | Screen Shot 2024-02-06 at 5 44 37 PM
246 | 247 | ### Tuning a Quorum 248 | - Often true that reading is more common the writing (I access files more often than I update them) 249 | 250 | - Increase W, i.e., larger write quorum 251 | - More redundancy for robustness 252 | - More local to consumers 253 | - Higher write cost (network and device throughput, contention, long tail, etc.) 254 | 255 | - Decrease R, i.e., smaller read quorum 256 | - Lower cost to read (network and device throughput, contention, long tail, etc.) 257 | - More choices of where to read (closer to the user) 258 | - More expensive writes (but that’s ok since they’re less frequent) 259 | 260 | - Larger overlap 261 | - More tolerance for failure 262 | 263 | - Common Practice: 264 | - Read-one / write-all (i.e., R=1, W=N) is the most common 265 | - Reads are more common than writes, so make them easy 266 | - Read gets most current data (requirement) 267 | - Reads can choose any replica (nearby, failure, etc.) 268 | - Reads require low bandwidth 269 | - Makes common case safe and fast 270 | 271 | ### Version Determination 272 | - If every replica has a version number (like a timestamp or sequence number), then determining which read replica is the newest is easy 273 | Simply choose the replica with the "highest" version number 274 | - Read〈f,tf〉from R servers ⇒〈f1,tf1〉,〈f2,tf2〉,...,〈fR,tfR〉 275 | - Keep fj, where j = argmaxi tfi 276 | - Still need to ensure that the overall newest version is guaranteed to be within any set of R replicas (e.g., via quorum) 277 | 278 | ### Majority Read Quorum 279 | - Make W large enough to guarantee that for any R replicas, strictly more than R/2 of them are the newest version 280 | - the majority of any read quorum represents the most recent version of the replica 281 | - E.g., if 3 of the 5 replicas I read are the same, they are guaranteed to be the most recent version 282 | - the overlap between the write quorum and the read quorum must include the majority of the read quorum 283 | - Overlap must be a majority of any read quorum 284 | - must write to all but a quorum minority 285 | - W > (N - R/2) 286 | - And this doesn’t even account for any read/write failures
287 | Screen Shot 2024-02-06 at 5 54 17 PM
288 | - This is very expensive 289 | - But we don't even need a reference of time at all and guarantee we could still do versioning! 290 | 291 | ### Write-all 292 | - If W = N (i.e., "write-all"), then it’s impossible for an old replica to be present in the system since every replica is always overwritten 293 | - Any read of any replica guarantees it is the newest overall replica 294 | - However, if one write fails, no one can read from the entire system until that server backs up. 295 | - This could lead to a lockdown system... 296 | - Thus, this is requiring the system to be perfect thus resulting in slow results. 297 | 298 | ### Locking 299 | - Uncontrolled concurrent writes can break quorum discipline (and maybe even corrupt data) 300 | - Also could break version number increment 301 | 302 | - What to lock? 303 | - Object at servers, not whole servers 304 | 305 | - How many to lock? 306 | - L ≥ R ⇐ lock quorum must cover most recent version 307 | - L ≥ W ⇐ lock quorum must cover every write to protect version and data 308 | - L ≥ max(R, W) ⇐ lock quorum must cover all reads and writes of updating object 309 | 310 | ### Failures 311 | - If the overlap between read and write Quora is strictly greater than F, then F failures can be tolerated 312 | - With version numbers: R + W - F > N 313 | - Without version numbers: W + R/2 - F > N 314 | - Still guaranteed to see an up-to-date version 315 | - If not covered by overlap, the failure model becomes important 316 | 317 | ### Alternatives to Static Quora 318 | - A static quorum is a pre-defined quorum (like our examples) 319 | - Not adaptive in terms of membership, connectivity, etc. 320 | - We have counted each replica equally (one replica = one vote), but this is not required 321 | - Alternatively, we can assign weights to different hosts based on any reasonable, observable metric; quorum rules still apply, but weighted 322 | - Examples: 323 | - More weight to more reliable hosts (more unreliable hosts for the same robustness) 324 | - Less weight to caches or secondary replicas, due to expiration and staleness 325 | 326 | ### Coda Version Vectors 327 | - CVVs are a form of vector logical timestamp (remember those?) 328 | - A CVV contains one entry for each server, which is the version number of the file on the corresponding server (ideally, all equal) 329 | - If a server doesn’t get an update to the file, its CVV entry will be lower than others 330 | 331 | - A Coda client requests a file via a three-step process: 332 | - It asks all replicas for their version number 333 | - It requests the file from the replica with the largest version number 334 | - If the servers don't agree about the file’s version, the client can detect and inform them of the conflict 335 | - A conflict exists if two CVVs are concurrent, which indicates that each server has seen some but not all changes 336 | 337 | - Ideally, a Coda client writes a file as: 338 | - The client sends the file and original CVV to all servers 339 | - Each server increments its entry in the file's CVV and sends an ACK to the client 340 | - The client merges the entries from all of the servers and sends the new CVV back to each server 341 | - If a conflict is detected, the client can inform the servers, so that it can be resolved automatically, or flagged for mitigation by the user 342 | -------------------------------------------------------------------------------- /15-XXX Computer Systems/15-513 Introduction to Computer Systems.md: -------------------------------------------------------------------------------- 1 | [Lecture 1: Bits, Bytes and Integers](#Bits,-Bytes-and-Integers)
2 | [Lecture 2: Machine Programming: Basics](#Machine-Programming-Basics)
3 | [Lecture 3: Machine Programming: Control](#Machine-Programming-Control)
4 | 5 | ## Bits, Bytes and Integers 6 | ### Binary Representations 7 | Binary representation leads to a simple binary, i.e. base-2, numbering system
8 | - 0 represents 0
9 | - 1 represents 1
10 | - Each “place” represents a power of two, exactly as each place in our usual “base 10”, 10-ary numbering system represents a power of 10
11 | 12 | ## Hexadecimal 0016 to FF16 13 | - Base 16 number representation 14 | - Use characters ‘0’ to ‘9’ and ‘A’ to ‘F’ 15 | 16 | Consider 1A2B in Hexadecimal: 17 | - 1 \* 163 + A \* 162 + 2 \* 161 + B \* 160 18 | - 1 \* 163 + 10 \* 162 + 2 \* 161 + 11 \* 160 = 6699 (decimal) 19 | 20 | ## Bit-level Manipulations 21 | - **AND** A&B = 1 when both A=1 and B=1 22 | - **OR** A|B = 1 when either A=1 or B=1 23 | - **NOT** ~A = 1 when A=0 24 | - **Exclusive-OR (XOR)** A^B = 1 when either A=1 or B=1, but not both 25 | 26 | - Boolean algebra operates on bit vectors. 27 | - In regards to operations, the following: 28 | - & Intersection 01000001 { 0, 6 } 29 | - | Union 01111101 { 0, 2, 3, 4, 5, 6 } 30 | - ^ Symmetric difference 00111100 { 2, 3, 4, 5 } 31 | - ~ Complement 10101010 { 1, 3, 5, 7 } 32 | 33 | ### Bit Operations in C 34 | Operations &, |, ~, ^ Available in C 35 | - Apply to any “integral” data type 36 | - long, int, short, char, unsigned 37 | - View arguments as bit vectors 38 | - Arguments applied bit-wise 39 | 40 | Examples (Char data type) 41 | - ~0x41 → 0xBE 42 | - ~01000001 → 10111110 43 | - ~0x00 → 0xFF 44 | - ~00000000 → 11111111 45 | - 0x69 & 0x55 → 0x41 46 | - 01101001 & 01010101 → 01000001 47 | - 0x69 | 0x55 → 0x7D 48 | - 01101001 | 01010101 → 01111101 49 | 50 | ### Logical Operations in C 51 | **In contrast to Bit-Level Operators** 52 | - Logic Operations: &&, ||, ! 53 | - View 0 as “False” 54 | - Anything nonzero as “True” 55 | - Always return 0 or 1 56 | - Early termination 57 | 58 | Examples (Char data type) 59 | - !0x41 → 0x00 60 | - !0x00 → 0x01 61 | - !!0x41→ 0x01 62 | - 0x69 && 0x55 → 0x01 63 | - 0x69 || 0x55 → 0x01 64 | - p && *p (avoids null pointer access) 65 | 66 | ### Shift Operations 67 | **Left Shift**: x << y 68 | - Shift bit-vector x left y positions 69 | – Throw away extra bits on the left 70 | - Fill with 0’s on the right 71 | 72 | **Right Shift**: x >> y 73 | - Shift bit-vector x right y positions 74 | - Throw away extra bits on the right 75 | - Logical shift 76 | - Fill with 0’s on the left 77 | - Arithmetic shift 78 | - Replicate the most significant bit on the left 79 | 80 | **Undefined Behavior** 81 | - Shift amount < 0 or ≥ word size 82 | Screen Shot 2024-01-22 at 6 58 21 PM 83 | 84 | ## Integers 85 | 86 | ### Overflow 87 | For a binary digit line, if two binaries add up to have the resulting binary having an extra digit, this will be considered as **overflow**
88 | For example: 111 + 001 = 1 000, the exceeded 1 is considered overflow.
89 | 90 | “Ints” means the finite set of integer numbers that we can represent on a number line enumerated by some fixed number of bits, i.e. **bit width**.
91 | - An **“unsigned” int** is any int on a number line, e.g. of a data type, that doesn’t contain any negative numbers
92 | - A **non-negative number** is a number greater than or equal to (>=) 0 on a number line, e.g. of a data type, that does contain negative numbers
93 | 94 | ### Negative Numbers 95 | We will use the leading bit as the **sign** bit
96 | - 0 means non-negative 97 | - 1 means negative 98 | - This will allow us to represent negative numbers and non-negative numbers 99 | - and make 0 represent 0 100 | - The issue here is there is a -0, which is the same as 0, except that it is different 101 | 102 | Given a non-negative number in binary, we can find its negative by flipping each bit and adding 1. 103 | For example: 104 | - 0101 is 5 105 | - 1010 is the "one's complement of 5" 106 | - 1011 is the "twos complement of 5" 107 | - 0101 + 1011 = 1 0000 = 0000 108 | - -x = ~x + 1 109 | 110 | This works because after flipping all the bits and making addition to the non-negative number, the result would be 1111, adding 1 in the end would result in a 0 due to overflow.
111 | 112 | ### Numeric Ranges 113 | - unsigned values: 114 | - Umin = 0 115 | - Umax = 2w - 1 116 | - Two's complement values: 117 | - Tmin = -2w - 1 118 | - Tmax = 2w - 1 - 1 119 | 120 | ## Conversion and Casting 121 | Large negative weight becomes Large positive weight
122 | 123 | Basic rules for casting Signed <-> Unsigned: 124 | - Bit pattern is maintained 125 | - But reinterpreted 126 | - Can have unexpected effects: adding or subtracting 2w 127 | - Expression containing signed and unsigned int 128 | - int is cast to unsigned! 129 | 130 | ### Two's Complement -> Unsigned 131 | - Ordering inversion 132 | - Negative -> Big Positive 133 | 134 | ## Expanding and Truncating 135 | 136 | ### Sign Extension 137 | - Given w-bit signed integer x 138 | - Converting it to a w+k-bit integer with the same value 139 | - This can be done by making k copies of the sign bit and extending it to the new bits. 140 | 141 | ### Truncation 142 | - Given k+w-bit signed or unsigned integer X 143 | - Converting it to w-bit integer 'X' with the same value for "small enough" X 144 | - This can be done by dropping the top k bits. 145 | 146 | ### Summary 147 | - Expanding (e.g., short int to int) 148 | - Unsigned: zeros added 149 | - Signed: sign extension 150 | - Both yield the expected result 151 | - Truncating (e.g., unsigned to unsigned short) 152 | - Unsigned/Signed: bits are truncated 153 | - Result reinterpreted 154 | - Unsigned: mod operation 155 | - Signed: similar to mod 156 | - For small (in magnitude) numbers yield expected behavior 157 | 158 | ## Addition, Negation, Multiplication and Shifting 159 | 160 | ### Addition 161 | - Unsigned Addition: 162 | - Operand: w bits 163 | - True sum: w+1 bits 164 | - Discard Carry: w bits 165 | - This should ignore the carry output 166 | 167 | For example:
168 | 1110 1001 + 1101 0101 = 1 1011 1110
169 | Discard the carry: the result is 1011 1110 -> 190 in decimal (UAdd)
170 | 171 | - Two's Complement Addition: 172 | - Operand: w bits 173 | - True sum: w+1 bits 174 | - Discard Carry: w bits 175 | - The bit-level behavior is the same with TAdd and UAdd 176 | 177 | For example:
178 | 1110 1001 + 1101 0101 = 1 1011 1110
179 | Discard the carry: the result is 1011 1110 -> -66 in decimal (TAdd)
180 | 181 | ### Visualizing Additions 182 | Screen Shot 2024-01-23 at 12 08 15 PM
183 | Screen Shot 2024-01-23 at 12 08 33 PM
184 | Screen Shot 2024-01-23 at 12 08 48 PM
185 | 186 | ### Multiplication 187 | Power-of-2 Multiply with Shift 188 | - Operation: 189 | - u << k gives u \* 2k 190 | - Both signed and unsigned 191 | - Operand: w bits 192 | - True Product: w+k bits 193 | - Discard k bits: w bits 194 | 195 | For example:
196 | u << 3 == u \* 8 197 | (u << 5) - (u << 3) == u \* 24 198 | Most machines shift and add faster than multiply since the compiler generates this code automatically. 199 | 200 | ### Division 201 | Unsigned Power-of-2 Divide with Shift
202 | - Operation: 203 | - u >> k gives floor(u / 2k) 204 | - This would require logical shift. 205 | 206 | Examples:
207 | Screen Shot 2024-01-23 at 12 17 45 PM
208 | 209 | Signed Power-of-2 Divide with Shift
210 | - Operation 211 | - x >> k gives floor(x / 2k) 212 | - This would require arithmetic shift. 213 | - Round to the left, not toward zero (Unlikely to be what is expected, introduces a bias.) 214 | 215 | Examples:
216 | Screen Shot 2024-01-23 at 12 20 10 PM
217 | 218 | ## Byte Ordering 219 | Big Engian: Sun (Oracle SPARC), PPC Mac, Internet
220 | - The least significant byte has the highest address
221 | 222 | Little Endian: x86, ARM
223 | - The least significant byte has the lowest address
224 | 225 | Byte ordering is a concern when we are communicating data over a network, via files, etc.
226 | Note:
227 | - Bits are not reversed, as the low-order bit is the reference point.
228 | - This doesn't affect chars, or strings (array of chars), as chars are only one byte.
229 | 230 | Example: Given a variable x has a 4-byte value of 0x01234567, the address given by &x is 0x100:
231 | Screen Shot 2024-01-23 at 12 26 10 PM
232 | 233 | ### Reading Byte-Reversed Listing 234 | Disassembly:
235 | - Text representation of binary machine code
236 | - Generated by the program that reads the machine code
237 | 238 | For example:
239 | - Value: 0x12ab
240 | - Pad to 32 bits: 0x000012ab
241 | - Split into bytes: 00 00 12 ab
242 | - Reverse: ab 12 00 00
243 | 244 | If we want to add this value to %ebx, then the assembly code would be:
245 | add $0x12ab, %ebx
246 | The instruction code corresponding would be:
247 | 81 c3 ab 12 00 00
248 | 249 | ## Machine Programming Basics 250 | ### Assembly Basics 251 | - **Architecture**: (also ISA: instruction set architecture) The parts of a processor design that one needs to understand for writing assembly/machine code. 252 | - Examples: instruction set specification, registers, memory model 253 | - The architecture is not the hardware, it is the interface of the hardware 254 | - *Microarchitecture**: Implementation of the architecture 255 | - Examples: cache sizes and core frequency 256 | 257 | - Code Forms: 258 | - **Machine Code**: The byte-level programs that a processor executes 259 | - **Assembly Code**: A text representation of machine code, (human-readable version of the assembly language) 260 | 261 | Example ISAs: 262 | - Intel: x86, IA32, Itanium, x86-64 263 | - ARM: Used in almost all mobile phones 264 | - RISC V: New open-source ISA 265 | 266 | ### Assembly Code and Machine Basics 267 | - **PC: Program counter** 268 | - Address of next instruction 269 | - Called “RIP” (x86-64) 270 | - **Register file** 271 | - Heavily used program data 272 | - Can be accessed directly 273 | - **Condition codes (FLAGS)** 274 | - Store status information about the most recent arithmetic or logical operation 275 | - Used for conditional branching 276 | - **Memory** 277 | - Byte addressable array 278 | - Heap to support code and user data 279 | - Stack to support procedures 280 | Screen Shot 2024-01-23 at 2 17 32 PM 281 | 282 | ### Assembly Data Types 283 | - Integer: data of 1, 2, 4, or 8 bytes 284 | - Data values 285 | - Addresses (untyped pointers) 286 | - Floating points: 4, 8 or 10 bytes 287 | - SIMD vector data types of 8, 16, 32 or 64 bytes 288 | - Code: Byte sequences encoding a series of instructions 289 | - No aggregate types such as arrays or structures 290 | - Just contiguously allocated bytes in memory 291 | 292 | Example:
293 | Screen Shot 2024-01-23 at 2 31 04 PM
294 | These are 64-bit registers, so we know this is a 64-bit add
295 | 296 | ### X86-64 Integer Registers 297 | Screen Shot 2024-01-23 at 2 34 59 PM
298 | 299 | ### Moving Data 300 | - Moving Data: 301 | - movq Source, Dest 302 | - Operand Types 303 | - Immediate: Constant integer data 304 | - Example: $0x400, $-533 305 | - Like C constant, but prefixed with ‘$’ 306 | - Encoded with 1, 2, or 4 bytes 307 | - Register: One of 16 integer registers 308 | - Example: %rax, %r13 309 | - But %rsp reserved for special use 310 | - Others have special uses for particular instructions 311 | - Memory: 8 consecutive bytes of memory at the address given by the register 312 | - Simplest example: (%rax) 313 | - Various other “addressing modes” 314 | 315 | Movq Operand Combinations:
316 | Screen Shot 2024-01-23 at 2 46 52 PM 317 | - The immediate cannot be a dest 318 | - Cannot do memory-memory transfer with a single instruction 319 | 320 | ### Memory Addressing Nodes 321 | Normal (R) Mem[Reg[R]] 322 | - Register R specifies the memory address 323 | - Aha! Pointer dereferencing in C 324 | - movq (%rcx),%rax 325 | 326 | Displacement D(R) Mem[Reg[R]+D] 327 | - Register R specifies the start of the memory region 328 | - Constant displacement D specifies the offset 329 | - movq 8(%rbp),%rdx 330 | 331 | **Most General Form**: 332 | D(Rb,Ri,S) Mem[Reg[Rb]+S*Reg[Ri]+ D] 333 | - D: Constant “displacement” 1, 2, or 4 bytes 334 | - Rb: Base register: Any of 16 integer registers 335 | - Ri: Index register: Any, except for %rsp 336 | - S: Scale: 1, 2, 4, or 8 (why these numbers?) 337 | 338 | Special Cases 339 | - (Rb,Ri) Mem[Reg[Rb]+Reg[Ri]] 340 | - D(Rb,Ri) Mem[Reg[Rb]+Reg[Ri]+D] 341 | - (Rb,Ri,S) Mem[Reg[Rb]+S*Reg[Ri]] 342 | 343 | An example of a swap function:
344 | Screen Shot 2024-01-23 at 2 58 33 PM
345 | 346 | An example of address computation:
347 | Screen Shot 2024-01-23 at 3 01 13 PM
348 | 349 | ### Address Computation Instruction 350 | leaq Src, Dst 351 | - Src is address mode expression 352 | - Set Dst to address denoted by the expression 353 | Uses 354 | - Computing addresses without a memory reference 355 | - E.g., translation of p = &x[i]; 356 | - Computing arithmetic expressions of the form x + k \* y 357 | - k = 1, 2, 4, or 8 358 | 359 | ### Turning C into Object Code 360 | Screen Shot 2024-01-25 at 2 02 40 PM
361 | 362 | ## Machine Programming Control 363 | ### Finding pointers 364 | - %rsp and %rip always hold pointers 365 | - Register values that are “close” to %rsp or %rip are probably also pointers 366 | - If a register is being used as a pointer… 367 | - mov (%rsi), %rsi 368 | - …Then its value is expected to be a pointer. 369 | - There might be a bug that makes its value incorrect. 370 | 371 | - Not as obvious with complicated address “modes” 372 | - (%rsi, %rbx) – One of these is a pointer, we don’t know which. 373 | - (%rsi, %rbx, 2) – %rsi is a pointer, %rbx isn’t (why?) 374 | - 0x400570(, %rbx, 2) – 0x400570 is a pointer, %rbx isn’t (why?) 375 | - lea (anything), %rax – (anything) may or may not be a pointer 376 | 377 | ### Control Flow 378 | - We would use jmp statements as GOTO statements for changing control flows 379 | Screen Shot 2024-01-25 at 2 22 37 PM
380 | - We will use condition codes to determine if we need to jump 381 | 382 | - Single-bit registers 383 | - CF Carry Flag (for unsigned) 384 | - SF Sign Flag (for signed) 385 | - ZF Zero Flag 386 | - OF Overflow Flag (for signed) 387 | 388 | - Compare Instruction 389 | - cmp a, b 390 | - Computes 𝑏𝑏 − 𝑎𝑎 (just like sub) 391 | - Sets condition codes based on the result, but… 392 | - Does not change 𝒃𝒃 393 | - Used for if (a < b) { … } whenever 𝑏𝑏 − 𝑎𝑎 isn’t needed for anything else 394 | 395 | - Test Instruction 396 | - test a, b 397 | - Computes 𝑏𝑏&𝑎𝑎 (just like and) 398 | - Sets condition codes (only SF and ZF) based on the result, but… 399 | - Does not change 𝒃𝒃 400 | - Most common use: test %rX, %rX to compare %rX to zero 401 | - Second most common use: test %rX, %rY tests if any of the 1-bits in %rY are also 1 in %rX (or vice versa) 402 | 403 | ### Conditional Branches 404 | - jX Instructions 405 | - Jump to different parts of the code depending on condition codes 406 | Screen Shot 2024-01-25 at 2 30 04 PM
407 | - For example: 408 | - cmp a, b 409 | - je 1000 410 | - If a and b are equal, then jump to instruction 1000 411 | 412 | - SetX Instructions 413 | - Set the low-order byte of the destination to 0 or 1 based on combinations of condition codes 414 | - Does not alter the remaining 7 bytes
415 | Screen Shot 2024-01-25 at 2 32 20 PM
416 | 417 | - Example
418 | Screen Shot 2024-01-25 at 2 45 26 PM
419 | 420 | ### Conditional Move Instruction 421 | - This is much more efficient for processors 422 | 423 | - Conditional Move Instructions 424 | - Instruction supports: 425 | - if (Test) Dest ← Src 426 | - Supported in post-1995 x86 processors 427 | - GCC tries to use them 428 | - But, only when known to be safe 429 | - Why? 430 | - Branches are very disruptive to an instruction flow through pipelines 431 | - Conditional moves do not require control transfer 432 | Screen Shot 2024-01-25 at 2 49 02 PM 433 | 434 | ### Loops 435 | - Use conditional branch to either continue looping or to exit the loop
436 | Screen Shot 2024-01-25 at 2 52 54 PM 437 | 438 | ## Machine Programming Advanced 439 | ### Linux x-86 Memory Layout 440 | - Stack 441 | - Runtime stack (8MB limit) 442 | - e.g., local variables 443 | - Heap 444 | - Dynamically allocated as needed 445 | - When call malloc(), calloc(), new() 446 | - Data 447 | - Statically allocated data 448 | - e.g., global vars, static vars, string constants 449 | - Text / Shared Libraries 450 | - Executable machine instructions 451 | - Read-only
452 | Screen Shot 2024-02-06 at 2 05 40 PM
453 | 454 | - Example layout: 455 | - Here, the struct is allocated in the stack, 456 | - In every function call, the memory address of the return value is pushed to the stack pointer, 457 | - The s.a[i] line is writing to memory. In lecture 1, fun(6) crashed the program. Why did writing to this location cause the process to crash? 458 | - This is possible when the instruction has been overwritten.
459 | Screen Shot 2024-02-06 at 2 13 12 PM
460 | 461 | ### Buffer Overflow 462 | Screen Shot 2024-02-06 at 2 18 39 PM
463 | - Generally called a “buffer overflow” 464 | - When exceeding the memory size allocated for an array 465 | - Why a big deal? 466 | - It’s the #1 technical cause of security vulnerabilities 467 | - #1 overall cause is social engineering/user ignorance 468 | - Most common form 469 | - Unchecked lengths on string inputs 470 | - Particularly for bounded character arrays on the stack 471 | - sometimes referred to as stack-smashing 472 | - This is an issue for gets() function 473 | - Similar problems with other library functions 474 | - strcpy, strcat: Copy strings of arbitrary length 475 | - scanf, fscanf, sscanf, when given %s conversion specification
476 | 477 | ### View of Buffer Overflow 478 | Screen Shot 2024-02-06 at 2 25 25 PM
479 | Screen Shot 2024-02-06 at 2 26 52 PM
480 | - This could be utilized for Stack Smashing Attacks
481 | Screen Shot 2024-02-06 at 2 30 38 PM
482 | - Overwrite normal return address A with the address of some other code S 483 | - When Q executes ret, will jump to other code 484 | - x-86 reads long in little-endian order, thus we would need to create a reversed address attack string. 485 | 486 | ### Code Injection Attacks
487 | Screen Shot 2024-02-06 at 2 39 20 PM
488 | - Input string contains byte representation of executable code 489 | - Overwrite return address A with the address of buffer B 490 | - When Q executes ret, will jump to exploit code 491 | 492 | ### Code Injection Execution
493 | Screen Shot 2024-02-06 at 2 40 42 PM
494 | 495 | ### Avoiding Buffer Overflow 496 | 1. Avoid Overflow Vulnerabilities in Code 497 | - For example, use library routines that limit string lengths 498 | - fgets instead of gets 499 | - strncpy instead of strcpy 500 | - Don’t use scanf with %s conversion specification 501 | - Use fgets to read the string 502 | - Or use %ns where n is a suitable integer 503 | 504 | 2. System-Level Protections 505 | - Randomized stack offsets 506 | - At the start of the program, allocate a random amount of space on the stack 507 | - Shifts stack addresses for the entire program 508 | - Makes it difficult for hackers to predict the beginning of inserted code
509 | Screen Shot 2024-02-06 at 2 49 34 PM
510 | 511 | - Non-executable memory 512 | - Older x86 CPUs would execute machine code from any readable address 513 | - x86-64 added a way to mark regions of memory as not executable 514 | - Immediate crash on jumping into any such region 515 | - Current Linux and Windows marks the stack this way
516 | Screen Shot 2024-02-06 at 2 51 12 PM
517 | 518 | - Stack Canaries 519 | - Idea 520 | - Place a special value (“canary”) on the stack just beyond the buffer 521 | - Check for corruption before exiting function 522 | - GCC Implementation 523 | - -fstack-protector 524 | - Now the default (disabled earlier)
525 | Screen Shot 2024-02-06 at 2 52 27 PM
526 | 527 | ### Return-Oriented Programming Attacks 528 | - Challenge (for hackers) 529 | - Stack randomization makes it hard to predict buffer location 530 | - Marking stack non-executable makes it hard to insert binary code 531 | - Alternative Strategy 532 | - Use existing code 533 | - Part of the program or the C library 534 | - String together fragments to achieve an overall desired outcome 535 | - Does not overcome stack canaries 536 | - Construct programs from gadgets 537 | - The sequence of instructions ending in ret 538 | - Encoded by single byte 0xc3 539 | - Code positions fixed from run to run 540 | - Code is executable 541 | 542 | ### ROP Execution
543 | Screen Shot 2024-02-06 at 3 06 00 PM
544 | - Trigger with ret instruction 545 | - Will start executing Gadget 1 546 | - Final ret in each gadget will start the next one 547 | - ret: pop the address from the stack and jump to that address 548 | -------------------------------------------------------------------------------- /15-XXX Computer Systems/15-619 Cloud Computing.md: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /15-XXX Computer Systems/15-641 Computer Network.md: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /15-XXX Computer Systems/15-645 Database Systems.md: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /18-XXX Electrical and Computer Engineering/18-746 Storage Systems.md: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2024 Jiawei (Allen) Zhu 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # CMU Course Notes 2 | 3 | ## Ongoing Updates 4 | 5 | - 10-601 Machine Learning 6 | - 14-736 Distributed Systems 7 | - 15-513 Computer Systems 8 | 9 | ## TBD 10 | 11 | - 10-617 Intermediate Deep Learning 12 | - 10-623 Generative AI 13 | - 10-725 Convex Optimization 14 | - 11-742 Search Engines 15 | - 11-667 Large Language Models 16 | - 15-618 Parallel Computer Architecture and Programming 17 | - 15-619 Cloud Computing 18 | - 15-641 Computer Networks 19 | - 15-645 Database Systems 20 | - 18-746 Storage Systems 21 | 22 | ## Usage 23 | 24 | The repository is organized by course departments. Each department folder contains individual courses, where you'll find notes in PDF and Markdown formats. 25 | 26 | To view a note, simply click on the file. To download, click the 'Download' button or use the `raw` option in the GitHub interface. 27 | 28 | ## How to Contribute 29 | 30 | We welcome contributions to courses not yet listed at the moment! To contribute: 31 | 1. Fork the repository. 32 | 2. Make your changes. 33 | 3. Submit a pull request with a clear description of your improvements. 34 | 35 | ## Responsibilities 36 | 37 | Please refrain from uploading code or unauthorized material from CMU. Such pull requests will not be granted. 38 | 39 | ## License 40 | 41 | This project is licensed under the MIT License. This means you're free to use, modify, and distribute the notes as long as you credit the source. See [LICENSE.md](LINK) for more details. 42 | 43 | ## Acknowledgments 44 | 45 | Special thanks to the faculty and students of CMU who have contributed to and inspired this collection of notes and Xuezzou, who inspired me to create this project. 46 |



47 | --------------------------------------------------------------------------------