├── 10-XXX Machine Learning ├── 10-601 Introduction to Machine Learning.md ├── 10-605 Machine Learning with Large Datasets.md ├── 10-617 Intermediate Deep Learning.md └── 10-725 Convex Optimization.md ├── 11-XXX Language Technologies ├── 11-642 Search Engines.md └── 11-711 Advanced NLP.md ├── 14-XXX Information Networking └── 14-736 Distributed System.md ├── 15-XXX Computer Systems ├── 15-513 Introduction to Computer Systems.md ├── 15-619 Cloud Computing.md ├── 15-641 Computer Network.md └── 15-645 Database Systems.md ├── 18-XXX Electrical and Computer Engineering └── 18-746 Storage Systems.md ├── LICENSE └── README.md /10-XXX Machine Learning/10-601 Introduction to Machine Learning.md: -------------------------------------------------------------------------------- 1 | [Lecture 1: Supervised and Unsupervised Learning](#Supervised-and-Unsupervised-Learning)
2 | [Lecture 2: Machine Learning as Functional Approximation](#Machine-Learning-as-Functional-Approximation)
3 | 4 | ## Supervised and Unsupervised Learning 5 | ### Supervised models:
6 | · Decision Trees
7 | · KNN
8 | · Naïve Bayes
9 | · Perceptron
10 | · Logistic Regression
11 | · Linear Regression
12 | · Neural Networks
13 | 14 | ### Definition of Machine Learning 15 | Three components: Task T, Performance metric P, and Experience E. 16 | 17 | ### Definitions: Machine Learning Classifier 18 | A **classifier** is a function that takes feature values as input and outputs a label.
19 | A **test dataset** is used to evaluate a classifier’s predictions.
20 | The **error rate** is the proportion of data points where the prediction is wrong
21 | 22 | ### Types of Machine Learning Classifiers 23 | **Majority vote classifier**: always predicts the most common label in the **training** dataset.
24 | **Memorizer**: if a set of features exists in the training dataset, predict its corresponding label; otherwise, predict the majority vote
25 | **Decision Stump**: based on a single feature, x_d, predict the most common label in the **training** dataset among all data points that have the same value for x_d
26 | 27 | ### A typical Machine Learning Routine 28 | · Step 1 – **Training**
29 | Input: a labeled training dataset
30 | Output: a classifier
31 | · Step 2 – **Testing**
32 | Inputs: a classifier, a test dataset
33 | Output: predictions for each test data point
34 | · Step 3 – **Evaluation**
35 | Inputs: predictions from step 2, test dataset labels
36 | Output: some measure of how good the predictions are;
37 | usually (but not always) error rate
38 | 39 | ## Machine Learning as Functional Approximation 40 | ### Notations 41 | · **Feature Space**: x
42 | · **Label Space**: y
43 | · **Unknown target function**: c^*: x -> y
44 | · **Training Dataset**: D = {x⁽¹⁾, c^\*(x⁽¹⁾) = y⁽¹⁾, (x⁽²⁾, y⁽²⁾)...,(x^(N), y^(N))}
45 | · **Hypothesis Space**: H
46 | · **Goal**: Find the classifier, h ∈ H, that best approximates c^\*
47 | 48 | ## Loss function and error Rates 49 | **Loss Function**: l : y x y -> ℝ
50 | This defines how "bad" predictions $\hat{y}$ = h(x), are compared to the true labels, y = c^\*(x)
51 | 52 | ### Common choices for loss functions: 53 | 1. Squared loss (for regression): l(y, $\hat{y}$ ) = (y - $\hat{y}$ )²
54 | 2. Binary or 0-1 loss (for classification): l(y, $\hat{y}$ ) = 1(y != $\hat{y}$ )
55 | 56 | ### Decision Stump Pseudocode: 57 | Screen Shot 2024-01-22 at 5 22 35 PM

58 | -------------------------------------------------------------------------------- /10-XXX Machine Learning/10-605 Machine Learning with Large Datasets.md: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /10-XXX Machine Learning/10-617 Intermediate Deep Learning.md: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /10-XXX Machine Learning/10-725 Convex Optimization.md: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /11-XXX Language Technologies/11-642 Search Engines.md: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /11-XXX Language Technologies/11-711 Advanced NLP.md: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /14-XXX Information Networking/14-736 Distributed System.md: -------------------------------------------------------------------------------- 1 | ## Intro to Distributed Systems 2 | ### Distributed Systems Overview 3 | A distributed system can be generically defined as a collection of resources on distinct physical components that appear to function as a single coherent system
4 | 5 | Examples: the Internet, Embedded SoC, Mobile Gaming
6 | 7 | The Benefits of having a distributed system:
8 | - Geographic distribution of users, peers, providers, etc.
9 | - The distribution of resources can match that of the population
10 | - Cost per resource (processor, memory, storage, network) at scale
11 | - Capacity (processor, memory, storage, network)
12 | - Failure model (high chance of 1 of N failure, lower chance of N of N failure)
13 | - Localized risks
14 | 15 | ## Computing Over Networks 16 | ### Challenges of Distributed Systems 17 | 18 | Fundamental distributed systems challenges:
19 | 20 | - Timing:
21 | - How do processes on different networked devices determine the ordering of events that took place? The "Who shot first?" problem
22 | - Concurrency: H
23 | - How do processes on different networked devices safely access/use a commonly shared resource on yet another device?
24 | - Robustness:
25 | - How do processes on different networked devices deal with imperfections in the networks that connect them?
26 | - Consistency:
27 | - How do we guarantee the same outcome from a process regardless of where/how the process runs?
28 | 29 | ### Timing 30 | **Reliability**: The delivery of all packets in order, with correct content, without duplicates.
31 | - Reliable transport provides applications with guaranteed delivery of message segments in order, with verified content, without duplication (e.g., TCP; useful for bulk data delivery)
32 | - Reliability is traded for latency (no bounded delay guarantee) and performance overhead (potentially lots of retransmissions)
33 | 34 | Reliability != timeliness 35 | - The two are often in direct conflict 36 | - Reliability is only guaranteed on a per-session basis, not globally 37 | - In general, **time** is a major challenge to be dealt with. 38 | 39 | ### Concurrency 40 | **Coordination**: 41 | - Finding “stuff” is tricky (content, resources, …) 42 | - Maybe someone/something needs to keep track of all the things? 43 | - Coordination is a new (or at least amplified) challenge 44 | - In general, **management** is a major challenge to be dealt with 45 | 46 | ### Robustness 47 | **(Partial) Failure**: 48 | - Servers can fail 49 | - Especially if the system comprises a large number of cheap/commodity servers instead of one monolithic, high-performance, many-9s availability server 50 | - If one server fails, the entire system doesn’t fail 51 | - Partial failure usually allows for recovery (ideally invisible to the client) and hopefully doesn’t affect availability 52 | - In general, **fault tolerance and recoverability** are highly desirable 53 | 54 | Failures are difficult to handle because it is difficult for us to know which specific processing state the failure occurred.
55 | 56 | ### Consistency 57 | **Consistency**: 58 | - If many clients are interacting with the same data, how do we guarantee that everyone agrees on the same value? 59 | - Mutexes/locks are great, but how can we extend that from a single OS to an entire DS? 60 | - If we can’t, can the clients at least detect that there’s a problem? 61 | - Consistency is one of the biggest challenges in asynchronous, concurrent, distributed systems 62 | 63 | ## Coordinating Operations over a Network 64 | ### Remote Operation 65 | - I have a task (e.g., compute something, access some data), and I need another networked component to help me 66 | - This is abstracted as a function/subroutine/… -> y = f(x) 67 | - y: what is the desired outcome that I’m asking for 68 | - x: what inputs am I providing as part of my request 69 | - f: what work is being done to provide my outcome, and I don’t care where/how it’s implemented 70 | 71 | ### Local Operation 72 | - If this was done locally: 73 | - Find f and x in memory 74 | - Load x in register(s) 75 | - Branch to subroutine 76 | - Subroutine computes y 77 | - Load y in register(s) 78 | - Branch return 79 | - Store y in memory 80 | 81 | - The OS hides most of this, so the dev sees a simple **function/method call**
82 | 83 | ### Local vs. Remote Operation 84 | Apart from the local operations,
85 | - At the remote entity: 86 | - Get messages from the other party 87 | - Unpack f and x (how?) 88 | - Load x in register(s) 89 | - Subroutine, compute y 90 | - Pack into reply message, send 91 | Screen Shot 2024-01-23 at 1 28 00 PM

92 | 93 | ### Remote Procedure Calls (RPC) 94 | - RPC is a way to provide a programmer-friendly abstraction to handle the complexity of coordination required for remote services 95 | - RPC moves inter-procedure communication from stack to network 96 | - Attempts to provide an interface for the programmer that still looks like a local procedure call 97 | 98 | ### Structure of the RPC 99 | - Client-side: 100 | - Client applications only see a local “stub” (aka proxy) that presents a simple function/method/API for local use 101 | - The stub layer abstracts away most networking details 102 | - Client app calls f(x) presented by stub, stub interacts with library to perform all the details of the remote call but doesn't "do" f(x) 103 | - Nothing about transport has to change 104 | - Server-side: 105 | - Same idea on the server side, where the actual service provided doesn’t care about underlying network interactions, just "doing" f(x) 106 | - Similar server stub (aka “skeleton”) abstracts away all networking details 107 | - Stub interacts with the library to handle incoming remote requests, decoding them in a way that’s useful for the server, calls the local subroutine, and handles the reply 108 | - Nothing about transport has to change 109 | Screen Shot 2024-01-23 at 1 32 33 PM

110 |

111 | 112 | ### Benefits and Difficulties of the RPC 113 | Benefits:
114 | - All the programmer sees when using RPC is the immediate call to the stub and the corresponding return 115 | - This provides: 116 | - Ease of programming with familiar model (just call a function) 117 | - Hidden complexity and architecture of the remote component 118 | - Automation of distributed computation 119 | 120 | Difficulties:
121 | - While the abstraction and ease of programming are great, some challenges need to be addressed 122 | - Caller and callee procedures run on different machines, using data in different address spaces, perhaps in different environments/OS, … 123 | - Must convert to the local representation of data (e.g., variable type, endian conversion) 124 | - Components or networks can fail 125 | 126 | ### Marshalling 127 | - Stubs are responsible for marshaling and unmarshalling these interactions 128 | - Client stub marshals arguments into machine-independent format ⇒ sends request ⇒ waits for response ⇒ unmarshals result 129 | - Server stub unmarshals arguments and builds stack frame ⇒ calls procedure ⇒ marshals results and replies 130 | 131 | - Since marshaling relies on standards and conventions in a variety of places, rather than making some assumption about what those are, stubs rely on an **interface definition language (IDL)** 132 | 133 | - There are many such IDLs supported by various libraries and middleware, often resembling XML, JSON, TLV, etc. 134 | 135 | ### RPC Diagram 136 | Screen Shot 2024-01-23 at 1 39 03 PM

137 | 138 | ### RPC By Reference 139 | - Early RPC implementations were for stateless/idempotent subroutines, so there was no notion of call-by-reference 140 | - There is no shared memory space, so what value is a pointer? 141 | 142 | - RPC is often limited to call-by-value or call-by-copy-and-restore, where values must be explicitly sent over the network 143 | - This creates some complications for complex data types and especially for custom data types 144 | - It is possible to maintain state and create generic "references", like pointers but not for memory 145 | - Different levels of integration are supported by different systems, depending on the hetero/homogeneity of the system, languages, etc. 146 | 147 | ### Challenges of RPC 148 | 1. Failures 149 | 150 | - Partial Failure: 151 | - In local computing: 152 | - If the machine fails, the application fails 153 | - There are much bigger concerns than the application at this point 154 | - In distributed computing: 155 | - If a machine fails, part of the application fails and part doesn’t 156 | - Can you tell the difference between machine failure and network failure? 157 | 158 | ## Replication 159 | 160 | - Reasons for Replication 161 | - Increased throughput/bandwidth/parallelism 162 | - Increased fault tolerance 163 | - Improve latency by storing content closer to distributed consumers, e.g., “at the edge” 164 | 165 | ### Primary and Secondary Replicas 166 | - Primary replicas 167 | - Exact copies of the original 168 | - Common for file servers, databases, etc. 169 | 170 | - Secondary replicas (eg. Github repo not updated) 171 | - Derivable from the original 172 | - Lower fidelity in some way 173 | - Caches (possibly stale) 174 | - Thumbnails (lower resolution, faster to transmit, smaller to store) 175 | - Compressed copies (slower to access, harder to update) 176 | 177 | - Primary takes all updates 178 | - Secondary may get occasional/periodic updates or “listen” to primary updates 179 | - Primary needs additional overhead to manage, secondary can be managed when possible 180 | - Frequency of secondary updates can be balanced for desired overhead, consistency, usefulness, etc. 181 | 182 | - Secondaries may be read-only caches 183 | - Leaving primary as the definitive and most up-to-date copy 184 | - An update of secondary affects how useful it is as a backup for primary 185 | 186 | ### Efficient Replica Management 187 | - Rather than enforce perfect replication (which has high overhead), it’s often reasonable to allow replicas to vary from each other slightly (e.g., one can be out of date 188 | w.r.t. others) in trade for significantly less latency/overhead 189 | 190 | - Need to be careful not to overly complicate the management of replication, or we could end up with more latency/overhead 191 | 192 | - Goal: we want to achieve one-copy semantics, meaning we can perform read and write operations on a collection of replicas with the same results as if there were one nonreplicated object 193 | 194 | ### Quorum 195 | - A group of servers trying to provide one-copy semantics 196 | 197 | - Setup of the Quorum 198 | - N primary replicas, no secondary replicas 199 | - One-copy semantics 200 | - Each "read" makes a local copy of multiple replicas, determines which is "correct" and discards others 201 | - Each "write" overwrites multiple replicas with a single local copy 202 | - Writes are not edits, they are replacements 203 | - A "read-then-write" approach is typically used 204 | 205 | - Concurrency controls are assumed 206 | - All reads and writes are "safe" 207 | 208 | ### Concurrency Control Addresses Conflict 209 | - Replication conflict 210 | - Any situation where concurrent actions break the usefulness of replicas 211 | 212 | - Writes cause conflicts 213 | - Write-write 214 | - If there are 2 replicas, A writes to one while B writes to the other, which is the newest? 215 | - If there is 1 replica, both A and B write, the outcome is uncertain (race condition) 216 | 217 | - Write-read 218 | - If there are 2 replicas, A writes to one while B reads from the other, then B reads an old file 219 | - If there is 1 replica, A writes as B is reading, B may read a mix of old and new content 220 | 221 | - Reads don’t cause conflicts 222 | - Read-read is safe 223 | - But caution is still needed while reading… 224 | 225 | ### Read/Write of Replicated Data 226 | - If a large number of users can access and modify replicated data, we have to be careful about who is reading/writing which replica 227 | 228 | - Ex: if there are four replicas f1, f2, f3, f4 of a file, what happens if one user writes f3 and f4 while another user reads f1 and f2? 229 | 230 | ### Read and Write Quora 231 | - Read quorum: Number of servers R a reader should read from 232 | - Write quorum: Number of servers W a writer should write to 233 | - What is the relationship between these? 234 | - Can we tune them to achieve different goals?
235 | 236 | Screen Shot 2024-02-06 at 5 43 24 PM

237 | 238 | ### Quora Must Overlap 239 | - The read quorum and the write quorum must overlap 240 | - Why? Without overlap, readers could miss-write and get old value 241 | - R + W > N is a necessary condition for correctness 242 | - Requires the ability to identify the most recent version among the replicas that are read (more in a few slides)
243 | 244 | Screen Shot 2024-02-06 at 5 44 17 PM

245 |

246 | 247 | ### Tuning a Quorum 248 | - Often true that reading is more common the writing (I access files more often than I update them) 249 | 250 | - Increase W, i.e., larger write quorum 251 | - More redundancy for robustness 252 | - More local to consumers 253 | - Higher write cost (network and device throughput, contention, long tail, etc.) 254 | 255 | - Decrease R, i.e., smaller read quorum 256 | - Lower cost to read (network and device throughput, contention, long tail, etc.) 257 | - More choices of where to read (closer to the user) 258 | - More expensive writes (but that’s ok since they’re less frequent) 259 | 260 | - Larger overlap 261 | - More tolerance for failure 262 | 263 | - Common Practice: 264 | - Read-one / write-all (i.e., R=1, W=N) is the most common 265 | - Reads are more common than writes, so make them easy 266 | - Read gets most current data (requirement) 267 | - Reads can choose any replica (nearby, failure, etc.) 268 | - Reads require low bandwidth 269 | - Makes common case safe and fast 270 | 271 | ### Version Determination 272 | - If every replica has a version number (like a timestamp or sequence number), then determining which read replica is the newest is easy 273 | Simply choose the replica with the "highest" version number 274 | - Read〈f,tf〉from R servers ⇒〈f1,tf1〉,〈f2,tf2〉,...,〈fR,tfR〉 275 | - Keep fj, where j = argmaxi tfi 276 | - Still need to ensure that the overall newest version is guaranteed to be within any set of R replicas (e.g., via quorum) 277 | 278 | ### Majority Read Quorum 279 | - Make W large enough to guarantee that for any R replicas, strictly more than R/2 of them are the newest version 280 | - the majority of any read quorum represents the most recent version of the replica 281 | - E.g., if 3 of the 5 replicas I read are the same, they are guaranteed to be the most recent version 282 | - the overlap between the write quorum and the read quorum must include the majority of the read quorum 283 | - Overlap must be a majority of any read quorum 284 | - must write to all but a quorum minority 285 | - W > (N - R/2) 286 | - And this doesn’t even account for any read/write failures
287 | Screen Shot 2024-02-06 at 5 54 17 PM

288 | - This is very expensive 289 | - But we don't even need a reference of time at all and guarantee we could still do versioning! 290 | 291 | ### Write-all 292 | - If W = N (i.e., "write-all"), then it’s impossible for an old replica to be present in the system since every replica is always overwritten 293 | - Any read of any replica guarantees it is the newest overall replica 294 | - However, if one write fails, no one can read from the entire system until that server backs up. 295 | - This could lead to a lockdown system... 296 | - Thus, this is requiring the system to be perfect thus resulting in slow results. 297 | 298 | ### Locking 299 | - Uncontrolled concurrent writes can break quorum discipline (and maybe even corrupt data) 300 | - Also could break version number increment 301 | 302 | - What to lock? 303 | - Object at servers, not whole servers 304 | 305 | - How many to lock? 306 | - L ≥ R ⇐ lock quorum must cover most recent version 307 | - L ≥ W ⇐ lock quorum must cover every write to protect version and data 308 | - L ≥ max(R, W) ⇐ lock quorum must cover all reads and writes of updating object 309 | 310 | ### Failures 311 | - If the overlap between read and write Quora is strictly greater than F, then F failures can be tolerated 312 | - With version numbers: R + W - F > N 313 | - Without version numbers: W + R/2 - F > N 314 | - Still guaranteed to see an up-to-date version 315 | - If not covered by overlap, the failure model becomes important 316 | 317 | ### Alternatives to Static Quora 318 | - A static quorum is a pre-defined quorum (like our examples) 319 | - Not adaptive in terms of membership, connectivity, etc. 320 | - We have counted each replica equally (one replica = one vote), but this is not required 321 | - Alternatively, we can assign weights to different hosts based on any reasonable, observable metric; quorum rules still apply, but weighted 322 | - Examples: 323 | - More weight to more reliable hosts (more unreliable hosts for the same robustness) 324 | - Less weight to caches or secondary replicas, due to expiration and staleness 325 | 326 | ### Coda Version Vectors 327 | - CVVs are a form of vector logical timestamp (remember those?) 328 | - A CVV contains one entry for each server, which is the version number of the file on the corresponding server (ideally, all equal) 329 | - If a server doesn’t get an update to the file, its CVV entry will be lower than others 330 | 331 | - A Coda client requests a file via a three-step process: 332 | - It asks all replicas for their version number 333 | - It requests the file from the replica with the largest version number 334 | - If the servers don't agree about the file’s version, the client can detect and inform them of the conflict 335 | - A conflict exists if two CVVs are concurrent, which indicates that each server has seen some but not all changes 336 | 337 | - Ideally, a Coda client writes a file as: 338 | - The client sends the file and original CVV to all servers 339 | - Each server increments its entry in the file's CVV and sends an ACK to the client 340 | - The client merges the entries from all of the servers and sends the new CVV back to each server 341 | - If a conflict is detected, the client can inform the servers, so that it can be resolved automatically, or flagged for mitigation by the user 342 | -------------------------------------------------------------------------------- /15-XXX Computer Systems/15-513 Introduction to Computer Systems.md: -------------------------------------------------------------------------------- 1 | [Lecture 1: Bits, Bytes and Integers](#Bits,-Bytes-and-Integers)
2 | [Lecture 2: Machine Programming: Basics](#Machine-Programming-Basics)
3 | [Lecture 3: Machine Programming: Control](#Machine-Programming-Control)
4 | 5 | ## Bits, Bytes and Integers 6 | ### Binary Representations 7 | Binary representation leads to a simple binary, i.e. base-2, numbering system
8 | - 0 represents 0
9 | - 1 represents 1
10 | - Each “place” represents a power of two, exactly as each place in our usual “base 10”, 10-ary numbering system represents a power of 10
11 | 12 | ## Hexadecimal 00₁₆ to FF₁₆ 13 | - Base 16 number representation 14 | - Use characters ‘0’ to ‘9’ and ‘A’ to ‘F’ 15 | 16 | Consider 1A2B in Hexadecimal: 17 | - 1 \* 16³ + A \* 16² + 2 \* 16¹ + B \* 16⁰ 18 | - 1 \* 16³ + 10 \* 16² + 2 \* 16¹ + 11 \* 16⁰ = 6699 (decimal) 19 | 20 | ## Bit-level Manipulations 21 | - **AND** A&B = 1 when both A=1 and B=1 22 | - **OR** A|B = 1 when either A=1 or B=1 23 | - **NOT** ~A = 1 when A=0 24 | - **Exclusive-OR (XOR)** A^B = 1 when either A=1 or B=1, but not both 25 | 26 | - Boolean algebra operates on bit vectors. 27 | - In regards to operations, the following: 28 | - & Intersection 01000001 { 0, 6 } 29 | - | Union 01111101 { 0, 2, 3, 4, 5, 6 } 30 | - ^ Symmetric difference 00111100 { 2, 3, 4, 5 } 31 | - ~ Complement 10101010 { 1, 3, 5, 7 } 32 | 33 | ### Bit Operations in C 34 | Operations &, |, ~, ^ Available in C 35 | - Apply to any “integral” data type 36 | - long, int, short, char, unsigned 37 | - View arguments as bit vectors 38 | - Arguments applied bit-wise 39 | 40 | Examples (Char data type) 41 | - ~0x41 → 0xBE 42 | - ~01000001 → 10111110 43 | - ~0x00 → 0xFF 44 | - ~00000000 → 11111111 45 | - 0x69 & 0x55 → 0x41 46 | - 01101001 & 01010101 → 01000001 47 | - 0x69 | 0x55 → 0x7D 48 | - 01101001 | 01010101 → 01111101 49 | 50 | ### Logical Operations in C 51 | **In contrast to Bit-Level Operators** 52 | - Logic Operations: &&, ||, ! 53 | - View 0 as “False” 54 | - Anything nonzero as “True” 55 | - Always return 0 or 1 56 | - Early termination 57 | 58 | Examples (Char data type) 59 | - !0x41 → 0x00 60 | - !0x00 → 0x01 61 | - !!0x41→ 0x01 62 | - 0x69 && 0x55 → 0x01 63 | - 0x69 || 0x55 → 0x01 64 | - p && *p (avoids null pointer access) 65 | 66 | ### Shift Operations 67 | **Left Shift**: x << y 68 | - Shift bit-vector x left y positions 69 | – Throw away extra bits on the left 70 | - Fill with 0’s on the right 71 | 72 | **Right Shift**: x >> y 73 | - Shift bit-vector x right y positions 74 | - Throw away extra bits on the right 75 | - Logical shift 76 | - Fill with 0’s on the left 77 | - Arithmetic shift 78 | - Replicate the most significant bit on the left 79 | 80 | **Undefined Behavior** 81 | - Shift amount < 0 or ≥ word size 82 | Screen Shot 2024-01-22 at 6 58 21 PM

83 | 84 | ## Integers 85 | 86 | ### Overflow 87 | For a binary digit line, if two binaries add up to have the resulting binary having an extra digit, this will be considered as **overflow**
88 | For example: 111 + 001 = 1 000, the exceeded 1 is considered overflow.
89 | 90 | “Ints” means the finite set of integer numbers that we can represent on a number line enumerated by some fixed number of bits, i.e. **bit width**.
91 | - An **“unsigned” int** is any int on a number line, e.g. of a data type, that doesn’t contain any negative numbers
92 | - A **non-negative number** is a number greater than or equal to (>=) 0 on a number line, e.g. of a data type, that does contain negative numbers
93 | 94 | ### Negative Numbers 95 | We will use the leading bit as the **sign** bit
96 | - 0 means non-negative 97 | - 1 means negative 98 | - This will allow us to represent negative numbers and non-negative numbers 99 | - and make 0 represent 0 100 | - The issue here is there is a -0, which is the same as 0, except that it is different 101 | 102 | Given a non-negative number in binary, we can find its negative by flipping each bit and adding 1. 103 | For example: 104 | - 0101 is 5 105 | - 1010 is the "one's complement of 5" 106 | - 1011 is the "twos complement of 5" 107 | - 0101 + 1011 = 1 0000 = 0000 108 | - -x = ~x + 1 109 | 110 | This works because after flipping all the bits and making addition to the non-negative number, the result would be 1111, adding 1 in the end would result in a 0 due to overflow.
111 | 112 | ### Numeric Ranges 113 | - unsigned values: 114 | - Umin = 0 115 | - Umax = 2^w - 1 116 | - Two's complement values: 117 | - Tmin = -2^{w - 1} 118 | - Tmax = 2^{w - 1} - 1 119 | 120 | ## Conversion and Casting 121 | Large negative weight becomes Large positive weight
122 | 123 | Basic rules for casting Signed <-> Unsigned: 124 | - Bit pattern is maintained 125 | - But reinterpreted 126 | - Can have unexpected effects: adding or subtracting 2^w 127 | - Expression containing signed and unsigned int 128 | - int is cast to unsigned! 129 | 130 | ### Two's Complement -> Unsigned 131 | - Ordering inversion 132 | - Negative -> Big Positive 133 | 134 | ## Expanding and Truncating 135 | 136 | ### Sign Extension 137 | - Given w-bit signed integer x 138 | - Converting it to a w+k-bit integer with the same value 139 | - This can be done by making k copies of the sign bit and extending it to the new bits. 140 | 141 | ### Truncation 142 | - Given k+w-bit signed or unsigned integer X 143 | - Converting it to w-bit integer 'X' with the same value for "small enough" X 144 | - This can be done by dropping the top k bits. 145 | 146 | ### Summary 147 | - Expanding (e.g., short int to int) 148 | - Unsigned: zeros added 149 | - Signed: sign extension 150 | - Both yield the expected result 151 | - Truncating (e.g., unsigned to unsigned short) 152 | - Unsigned/Signed: bits are truncated 153 | - Result reinterpreted 154 | - Unsigned: mod operation 155 | - Signed: similar to mod 156 | - For small (in magnitude) numbers yield expected behavior 157 | 158 | ## Addition, Negation, Multiplication and Shifting 159 | 160 | ### Addition 161 | - Unsigned Addition: 162 | - Operand: w bits 163 | - True sum: w+1 bits 164 | - Discard Carry: w bits 165 | - This should ignore the carry output 166 | 167 | For example:
168 | 1110 1001 + 1101 0101 = 1 1011 1110
169 | Discard the carry: the result is 1011 1110 -> 190 in decimal (UAdd)
170 | 171 | - Two's Complement Addition: 172 | - Operand: w bits 173 | - True sum: w+1 bits 174 | - Discard Carry: w bits 175 | - The bit-level behavior is the same with TAdd and UAdd 176 | 177 | For example:
178 | 1110 1001 + 1101 0101 = 1 1011 1110
179 | Discard the carry: the result is 1011 1110 -> -66 in decimal (TAdd)
180 | 181 | ### Visualizing Additions 182 | Screen Shot 2024-01-23 at 12 08 15 PM

183 |

184 |

294 | These are 64-bit registers, so we know this is a 64-bit add
295 | 296 | ### X86-64 Integer Registers 297 | Screen Shot 2024-01-23 at 2 34 59 PM

345 | 346 | An example of address computation:
347 | Screen Shot 2024-01-23 at 3 01 13 PM

380 | - We will use condition codes to determine if we need to jump 381 | 382 | - Single-bit registers 383 | - CF Carry Flag (for unsigned) 384 | - SF Sign Flag (for signed) 385 | - ZF Zero Flag 386 | - OF Overflow Flag (for signed) 387 | 388 | - Compare Instruction 389 | - cmp a, b 390 | - Computes 𝑏𝑏 − 𝑎𝑎 (just like sub) 391 | - Sets condition codes based on the result, but… 392 | - Does not change 𝒃𝒃 393 | - Used for if (a < b) { … } whenever 𝑏𝑏 − 𝑎𝑎 isn’t needed for anything else 394 | 395 | - Test Instruction 396 | - test a, b 397 | - Computes 𝑏𝑏&𝑎𝑎 (just like and) 398 | - Sets condition codes (only SF and ZF) based on the result, but… 399 | - Does not change 𝒃𝒃 400 | - Most common use: test %rX, %rX to compare %rX to zero 401 | - Second most common use: test %rX, %rY tests if any of the 1-bits in %rY are also 1 in %rX (or vice versa) 402 | 403 | ### Conditional Branches 404 | - jX Instructions 405 | - Jump to different parts of the code depending on condition codes 406 | Screen Shot 2024-01-25 at 2 30 04 PM

416 | 417 | - Example
418 | Screen Shot 2024-01-25 at 2 45 26 PM

433 | 434 | ### Loops 435 | - Use conditional branch to either continue looping or to exit the loop
436 | Screen Shot 2024-01-25 at 2 52 54 PM

453 | 454 | - Example layout: 455 | - Here, the struct is allocated in the stack, 456 | - In every function call, the memory address of the return value is pushed to the stack pointer, 457 | - The s.a[i] line is writing to memory. In lecture 1, fun(6) crashed the program. Why did writing to this location cause the process to crash? 458 | - This is possible when the instruction has been overwritten.
459 | Screen Shot 2024-02-06 at 2 13 12 PM

460 | 461 | ### Buffer Overflow 462 | Screen Shot 2024-02-06 at 2 18 39 PM

479 |

480 | - This could be utilized for Stack Smashing Attacks
481 | Screen Shot 2024-02-06 at 2 30 38 PM

482 | - Overwrite normal return address A with the address of some other code S 483 | - When Q executes ret, will jump to other code 484 | - x-86 reads long in little-endian order, thus we would need to create a reversed address attack string. 485 | 486 | ### Code Injection Attacks
487 | Screen Shot 2024-02-06 at 2 39 20 PM

544 | - Trigger with ret instruction 545 | - Will start executing Gadget 1 546 | - Final ret in each gadget will start the next one 547 | - ret: pop the address from the stack and jump to that address 548 | -------------------------------------------------------------------------------- /15-XXX Computer Systems/15-619 Cloud Computing.md: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /15-XXX Computer Systems/15-641 Computer Network.md: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /15-XXX Computer Systems/15-645 Database Systems.md: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /18-XXX Electrical and Computer Engineering/18-746 Storage Systems.md: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2024 Jiawei (Allen) Zhu 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # CMU Course Notes 2 | 3 | ## Ongoing Updates 4 | 5 | - 10-601 Machine Learning 6 | - 14-736 Distributed Systems 7 | - 15-513 Computer Systems 8 | 9 | ## TBD 10 | 11 | - 10-617 Intermediate Deep Learning 12 | - 10-623 Generative AI 13 | - 10-725 Convex Optimization 14 | - 11-742 Search Engines 15 | - 11-667 Large Language Models 16 | - 15-618 Parallel Computer Architecture and Programming 17 | - 15-619 Cloud Computing 18 | - 15-641 Computer Networks 19 | - 15-645 Database Systems 20 | - 18-746 Storage Systems 21 | 22 | ## Usage 23 | 24 | The repository is organized by course departments. Each department folder contains individual courses, where you'll find notes in PDF and Markdown formats. 25 | 26 | To view a note, simply click on the file. To download, click the 'Download' button or use the `raw` option in the GitHub interface. 27 | 28 | ## How to Contribute 29 | 30 | We welcome contributions to courses not yet listed at the moment! To contribute: 31 | 1. Fork the repository. 32 | 2. Make your changes. 33 | 3. Submit a pull request with a clear description of your improvements. 34 | 35 | ## Responsibilities 36 | 37 | Please refrain from uploading code or unauthorized material from CMU. Such pull requests will not be granted. 38 | 39 | ## License 40 | 41 | This project is licensed under the MIT License. This means you're free to use, modify, and distribute the notes as long as you credit the source. See [LICENSE.md](LINK) for more details. 42 | 43 | ## Acknowledgments 44 | 45 | Special thanks to the faculty and students of CMU who have contributed to and inspired this collection of notes and Xuezzou, who inspired me to create this project. 46 |

47 | --------------------------------------------------------------------------------