├── 10-XXX Machine Learning
├── 10-601 Introduction to Machine Learning.md
├── 10-605 Machine Learning with Large Datasets.md
├── 10-617 Intermediate Deep Learning.md
└── 10-725 Convex Optimization.md
├── 11-XXX Language Technologies
├── 11-642 Search Engines.md
└── 11-711 Advanced NLP.md
├── 14-XXX Information Networking
└── 14-736 Distributed System.md
├── 15-XXX Computer Systems
├── 15-513 Introduction to Computer Systems.md
├── 15-619 Cloud Computing.md
├── 15-641 Computer Network.md
└── 15-645 Database Systems.md
├── 18-XXX Electrical and Computer Engineering
└── 18-746 Storage Systems.md
├── LICENSE
└── README.md
/10-XXX Machine Learning/10-601 Introduction to Machine Learning.md:
--------------------------------------------------------------------------------
1 | [Lecture 1: Supervised and Unsupervised Learning](#Supervised-and-Unsupervised-Learning)
2 | [Lecture 2: Machine Learning as Functional Approximation](#Machine-Learning-as-Functional-Approximation)
3 |
4 | ## Supervised and Unsupervised Learning
5 | ### Supervised models:
6 | · Decision Trees
7 | · KNN
8 | · Naïve Bayes
9 | · Perceptron
10 | · Logistic Regression
11 | · Linear Regression
12 | · Neural Networks
13 |
14 | ### Definition of Machine Learning
15 | Three components: Task T, Performance metric P, and Experience E.
16 |
17 | ### Definitions: Machine Learning Classifier
18 | A **classifier** is a function that takes feature values as input and outputs a label.
19 | A **test dataset** is used to evaluate a classifier’s predictions.
20 | The **error rate** is the proportion of data points where the prediction is wrong
21 |
22 | ### Types of Machine Learning Classifiers
23 | **Majority vote classifier**: always predicts the most common label in the **training** dataset.
24 | **Memorizer**: if a set of features exists in the training dataset, predict its corresponding label; otherwise, predict the majority vote
25 | **Decision Stump**: based on a single feature, xd, predict the most common label in the **training** dataset among all data points that have the same value for xd
26 |
27 | ### A typical Machine Learning Routine
28 | · Step 1 – **Training**
29 | Input: a labeled training dataset
30 | Output: a classifier
31 | · Step 2 – **Testing**
32 | Inputs: a classifier, a test dataset
33 | Output: predictions for each test data point
34 | · Step 3 – **Evaluation**
35 | Inputs: predictions from step 2, test dataset labels
36 | Output: some measure of how good the predictions are;
37 | usually (but not always) error rate
38 |
39 | ## Machine Learning as Functional Approximation
40 | ### Notations
41 | · **Feature Space**: x
42 | · **Label Space**: y
43 | · **Unknown target function**: c*: x -> y
44 | · **Training Dataset**: D = {x(1), c\*(x(1)) = y(1), (x(2), y(2))...,(x(N), y(N))}
45 | · **Hypothesis Space**: H
46 | · **Goal**: Find the classifier, h ∈ H, that best approximates c\*
47 |
48 | ## Loss function and error Rates
49 | **Loss Function**: l : y x y -> ℝ
50 | This defines how "bad" predictions $\hat{y}$ = h(x), are compared to the true labels, y = c\*(x)
51 |
52 | ### Common choices for loss functions:
53 | 1. Squared loss (for regression): l(y, $\hat{y}$ ) = (y - $\hat{y}$ )2
54 | 2. Binary or 0-1 loss (for classification): l(y, $\hat{y}$ ) = 1(y != $\hat{y}$ )
55 |
56 | ### Decision Stump Pseudocode:
57 |
58 |
--------------------------------------------------------------------------------
/10-XXX Machine Learning/10-605 Machine Learning with Large Datasets.md:
--------------------------------------------------------------------------------
1 |
2 |
--------------------------------------------------------------------------------
/10-XXX Machine Learning/10-617 Intermediate Deep Learning.md:
--------------------------------------------------------------------------------
1 |
2 |
--------------------------------------------------------------------------------
/10-XXX Machine Learning/10-725 Convex Optimization.md:
--------------------------------------------------------------------------------
1 |
2 |
--------------------------------------------------------------------------------
/11-XXX Language Technologies/11-642 Search Engines.md:
--------------------------------------------------------------------------------
1 |
2 |
--------------------------------------------------------------------------------
/11-XXX Language Technologies/11-711 Advanced NLP.md:
--------------------------------------------------------------------------------
1 |
2 |
--------------------------------------------------------------------------------
/14-XXX Information Networking/14-736 Distributed System.md:
--------------------------------------------------------------------------------
1 | ## Intro to Distributed Systems
2 | ### Distributed Systems Overview
3 | A distributed system can be generically defined as a collection of resources on distinct physical components that appear to function as a single coherent system
4 |
5 | Examples: the Internet, Embedded SoC, Mobile Gaming
6 |
7 | The Benefits of having a distributed system:
8 | - Geographic distribution of users, peers, providers, etc.
9 | - The distribution of resources can match that of the population
10 | - Cost per resource (processor, memory, storage, network) at scale
11 | - Capacity (processor, memory, storage, network)
12 | - Failure model (high chance of 1 of N failure, lower chance of N of N failure)
13 | - Localized risks
14 |
15 | ## Computing Over Networks
16 | ### Challenges of Distributed Systems
17 |
18 | Fundamental distributed systems challenges:
19 |
20 | - Timing:
21 | - How do processes on different networked devices determine the ordering of events that took place? The "Who shot first?" problem
22 | - Concurrency: H
23 | - How do processes on different networked devices safely access/use a commonly shared resource on yet another device?
24 | - Robustness:
25 | - How do processes on different networked devices deal with imperfections in the networks that connect them?
26 | - Consistency:
27 | - How do we guarantee the same outcome from a process regardless of where/how the process runs?
28 |
29 | ### Timing
30 | **Reliability**: The delivery of all packets in order, with correct content, without duplicates.
31 | - Reliable transport provides applications with guaranteed delivery of message segments in order, with verified content, without duplication (e.g., TCP; useful for bulk data delivery)
32 | - Reliability is traded for latency (no bounded delay guarantee) and performance overhead (potentially lots of retransmissions)
33 |
34 | Reliability != timeliness
35 | - The two are often in direct conflict
36 | - Reliability is only guaranteed on a per-session basis, not globally
37 | - In general, **time** is a major challenge to be dealt with.
38 |
39 | ### Concurrency
40 | **Coordination**:
41 | - Finding “stuff” is tricky (content, resources, …)
42 | - Maybe someone/something needs to keep track of all the things?
43 | - Coordination is a new (or at least amplified) challenge
44 | - In general, **management** is a major challenge to be dealt with
45 |
46 | ### Robustness
47 | **(Partial) Failure**:
48 | - Servers can fail
49 | - Especially if the system comprises a large number of cheap/commodity servers instead of one monolithic, high-performance, many-9s availability server
50 | - If one server fails, the entire system doesn’t fail
51 | - Partial failure usually allows for recovery (ideally invisible to the client) and hopefully doesn’t affect availability
52 | - In general, **fault tolerance and recoverability** are highly desirable
53 |
54 | Failures are difficult to handle because it is difficult for us to know which specific processing state the failure occurred.
55 |
56 | ### Consistency
57 | **Consistency**:
58 | - If many clients are interacting with the same data, how do we guarantee that everyone agrees on the same value?
59 | - Mutexes/locks are great, but how can we extend that from a single OS to an entire DS?
60 | - If we can’t, can the clients at least detect that there’s a problem?
61 | - Consistency is one of the biggest challenges in asynchronous, concurrent, distributed systems
62 |
63 | ## Coordinating Operations over a Network
64 | ### Remote Operation
65 | - I have a task (e.g., compute something, access some data), and I need another networked component to help me
66 | - This is abstracted as a function/subroutine/… -> y = f(x)
67 | - y: what is the desired outcome that I’m asking for
68 | - x: what inputs am I providing as part of my request
69 | - f: what work is being done to provide my outcome, and I don’t care where/how it’s implemented
70 |
71 | ### Local Operation
72 | - If this was done locally:
73 | - Find f and x in memory
74 | - Load x in register(s)
75 | - Branch to subroutine
76 | - Subroutine computes y
77 | - Load y in register(s)
78 | - Branch return
79 | - Store y in memory
80 |
81 | - The OS hides most of this, so the dev sees a simple **function/method call**
82 |
83 | ### Local vs. Remote Operation
84 | Apart from the local operations,
85 | - At the remote entity:
86 | - Get messages from the other party
87 | - Unpack f and x (how?)
88 | - Load x in register(s)
89 | - Subroutine, compute y
90 | - Pack into reply message, send
91 |
92 |
93 | ### Remote Procedure Calls (RPC)
94 | - RPC is a way to provide a programmer-friendly abstraction to handle the complexity of coordination required for remote services
95 | - RPC moves inter-procedure communication from stack to network
96 | - Attempts to provide an interface for the programmer that still looks like a local procedure call
97 |
98 | ### Structure of the RPC
99 | - Client-side:
100 | - Client applications only see a local “stub” (aka proxy) that presents a simple function/method/API for local use
101 | - The stub layer abstracts away most networking details
102 | - Client app calls f(x) presented by stub, stub interacts with library to perform all the details of the remote call but doesn't "do" f(x)
103 | - Nothing about transport has to change
104 | - Server-side:
105 | - Same idea on the server side, where the actual service provided doesn’t care about underlying network interactions, just "doing" f(x)
106 | - Similar server stub (aka “skeleton”) abstracts away all networking details
107 | - Stub interacts with the library to handle incoming remote requests, decoding them in a way that’s useful for the server, calls the local subroutine, and handles the reply
108 | - Nothing about transport has to change
109 |
110 |
111 |
112 | ### Benefits and Difficulties of the RPC
113 | Benefits:
114 | - All the programmer sees when using RPC is the immediate call to the stub and the corresponding return
115 | - This provides:
116 | - Ease of programming with familiar model (just call a function)
117 | - Hidden complexity and architecture of the remote component
118 | - Automation of distributed computation
119 |
120 | Difficulties:
121 | - While the abstraction and ease of programming are great, some challenges need to be addressed
122 | - Caller and callee procedures run on different machines, using data in different address spaces, perhaps in different environments/OS, …
123 | - Must convert to the local representation of data (e.g., variable type, endian conversion)
124 | - Components or networks can fail
125 |
126 | ### Marshalling
127 | - Stubs are responsible for marshaling and unmarshalling these interactions
128 | - Client stub marshals arguments into machine-independent format ⇒ sends request ⇒ waits for response ⇒ unmarshals result
129 | - Server stub unmarshals arguments and builds stack frame ⇒ calls procedure ⇒ marshals results and replies
130 |
131 | - Since marshaling relies on standards and conventions in a variety of places, rather than making some assumption about what those are, stubs rely on an **interface definition language (IDL)**
132 |
133 | - There are many such IDLs supported by various libraries and middleware, often resembling XML, JSON, TLV, etc.
134 |
135 | ### RPC Diagram
136 | 
137 |
138 | ### RPC By Reference
139 | - Early RPC implementations were for stateless/idempotent subroutines, so there was no notion of call-by-reference
140 | - There is no shared memory space, so what value is a pointer?
141 |
142 | - RPC is often limited to call-by-value or call-by-copy-and-restore, where values must be explicitly sent over the network
143 | - This creates some complications for complex data types and especially for custom data types
144 | - It is possible to maintain state and create generic "references", like pointers but not for memory
145 | - Different levels of integration are supported by different systems, depending on the hetero/homogeneity of the system, languages, etc.
146 |
147 | ### Challenges of RPC
148 | 1. Failures
149 |
150 | - Partial Failure:
151 | - In local computing:
152 | - If the machine fails, the application fails
153 | - There are much bigger concerns than the application at this point
154 | - In distributed computing:
155 | - If a machine fails, part of the application fails and part doesn’t
156 | - Can you tell the difference between machine failure and network failure?
157 |
158 | ## Replication
159 |
160 | - Reasons for Replication
161 | - Increased throughput/bandwidth/parallelism
162 | - Increased fault tolerance
163 | - Improve latency by storing content closer to distributed consumers, e.g., “at the edge”
164 |
165 | ### Primary and Secondary Replicas
166 | - Primary replicas
167 | - Exact copies of the original
168 | - Common for file servers, databases, etc.
169 |
170 | - Secondary replicas (eg. Github repo not updated)
171 | - Derivable from the original
172 | - Lower fidelity in some way
173 | - Caches (possibly stale)
174 | - Thumbnails (lower resolution, faster to transmit, smaller to store)
175 | - Compressed copies (slower to access, harder to update)
176 |
177 | - Primary takes all updates
178 | - Secondary may get occasional/periodic updates or “listen” to primary updates
179 | - Primary needs additional overhead to manage, secondary can be managed when possible
180 | - Frequency of secondary updates can be balanced for desired overhead, consistency, usefulness, etc.
181 |
182 | - Secondaries may be read-only caches
183 | - Leaving primary as the definitive and most up-to-date copy
184 | - An update of secondary affects how useful it is as a backup for primary
185 |
186 | ### Efficient Replica Management
187 | - Rather than enforce perfect replication (which has high overhead), it’s often reasonable to allow replicas to vary from each other slightly (e.g., one can be out of date
188 | w.r.t. others) in trade for significantly less latency/overhead
189 |
190 | - Need to be careful not to overly complicate the management of replication, or we could end up with more latency/overhead
191 |
192 | - Goal: we want to achieve one-copy semantics, meaning we can perform read and write operations on a collection of replicas with the same results as if there were one nonreplicated object
193 |
194 | ### Quorum
195 | - A group of servers trying to provide one-copy semantics
196 |
197 | - Setup of the Quorum
198 | - N primary replicas, no secondary replicas
199 | - One-copy semantics
200 | - Each "read" makes a local copy of multiple replicas, determines which is "correct" and discards others
201 | - Each "write" overwrites multiple replicas with a single local copy
202 | - Writes are not edits, they are replacements
203 | - A "read-then-write" approach is typically used
204 |
205 | - Concurrency controls are assumed
206 | - All reads and writes are "safe"
207 |
208 | ### Concurrency Control Addresses Conflict
209 | - Replication conflict
210 | - Any situation where concurrent actions break the usefulness of replicas
211 |
212 | - Writes cause conflicts
213 | - Write-write
214 | - If there are 2 replicas, A writes to one while B writes to the other, which is the newest?
215 | - If there is 1 replica, both A and B write, the outcome is uncertain (race condition)
216 |
217 | - Write-read
218 | - If there are 2 replicas, A writes to one while B reads from the other, then B reads an old file
219 | - If there is 1 replica, A writes as B is reading, B may read a mix of old and new content
220 |
221 | - Reads don’t cause conflicts
222 | - Read-read is safe
223 | - But caution is still needed while reading…
224 |
225 | ### Read/Write of Replicated Data
226 | - If a large number of users can access and modify replicated data, we have to be careful about who is reading/writing which replica
227 |
228 | - Ex: if there are four replicas f1, f2, f3, f4 of a file, what happens if one user writes f3 and f4 while another user reads f1 and f2?
229 |
230 | ### Read and Write Quora
231 | - Read quorum: Number of servers R a reader should read from
232 | - Write quorum: Number of servers W a writer should write to
233 | - What is the relationship between these?
234 | - Can we tune them to achieve different goals?
235 |
236 | 
237 |
238 | ### Quora Must Overlap
239 | - The read quorum and the write quorum must overlap
240 | - Why? Without overlap, readers could miss-write and get old value
241 | - R + W > N is a necessary condition for correctness
242 | - Requires the ability to identify the most recent version among the replicas that are read (more in a few slides)
243 |
244 | 
245 | 
246 |
247 | ### Tuning a Quorum
248 | - Often true that reading is more common the writing (I access files more often than I update them)
249 |
250 | - Increase W, i.e., larger write quorum
251 | - More redundancy for robustness
252 | - More local to consumers
253 | - Higher write cost (network and device throughput, contention, long tail, etc.)
254 |
255 | - Decrease R, i.e., smaller read quorum
256 | - Lower cost to read (network and device throughput, contention, long tail, etc.)
257 | - More choices of where to read (closer to the user)
258 | - More expensive writes (but that’s ok since they’re less frequent)
259 |
260 | - Larger overlap
261 | - More tolerance for failure
262 |
263 | - Common Practice:
264 | - Read-one / write-all (i.e., R=1, W=N) is the most common
265 | - Reads are more common than writes, so make them easy
266 | - Read gets most current data (requirement)
267 | - Reads can choose any replica (nearby, failure, etc.)
268 | - Reads require low bandwidth
269 | - Makes common case safe and fast
270 |
271 | ### Version Determination
272 | - If every replica has a version number (like a timestamp or sequence number), then determining which read replica is the newest is easy
273 | Simply choose the replica with the "highest" version number
274 | - Read〈f,tf〉from R servers ⇒〈f1,tf1〉,〈f2,tf2〉,...,〈fR,tfR〉
275 | - Keep fj, where j = argmaxi tfi
276 | - Still need to ensure that the overall newest version is guaranteed to be within any set of R replicas (e.g., via quorum)
277 |
278 | ### Majority Read Quorum
279 | - Make W large enough to guarantee that for any R replicas, strictly more than R/2 of them are the newest version
280 | - the majority of any read quorum represents the most recent version of the replica
281 | - E.g., if 3 of the 5 replicas I read are the same, they are guaranteed to be the most recent version
282 | - the overlap between the write quorum and the read quorum must include the majority of the read quorum
283 | - Overlap must be a majority of any read quorum
284 | - must write to all but a quorum minority
285 | - W > (N - R/2)
286 | - And this doesn’t even account for any read/write failures
287 | 
288 | - This is very expensive
289 | - But we don't even need a reference of time at all and guarantee we could still do versioning!
290 |
291 | ### Write-all
292 | - If W = N (i.e., "write-all"), then it’s impossible for an old replica to be present in the system since every replica is always overwritten
293 | - Any read of any replica guarantees it is the newest overall replica
294 | - However, if one write fails, no one can read from the entire system until that server backs up.
295 | - This could lead to a lockdown system...
296 | - Thus, this is requiring the system to be perfect thus resulting in slow results.
297 |
298 | ### Locking
299 | - Uncontrolled concurrent writes can break quorum discipline (and maybe even corrupt data)
300 | - Also could break version number increment
301 |
302 | - What to lock?
303 | - Object at servers, not whole servers
304 |
305 | - How many to lock?
306 | - L ≥ R ⇐ lock quorum must cover most recent version
307 | - L ≥ W ⇐ lock quorum must cover every write to protect version and data
308 | - L ≥ max(R, W) ⇐ lock quorum must cover all reads and writes of updating object
309 |
310 | ### Failures
311 | - If the overlap between read and write Quora is strictly greater than F, then F failures can be tolerated
312 | - With version numbers: R + W - F > N
313 | - Without version numbers: W + R/2 - F > N
314 | - Still guaranteed to see an up-to-date version
315 | - If not covered by overlap, the failure model becomes important
316 |
317 | ### Alternatives to Static Quora
318 | - A static quorum is a pre-defined quorum (like our examples)
319 | - Not adaptive in terms of membership, connectivity, etc.
320 | - We have counted each replica equally (one replica = one vote), but this is not required
321 | - Alternatively, we can assign weights to different hosts based on any reasonable, observable metric; quorum rules still apply, but weighted
322 | - Examples:
323 | - More weight to more reliable hosts (more unreliable hosts for the same robustness)
324 | - Less weight to caches or secondary replicas, due to expiration and staleness
325 |
326 | ### Coda Version Vectors
327 | - CVVs are a form of vector logical timestamp (remember those?)
328 | - A CVV contains one entry for each server, which is the version number of the file on the corresponding server (ideally, all equal)
329 | - If a server doesn’t get an update to the file, its CVV entry will be lower than others
330 |
331 | - A Coda client requests a file via a three-step process:
332 | - It asks all replicas for their version number
333 | - It requests the file from the replica with the largest version number
334 | - If the servers don't agree about the file’s version, the client can detect and inform them of the conflict
335 | - A conflict exists if two CVVs are concurrent, which indicates that each server has seen some but not all changes
336 |
337 | - Ideally, a Coda client writes a file as:
338 | - The client sends the file and original CVV to all servers
339 | - Each server increments its entry in the file's CVV and sends an ACK to the client
340 | - The client merges the entries from all of the servers and sends the new CVV back to each server
341 | - If a conflict is detected, the client can inform the servers, so that it can be resolved automatically, or flagged for mitigation by the user
342 |
--------------------------------------------------------------------------------
/15-XXX Computer Systems/15-513 Introduction to Computer Systems.md:
--------------------------------------------------------------------------------
1 | [Lecture 1: Bits, Bytes and Integers](#Bits,-Bytes-and-Integers)
2 | [Lecture 2: Machine Programming: Basics](#Machine-Programming-Basics)
3 | [Lecture 3: Machine Programming: Control](#Machine-Programming-Control)
4 |
5 | ## Bits, Bytes and Integers
6 | ### Binary Representations
7 | Binary representation leads to a simple binary, i.e. base-2, numbering system
8 | - 0 represents 0
9 | - 1 represents 1
10 | - Each “place” represents a power of two, exactly as each place in our usual “base 10”, 10-ary numbering system represents a power of 10
11 |
12 | ## Hexadecimal 0016 to FF16
13 | - Base 16 number representation
14 | - Use characters ‘0’ to ‘9’ and ‘A’ to ‘F’
15 |
16 | Consider 1A2B in Hexadecimal:
17 | - 1 \* 163 + A \* 162 + 2 \* 161 + B \* 160
18 | - 1 \* 163 + 10 \* 162 + 2 \* 161 + 11 \* 160 = 6699 (decimal)
19 |
20 | ## Bit-level Manipulations
21 | - **AND** A&B = 1 when both A=1 and B=1
22 | - **OR** A|B = 1 when either A=1 or B=1
23 | - **NOT** ~A = 1 when A=0
24 | - **Exclusive-OR (XOR)** A^B = 1 when either A=1 or B=1, but not both
25 |
26 | - Boolean algebra operates on bit vectors.
27 | - In regards to operations, the following:
28 | - & Intersection 01000001 { 0, 6 }
29 | - | Union 01111101 { 0, 2, 3, 4, 5, 6 }
30 | - ^ Symmetric difference 00111100 { 2, 3, 4, 5 }
31 | - ~ Complement 10101010 { 1, 3, 5, 7 }
32 |
33 | ### Bit Operations in C
34 | Operations &, |, ~, ^ Available in C
35 | - Apply to any “integral” data type
36 | - long, int, short, char, unsigned
37 | - View arguments as bit vectors
38 | - Arguments applied bit-wise
39 |
40 | Examples (Char data type)
41 | - ~0x41 → 0xBE
42 | - ~01000001 → 10111110
43 | - ~0x00 → 0xFF
44 | - ~00000000 → 11111111
45 | - 0x69 & 0x55 → 0x41
46 | - 01101001 & 01010101 → 01000001
47 | - 0x69 | 0x55 → 0x7D
48 | - 01101001 | 01010101 → 01111101
49 |
50 | ### Logical Operations in C
51 | **In contrast to Bit-Level Operators**
52 | - Logic Operations: &&, ||, !
53 | - View 0 as “False”
54 | - Anything nonzero as “True”
55 | - Always return 0 or 1
56 | - Early termination
57 |
58 | Examples (Char data type)
59 | - !0x41 → 0x00
60 | - !0x00 → 0x01
61 | - !!0x41→ 0x01
62 | - 0x69 && 0x55 → 0x01
63 | - 0x69 || 0x55 → 0x01
64 | - p && *p (avoids null pointer access)
65 |
66 | ### Shift Operations
67 | **Left Shift**: x << y
68 | - Shift bit-vector x left y positions
69 | – Throw away extra bits on the left
70 | - Fill with 0’s on the right
71 |
72 | **Right Shift**: x >> y
73 | - Shift bit-vector x right y positions
74 | - Throw away extra bits on the right
75 | - Logical shift
76 | - Fill with 0’s on the left
77 | - Arithmetic shift
78 | - Replicate the most significant bit on the left
79 |
80 | **Undefined Behavior**
81 | - Shift amount < 0 or ≥ word size
82 |
83 |
84 | ## Integers
85 |
86 | ### Overflow
87 | For a binary digit line, if two binaries add up to have the resulting binary having an extra digit, this will be considered as **overflow**
88 | For example: 111 + 001 = 1 000, the exceeded 1 is considered overflow.
89 |
90 | “Ints” means the finite set of integer numbers that we can represent on a number line enumerated by some fixed number of bits, i.e. **bit width**.
91 | - An **“unsigned” int** is any int on a number line, e.g. of a data type, that doesn’t contain any negative numbers
92 | - A **non-negative number** is a number greater than or equal to (>=) 0 on a number line, e.g. of a data type, that does contain negative numbers
93 |
94 | ### Negative Numbers
95 | We will use the leading bit as the **sign** bit
96 | - 0 means non-negative
97 | - 1 means negative
98 | - This will allow us to represent negative numbers and non-negative numbers
99 | - and make 0 represent 0
100 | - The issue here is there is a -0, which is the same as 0, except that it is different
101 |
102 | Given a non-negative number in binary, we can find its negative by flipping each bit and adding 1.
103 | For example:
104 | - 0101 is 5
105 | - 1010 is the "one's complement of 5"
106 | - 1011 is the "twos complement of 5"
107 | - 0101 + 1011 = 1 0000 = 0000
108 | - -x = ~x + 1
109 |
110 | This works because after flipping all the bits and making addition to the non-negative number, the result would be 1111, adding 1 in the end would result in a 0 due to overflow.
111 |
112 | ### Numeric Ranges
113 | - unsigned values:
114 | - Umin = 0
115 | - Umax = 2w - 1
116 | - Two's complement values:
117 | - Tmin = -2w - 1
118 | - Tmax = 2w - 1 - 1
119 |
120 | ## Conversion and Casting
121 | Large negative weight becomes Large positive weight
122 |
123 | Basic rules for casting Signed <-> Unsigned:
124 | - Bit pattern is maintained
125 | - But reinterpreted
126 | - Can have unexpected effects: adding or subtracting 2w
127 | - Expression containing signed and unsigned int
128 | - int is cast to unsigned!
129 |
130 | ### Two's Complement -> Unsigned
131 | - Ordering inversion
132 | - Negative -> Big Positive
133 |
134 | ## Expanding and Truncating
135 |
136 | ### Sign Extension
137 | - Given w-bit signed integer x
138 | - Converting it to a w+k-bit integer with the same value
139 | - This can be done by making k copies of the sign bit and extending it to the new bits.
140 |
141 | ### Truncation
142 | - Given k+w-bit signed or unsigned integer X
143 | - Converting it to w-bit integer 'X' with the same value for "small enough" X
144 | - This can be done by dropping the top k bits.
145 |
146 | ### Summary
147 | - Expanding (e.g., short int to int)
148 | - Unsigned: zeros added
149 | - Signed: sign extension
150 | - Both yield the expected result
151 | - Truncating (e.g., unsigned to unsigned short)
152 | - Unsigned/Signed: bits are truncated
153 | - Result reinterpreted
154 | - Unsigned: mod operation
155 | - Signed: similar to mod
156 | - For small (in magnitude) numbers yield expected behavior
157 |
158 | ## Addition, Negation, Multiplication and Shifting
159 |
160 | ### Addition
161 | - Unsigned Addition:
162 | - Operand: w bits
163 | - True sum: w+1 bits
164 | - Discard Carry: w bits
165 | - This should ignore the carry output
166 |
167 | For example:
168 | 1110 1001 + 1101 0101 = 1 1011 1110
169 | Discard the carry: the result is 1011 1110 -> 190 in decimal (UAdd)
170 |
171 | - Two's Complement Addition:
172 | - Operand: w bits
173 | - True sum: w+1 bits
174 | - Discard Carry: w bits
175 | - The bit-level behavior is the same with TAdd and UAdd
176 |
177 | For example:
178 | 1110 1001 + 1101 0101 = 1 1011 1110
179 | Discard the carry: the result is 1011 1110 -> -66 in decimal (TAdd)
180 |
181 | ### Visualizing Additions
182 | 
183 | 
184 | 
185 |
186 | ### Multiplication
187 | Power-of-2 Multiply with Shift
188 | - Operation:
189 | - u << k gives u \* 2k
190 | - Both signed and unsigned
191 | - Operand: w bits
192 | - True Product: w+k bits
193 | - Discard k bits: w bits
194 |
195 | For example:
196 | u << 3 == u \* 8
197 | (u << 5) - (u << 3) == u \* 24
198 | Most machines shift and add faster than multiply since the compiler generates this code automatically.
199 |
200 | ### Division
201 | Unsigned Power-of-2 Divide with Shift
202 | - Operation:
203 | - u >> k gives floor(u / 2k)
204 | - This would require logical shift.
205 |
206 | Examples:
207 | 
208 |
209 | Signed Power-of-2 Divide with Shift
210 | - Operation
211 | - x >> k gives floor(x / 2k)
212 | - This would require arithmetic shift.
213 | - Round to the left, not toward zero (Unlikely to be what is expected, introduces a bias.)
214 |
215 | Examples:
216 | 
217 |
218 | ## Byte Ordering
219 | Big Engian: Sun (Oracle SPARC), PPC Mac, Internet
220 | - The least significant byte has the highest address
221 |
222 | Little Endian: x86, ARM
223 | - The least significant byte has the lowest address
224 |
225 | Byte ordering is a concern when we are communicating data over a network, via files, etc.
226 | Note:
227 | - Bits are not reversed, as the low-order bit is the reference point.
228 | - This doesn't affect chars, or strings (array of chars), as chars are only one byte.
229 |
230 | Example: Given a variable x has a 4-byte value of 0x01234567, the address given by &x is 0x100:
231 | 
232 |
233 | ### Reading Byte-Reversed Listing
234 | Disassembly:
235 | - Text representation of binary machine code
236 | - Generated by the program that reads the machine code
237 |
238 | For example:
239 | - Value: 0x12ab
240 | - Pad to 32 bits: 0x000012ab
241 | - Split into bytes: 00 00 12 ab
242 | - Reverse: ab 12 00 00
243 |
244 | If we want to add this value to %ebx, then the assembly code would be:
245 | add $0x12ab, %ebx
246 | The instruction code corresponding would be:
247 | 81 c3 ab 12 00 00
248 |
249 | ## Machine Programming Basics
250 | ### Assembly Basics
251 | - **Architecture**: (also ISA: instruction set architecture) The parts of a processor design that one needs to understand for writing assembly/machine code.
252 | - Examples: instruction set specification, registers, memory model
253 | - The architecture is not the hardware, it is the interface of the hardware
254 | - *Microarchitecture**: Implementation of the architecture
255 | - Examples: cache sizes and core frequency
256 |
257 | - Code Forms:
258 | - **Machine Code**: The byte-level programs that a processor executes
259 | - **Assembly Code**: A text representation of machine code, (human-readable version of the assembly language)
260 |
261 | Example ISAs:
262 | - Intel: x86, IA32, Itanium, x86-64
263 | - ARM: Used in almost all mobile phones
264 | - RISC V: New open-source ISA
265 |
266 | ### Assembly Code and Machine Basics
267 | - **PC: Program counter**
268 | - Address of next instruction
269 | - Called “RIP” (x86-64)
270 | - **Register file**
271 | - Heavily used program data
272 | - Can be accessed directly
273 | - **Condition codes (FLAGS)**
274 | - Store status information about the most recent arithmetic or logical operation
275 | - Used for conditional branching
276 | - **Memory**
277 | - Byte addressable array
278 | - Heap to support code and user data
279 | - Stack to support procedures
280 |
281 |
282 | ### Assembly Data Types
283 | - Integer: data of 1, 2, 4, or 8 bytes
284 | - Data values
285 | - Addresses (untyped pointers)
286 | - Floating points: 4, 8 or 10 bytes
287 | - SIMD vector data types of 8, 16, 32 or 64 bytes
288 | - Code: Byte sequences encoding a series of instructions
289 | - No aggregate types such as arrays or structures
290 | - Just contiguously allocated bytes in memory
291 |
292 | Example:
293 | 
294 | These are 64-bit registers, so we know this is a 64-bit add
295 |
296 | ### X86-64 Integer Registers
297 | 
298 |
299 | ### Moving Data
300 | - Moving Data:
301 | - movq Source, Dest
302 | - Operand Types
303 | - Immediate: Constant integer data
304 | - Example: $0x400, $-533
305 | - Like C constant, but prefixed with ‘$’
306 | - Encoded with 1, 2, or 4 bytes
307 | - Register: One of 16 integer registers
308 | - Example: %rax, %r13
309 | - But %rsp reserved for special use
310 | - Others have special uses for particular instructions
311 | - Memory: 8 consecutive bytes of memory at the address given by the register
312 | - Simplest example: (%rax)
313 | - Various other “addressing modes”
314 |
315 | Movq Operand Combinations:
316 |
317 | - The immediate cannot be a dest
318 | - Cannot do memory-memory transfer with a single instruction
319 |
320 | ### Memory Addressing Nodes
321 | Normal (R) Mem[Reg[R]]
322 | - Register R specifies the memory address
323 | - Aha! Pointer dereferencing in C
324 | - movq (%rcx),%rax
325 |
326 | Displacement D(R) Mem[Reg[R]+D]
327 | - Register R specifies the start of the memory region
328 | - Constant displacement D specifies the offset
329 | - movq 8(%rbp),%rdx
330 |
331 | **Most General Form**:
332 | D(Rb,Ri,S) Mem[Reg[Rb]+S*Reg[Ri]+ D]
333 | - D: Constant “displacement” 1, 2, or 4 bytes
334 | - Rb: Base register: Any of 16 integer registers
335 | - Ri: Index register: Any, except for %rsp
336 | - S: Scale: 1, 2, 4, or 8 (why these numbers?)
337 |
338 | Special Cases
339 | - (Rb,Ri) Mem[Reg[Rb]+Reg[Ri]]
340 | - D(Rb,Ri) Mem[Reg[Rb]+Reg[Ri]+D]
341 | - (Rb,Ri,S) Mem[Reg[Rb]+S*Reg[Ri]]
342 |
343 | An example of a swap function:
344 | 
345 |
346 | An example of address computation:
347 | 
348 |
349 | ### Address Computation Instruction
350 | leaq Src, Dst
351 | - Src is address mode expression
352 | - Set Dst to address denoted by the expression
353 | Uses
354 | - Computing addresses without a memory reference
355 | - E.g., translation of p = &x[i];
356 | - Computing arithmetic expressions of the form x + k \* y
357 | - k = 1, 2, 4, or 8
358 |
359 | ### Turning C into Object Code
360 | 
361 |
362 | ## Machine Programming Control
363 | ### Finding pointers
364 | - %rsp and %rip always hold pointers
365 | - Register values that are “close” to %rsp or %rip are probably also pointers
366 | - If a register is being used as a pointer…
367 | - mov (%rsi), %rsi
368 | - …Then its value is expected to be a pointer.
369 | - There might be a bug that makes its value incorrect.
370 |
371 | - Not as obvious with complicated address “modes”
372 | - (%rsi, %rbx) – One of these is a pointer, we don’t know which.
373 | - (%rsi, %rbx, 2) – %rsi is a pointer, %rbx isn’t (why?)
374 | - 0x400570(, %rbx, 2) – 0x400570 is a pointer, %rbx isn’t (why?)
375 | - lea (anything), %rax – (anything) may or may not be a pointer
376 |
377 | ### Control Flow
378 | - We would use jmp statements as GOTO statements for changing control flows
379 | 
380 | - We will use condition codes to determine if we need to jump
381 |
382 | - Single-bit registers
383 | - CF Carry Flag (for unsigned)
384 | - SF Sign Flag (for signed)
385 | - ZF Zero Flag
386 | - OF Overflow Flag (for signed)
387 |
388 | - Compare Instruction
389 | - cmp a, b
390 | - Computes 𝑏𝑏 − 𝑎𝑎 (just like sub)
391 | - Sets condition codes based on the result, but…
392 | - Does not change 𝒃𝒃
393 | - Used for if (a < b) { … } whenever 𝑏𝑏 − 𝑎𝑎 isn’t needed for anything else
394 |
395 | - Test Instruction
396 | - test a, b
397 | - Computes 𝑏𝑏&𝑎𝑎 (just like and)
398 | - Sets condition codes (only SF and ZF) based on the result, but…
399 | - Does not change 𝒃𝒃
400 | - Most common use: test %rX, %rX to compare %rX to zero
401 | - Second most common use: test %rX, %rY tests if any of the 1-bits in %rY are also 1 in %rX (or vice versa)
402 |
403 | ### Conditional Branches
404 | - jX Instructions
405 | - Jump to different parts of the code depending on condition codes
406 | 
407 | - For example:
408 | - cmp a, b
409 | - je 1000
410 | - If a and b are equal, then jump to instruction 1000
411 |
412 | - SetX Instructions
413 | - Set the low-order byte of the destination to 0 or 1 based on combinations of condition codes
414 | - Does not alter the remaining 7 bytes
415 | 
416 |
417 | - Example
418 | 
419 |
420 | ### Conditional Move Instruction
421 | - This is much more efficient for processors
422 |
423 | - Conditional Move Instructions
424 | - Instruction supports:
425 | - if (Test) Dest ← Src
426 | - Supported in post-1995 x86 processors
427 | - GCC tries to use them
428 | - But, only when known to be safe
429 | - Why?
430 | - Branches are very disruptive to an instruction flow through pipelines
431 | - Conditional moves do not require control transfer
432 |
433 |
434 | ### Loops
435 | - Use conditional branch to either continue looping or to exit the loop
436 |
437 |
438 | ## Machine Programming Advanced
439 | ### Linux x-86 Memory Layout
440 | - Stack
441 | - Runtime stack (8MB limit)
442 | - e.g., local variables
443 | - Heap
444 | - Dynamically allocated as needed
445 | - When call malloc(), calloc(), new()
446 | - Data
447 | - Statically allocated data
448 | - e.g., global vars, static vars, string constants
449 | - Text / Shared Libraries
450 | - Executable machine instructions
451 | - Read-only
452 | 
453 |
454 | - Example layout:
455 | - Here, the struct is allocated in the stack,
456 | - In every function call, the memory address of the return value is pushed to the stack pointer,
457 | - The s.a[i] line is writing to memory. In lecture 1, fun(6) crashed the program. Why did writing to this location cause the process to crash?
458 | - This is possible when the instruction has been overwritten.
459 | 
460 |
461 | ### Buffer Overflow
462 | 
463 | - Generally called a “buffer overflow”
464 | - When exceeding the memory size allocated for an array
465 | - Why a big deal?
466 | - It’s the #1 technical cause of security vulnerabilities
467 | - #1 overall cause is social engineering/user ignorance
468 | - Most common form
469 | - Unchecked lengths on string inputs
470 | - Particularly for bounded character arrays on the stack
471 | - sometimes referred to as stack-smashing
472 | - This is an issue for gets() function
473 | - Similar problems with other library functions
474 | - strcpy, strcat: Copy strings of arbitrary length
475 | - scanf, fscanf, sscanf, when given %s conversion specification
476 |
477 | ### View of Buffer Overflow
478 | 
479 | 
480 | - This could be utilized for Stack Smashing Attacks
481 | 
482 | - Overwrite normal return address A with the address of some other code S
483 | - When Q executes ret, will jump to other code
484 | - x-86 reads long in little-endian order, thus we would need to create a reversed address attack string.
485 |
486 | ### Code Injection Attacks
487 | 
488 | - Input string contains byte representation of executable code
489 | - Overwrite return address A with the address of buffer B
490 | - When Q executes ret, will jump to exploit code
491 |
492 | ### Code Injection Execution
493 | 
494 |
495 | ### Avoiding Buffer Overflow
496 | 1. Avoid Overflow Vulnerabilities in Code
497 | - For example, use library routines that limit string lengths
498 | - fgets instead of gets
499 | - strncpy instead of strcpy
500 | - Don’t use scanf with %s conversion specification
501 | - Use fgets to read the string
502 | - Or use %ns where n is a suitable integer
503 |
504 | 2. System-Level Protections
505 | - Randomized stack offsets
506 | - At the start of the program, allocate a random amount of space on the stack
507 | - Shifts stack addresses for the entire program
508 | - Makes it difficult for hackers to predict the beginning of inserted code
509 | 
510 |
511 | - Non-executable memory
512 | - Older x86 CPUs would execute machine code from any readable address
513 | - x86-64 added a way to mark regions of memory as not executable
514 | - Immediate crash on jumping into any such region
515 | - Current Linux and Windows marks the stack this way
516 | 
517 |
518 | - Stack Canaries
519 | - Idea
520 | - Place a special value (“canary”) on the stack just beyond the buffer
521 | - Check for corruption before exiting function
522 | - GCC Implementation
523 | - -fstack-protector
524 | - Now the default (disabled earlier)
525 | 
526 |
527 | ### Return-Oriented Programming Attacks
528 | - Challenge (for hackers)
529 | - Stack randomization makes it hard to predict buffer location
530 | - Marking stack non-executable makes it hard to insert binary code
531 | - Alternative Strategy
532 | - Use existing code
533 | - Part of the program or the C library
534 | - String together fragments to achieve an overall desired outcome
535 | - Does not overcome stack canaries
536 | - Construct programs from gadgets
537 | - The sequence of instructions ending in ret
538 | - Encoded by single byte 0xc3
539 | - Code positions fixed from run to run
540 | - Code is executable
541 |
542 | ### ROP Execution
543 | 
544 | - Trigger with ret instruction
545 | - Will start executing Gadget 1
546 | - Final ret in each gadget will start the next one
547 | - ret: pop the address from the stack and jump to that address
548 |
--------------------------------------------------------------------------------
/15-XXX Computer Systems/15-619 Cloud Computing.md:
--------------------------------------------------------------------------------
1 |
2 |
--------------------------------------------------------------------------------
/15-XXX Computer Systems/15-641 Computer Network.md:
--------------------------------------------------------------------------------
1 |
2 |
--------------------------------------------------------------------------------
/15-XXX Computer Systems/15-645 Database Systems.md:
--------------------------------------------------------------------------------
1 |
2 |
--------------------------------------------------------------------------------
/18-XXX Electrical and Computer Engineering/18-746 Storage Systems.md:
--------------------------------------------------------------------------------
1 |
2 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2024 Jiawei (Allen) Zhu
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # CMU Course Notes
2 |
3 | ## Ongoing Updates
4 |
5 | - 10-601 Machine Learning
6 | - 14-736 Distributed Systems
7 | - 15-513 Computer Systems
8 |
9 | ## TBD
10 |
11 | - 10-617 Intermediate Deep Learning
12 | - 10-623 Generative AI
13 | - 10-725 Convex Optimization
14 | - 11-742 Search Engines
15 | - 11-667 Large Language Models
16 | - 15-618 Parallel Computer Architecture and Programming
17 | - 15-619 Cloud Computing
18 | - 15-641 Computer Networks
19 | - 15-645 Database Systems
20 | - 18-746 Storage Systems
21 |
22 | ## Usage
23 |
24 | The repository is organized by course departments. Each department folder contains individual courses, where you'll find notes in PDF and Markdown formats.
25 |
26 | To view a note, simply click on the file. To download, click the 'Download' button or use the `raw` option in the GitHub interface.
27 |
28 | ## How to Contribute
29 |
30 | We welcome contributions to courses not yet listed at the moment! To contribute:
31 | 1. Fork the repository.
32 | 2. Make your changes.
33 | 3. Submit a pull request with a clear description of your improvements.
34 |
35 | ## Responsibilities
36 |
37 | Please refrain from uploading code or unauthorized material from CMU. Such pull requests will not be granted.
38 |
39 | ## License
40 |
41 | This project is licensed under the MIT License. This means you're free to use, modify, and distribute the notes as long as you credit the source. See [LICENSE.md](LINK) for more details.
42 |
43 | ## Acknowledgments
44 |
45 | Special thanks to the faculty and students of CMU who have contributed to and inspired this collection of notes and Xuezzou, who inspired me to create this project.
46 |
47 |
--------------------------------------------------------------------------------