├── README.md
├── _config.yml
├── advanced-networking-speciality.md
├── developer-associate.md
├── devops-engineer-professional-02.md
├── devops-engineer-professional.md
├── solutions-architect-associate.md
├── solutions-architect-professional.md
└── sysops-administrator-associate.md


/README.md:
--------------------------------------------------------------------------------
 1 | # AWS Certification Notes
 2 | 
 3 | * [DevOps Engineer Professional DOP-C02](devops-engineer-professional-02.md) (2023)
 4 | * [Advanced Network Speciality](advanced-networking-speciality.md) (2021)
 5 | * [Solutions Architect Professional](solutions-architect-professional.md) (2021)
 6 | * [DevOps Engineer Professional DOP-C01](devops-engineer-professional.md) (2020)
 7 | * [Solutions Architect Associate](solutions-architect-associate.md) (2019)
 8 | * [SysOps Administrator Associate](sysops-administrator-associate.md) (2018)
 9 | * [Developer Associate](developer-associate.md) (2017)
10 | 
11 | ---
12 | 
13 | * View as [GitHub Pages](https://jangroth.github.io/aws-certification-notes/)
14 | * TOCs generated with [MarkDownHelper](https://github.com/jangroth/markdownhelper)
15 | 


--------------------------------------------------------------------------------
/_config.yml:
--------------------------------------------------------------------------------
1 | markdown: GFM
2 | theme: jekyll-theme-dinky
3 | title: AWS Cert Notes
4 | description: My AWS cert notes.
5 | 


--------------------------------------------------------------------------------
/advanced-networking-speciality.md:
--------------------------------------------------------------------------------
  1 | <!-- toc_start -->
  2 | <a name="top"></a>
  3 | ---
  4 | * [Advanced Networking - Speciality](#1)
  5 | * [Exam Objectives](#2)
  6 |   * [Content](#2_1)
  7 | * [Design and Implement AWS Networks](#3)
  8 |   * [AWS Global Network Infrastructure](#3_1)
  9 |   * [Virtual Private Cloud (VPC)](#3_2)
 10 |   * [Connecting VPCs to other VPCs](#3_3)
 11 |   * [Extending on-premises networks to VPCs](#3_4)
 12 | * [Open](#4)
 13 |   * [Services](#4_1)
 14 |   * [Topics](#4_2)
 15 |   * [Practice/Hands-on](#4_3)
 16 |   * [Supporting Material](#4_4)
 17 | ---
 18 | <!-- toc_end -->
 19 | ---
 20 | <a name="1"></a>
 21 | # [↖](#top)[↓](#2) Advanced Networking - Speciality
 22 | 
 23 | > 8/2021 -
 24 | 
 25 | ---
 26 | 
 27 | <a name="2"></a>
 28 | # [↖](#top)[↑](#1)[↓](#2_1) Exam Objectives
 29 | * Design, develop, and deploy cloud-based solutions using AWS.
 30 | * Implement core AWS services according to basic architectural best practices.
 31 | * Design and maintain network architecture for all AWS services.
 32 | * Leverage tools to automate AWS networking tasks.
 33 | 
 34 | <a name="2_1"></a>
 35 | ## [↖](#top)[↑](#2)[↓](#2_1_1) Content
 36 | <!-- toc_start -->
 37 | * [Domain 1: Design and Implement Hybrid IT Network Architectures at Scale](#2_1_1)
 38 | * [Domain 2: Design and Implement AWS Networks](#2_1_2)
 39 | * [Domain 3: Automate AWS Tasks](#2_1_3)
 40 | * [Domain 4: Configure Network Integration with Application Services](#2_1_4)
 41 | * [Domain 5: Design and Implement for Security and Compliance](#2_1_5)
 42 | * [Domain 6: Manage, Optimize, and Troubleshoot the Network](#2_1_6)
 43 | <!-- toc_end -->
 44 | <a name="2_1_1"></a>
 45 | ### [↖](#2_1)[↑](#2_1)[↓](#2_1_2) Domain 1: Design and Implement Hybrid IT Network Architectures at Scale
 46 | * 1.1 Implement connectivity for hybrid IT
 47 | * 1.2 Given a scenario, derive an appropriate hybrid IT architecture connectivity solution
 48 | * 1.3 Explain the process to extend connectivity using AWS Direct Connect
 49 | * 1.4 Evaluate design alternatives that leverage AWS Direct Connect
 50 | * 1.5 Define routing policies for hybrid IT architectures
 51 | <a name="2_1_2"></a>
 52 | ### [↖](#2_1)[↑](#2_1_1)[↓](#2_1_3) Domain 2: Design and Implement AWS Networks
 53 | * 2.1 Apply AWS networking concepts
 54 | * 2.2 Given customer requirements, define network architectures on AWS
 55 | * 2.3 Propose optimized designs based on the evaluation of an existing implementation
 56 | * 2.4 Determine network requirements for a specialized workload
 57 | * 2.5 Derive an appropriate architecture based on customer and application requirements
 58 | * 2.6 Evaluate and optimize cost allocations given a network design and application data flow
 59 | <a name="2_1_3"></a>
 60 | ### [↖](#2_1)[↑](#2_1_2)[↓](#2_1_4) Domain 3: Automate AWS Tasks
 61 | * 3.1 Evaluate automation alternatives within AWS for network deployments
 62 | * 3.2 Evaluate tool-based alternatives within AWS for network operations and management
 63 | <a name="2_1_4"></a>
 64 | ### [↖](#2_1)[↑](#2_1_3)[↓](#2_1_5) Domain 4: Configure Network Integration with Application Services
 65 | * 4.1 Leverage the capabilities of Route 53
 66 | * 4.2 Evaluate DNS solutions in a hybrid IT architecture
 67 | * 4.3 Determine the appropriate configuration of DHCP within AWS
 68 | * 4.4 Given a scenario, determine an appropriate load balancing strategy within the AWS ecosystem
 69 | * 4.5 Determine a content distribution strategy to optimize for performance
 70 | * 4.6 Reconcile AWS service requirements with network requirements
 71 | <a name="2_1_5"></a>
 72 | ### [↖](#2_1)[↑](#2_1_4)[↓](#2_1_6) Domain 5: Design and Implement for Security and Compliance
 73 | * 5.1 Evaluate design requirements for alignment with security and compliance objectives
 74 | * 5.2 Evaluate monitoring strategies in support of security and compliance objectives
 75 | * 5.3 Evaluate AWS security features for managing network traffic
 76 | * 5.4 Utilize encryption technologies to secure network communications
 77 | <a name="2_1_6"></a>
 78 | ### [↖](#2_1)[↑](#2_1_5)[↓](#3) Domain 6: Manage, Optimize, and Troubleshoot the Network
 79 | * 6.1 Given a scenario, troubleshoot and resolve a network issu
 80 | 
 81 | <a name="3"></a>
 82 | # [↖](#top)[↑](#2_1_6)[↓](#3_1) Design and Implement AWS Networks
 83 | 
 84 | <a name="3_1"></a>
 85 | ## [↖](#top)[↑](#3)[↓](#3_1_1) AWS Global Network Infrastructure
 86 | <!-- toc_start -->
 87 | * [Overview](#3_1_1)
 88 | <!-- toc_end -->
 89 | 
 90 | <a name="3_1_1"></a>
 91 | ### [↖](#3_1)[↑](#3_1)[↓](#3_2) Overview
 92 | AWS has the concept of a **Region**, which is a physical location around the world where we cluster data centers. We call each group of logical data centers an Availability Zone. Each AWS Region consists of multiple, isolated, and physically separate AZs within a geographic area.
 93 | 
 94 | An **Availability Zone (AZ)** is one or more discrete data centers with redundant power, networking, and connectivity in an AWS Region. AZs give customers the ability to operate production applications and databases that are more highly available, fault tolerant, and scalable than would be possible from a single data center.
 95 | 
 96 | A **transit center** provides redundant connectivity between AZs and internet backbones.
 97 | 
 98 | **Edge locations** are AWS data centers ('endpoints') designed to deliver services with the lowest latency possible. Amazon has dozens of these data centers spread across the world. They’re closer to users than Regions or Availability Zones, often in major cities, so responses can be fast and snappy. A subset of services for which latency really matters use edge locations, including:
 99 | * *CloudFront*, which uses edge locations to cache copies of the content that it serves, so the content is closer to users and can be delivered to them faster.
100 | * *Lambda@Edge*, is a feature of Amazon CloudFront that lets you run code closer to users of your application, which improves performance and reduces latency.
101 | * *Route 53*, which serves DNS responses from edge locations, so that DNS queries that originate nearby can resolve faster (and, contrary to what you might think, is also Amazon’s premier database).
102 | * *Web Application Firewall* and *AWS Shield*, which filter traffic in edge locations to stop unwanted traffic as soon as possible.
103 | 
104 | **AWS Local Zones** place compute, storage, database, and other select AWS services closer to end-users. With AWS Local Zones, you can easily run highly-demanding applications that require single-digit millisecond latencies to your end-users such as media & entertainment content creation, real-time gaming, reservoir simulations, electronic design automation, and machine learning:
105 | * A Local Zone is an extension of an AWS Region that is geographically close to your users.
106 | * You can extend any VPC from the parent AWS Region into Local Zones by creating a new subnet and assigning it to the AWS Local Zone. When you create a subnet in a Local Zone, your VPC is extended to that Local Zone. The subnet in the Local Zone operates the same as other subnets in your VPC.
107 | 
108 | <a name="3_2"></a>
109 | ## [↖](#top)[↑](#3_1_1)[↓](#3_2_1) Virtual Private Cloud (VPC)
110 | <!-- toc_start -->
111 | * [Overview](#3_2_1)
112 |   * [Default VPC (Amazon specific)](#3_2_1_1)
113 |   * [Non-default VPC (regular VPC)](#3_2_1_2)
114 |   * [VPC Scenarios](#3_2_1_3)
115 | * [Core Components](#3_2_2)
116 | * [Security Components](#3_2_3)
117 | * [Structure & Package Flow](#3_2_4)
118 |   * [Package flow through VPC components](#3_2_4_1)
119 | * [Limits](#3_2_5)
120 | <!-- toc_end -->
121 | 
122 | <a name="3_2_1"></a>
123 | ### [↖](#3_2)[↑](#3_2)[↓](#3_2_1_1) Overview
124 | **Amazon Virtual Private Cloud (Amazon VPC)** is a service that lets you launch AWS resources in a logically isolated virtual network that you define. You have complete control over your virtual networking environment, including selection of your own IP address range, creation of subnets, and configuration of route tables and network gateways. You can use both IPv4 and IPv6 for most resources in your virtual private cloud, helping to ensure secure and easy access to resources and applications.
125 | 
126 | As one of AWS's foundational services, Amazon VPC makes it easy to customize your VPC's network configuration. You can create a public-facing subnet for your web servers that have access to the internet. It also lets you place your backend systems, such as databases or application servers, in a private-facing subnet with no internet access. Amazon VPC lets you to use multiple layers of security, including security groups and network access control lists, to help control access to Amazon EC2 instances in each subnet.
127 | * Provisions a logically isolated section of the AWS cloud
128 | * Spans over all AZs in a region
129 | * Allows to create layered architecture
130 | * Shared or dedicated tenancy (exclusive hardware or not)
131 |   * Cannot be changed after VPC creation
132 | * *Security groups* and subnet-level *network ACLs*
133 | * Ability to extend on-premises network to cloud
134 | * Can be extended *after creation* by adding 1 to utmost 4 CIDR blocks
135 | * On AWS
136 | 	* <a href="https://aws.amazon.com/vpc/" target="_blank">Service</a> - <a href="https://aws.amazon.com/vpc/faqs/" target="_blank">FAQs</a> - <a href="https://docs.aws.amazon.com/toolkit-for-visual-studio/latest/user-guide/vpc-tkv.html" target="_blank">User Guide</a>
137 | 
138 | <a name="3_2_1_1"></a>
139 | #### [↖](#3_2)[↑](#3_2_1)[↓](#3_2_1_2) Default VPC (Amazon specific)
140 | * Gives easy access to a VPC without having to configure it from scratch
141 | * Has different subnets in different AZs and an internet Gateway (HA, spread out to all AZs)
142 | * Each instance launched automatically receives a *public IP* (and a private IP), this is usually not the case for non-default VPCs
143 | * Cannot be restored if deleted
144 | * Comes with default NACL that allows all inbound/outbound traffic
145 | <a name="3_2_1_2"></a>
146 | #### [↖](#3_2)[↑](#3_2_1_1)[↓](#3_2_1_3) Non-default VPC (regular VPC)
147 | * Only has private IP addresses
148 | * Resources *only* accessible through *Elastic IP*, *VPN* or *Internet Gateways*
149 | <a name="3_2_1_3"></a>
150 | #### [↖](#3_2)[↑](#3_2_1_2)[↓](#3_2_2) VPC Scenarios
151 | * VPC with private subnet only -> single tier apps
152 | * VPC with public and private subnets -> layered apps
153 | * VPC with public, private subnets and hardware connected VPN -> extending apps to on-premises
154 | * VPC with private subnets and hardware connected VPN -> extended VPN
155 | 
156 | <a name="3_2_2"></a>
157 | ### [↖](#3_2)[↑](#3_2_1_3)[↓](#3_2_3) Core Components
158 | * **CIDR range**
159 |   * VPCs are private networks and use RFC1918 ranges
160 |     * 10.0.0.0/8 (-> `10.255.255.255`)
161 |     * 172.16.0.0/12 (-> `172.31.255.255`)
162 |     * 192.168.0.0/16 (-> `192.168.255.255`)
163 |   * This guarantees that VPCs cannot conflict in the public internet
164 | * **Subnet**
165 |   * In exactly one AZ
166 |   * If traffic is routed to an Internet Gateway, the subnet is known as a *public subnet*
167 |     * Gets public IP through Internet Gateway
168 |   * If a subnet doesn't have a route to the Internet Gateway, it's known as a *private subnet*
169 |     * Can get internet access through NAT Gateway
170 |   * EC2 instances are launched into subnets
171 |   * Sometimes grouped into Subnet Groups, e.g. for caching or DB. Typically across AZs
172 | * **Route Table**
173 |   * Contains a set of rules, called *routes* that determine where network traffic is directed to
174 |   * Each VPC automatically comes with a *main route table* that can be configured
175 |     * Each subnet in a VPC must be associated with a route table; the table controls the routing for the subnet.
176 |     * A subnet can only be associated with one route table at a time, but multiple subnets can be associated with the same route table
177 |   * Each route in a table specifies a destination CIDR and a target
178 |   * Every route table contains a local route for communication within the VPC
179 |   * Has a *local route* for communication within the VPC (e.g. `172.31.0.0/16`)
180 |   * Can have a *default route* `0.0.0.0/0` to route everything that doesn't have a specific rule
181 | 
182 | |Route Table Type|Description|
183 | |-|-|
184 | |Main|The route table that automatically comes with your VPC. It controls the routing for all subnets that are not explicitly associated with any other route table.|
185 | |Subnet|A route table that's associated with a subnet.|
186 | |Gateway|A route table that's associated with an internet gateway or virtual private gateway.|
187 | |Local gateway|A route table that's associated with an Outposts local gateway.|
188 | 
189 | * **Elastic IP**
190 |   * Static IPv4 address mapped to an instance or network interface
191 |   * If attached to network interface it's decoupled from the instance's lifecycle
192 |   * Routes to private IP address of instance
193 |   * Can be remapped in case of failure
194 |   * For use in a specific region only
195 |   * Can only map to instances in public subnets
196 | * **Gateways**
197 |   * *Internet Gateway*
198 |     * Horizontally scaled, redundant, and highly available VPC component that allows communication between instances in a VPC and the internet
199 |     * Provides a target in VPC route tables for internet-routable traffic
200 |     * Performs network address translation (NAT) for instances that have been assigned public IPv4 addresses
201 |   * *Egress-Only* Gateway
202 |     * Allows outbound communication over IPv6 from instances in your VPC to the Internet
203 |     * Prevents the Internet from initiating an IPv6 connection with your instances.
204 |     * (IPv6 addresses are globally unique, and are therefore public by default)
205 |   * *Virtual Private* Gateway (VGW)
206 |     * AWS side of Site-to-site VPN
207 |     * Has VPN connection to customer gateway attached
208 |     * Serves as VPN concentrator on the Amazon side of the VPN connection
209 |     * Only one virtual private gateway can be attached to a VPC at a time
210 |   * *Customer Gateway*
211 |     * Customer side of Site-to-site VPN
212 |     * A physical device or software application on your side of the VPN connection
213 | * **NAT**
214 |   * 'One-way valve' that allows access *to* the internet, but not *from*.
215 |   * *NAT Instances*
216 |     * Manually configured instance from an NAT AMI
217 |     * Need to manually disable *source/destination check* on the instance
218 |   * *NAT Gateway*
219 |     * AWS-mananged service
220 |     * HA per AZ, create one gateway per AZ
221 | * **DNS**
222 |   * Route53 resolver is provided for VPC (can be disabled)
223 |     * Can provide DHCP options to provide own DNS configuration
224 |   * DNS hostnames are provides (can be disabled)
225 |     * Private (internal) hostname: `ip-private-ipv4-address.region.compute.internal`
226 |     * Public (external) hostname: `ec2-public-ipv4-address.region.compute.amazonaws.com`
227 | <a name="3_2_3"></a>
228 | ### [↖](#3_2)[↑](#3_2_2)[↓](#3_2_4) Security Components
229 | * **Security Groups**
230 |   * Acts as a virtual, distributed firewall to control inbound and outbound traffic to instances
231 |   * Acts on instance level, not subnet level
232 |   * 'Allow rules' for inbound and outbound traffic (*no* explicite deny rules)
233 |     * All outbound traffic is allowed by default
234 |     * All inbound traffic is denied per default
235 |   * Support *allow* rules only
236 |     * Cannot block individual IP adresses (use NACL for that)
237 |   * *Stateful* - will always allow response to (allowed) outbound traffic
238 |   * Can refer to other security groups, e.g. allow traffic from there
239 |   * Can have mulitple security groups attached to an instance
240 |   * Can have any number of instances within a security group
241 | * **Network ACL**
242 |   * Subnet level, acting as firewall
243 |   * One subnet can (and must) only ever be associated to one NACL, however, one NACL can be associated to many subnets
244 |   * Rules for inbound and outbound traffic
245 |   * Rules have numbers and are evaluated from low to high
246 |   * Default is to deny everything in and out
247 |   * *Stateless*
248 |   * Support *allow* and *deny* rules
249 |   * Can block IP addresses (Security groups can't)
250 |   * **Cannot** block URLs (forward proxies can)
251 | * **VPC Flow Logs**
252 |   * Capture information about the IP traffic going to and from network interfaces in a VPC.
253 |     * Contains description of networking packets, but not their payload
254 |   * Log data can be published to Amazon CloudWatch Logs and Amazon S3
255 |   * Can be created at 3 levels:
256 |     * VPC
257 |     * Subnet
258 |     * Network interface
259 | 
260 | <a name="3_2_4"></a>
261 | ### [↖](#3_2)[↑](#3_2_3)[↓](#3_2_4_1) Structure & Package Flow
262 | <a name="3_2_4_1"></a>
263 | #### [↖](#3_2)[↑](#3_2_4)[↓](#3_2_5) Package flow through VPC components
264 | * VPC (has *CIDR*)
265 | 	* Gateway (Internet or VPN)
266 |   * Router
267 | 	* Route table (one per subnet, can be shared)
268 | 	* Network ACL (one per subnet, can be shared)
269 | 	* Subnets (CIDRs match VPC's CIDR)
270 | 	* Security Group (on VPC level)
271 | 	* Instance (needs public IP for internet communication, either ELB or Elastic IP)
272 | 
273 | 
274 | <a name="3_2_5"></a>
275 | ### [↖](#3_2)[↑](#3_2_4_1)[↓](#3_3) Limits
276 | |||
277 | |-|-|
278 | |VPCs per region|5|
279 | |Min/max VPC size|`/28`/`/16`|
280 | |Subnets per VPC|200|
281 | |Customer gateways per region|50|
282 | |Gateway per region|5 Internet|
283 | |Elastic IPs per account per region|5|
284 | |VPN connections per region|50|
285 | |Route tables per region|200|
286 | |Security groups per region|500|
287 | 
288 | ---
289 | 
290 | <a name="3_3"></a>
291 | ## [↖](#top)[↑](#3_2_5)[↓](#3_3_1) Connecting VPCs to other VPCs
292 | <!-- toc_start -->
293 | * [Overview](#3_3_1)
294 | * [VPC Peering](#3_3_2)
295 |   * [Establishing a VPC peering](#3_3_2_1)
296 |   * [Longest prefix match](#3_3_2_2)
297 |   * [Unsupported VPC peering configurations](#3_3_2_3)
298 |   * [Limits](#3_3_2_4)
299 | * [Transit Gateway](#3_3_3)
300 |   * [Overview](#3_3_3_1)
301 |   * [Setting up a Transit Gateway](#3_3_3_2)
302 | * [Transit VPC (=Software VPN, not recommended any more)](#3_3_4)
303 | * [AWS PrivateLink](#3_3_5)
304 | <!-- toc_end -->
305 | 
306 | <a name="3_3_1"></a>
307 | ### [↖](#3_3)[↑](#3_3)[↓](#3_3_2) Overview
308 | 
309 | ||VPC Peering|Transit Gateway|
310 | |-|-|-|
311 | |VPC-Limit|125 peerings|5,000 attachments|
312 | |Bandwith limit|N/A (intra-region)|50Gb/s per VPC attachment|
313 | |Management|Decentralized|Centralized|
314 | |Cost Dimensions|Data transfer|Data transfer & attachment|
315 | 
316 | <a name="3_3_2"></a>
317 | ### [↖](#3_3)[↑](#3_3_1)[↓](#3_3_2_1) VPC Peering
318 | * Connect VPCs through direct network routing
319 | 	* Cross-region, cross-account
320 | * Allows instances to communicate with each other as if they were in the same network
321 | 	* Full private IP connectivity between VPCs
322 | * Connectivity must be established for each VPC that need to communicate with one another
323 | * Can reference a security group of a peered VPC (even cross-account)
324 | * Must update route tables in each VPC’s subnets to ensure instances can communicate
325 | * On AWS:
326 | 	* <a href="https://docs.aws.amazon.com/vpc/latest/peering/what-is-vpc-peering.html" target="_blank">Documentation</a> - <a href="https://aws.amazon.com/vpc/faqs/" target="_blank">FAQs</a>
327 | 
328 | <a name="3_3_2_1"></a>
329 | #### [↖](#3_3)[↑](#3_3_2)[↓](#3_3_2_2) Establishing a VPC peering
330 | * Consumer VPC initiates peering request
331 | * Provider VPC accepts peering request
332 | * Route tables on both sides are updated, to ensure traffic can flow
333 | 
334 | <a name="3_3_2_2"></a>
335 | #### [↖](#3_3)[↑](#3_3_2_1)[↓](#3_3_2_3) Longest prefix match
336 | * VPC uses the longest prefix match to select the most specific route
337 | * Other way of saying it is “most specific route”
338 | 
339 | <a name="3_3_2_3"></a>
340 | #### [↖](#3_3)[↑](#3_3_2_2)[↓](#3_3_2_4) Unsupported VPC peering configurations
341 | * *Overlapping CIDR blocks*
342 | 	* Cannot create a VPC peering connection between VPCs with matching or overlapping IPv4 CIDR blocks
343 | * *Transitive peering*
344 | 	* You have a VPC peering connection between VPC A and VPC B (pcx-aaaabbbb), and between VPC A and VPC C (pcx-aaaacccc). There is no VPC peering connection between VPC B and VPC C. You cannot route packets directly from VPC B to VPC C through VPC A.
345 | * *Edge to edge routing through a gateway or private connection*
346 | 	* A VPN connection or an AWS Direct Connect connection to a corporate network
347 | 	* An internet connection through an internet gateway
348 | 	* An internet connection in a private subnet through a NAT device
349 | 	* A gateway VPC endpoint to an AWS service; for example, an endpoint to Amazon S3.
350 | 	* (IPv6) A ClassicLink connection. You can enable IPv4 communication between a linked EC2-Classic instance and instances in a VPC on the other side of a VPC peering connection. However, IPv6 is not supported in EC2-Classic, so you cannot extend this connection for IPv6 communication.
351 | 
352 | <a name="3_3_2_4"></a>
353 | #### [↖](#3_3)[↑](#3_3_2_3)[↓](#3_3_3) Limits
354 | ||soft|hard|
355 | |-|-|-|
356 | |Active VPC peering connections per VPC|50|125|
357 | 
358 | <a name="3_3_3"></a>
359 | ### [↖](#3_3)[↑](#3_3_2_4)[↓](#3_3_3_1) Transit Gateway
360 | 
361 | <a name="3_3_3_1"></a>
362 | #### [↖](#3_3)[↑](#3_3_3)[↓](#3_3_3_2) Overview
363 | AWS Transit Gateway connects VPCs and on-premises networks through a central hub. This simplifies your network and puts an end to complex peering relationships. It acts as a cloud router – each new connection is only made once.
364 | 
365 | As you expand globally, inter-Region peering connects AWS Transit Gateways together using the AWS global network. Your data is automatically encrypted, and never travels over the public internet. And, because of its central position, AWS Transit Gateway Network Manager has a unique view over your entire network, even connecting to Software-Defined Wide Area Network (SD-WAN) devices.
366 | * For having transitive peering between thousands of VPC and on-premises, hub-and-spoke (star) connection
367 |   * Private IP connectivity
368 |   * VPCs must be in same region as Transit Gateway
369 |     * However, you can peer Transit Gateways across regions
370 |   * VPCs can be in different accounts
371 | * Transit Gateway Route Tables: Control which VPC can talk with other VPC
372 | * Works with Direct Connect Gateway, VPN connections
373 | * Instances in a VPC can access a NAT Gateway, NLB, PrivateLink, and EFS in others VPCs attached to the AWS Transit Gateway.
374 | * Share cross-account using Resource Access Manager
375 | 	* AWS Resource Access Manager (AWS RAM) lets you share your resources with any AWS account or through AWS Organizations. If you have multiple AWS accounts, you can create resources centrally and use AWS RAM to share those resources with other accounts.
376 | * Supports *IP Multicast* (not supported by any other AWS service)
377 | * On AWS:
378 | 	* <a href="https://aws.amazon.com/transit-gateway/" target="_blank">Service</a> - <a href="https://aws.amazon.com/transit-gateway/faqs/" target="_blank">FAQs</a> - <a href="https://docs.aws.amazon.com/vpc/latest/tgw/what-is-transit-gateway.html" target="_blank">User Guide</a>
379 | 
380 | <a name="3_3_3_2"></a>
381 | #### [↖](#3_3)[↑](#3_3_3_1)[↓](#3_3_4) Setting up a Transit Gateway
382 | * Connected VPCs route to Transit Gateway
383 | * Transit Gateway Route Table determines which VPCs can talk to each other
384 | 
385 | <a name="3_3_4"></a>
386 | ### [↖](#3_3)[↑](#3_3_3_2)[↓](#3_3_5) Transit VPC (=Software VPN, not recommended any more)
387 | * Not an AWS offering, newer managed solution is Transit Gateway
388 | * Uses the public internet with a software VPN solution
389 | * Allows for transitive connectivity between VPC & locations
390 | * More complex routing rules, overlapping CIDR ranges, network-level packet filtering
391 | 
392 | <a name="3_3_5"></a>
393 | ### [↖](#3_3)[↑](#3_3_4)[↓](#3_4) AWS PrivateLink
394 | ...
395 | 
396 | ---
397 | 
398 | <a name="3_4"></a>
399 | ## [↖](#top)[↑](#3_3_5)[↓](#3_4_1) Extending on-premises networks to VPCs
400 | <!-- toc_start -->
401 | * [AWS VPN](#3_4_1)
402 | * [AWS Direct Connect](#3_4_2)
403 | <!-- toc_end -->
404 | 
405 | <a name="3_4_1"></a>
406 | ### [↖](#3_4)[↑](#3_4)[↓](#3_4_2) AWS VPN
407 | ...
408 | <a name="3_4_2"></a>
409 | ### [↖](#3_4)[↑](#3_4_1)[↓](#4) AWS Direct Connect
410 | ...
411 | 
412 | ---
413 | 
414 | <a name="4"></a>
415 | # [↖](#top)[↑](#3_4_2)[↓](#4_1) Open
416 | <a name="4_1"></a>
417 | ## [↖](#top)[↑](#4)[↓](#4_2) Services
418 | * RAM
419 | <a name="4_2"></a>
420 | ## [↖](#top)[↑](#4_1)[↓](#4_3) Topics
421 | * IPv4 vs IPv6
422 | * Dynamic Routing Protocols (BGP)
423 | <a name="4_3"></a>
424 | ## [↖](#top)[↑](#4_2)[↓](#4_4) Practice/Hands-on
425 | * VPC Peering
426 | * Transit Gateway
427 | * Transit VPC
428 | * PrivateLink/Endpoint service
429 | 
430 | ---
431 | 
432 | <a name="4_4"></a>
433 | ## [↖](#top)[↑](#4_3) Supporting Material
434 | * [Exam Readiness: AWS Certified Advanced Networking - Specialty](https://www.aws.training/Details/Curriculum?id=21330) (free aws training)
435 | * [AWS Networking Fundamentals](https://www.youtube.com/watch?v=hiKPPy584Mg) (youtube)
436 | 
437 | 


--------------------------------------------------------------------------------
/developer-associate.md:
--------------------------------------------------------------------------------
   1 | [toc_start]::
   2 | <a name="top"></a>
   3 | ---
   4 | * [AWS Developer Associate](#1)
   5 | * [AWS Fundamentals](#2)
   6 |   * [Global infrastructure](#2_1)
   7 |   * [Storage overview](#2_2)
   8 |   * [Security Concepts](#2_3)
   9 | * [Services](#3)
  10 |   * [IAM](#3_1)
  11 |   * [Secure Token Service (STS)](#3_2)
  12 |   * [S3](#3_3)
  13 |   * [Dynamo DB](#3_4)
  14 |   * [Elastic Compute Cloud (EC2)](#3_5)
  15 |   * [Elastic Load Balancer (ELB)](#3_6)
  16 |   * [SNS](#3_7)
  17 |   * [SQS](#3_8)
  18 |   * [Cloudformation](#3_9)
  19 |   * [Elastic Beanstalk (EB)](#3_10)
  20 |   * [Simple Workflow Service (SWF)](#3_11)
  21 |   * [Virtual Private Cloud (VPC)](#3_12)
  22 |   * [Relational Database Service (RDS)](#3_13)
  23 | * [Etc](#4)
  24 | ---
  25 | [toc_end]::
  26 | <a name="1"></a>
  27 | # [↖](#top)[↑](#)[↓](#2) Developer Associate
  28 | > 6/2017 - 8/2017
  29 | 
  30 | <a name="2"></a>
  31 | # [↖](#top)[↑](#1)[↓](#2_1) AWS Fundamentals
  32 | 
  33 | <a name="2_1"></a>
  34 | ## [↖](#top)[↑](#2)[↓](#2_2) Global infrastructure
  35 | * **Region** - grouping of data centers
  36 | * **AZ** - indidvidual data center in a region. Redundancy throughout AZs in one region
  37 | * **Edge Location** - location to deliver cached data fast -> Use *Cloudfront* CDN to cache data
  38 | close to where it's being used
  39 | 
  40 | <a name="2_2"></a>
  41 | ## [↖](#top)[↑](#2_1)[↓](#2_2_1) Storage overview
  42 | <a name="2_2_1"></a>
  43 | ### Instance store volumes
  44 | * **Temporary block storage**
  45 | * Physically attached to the host computer of the instance
  46 | * Useful for often-changing data like caches & buffers
  47 | * *Data is lost* when EC2 instance stops or terminates (*ephemeral* data)
  48 | <a name="2_2_2"></a>
  49 | ### Elastic Block Storage (EBS)
  50 | * **Permanent block storage**, independent to instance
  51 | * Attachable to running EC2 instances (same AZ)
  52 | * Only accessible by a *single instance*
  53 | * Can take snapshots from
  54 | * Can be encrypted
  55 | * Stores redundantly in single AZ
  56 | * Different volume options:
  57 | 	* General purpose SSD
  58 | 	* Provisioned IOPS
  59 | 	* Magnetic volumes
  60 | <a name="2_2_3"></a>
  61 | ### Elastic File System (EFS)
  62 | * **Scalable file storage** for use with *Amazon EC2 instances*
  63 | * Elastic storage capacity, growing and shrinking as files are added or removed
  64 | * *Multiple EC2 instances* from *multiple AZs* can access an EFS file system at the same time
  65 | * Stores redundantly in multiple AZs
  66 | <a name="2_2_4"></a>
  67 | ### Amazon Glacier
  68 | * Low cost, very slow retrieval
  69 | * Can be intergrated with S3 lifecycle policy
  70 | <a name="2_2_5"></a>
  71 | ### Database Storage
  72 | * *DynamoDB*
  73 | * *RDS*
  74 | * DBs on EC2 instances
  75 | * *AWS Redshift* (data warehouse service)
  76 | <a name="2_2_6"></a>
  77 | ### In-memory caching
  78 | * *ElastiCache* (Memcached and Redis)
  79 | * Software on EC2 instances
  80 | <a name="2_2_7"></a>
  81 | ### Storage gateway
  82 | * Integrate existing *on-premises storage* infrastructure and data with the AWS Cloud
  83 | 
  84 | <a name="2_3"></a>
  85 | ## [↖](#top)[↑](#2_2_7)[↓](#3) Security Concepts
  86 | * **Shared responsibility** environment
  87 | * AWS is responsible for:
  88 | 	* Server / Host level and below
  89 | 	* Physical environment security
  90 | 	* Hardware decommissioning
  91 | 	* Traffic security (Networks, ACLs, SSL, DDOS-protection)
  92 | 	* EC2 hypervisor isolation
  93 | * User is responsible for:
  94 | 	* IAM
  95 | 	* MFA
  96 | 	* Password/key-rotation
  97 | 	* Access advisor (shows used permissions)
  98 | 	* Trusted advisor (validates best practices)
  99 | 	* Security groups
 100 | 	* ACL (resource based policy)
 101 | 	* VPC
 102 | 
 103 | <a name="3"></a>
 104 | # [↖](#top)[↑](#2_3)[↓](#3_1) Services
 105 | <a name="3_1"></a>
 106 | ## [↖](#top)[↑](#3)[↓](#3_1_1) IAM
 107 | IAM is a global service that helps to securely control access to AWS resources.
 108 | 
 109 | * **Users** hold credentials
 110 | * **Groups** hold users, typically only provides permission to assume a role
 111 | * **Roles** hold policies.
 112 | 	* Can have **trust relationships** with trusted entities that can *assume* this role
 113 | * **Policies** can be attached to users, groups or roles (preferred)
 114 | * An **instance profile** is a container for an IAM role that you can use to pass role information to an
 115 | 	EC2 instance when the instance starts.
 116 | * Users and / or services assume roles
 117 | 
 118 | <a name="3_1_1"></a>
 119 | ### Policies
 120 | * Any actions on resources that are not explicitly allowed are **denied by default**
 121 | * Structure
 122 | 	* **E** - `effect` (*allow* / *deny*)
 123 | 		* What the effect will be when the user requests the specific action
 124 | 	* **P** - `prinicpal` (*ARN*)
 125 | 		* The account or user who is allowed access to the actions and resources in the statement
 126 | 		* IAM policies do not have a principal (because they are attached to users, groups or roles)
 127 | 	* **A** - `action` or `notaction`
 128 | 		* Describes the specific action or actions that will be allowed or denied
 129 | 	* **R** - `resource` or `notresource`
 130 | 		* Specifies the object or objects that the statement covers
 131 | 	* **C** - `condition`
 132 | 		* Specifies conditions for when a policy is in effect
 133 | * Can use **policy variables**
 134 | 	* `aws:currentTime`, `aws:userid`, ...
 135 | 
 136 | ```
 137 | 	{
 138 | 		"Version": "2012-10-17",
 139 | 		"Statement": [
 140 | 			{
 141 | 				"Effect": "Allow",
 142 | 				"Action": "s3:ListAllMyBuckets",
 143 | 				"Resource": "arn:aws:s3:::*"
 144 | 			},
 145 | 			{
 146 | 				"Effect": "Allow",
 147 | 				"Action": [
 148 | 						"s3:ListBucket",
 149 | 						"s3:GetBucketLocation"
 150 | 				],
 151 | 				"Resource": "arn:aws:s3:::productionapp"
 152 | 			},
 153 | 			{
 154 | 				"Effect": "Allow",
 155 | 				"Action": [
 156 | 					"s3:GetObject",
 157 | 					"s3:PutObject",
 158 | 					"s3:DeleteObject"
 159 | 				],
 160 | 				"Resource": "arn:aws:s3:::productionapp/*"
 161 | 			}
 162 | 		]
 163 | 	}
 164 | ```
 165 | <a name="3_1_1_1"></a>
 166 | #### IAM Policies
 167 | * Managed policies (the new way)
 168 | 	* Can be attached to multiple users, groups and roles
 169 | 	* AWS managed policies
 170 | 		* Updated by AWS if new API come out
 171 | * Inline policies (the old way)
 172 | 
 173 | <a name="3_1_2"></a>
 174 | ### Limits
 175 | .|.
 176 | -|-
 177 | Groups per account|100
 178 | Instance profiles|100
 179 | Roles|500
 180 | Server certificates|20
 181 | Users|5000
 182 | 
 183 | <a name="3_2"></a>
 184 | ## [↖](#top)[↑](#3_1_2)[↓](#3_2_1) Secure Token Service (STS)
 185 | * Allows to grant **temporary access** to authenticated users
 186 | 	* IAM users
 187 | 	* Web-based identity providers (google, facebook, ...)
 188 | 	* Organization's existing identity system
 189 | * Returns **temporary credentials** that expire after some time:
 190 | 	* Access key
 191 | 	* Session token
 192 | 
 193 | <a name="3_2_1"></a>
 194 | ### Terms
 195 | * **Federation**
 196 | 	* Trust relationship between identity provider and AWS
 197 | * **Identity broker**
 198 | 	* Broker in charge of mapping user to the right set of credentials
 199 | * **Identity store**
 200 | 	* Eg Google or Facebook
 201 | * **Identities**
 202 | 	* Users
 203 | 
 204 | <a name="3_2_2"></a>
 205 | ### Scenarios
 206 | * Temporary credentials with EC2
 207 | 	* Assign IAM role to instance
 208 | 	* Get temp credentials from *instance metadata*
 209 | * Temporary credentials with SDK
 210 | 	* Call `assumeRole`, extract temp credentials
 211 | * Options for temporary credentials with API calls
 212 | 	* *Sign request* with temp credentials
 213 | 	* Add AC / SK to request (*header* or *query string*)
 214 | 
 215 | <a name="3_3"></a>
 216 | ## [↖](#top)[↑](#3_2_2)[↓](#3_3_1) S3
 217 | 
 218 | Amazon Simple Storage Service (S3) is object storage with a simple web service interface to store and
 219 | retrieve any amount of data from anywhere on the web. It is designed to deliver 11x9 durability and
 220 | scale past trillions of objects worldwide.
 221 | 
 222 | * **Key**-**value** storage (folder-like structure is only a UI representation)
 223 | * **Bucket** size is unlimited. Objects from 0B to 5TB.
 224 | * HA and scalable, transparent data partitioning
 225 | * Bucket lifecycle events can trigger *SNS*, *SQS* or *AWS Lambda*
 226 | 	* New object created events
 227 | 	* Object removal events
 228 | 	* Reduced Redundancy Storage (RRS) object lost event
 229 | * Bucket names have to be globally unique, should comply with DNS naming conventions.
 230 | 	* `http://bucket.s3.amazonaws.com`
 231 | 	* `http://bucket.s3-aws-region.amazonaws.com`
 232 | 	* `http://s3.amazonaws.com/bucket`
 233 | 	* `http://s3-aws-region.amazonaws.com/bucket`
 234 | 
 235 | <a name="3_3_1"></a>
 236 | ### Perfomance & Consistency
 237 | * Bucket operations **get** - **list** - **put** - **delete** - **head**
 238 | 	* Implemented through *http* operations: `GET` - `PUT` - `DELETE` - `HEAD`
 239 | 	* *Read-after-write consistency* for `PUT` of *new* objects.
 240 | 	* *Eventual consistency* for *overwrite* `PUT` and `DELETE` (stale reads but low latency).
 241 | * Can only delete a bucket that is empty.
 242 | * *Scales* automatically, up to a certain limit:
 243 | 	* Consistent:
 244 | 		* `>100 PUT/LIST/DELETE/s`
 245 | 		* `>300 GET/s`
 246 | 	* Bursts:
 247 | 		* `>300 PUT/LIST/DELETE/s`
 248 | 		* `>800 GET/s`
 249 | * Key names are used to determine which partition to store the object in.
 250 | 	* Make sure keys are spread out (not sequential)
 251 | 		* E.g. by adding a random prefix to the key name
 252 | * For `GET` requests put *AWS CloudFront* in front of S3 bucket
 253 | 	* Internal caching
 254 | 	* Reduced latency - objects are physically closer to the consumer.
 255 | * **Multipart upload**
 256 | 	* Recommended for objects >=100MB, mandatory for >=5GB
 257 | 	* Supports parallel uploads
 258 | 	* Can pause & resume
 259 | 	* Can upload file while it's being created
 260 | 	* 3 step process:
 261 | 		* Initiate multipart upload
 262 | 			* `POST /ObjectName?uploads HTTP/1.1`
 263 | 		* Upload of all parts
 264 | 			* `PUT /ObjectName?partNumber=PartNumber&uploadId=UploadId HTTP/1.1`
 265 | 		* Complete Multipart upload
 266 | 			* ```
 267 | 				POST /ObjectName?uploadId=UploadId HTTP/1.1`
 268 | 				<CompleteMultipartUpload>...
 269 | 				```
 270 | 
 271 | <a name="3_3_2"></a>
 272 | ### Hosting Static Websites
 273 | `<bucket-name>.s3-website-<AWS-Region>.amazonaws.com`
 274 | * Bucket name *must* match domain name. Every hosted bucket recieves its own URL
 275 | * Use *AWS Route 53* to integrate custom domains (also to automatically fail-over from dynamic website)
 276 | * Specify `index` & `error` documents
 277 | * In *AWS Route 53*: create hosted zone & record set
 278 | * Might need to add CORS configuration to bucket (cross origin resource sharing)
 279 | 
 280 | <a name="3_3_3"></a>
 281 | ### Access Control
 282 | * **Effect** – This can be either allow or deny
 283 | * **Principal** – Account or user who is allowed access to the actions and resources in the statement
 284 | * **Actions** – For each resource, S3 supports a set of operations
 285 | * **Resources** – Buckets and objects are the resources
 286 | * Authorization works as a *union* of **IAM** & **bucket policies** and **bucket ACLs**
 287 | <a name="3_3_3_1"></a>
 288 | #### Defaults
 289 | * Bucket is *owned* by the AWS account that created it
 290 | 	* Bucket ownership is not transferable
 291 | * Bucket owner gets full permission (ACL)
 292 | * The person paying the bills always has full control.
 293 | * A person uploading an object into a bucket owns it by default.
 294 | <a name="3_3_3_2"></a>
 295 | #### IAM
 296 | * IAM policies (in general) specify what actions are allowed or denied on what AWS resources
 297 | * Defined as JSON
 298 | * Attached to IAM users, groups, or roles (so they cannot grant access to anonymous users)
 299 | * Use if you’re more interested in *“What can this user do in AWS?”*
 300 | <a name="3_3_3_3"></a>
 301 | #### Bucket policies
 302 | * Specify what actions are allowed or denied for which principals on the bucket that the policy is
 303 | attached to
 304 | * Defined as JSON
 305 | * Attached *only* to S3 buckets. Can however effect object in buckets.
 306 | * Contain *principal* element (unnecessary for IAM)
 307 | * Use if you’re more interested in *“Who can access this S3 bucket?”*
 308 | * Easiest way to grant *cross-account permissions* for all `s3:*` permission. (Cannot do this with ACLs.)
 309 | <a name="3_3_3_4"></a>
 310 | #### ACLs
 311 | * Defined as XML. Legacy, not recomended any more.
 312 | * Can
 313 | 	* be attached to individual objects (bucket policies only bucket level)
 314 | 	* control access to object uploaded into a bucket from a *different* account.
 315 | * Cannot..
 316 | 	* have conditions
 317 | 	* cannot explicitely deny actions
 318 | 	* grant permission to bucket sub-resources (eg. lifecycle or static website configurations)
 319 | * Other than *object ACL*s there are *bucket ACL*s as well - only for writing access log objects to a
 320 | bucket.
 321 | <a name="3_3_3_5"></a>
 322 | #### How to specify resources in a policy:
 323 | .|.
 324 | -|-
 325 | `arn:partition:service:region:namespace:relative-id`|`arn:aws:s3:::mybucket`
 326 | `arn:aws:s3:::*`|All buckets and objects in account
 327 | `arn:aws:s3:::mybucket`|`mybucket`
 328 | `arn:aws:s3:::mybucket/*`|All objects in `mybucket`
 329 | `arn:aws:s3:::mybucket/mykey`|`mykey` in `mybucket`
 330 | `arn:aws:s3:::mybucket/developers/($aws:username)/`|folder matching the accessing user's name
 331 | <a name="3_3_3_6"></a>
 332 | #### Pre-signed URLs
 333 | All objects are private by default. Only the object owner has permission to access these objects.
 334 | However, the object owner can optionally share objects with others by creating a **pre-signed URL**,
 335 | using their own security credentials, to grant time-limited permission to download the objects.
 336 | 
 337 | <a name="3_3_4"></a>
 338 | ### Logging
 339 | * *AWS CloudTrail* logs S3-API calls for bucket-level operations (and many other information) and
 340 | stores them in an S3 bucket. Could also send email notifications or trigger *SNS* notifications for
 341 | specific events.
 342 | * *S3 Server Access Logs* log on object level.
 343 | 
 344 | <a name="3_3_5"></a>
 345 | ### Versioning
 346 | * Works on bucket level (for *all* objects)
 347 | * Versioning can either be *unversioned* (default), *enabled* or *suspended*
 348 | * **Version ids** are automatically assigned to objects
 349 | 	* Ids cannot changed.
 350 | 	* As long as versioning is *disabled*, id is set to `null`
 351 | 	* Once enabled, versioning can only be suspended (but not disabled)
 352 | 	* `PUT` creates a new version, `GET` returns the latest version. Specific versions can be requested.
 353 | 	* `DELETE` (without version) marks latest version as deleted and returns a `404` for subsequent `GET`s.
 354 | 		* Older versions (pre-delete) can still be requested.
 355 | 		* Restore old version by deleting the new version or by copying the old version on top of the bucket.
 356 | 	* `DELETE` (with a version) permanently deletes that version.
 357 | 	* If versioning is *suspendend*, S3 automatically adds a `null` version ID to every subsequent
 358 | object stored thereafter
 359 | * *Lifecycle Management policies* can automatically handle old versions, e.g. permanently delete or
 360 | move to *AWS Glacier*.
 361 | * Different versions of the same object can have different permissions.
 362 | 
 363 | <a name="3_3_6"></a>
 364 | ### Encryption
 365 | <a name="3_3_6_1"></a>
 366 | #### Protecting data in transit
 367 | * Using an AWS KMS–Managed Customer Master Key (CMK)
 368 | 	* Before *uploading* to S3, Client makes request to KMS, receives plain text encryption key and
 369 | 	cypher blob, to upload to S3 as object metadata. Decrypt by sending cypher blob to KMS, retrieving
 370 | 	plain text back, use for decryption.
 371 | 	* Before *downloading* from S3, The client first downloads the encrypted object from Amazon S3 along
 372 | 	with the cipher blob version of the data encryption key stored as object metadata. The client then
 373 | 	sends the cipher blob to AWS KMS to get the plain text version of the same, so that it can decrypt
 374 | 	the object data.
 375 | * Using a Client-Side Master Key
 376 | 	* Clients provides a master key, S3 client generates random data
 377 | 	key and encrypts with client's master key.
 378 | 	* *Uploads* material description as part of the object metadata.
 379 | 	* On *download* S3 client uses metadata to determine the right master key to use for decryption.
 380 | * Use *SSL encryption*
 381 | <a name="3_3_6_2"></a>
 382 | #### Protecting data at rest
 383 | * Uses *AES-256* (or others)
 384 | * Encryption can be enforced via bucket policy.
 385 | * Enable server-side encryption by adding specific header to request (`x-amz-server-side-encryption`).
 386 | * Server-Side Encryption with *Amazon S3-Managed Keys* (SSE-S3)
 387 | 	* Each object is encrypted with a unique key employing strong multi-factor encryption
 388 | 	* Furthermore it encrypts the key itself with a master key that is rotated regularly
 389 | * Server-Side Encryption with *AWS KMS-Managed Keys* (SSE-KMS)
 390 | 	* Similar to SSE-S3, with extra benefits
 391 | 	* Separate permissions for the use of an envelope key
 392 | 	* Has audit trail
 393 | * Server-Side Encryption with *Customer-Provided Keys* (SSE-C)
 394 | 	* Key is not stored with AWS (stores salted HMAC valued instead)
 395 | 
 396 | <a name="3_3_7"></a>
 397 | ### Storage classes
 398 | .|.
 399 | -|-
 400 | S3 Standard|Durability 11x9
 401 |  |Availability 4x9
 402 | S3 IA (infrequent access)|Durability 11x9
 403 |  |Availability 3x9
 404 | S3 RRS (reduced redundancy storage)|Durability 4x9
 405 |  |Availability 4x9
 406 | 
 407 | <a name="3_3_8"></a>
 408 | ### Request/response headers
 409 | Request|Response
 410 | -|-
 411 | `x-amz-content-sha256`|`x-amz-delete-marker`
 412 | `x-amz-date`|`x-amz-id-2 `
 413 | `x-amz-security-token`|`x-amz-request-id`
 414 |  |`x-amz-version-id `
 415 | 
 416 | <a name="3_3_9"></a>
 417 | ### Error codes
 418 | .|.
 419 | -|-
 420 | 400 Bad Request|`ExpiredToken`
 421 | 400 Bad Request|`InvalidToken`
 422 | 400 Bad Request|`InvalidArgument`
 423 | 400 Bad Request|`InvalidRequest`
 424 | 400 Bad Request|`IncompleteBody`
 425 | 400 Bad Request|`IncompleteDigest`
 426 | 400 Bad Request|`InvalidBucketName`
 427 | 403 Forbidden|`AccessDenied`
 428 | 403 Forbidden|`InvalidAccessKeyId`
 429 | 404 Not Found|`NoSuchBucket`
 430 | 404 Not Found|`NoSuchKey`
 431 | 409 Conflict|`BucketAlreadyExists`
 432 | 409 Conflict|`BucketNotEmpty`
 433 | 
 434 | <a name="3_3_10"></a>
 435 | ### Limits
 436 | .|.
 437 | -|-
 438 | Buckets per account|100
 439 | Bucket policy max size|20KB
 440 | Object size|0B to 5TB
 441 | Object size in a single `PUT`|5GB
 442 | 
 443 | <a name="3_4"></a>
 444 | ## [↖](#top)[↑](#3_3_10)[↓](#3_4_1) Dynamo DB
 445 | 
 446 | <a name="3_4_1"></a>
 447 | ### Overview
 448 | * Fully managed **NoSQL** database
 449 | * *HA* through different AZs, automatically spreads data and traffic accross servers
 450 | 	* 3 geographically distributed regions per table
 451 | * Can scale up and down depending on demand (no downtime, no performance degradation)
 452 | * Built-in monitoring
 453 | * User controlled read/write capacity (recently added: *auto-scaling*)
 454 | * Big data: Integrates with *AWS Elastic MapReduce* and *Redshift*
 455 | * No joins - create references to other tables manually (`table1#something`)
 456 | * Option between **eventual consistency** or **strongly consistency**
 457 | * Conditional updates and concurrency control (**atomic counters**)
 458 | 
 459 | <a name="3_4_2"></a>
 460 | ### Core components
 461 | * A **table** is a collection of items.
 462 | 	* Can be updated through a single `UpdateTable` command at a time (`ACTIVE` -> `UPDATING`)
 463 | * An **item** is a group of one or more attributes that is uniquely identifiable among all of the
 464 | other items. (*row* in a traditional db)
 465 | * An **attribute** is a fundamental data element, something that does not need to be broken down any
 466 | further. Can be nested up to 32 levels. (*column* in a traditional db)
 467 | * **Primary keys** are used to uniquely identify each item in a table. Apart from that DynamoDB is
 468 | *schemaless*, which means that neither the attributes nor their data types need to be defined
 469 | beforehand
 470 | * **Secondary indexes** are used to provide more querying flexibility
 471 | * **Control plane** operations create and manage DynamoDB table
 472 | * **Data plane** operations perform CRUD actions on data in a table
 473 | * **DynamoDB streams** operations capture data modification events in DynamoDB tables
 474 | 
 475 | <a name="3_4_3"></a>
 476 | ### Keys and indexes
 477 | <a name="3_4_3_1"></a>
 478 | #### Partion key (PK)
 479 | * **Partition key** is also called **hash attribute** or **primary key**
 480 | * Must be unique, used for internal hash function (*unordered*)
 481 | * Used to retrieve data
 482 | <a name="3_4_3_2"></a>
 483 | #### PK & Sort key
 484 | * **Composite PK**: *index* composed of hashed PK (*unordered*) and SK (*ordered*)
 485 | * **Sort key** is also called **range attribute** or **range key**
 486 | * Different items can have the same *PK*, must have different *SK*
 487 | 
 488 | <a name="3_4_4"></a>
 489 | ### Secondary indexes
 490 | * Associated with exactly one table, from which it obtains its data
 491 | * Allows to query or scan data by an *alternate key* (other than PK/SK)
 492 | * All secondary indexes are automatically maintained by DynamoDB as sparse objects
 493 | 	* Items will only appear in an index if they exist in the base table
 494 | 	* Makes querying very efficient
 495 | * Only for `read` operations, `write` is not supported.
 496 | * Tables with secondary indexes need to be created sequentially (`LimitExceededException`)
 497 | <a name="3_4_4_1"></a>
 498 | #### Projected attributes
 499 | * Attributes copied from the base table into an *index*
 500 | * Makes them queryable
 501 | * Different projection types
 502 | 	* *KEYS_ONLY* - Only the index and primary keys are projected into the index
 503 | 	* *INCLUDE* - Only the specified table attributes are projected into the index
 504 | 	* *ALL* - All of the table attributes are projected into the index
 505 | <a name="3_4_4_2"></a>
 506 | #### Local secondary index
 507 | * Uses the *same PK*, but offers different *SK*
 508 | * Every partition of a local secondary index is scoped to a base table partition that has the same
 509 | partition key value
 510 | * Local secondary indexes are extra tables that dynamo keeps in the background
 511 | * Cannot be created after the base table has already been created.
 512 | * Can choose *eventual consistency* or *strong consistency* at *creation* time
 513 | * *Local* as in "co-located on the same partition"
 514 | * Can request *not-projected* attributes for query or scan operation
 515 | * Consumes read/write throughput from the original table.
 516 | <a name="3_4_4_3"></a>
 517 | #### Global secondary index
 518 | * Uses *different PK* and offers additional *SK* (or none).
 519 | * *PK* does not have to be unique (unlike base table)
 520 | * Queries on the global index can span all of the data in the base table, across all partitions
 521 | * Can be created after the base table has already been created.
 522 | * Only support *eventual consistency*
 523 | * Have their own provisioned read/write throughput
 524 | * Global secondary keys are distributed transactions across multiple partitions
 525 | * Global as in "over many partitions"
 526 | * Cannot request not-projected attributes for query or scan operation
 527 | 
 528 | <a name="3_4_5"></a>
 529 | ### Capacity provisioning
 530 | * Unit for operations:
 531 | 	* 1 *strongly consistent* `read` per second (up to 4KB/s)
 532 | 	* 2 *eventual consistent* `read` per second (up to 8KB/s)
 533 | 	* 1 `write` per second (up to 1KB)
 534 | * Algorithm
 535 | 
 536 | .|.
 537 | -|-
 538 | .|*300 strongly consistent reads of 11KB per minute*
 539 | Calculate read / writes per second|`300r/60s = 5r/s`
 540 | Multiply with payload factor|`5r/s * (11KB/4KB) = 15cu`
 541 | If eventual consistent, devide by 2|`15cu / 2 = 8cu`
 542 | 
 543 | * More throughput -> more reads / writes per second
 544 | * Exceeding allocated throughput may result in throttling of the operation. Check return code.
 545 | * Failing to distribute data accross partions can result in `ProvisionedThroughputExceededException`
 546 | * Local secondary index
 547 | 	* `Read`
 548 | 		* If read only index keys and projected attributes use same calculation
 549 | 		* If more than index keys and projected attributes add extra latency and read capacity cost
 550 | 			* Use read capacity from the index *and* for every item from the table
 551 | 	* `Write` (to items in the base table that are indexed)
 552 | 		* 1 for adding an item
 553 | 		* 2 for changing the value of an item
 554 | 	  * 1 for deleting and item
 555 | * Global secondary index
 556 | 	* Read
 557 | 		* Only supports eventual consistency, so 8KB/s base unit
 558 | 		* Calculated the same as in tables, except that the size of the index entries is used instead
 559 | 		of the size of the entire item
 560 | 	* Write (to items in the base table that are indexed)
 561 | 		* Putting, Updating, or Deleting items in a table consumes the index' write capacity units
 562 | 
 563 | <a name="3_4_6"></a>
 564 | ### Query and scan operation
 565 | <a name="3_4_6_1"></a>
 566 | #### Query
 567 | * Finds items based on PK values
 568 | * Can *only* query any table or secondary index that have a composite primary key
 569 | * *Has* to use PK, *can* specify SK
 570 | * Very efficient, only searches index
 571 | * Result is orderd by SK
 572 | * Returns all attributes or only specified subset
 573 | * Eventually consistent per default, can request consistent read
 574 | * Can use *conditional attributes*
 575 | 
 576 | <a name="3_4_7"></a>
 577 | ### Scan
 578 | * Reads every item in table (much worse performance than queries)
 579 | * Can *filter* result (slows down performance)
 580 | * The larger the data set in the table the slower the scan
 581 | * *Eventual consistent* reads by default, can specify *strongly consistent*
 582 | * Try to avoid scans
 583 | * Use *Page Size* to limit how much data is retrieved at the same time
 584 | 
 585 | <a name="3_4_8"></a>
 586 | ### Atomic and conditional updates
 587 | <a name="3_4_8_1"></a>
 588 | #### Atomic Counters
 589 | * Increment or decrement the value of an existing attribute without interfering with other writes
 590 | * Request are applied in the order they are received
 591 | * *Not idempotent*
 592 | <a name="3_4_8_2"></a>
 593 | #### Conditional updates
 594 | * Only proceed if condition is met
 595 | * *Idempotent*
 596 | 
 597 | <a name="3_4_9"></a>
 598 | ### How to grant temporary access
 599 | * *Web Identity Federation* - use existing OpenId provider, eg. Amazon, Google, Facebook
 600 | * *Amazon Cognito* does Web Identity Federation, also synchronizes app data
 601 | * *IAM* - contains role for users to assume
 602 | 
 603 | <a name="3_4_10"></a>
 604 | ### API
 605 | * Control Plane
 606 | 
 607 | Create and manage tables|.
 608 | -|-
 609 | `CreateTable`|Creates a table and specifies the primary index used for data access
 610 | `DescribeTable`|Returns information such as primary key schema, throughput settings, index information
 611 | `ListTables`|Returns the names of all of your tables in a list
 612 | `UpdateTable`|Modifies the settings of a table or its indexes
 613 | `DeleteTable`|emoves a table and all of its dependent objects
 614 | 
 615 | * Data Plane
 616 | 
 617 | Creating data|.|conditional?
 618 | -|-|-
 619 | `PutItem`|Creates a new item, or replaces an old item with a new item|yes
 620 | `BatchWriteItem`|Puts or deletes multiple items in one or more tables|no
 621 |  |Called in a loop it typically checks for unprocesses items and submits a new `BWI` request for those
 622 | 
 623 | Reading data|.|conditional?
 624 | -|-|-
 625 | `GetItem`|Returns a set of Attributes for an item that matches the PK|no
 626 | `BatchGetItem`|Returns the attributes for multiple items from multiple tables using their PKs|no
 627 | `Query`|Gets one or more items using the table *PK*, or from a secondary index using the index key|no
 628 | `Scan`|Gets all items and attributes by performing a full scan across the table or a secondary index|no
 629 | 
 630 | Updating data|.|conditional?
 631 | -|-|-
 632 | `UpdateItem`|Modifies one or more attributes in an item|yes
 633 | 
 634 | Deleting data|.|conditional?
 635 | -|-|-
 636 | `DeleteItem`|Deletes a single item in a table by primary key|yes
 637 | `BatchWriteItem`|Puts or deletes multiple items in one or more tables|no
 638 |  |Called in a loop it typically checks for unprocesses items and submits a new `BWI` request for those
 639 | 
 640 | <a name="3_4_11"></a>
 641 | ### Limits
 642 | .|.
 643 | -|-
 644 | Tables per account/region|256
 645 | Max read / write per table partition|3000 reads / 1000 writes
 646 | Partition key|min 1B, max 2048B
 647 | Sort key|min 1B, max 1024B
 648 | Local secondary index per table|5
 649 | Global secondary index per table|5
 650 | Item size|1B to 400KB, including name & value
 651 | Simultaneous `CreateTable`, `UpdateTable`, `DeleteTable`|up to 10
 652 | Single `BatchGetItem`|Max 100 items, must be <16MB
 653 | Single `BatchWriteItem`|Up to 25 *PutItem* or *DeleteItem*, must be <16MB
 654 | *Query* and *Scan* result set limit|1MB data per call
 655 | 
 656 | <a name="3_5"></a>
 657 | ## [↖](#top)[↑](#3_4_11)[↓](#3_5_1) Elastic Compute Cloud (EC2)
 658 | * Resizable **compute capacity** in the cloud
 659 | * Amazon Machine Image (AMI)
 660 | 	* Unit of deployment
 661 | 	* Packaged-up environment that includes all the necessary bits to set up and boot an instance
 662 | 	* Can create AMI from configured *EC2* instance
 663 | 
 664 | <a name="3_5_1"></a>
 665 | ### Different options
 666 | * Payment models
 667 | 	* **On-demand instances**
 668 | 		* Pay for compute capacity by the hour, can be terminated by Amazon
 669 | 	* **Reserved instances**
 670 | 		* Provide a significant discount compared to On-Demand pricing and
 671 | 		provide a capacity reservation when used in a specific Availability Zone
 672 | 		* Can transfer between AZs
 673 | 	* **Spot instances**
 674 | 		* Bid on spare Amazon EC2 computing capacity, not available for all instance types
 675 | 	* **Dedicated hosts**
 676 | 		* A physical server with EC2 instance capacity fully dedicated to your use
 677 | * Instance sizes & types
 678 | 	* *Sizes*: nano / micro / small / medium / large
 679 | 	* *Types*: general purpose / computer optimized / memory optimized / gpu / storage optimized
 680 | * Pricing by
 681 | 	* Compute time
 682 | 	* Data transfer
 683 | 	* Storage
 684 | 	* Elastic IP address
 685 | 	* Monitoring
 686 | 	* Elastic load balancer
 687 | 
 688 | <a name="3_5_2"></a>
 689 | ### Instance metadata & userdata
 690 | * Data about an instance that can be used to configure or manage the running instance
 691 | * Available from *running instance* under `http://169.254.169.254/latest/meta-data/`
 692 | * Contains various data about the current instance (static & dynamic)
 693 | * Can specify user-data
 694 | 	* Allows to launch individual instances from same AMI
 695 | 
 696 | <a name="3_5_3"></a>
 697 | ### API
 698 | .|.
 699 | -|-
 700 | `DescribeImages`|Describe an Amazon Machine Image
 701 | `RegisterImage`|Final process of creating an AMI
 702 | 
 703 | <a name="3_5_4"></a>
 704 | ### Limits:
 705 | .|.
 706 | -|-
 707 | Elastic IP addresses for EC2-Classic|5
 708 | 
 709 | <a name="3_6"></a>
 710 | ## [↖](#top)[↑](#3_5_4)[↓](#3_6_1) Elastic Load Balancer (ELB)
 711 | * **Distributes traffic** between instances that belong to the ELB group
 712 | * Stops sending requests to unhealthy instances
 713 | * Can store SSL certificates (offloads encryption to load balancer level)
 714 | * Can configure session stickyness:
 715 | 	* *LB issued cookie*
 716 | 		* Easy to implement, not best balancing
 717 | 	* *Application issued cookie*
 718 | 		* Cookies based on application session, marginally better
 719 | 	* *ElastiCache*
 720 | 		* Better distribution, requires state to be stored in *DB* or in *EC* memory.
 721 | 		* EC memory is the much better option
 722 | * Relies on DNS / *Route53*
 723 | * Can route traffic into instances running in private subnets
 724 | 	* Needs to be configured with (empty) public subnets though.
 725 | 
 726 | <a name="3_6_1"></a>
 727 | ### Limits:
 728 | .|.
 729 | -|-
 730 | Total load balancers per region (ALB & ELB)|20
 731 | 
 732 | <a name="3_7"></a>
 733 | ## [↖](#top)[↑](#3_6_1)[↓](#3_7_1) SNS
 734 | * **Publishes** messages to **subscribers** via topic
 735 | * **Pub-Sub-Service** for messaging
 736 | 	* Scenarios:
 737 | 		* *Fanout*: Many subsribers process event parallel and asyncronously
 738 | 		* *Push to SQS*: Services pull from SQS, when they become available
 739 | 		* *Alert*: Notification triggered by event or threshold
 740 | * **Mobile Notifications** to mobile devices
 741 | 	* Sends *push notifications* to iOS, Android, Fire OS, Windows and Baidu-based devices
 742 | 
 743 | <a name="3_7_1"></a>
 744 | ### Components
 745 | * **Publisher** (producer)
 746 | 	* Communicates asynchronously with subscribers
 747 | 	* Policies determine which topic(s) publishers can write to
 748 | * **Topics**
 749 | 	* Unique name up to 256 characters
 750 | 	* Stored redundantly on multitple servers and datacenters
 751 | * **Subscribers** (consumer)
 752 | 	* Subscribes to topic
 753 | 	* Endpoints like mobile app, web server, email, *AWS SQS*, *AWS Lambda*
 754 | 		* Email subscriptions need to be confirmed
 755 | * **Messages**
 756 | 	* Json-formatted key-value pairs
 757 | 		* Fixed set + additional attributes if required
 758 | 			* `POST`s to https endpoints with specific headers
 759 | 				* Contains topics- and subscription-ARN
 760 | 				* To identify messages without parsing the body
 761 | 		* Up to 10 for SQS.
 762 | 		* Provider-specific for mobile push notifications
 763 | 	* Messages can be signed and verified
 764 | 	* Message data:
 765 | 		* `Message`, `MessageId`, `Signature`, `SignatureVersion`, `SigningCertURL`, `Subject`,
 766 | 		`Timestamp`, `TopicArn`, `Type`, `UnsubscribeURL`
 767 | 
 768 | <a name="3_7_2"></a>
 769 | ### Managing access
 770 | * Owner creates topic and controls access to it
 771 | * Can use own API (Access Control) and / or *IAM*, similar to *S3*
 772 | 	* Access control policies
 773 | 		* Default denies, needs explicit allow
 774 | 		* Can grant access across account (API call: *AddPermission*)
 775 | 	* IAM
 776 | 		* More fine grained or very coarse, can include conditions
 777 | 		* Can grant temporary security credentials
 778 | 
 779 | <a name="3_7_3"></a>
 780 | ### Mobile push notifications
 781 | * Does not push to endpoint, but to PN-service (platform/provider specific)
 782 | 	1. Request *credentials* from mobile platforms (ADM, APNS, etc...)
 783 | 	2. Request a *token* from mobile platforms (*registrations id* for some platforms)
 784 | 	3. Create a *platform application object*
 785 | 	4. Create a *platform endpoint object*
 786 | 	5. Publish a *message* to the mobile endpoint
 787 | * A single message can contain different data for different platforms
 788 | 
 789 | <a name="3_7_4"></a>
 790 | ### API
 791 | .|.
 792 | -|-
 793 | `CreateTopic`|Create a new topic.
 794 | `DeleteTopic`|Delete a topic and all its subscriptions.
 795 | `Publish`| Publish a new message to the topic.
 796 | `ListTopics`|List of topics owned by a particular user (AWS ID).
 797 | `ListSubscriptions`|List subscriptions owned by a particular user (AWS ID)
 798 | `ListSubscriptionsByTopic`|List of subscriptions for a particular topic
 799 | `Subscribe`|Register a new subscription on a topic, will generate a confirmation message from Amazon SNS
 800 | `ConfirmSubscription`|Respond to a confirmation message, confirming to receive notifications from the topic
 801 | `UnSubscribe`|Cancel a previously registered subscription
 802 | 
 803 | <a name="3_7_5"></a>
 804 | ### Limits:
 805 | .|.
 806 | -|-
 807 | Topics per account|100,000
 808 | 
 809 | <a name="3_8"></a>
 810 | ## [↖](#top)[↑](#3_7_5)[↓](#3_8_1) SQS
 811 | * Scalable **message queue** service
 812 | * Allows *loose coupling* and *asynchronous processing*
 813 | * **Pull** from *SQS* (*Push* to *SNS*)
 814 | * PCI compliant
 815 | * Allows for asynchronous processing
 816 | * Protection against data loss on application failure
 817 | 
 818 | <a name="3_8_1"></a>
 819 | ### Core features
 820 | * Redundant infrastructure
 821 | * Multiple readers / writers at the same time
 822 | * Access control via *SQS policies* (similar to *IAM*)
 823 | * **Standard queue**
 824 | 	* Guarantees message delivery *at least once*
 825 | 	* *No guarantee on message order*
 826 | 	* No guarantee on not receiving duplicates (app has to deal with it)
 827 | * **FIFO queue**
 828 | 	* Guaranteed order
 829 | 	* Exactly-once processing
 830 | 	* *Message groups* - multiple ordered message groups within a single queue
 831 | 	* Name ends in `.fifo`
 832 | * **Delay queues**
 833 | 	* Controls when a message becomes available
 834 | 	* Between 0s and 15min, default 0s
 835 | * **Visibility timeout**
 836 | 	* Controls when a polled message becomes visible again
 837 | 	* Configurable and extendable for individual messages
 838 | 	* Between 0s and 12h, default 30s
 839 | * **Message retention period**
 840 | 	* Amount of time a message will live in the queue if it's not deleted
 841 | 	* Between 1min and 14d, default 4d
 842 | * **In flight message**
 843 | 	* Sent to a client but have not yet been deleted or have not yet reached the end of their
 844 | 	 visibility window
 845 | * **Deadletter queue**
 846 | 	* Queue that other queues can send messages to when these were not successfully
 847 | 	processed.
 848 | * **Receive message wait time**
 849 | 	* Value >0 enables *long polling*
 850 | 	* Between 0s and 20s, default 0s (*short polling*)
 851 | 
 852 | <a name="3_8_2"></a>
 853 | ### Message lifecycle
 854 | * Component 1 sends message A to queue
 855 | 	* `SendMessage`/`SendMessageBatch`
 856 | * Component 2 retrieves A from queue.
 857 | 	* A remains in queue while it's being processed, but is not returned to any other components
 858 | 	* Message is now considered to be *in flight*.
 859 | 	* `ReceiveMessage`
 860 | * Component 2 deletes A from queue during visibility timeout
 861 | 	* Otherwise it will get processed again
 862 | 	* SQS will never delete messages
 863 | 	* `DeleteMessage`/`DeleteMessageBatch`
 864 | 
 865 | <a name="3_8_3"></a>
 866 | ### Long polling vs short polling
 867 | * **Short polling** returns immediately, could be *false empty* (e.g. message not fully propagated yet)
 868 | * **Long polling** won't return unless there's a message in the queue or receive message wait time is
 869 | exceeded. Also checks *every server* to avoid false empty responses
 870 | 
 871 | <a name="3_8_4"></a>
 872 | ### API
 873 | .|.
 874 | -|-
 875 | `SendMessage`/`SendMessageBatch`|Delivers a message to the specified queue (up to 20, <= 256KB)
 876 | `ReceiveMessage`|Retrieves one or more messages (up to 10), `WaitTimeSeconds` for long poll
 877 | `ChangeMessageVisibility`/`ChangeMessageVisibilityBatch`|Changes the visibility timeout of a message
 878 | `DeleteMessage`/`DeleteMessageBatch`|Deletes the specified message from the specified queue
 879 | `SetQueueAttribute`|e.g `DelaySeconds`, `MessageRetentionPeriod`
 880 | `GetQueueURL`|
 881 | `CreateQueue`|
 882 | `DeleteQueue`|
 883 | `ListQueues`|
 884 | 
 885 | <a name="3_8_5"></a>
 886 | ### Limits:
 887 | .|.
 888 | -|-
 889 | Max message size|256KB
 890 | Max inflight messages|120,000
 891 | 
 892 | <a name="3_9"></a>
 893 | ## [↖](#top)[↑](#3_8_5)[↓](#3_9_1) Cloudformation
 894 | * Allows to create and provision **resources** in a reusable **template** fashion
 895 | 	* A *CloudFormation* template is a `JSON` or `YAML` formatted text file
 896 | * Related resources are managed in a single unit called a **stack**
 897 | 	* All the resources in a stack are defined by the stack's *CloudFormation* template
 898 | * Two ways to update a stack
 899 | 	* *Direct update*
 900 | 	* Create **change set**
 901 | 		* Summary of proposed changes
 902 | * Will **rollback** stack if it fails to create (can be disabled via API / console)
 903 | 
 904 | <a name="3_9_1"></a>
 905 | ### Anatomy of template
 906 | * *AWSTemplateFormatVersion*
 907 | * *Description*
 908 | * *Metadata*
 909 | 	* Details about the template
 910 | * *Parameters*
 911 | 	* Values to pass in right before template creation
 912 | 	* Allows validation per *regular expression*
 913 | * *Mappings*
 914 | 	* Maps keys to values (eg different values for different regions)
 915 | * *Conditions*
 916 | 	* Check values before deciding what to do
 917 | * *Resources*
 918 | 	* Creates resources
 919 | * *Outputs*
 920 | 	* Values to be exposed from the console or from API calls.
 921 | 	* Can be used in a different stack (*cross stack references*)
 922 | 
 923 | <a name="3_9_2"></a>
 924 | ### Intrinsic Functions
 925 | * Used to pass in values that are not available until runtime
 926 | * Usable in *resource* properties, *metadata* attributes, and *update policy* attributes (auto-scaling)
 927 | * `Fn::GetAtt`
 928 | 	* Returns the value of an attribute from a resource in the template
 929 | * `Fn::FindInMap`
 930 | 	* Returns the value corresponding to keys in a two-level map that is declared in the *Mappings* section
 931 | * `Fn::Join`
 932 | 	* Appends a set of values into a single value, separated by the specified delimiter
 933 | * `Fn::GetAZs`
 934 | 	* Returns an array that lists *Availability Zones* for a specified region
 935 | * `Fn::Select`
 936 | 	* Returns a single object from a list of objects by index
 937 | * `Fn::ImportValue`
 938 | 	* Returns the value of an *Output* exported by another stack
 939 | * `Fn::Split`
 940 | 	* Split a string into a list of string values so that you can select an element from the resulting
 941 | string list
 942 | * `Fn::Sub`
 943 | 	* Substitutes variables in an input string with values that you specify
 944 | * `Ref`
 945 | 	* Returns the value of the specified parameter or resource
 946 | 
 947 | <a name="3_9_3"></a>
 948 | ### Limits:
 949 | .|.
 950 | -|-
 951 | Max stacks per region|200
 952 | Max templates per region|unlimited
 953 | Parameters|60
 954 | Mappings|100
 955 | Resources|200
 956 | Outputs|60
 957 | 
 958 | <a name="3_10"></a>
 959 | ## [↖](#top)[↑](#3_9_3)[↓](#3_10_1) Elastic Beanstalk (EB)
 960 | * **Full stack** that provisions *capacity*, sets up *load balancing* and *auto-scaling* and configures
 961 | *monitoring*
 962 | * No need to create / manage infrastructure
 963 | * Not good if full control of resource configuration is needed
 964 | * Not everything fits into the EB model
 965 | 
 966 | <a name="3_10_1"></a>
 967 | ### AWS-Stack
 968 | * *EC2* instance
 969 | * Instance *Security Group*
 970 | * *Elastic Load Balancer*
 971 | * *Load Balancer Security Group*
 972 | * *Auto Scaling Group*
 973 | 	* Automatically replaces instances if they become unavailable
 974 | * *S3 Bucket*
 975 | 	* Source code, logs & othe artifacts
 976 | * *CloudWatch Alarm*
 977 | 	* 2 alarms that monitor load on instances & Auto Scaling group scaling up / down
 978 | * *Cloudformation stack*
 979 | * *Domain name*
 980 | 	* `subdomain.region.elasticbeanstalk.com`
 981 | 
 982 | <a name="3_10_2"></a>
 983 | ### Supports
 984 | * Platform-specific application *source bundle* (e.g. Java `war` for Tomcat)
 985 | 	* Go
 986 | 	* Java with Tomcat
 987 | 	* Java SE
 988 | 	* .NET on Windows Server with IIS
 989 | 	* Node.js
 990 | 	* PHP
 991 | 	* Python
 992 | 	* Ruby (Passenger Standalone)
 993 | 	* Ruby (Puma)
 994 | 	* Single Container Docker
 995 | 	* Multicontainer Docker
 996 | 	* Preconfigured Docker (Glassfish)
 997 | 	* Preconfigured Docker (Python 3.x)
 998 | 	* Preconfigured Docker (Go)
 999 | 
1000 | <a name="3_10_3"></a>
1001 | ### Core components
1002 | * **Application**
1003 | 	* Logical collection of *Elastic Beanstalk* components, including *environments*, *versions*, and
1004 | 	*environment configurations*. In Elastic Beanstalk an application is conceptually similar to a
1005 | 	folder.
1006 | * **Application version**
1007 | 	* An *application version* refers to a specific, labeled iteration of deployable code for a web
1008 | 	application
1009 | * **Environment**
1010 | 	* An environment is a version that is deployed onto AWS resources
1011 | 	* Runs only a single application version at a time
1012 | 	* Can run the same version or different versions in many environments at the same time
1013 | * **Environment Configuratoin**
1014 | 	* Collection of parameters and settings that define how an environment and its associated resources behave
1015 | 	* Updating a configuration will cause AWS to automatically apply the changes
1016 | * **Configuration template**
1017 | 	* Starting point for creating unique environment configurations
1018 | 
1019 | <a name="3_10_4"></a>
1020 | ### Limits:
1021 | .|.
1022 | -|-
1023 | Applications|75
1024 | Application Versions|1000
1025 | Environments|200
1026 | 
1027 | <a name="3_11"></a>
1028 | ## [↖](#top)[↑](#3_10_4)[↓](#3_11_1) Simple Workflow Service (SWF)
1029 | * **Task coordination** and **state management** service
1030 | * Distributed, scales up and down depending on task
1031 | * Works with *on-premise* and *cloud* apps
1032 | * Allows for *synchronous* or *asynchronous* processing
1033 | * Can contain human events
1034 | * Guaranteed order of execution
1035 | * Tasks can live up to one year (`31,536,000 seconds`)
1036 | 
1037 | <a name="3_11_1"></a>
1038 | ### Core components
1039 | * **Workflow**
1040 | 	* A workflow is a set of *activities* that carry out some objective, together with logic that
1041 | 	coordinates the activities.
1042 | * **Domain**
1043 | 	* Scope of a *workflow*
1044 | 	* An account can have multiple *domains*, each of which can contain multiple *workflows*
1045 | 	* *Workflows* in different *domains* cannot interact
1046 | * **Workflow Starter**
1047 | 	* Any application that can initiate workflow executions
1048 | * **Activity**
1049 | 	* Things carried out by a *workflow*
1050 | * **Activity Task**
1051 | 	* Represents one invocation of an *activity*
1052 | 	* Can run synchronously or asynchronously
1053 | 	* Gets assigned to worker
1054 | 	* *Decision task* tells decider that state of workflow has changed
1055 | * **Activity Worker**
1056 | 	* Is a program that receives *activity tasks*, performs them, and provides results back
1057 | 	* Might be used by a person
1058 | 	* Can live on *EC2* or on-premise
1059 | * **Decider**
1060 | 	* Coordination logic in a *workflow*
1061 | 	* Schedules *activity tasks*, provides input data to the *activity workers*, processes events that
1062 | 		arrive while the *workflow* is in progress, and ultimately ends (or closes) the *workflow* when the
1063 | 		objective has been completed.
1064 | 
1065 | <a name="3_11_2"></a>
1066 | ### Limits:
1067 | .|.
1068 | -|-
1069 | Maximum registered domains|100
1070 | 
1071 | <a name="3_12"></a>
1072 | ## [↖](#top)[↑](#3_11_2)[↓](#3_12_1) Virtual Private Cloud (VPC)
1073 | * Provisions a logically isolated section of the AWS cloud
1074 | * Spans over all AZs in a region
1075 | * Allows to create layered architecture
1076 | * Shared or dedicated tenancy (exclusive hardware or not)
1077 | * *Security groups* and subnet *network ACLs*
1078 | * Ability to extend on-premise network to cloud
1079 | 
1080 | <a name="3_12_1"></a>
1081 | ### Overview
1082 | <a name="3_12_1_1"></a>
1083 | #### Default VPC (Amazon specific)
1084 | * Gives easy access to a VPC without having to configure it from scratch
1085 | * Has different subnets in different AZs and an internet gateway per AZ
1086 | * Each instance launched automatically receives a *public IP* (very different to non-default VPC)
1087 | * Cannot be restored if deleted
1088 | <a name="3_12_1_2"></a>
1089 | #### Non-default VPC (regular VPC)
1090 | * Only has private IP addresses
1091 | * Resources *only* accessible through *Elastic IP*, *VPN* or *internet gateways*
1092 | <a name="3_12_1_3"></a>
1093 | #### VPC Peering
1094 | * Connect VPCs through direct network routing
1095 | * Can occur between different accounts and VPCs, but must be  in the same region
1096 | * Allows instances to communicate with each other as if they were in the same network
1097 | <a name="3_12_1_4"></a>
1098 | #### VPC Scenarios
1099 | * VPC with private subnet only -> single tier apps
1100 | * VPC with public and private subnets -> layered apps
1101 | * VPC with public, private subnets and hardware connected VPN -> extending apps to on-premise
1102 | * VPC with private subnets and hardware connected VPN -> extended VPN
1103 | 
1104 | <a name="3_12_2"></a>
1105 | ### Components
1106 | * **Subnet**
1107 | 	* In exactly one AZ
1108 | 	* If traffic is routed to an Internet gateway, the subnet is known as a public subnet
1109 | 	* If a subnet doesn't have a route to the Internet gateway, it's known as a private subnet
1110 | 	* EC2 instances are launched into subnets
1111 | 	* Use ssh-agent forwarding to connect from public to private instances
1112 | 	* Sometimes grouped into Subnet Groups, e.g. for caching or DB. Typically across AZs
1113 | * **Route Table**
1114 | 	* Contains a set of rules, called routes that determine where network traffic is directed to
1115 | 	* Each VPC automatically comes with a main route table that can be configured
1116 | 	* Each subnet in a VPC must be associated with a route table; the table controls the routing
1117 | 	for the subnet. A subnet can only be associated with one route table at a time, but multiple
1118 | 	subnets can be associated with the same route table
1119 | 	* Each route in a table specifies a destination CIDR and a target
1120 | 	* Every route table contains a local route for communication within the VPC
1121 | 	* Can have a *default route* 0.0.0.0/0 to route everything that doesn't have a specific rule
1122 | * **Elastic IP**
1123 | 	* Static IPv4 address mapped to an instance or network interface
1124 | 	* If attached to network interface it's decoupled from the instance's lifecycle
1125 | 	* Routes to private IP address of instance
1126 | 	* Can be remapped in case of failure.
1127 | 	* For use in a specific region only
1128 | 	* Can only map to instances in public subnets
1129 | * **Gateways**
1130 | 	* *Internet Gateway*
1131 | 		* Horizontally scaled, redundant, and highly available VPC component that allows communication
1132 | 		between instances in a VPC and the internet
1133 | 		* Provides a target in VPC route tables for internet-routable traffic
1134 | 		* Performs network address translation (NAT) for instances that have been assigned public
1135 | 		IPv4 addresses
1136 | 	* *Virtual Private Gateway*
1137 | 		* Has VPN connection to customer gateway attached
1138 | 		* Serves as VPN concentrator on the Amazon side of the VPN connection
1139 | 	* *Customer Gateway*
1140 | 		* A physical device or software application on your side of the VPN connection
1141 | * **NAT**
1142 | 	* *NAT Instances*
1143 | 		* Manually configured instance from an NAT AMI
1144 | 	* *NAT Gateway*
1145 | 		* AWS-mananged service
1146 | <a name="3_12_2_1"></a>
1147 | #### Structure & package flow
1148 | * VPC (has *CIDR*)
1149 | 	* Gateway (Internet or VPN)
1150 | 	* Routes (one per subnet, can be shared)
1151 | 	* Network ACL (one per subnet, can be shared)
1152 | 	* Subnets (CIDRs match VPC's CIDR)
1153 | 	* Security Group (on VPC level)
1154 | 	* Instance (needs public IP for internet communication, either ELB or Elastic IP)
1155 | 
1156 | <a name="3_12_3"></a>
1157 | ### Security
1158 | <a name="3_12_3_1"></a>
1159 | #### Network ACL
1160 | * Subnet level, acting as firewall
1161 | * Rules for inbound and outbound traffic
1162 | * Rules have numbers and are evaluated from low to high
1163 | * *Stateless*
1164 | <a name="3_12_3_2"></a>
1165 | #### Security Groups
1166 | * Acts as a virtual firewall to control inbound and outbound traffic to instances
1167 | * Acts on instance level, not subnet level
1168 | * Rules for inbound and outbound traffic
1169 | * *Stateful* - will always allow response to (allowed) outbound traffic
1170 | * Can refer to other security group, e.g. allow traffic from there
1171 | 
1172 | <a name="3_12_4"></a>
1173 | ### Limits:
1174 | .|.
1175 | -|-
1176 | VPCs per region|5
1177 | Subnets per VPC|200
1178 | Customer gateways per region|50
1179 | Gateway per region|5 Internet
1180 | Elastic IPs per account per region|5
1181 | VPN connections per region|50
1182 | Route tables per region|200
1183 | Security groups per region|500
1184 | 
1185 | <a name="3_13"></a>
1186 | ## [↖](#top)[↑](#3_12_4)[↓](#4) Relational Database Service (RDS)
1187 | * Set up, operate, and scale a **relational database** in the cloud
1188 | * Supports
1189 | 	* Amazon Aurora
1190 | 	* MySQL
1191 | 	* MariaDB
1192 | 	* Oracle
1193 | 	* SQL Server
1194 | 	* PostgreSQL
1195 | * Automates common administrative tasks such as performing **backups** and software **patching**
1196 | 	* *Automated backups*
1197 | 		* Performs a full daily snapshot
1198 | 		* Enables point-in-time recovery
1199 | 	* *DB Snapshots*
1200 | 		* User-initiated
1201 | 		* As frequent as desired
1202 | * Supports *encryption at rest* for all database engines
1203 | * **DB instance**
1204 | 	* Database environment in the cloud with specified *compute* and *storage* resources
1205 | * **Multi-AZ deployments**
1206 | 	* Provide enhanced availability and durability for DB Instances, making them a natural fit for
1207 | 	production database workloads
1208 | * **DB subnet group**
1209 | 	* Collection of subnets that you are designated for the RDS DB Instances in a VPC
1210 | * **Maintenance window**
1211 | 	* Needs to be specified (or defaults to weekly) for maintenance events like scaling and patching
1212 | * **DB Parameter group**
1213 | 	* Acts as a “container” for engine configuration values that can be applied to one or more DB
1214 | 	Instances
1215 | 
1216 | <a name="4"></a>
1217 | # [↖](#top)[↑](#3_13)[↓](#) Etc
1218 | * *us-east-1* is the default region for all SDKs
1219 | * *Penetration tests* need to be anounced
1220 | 


--------------------------------------------------------------------------------
/devops-engineer-professional-02.md:
--------------------------------------------------------------------------------
  1 | 
  2 | # DevOps Engineer Professional (C02)
  3 | 
  4 | ## Comments per Service
  5 | 
  6 | ### CodeStar
  7 | 
  8 | #### CodeCommit
  9 | 
 10 | - CodeCommit requires CloudWatch Events rule to trigger CodePipeline
 11 | - Can trigger lambda functions out of CodeCommit events
 12 | - AWS provides several managed policies:
 13 | 	- `AWSCodeCommitFullAccess` , `AWSCodeCommitPowerUser` , `AWSCodeCommitReadOnly`
 14 | - Can use Approval Rule templates to e.g. trigger unit tests via CodeBuild
 15 | 
 16 | #### CodePipeline
 17 | 
 18 | - CodePipeline can execute cross-region actions
 19 | - CodePipeline can deploy straight into S3
 20 | - CodePipeline can have custom actions that invoke job workers
 21 | 
 22 | #### CodeBuild
 23 | 
 24 | - CodeBuild can be triggered directly from Github via web hook
 25 | - CodeBuild supports build badges, which provide an embeddable, dynamically generated image (_badge_) that displays the status of the latest build for a project
 26 | 
 27 | #### CodeDeploy
 28 | 
 29 | - In EC2/On-Premises deployment, a CodeDeploy **deployment group** is a set of individual instances targeted for a deployment. A deployment group contains individually tagged instances, Amazon EC2 instances in Amazon EC2 Auto Scaling groups, or both.
 30 | - CodeDeploy can terminate the original instances in the deployment group with a waiting period of 1 hour.
 31 | - CodeDeploy has a default timeout of 1 hour to wait for scripts to finish
 32 | - CodeDeploy failing on `AllowTraffic` can mean that health checks on ELB are misconfigured
 33 | - Notifies via CloudWatch Events
 34 | 	- Lambda[]()
 35 | 	- SNS
 36 | 	- Kinesis streams
 37 | 	- SQS
 38 | 	- Built-in targets (CloudWatch Alarms actions)
 39 | 
 40 | #### CodeGuru
 41 | 
 42 | - Amazon CodeGuru **Profiler** helps developers understand the runtime behaviour of their applications, improve performance, and decrease infrastructure costs.
 43 | - Amazon CodeGuru **Reviewer** is an automated code review service that identifies critical defects and deviation from coding best practices for Java and Python code. Works on PRs
 44 | - Reviewer can protect secrets and suggest code changes to mitigate
 45 | 
 46 | ### IaC
 47 | 
 48 | #### CloudFormation
 49 | 
 50 | - CFN custom resources ->  pre-signed URLs
 51 | - In a stackset, global resources (like S3) have to be unique
 52 | - CloudFormation drift detection requires manual intervention; use AWS Config to automate detection.
 53 | 
 54 | #### Service Catalog
 55 | 
 56 | - By using a launch role via **launch constraint**, you can instead limit the end users’ permissions to the minimum they require for that product
 57 | - The **template constraint** limits the options that are available to end-users when they launch a product. It works by narrowing the allowable values for parameters that are defined in the product’s underlying AWS CloudFormation template
 58 | 	- Apply template constraints to ensure that the end users can use products without breaching the compliance requirements of your organization
 59 | 
 60 | #### OpsWorks
 61 | 
 62 | - OpsWorks can create time-based instances for scaling of predictable workload,  or load-based using CPU utilisation or load, or memory utilisation
 63 | 
 64 | ### Compute
 65 | 
 66 | #### EC2
 67 | 
 68 | - EC2 memory metrics are not collected by default and need to have CloudWatch agent installed
 69 | - EC2 can use built-in **instance recovery**
 70 | - An instance is scheduled to be retired when AWS detects irreparable failure of the underlying hardware that hosts the instance.
 71 | 	- When an instance reaches its scheduled retirement date, it is stopped or terminated by AWS.
 72 | 	- AWS also sends an AWS Health event, which you can monitor and manage by using Amazon CloudWatch Events.
 73 | 
 74 | #### ASG
 75 | 
 76 |   - ASG lifecycle states:
 77 |     - `Pending` (hooks `Pending:Wait`, `Pending:Proceed`)
 78 |     - `InService`
 79 |     - `Terminating` (hooks `Terminating:Wait`, `Terminating:Proceed`)
 80 |     - `Terminated`
 81 | - `Pending:Wait` lifecycle hook can allow AMI upgrades before bringing them into service
 82 | - `Terminating:Wait` lifecycle hook to collect instance data (e.g. logs) before final termination
 83 | - Tags mentioned in the Auto Scaling group are _not_ propagated to EBS volumes
 84 | - ASG: A warm pool gives you the ability to decrease latency for your applications that have exceptionally long boot times, for example, because instances need to write massive amounts of data to disk.
 85 | 	- Can keep instances in pool _running_ or _stopped_
 86 | - ASG can notify via SNS on failed instance launch
 87 | - Can use Amazon EventBridge or Amazon CloudWatch Events to track the Auto Scaling Events
 88 | 	- Can trigger Lambdas from ASG by filtering on EventBridge events
 89 | - CloudFormation + ASGs:
 90 |     - `AutoScalingReplacingUpdate`: `WillReplace` `true` will wait for a complete replacement of the ASG and its instances before deleting the old ASG
 91 |     - `AutoScalingRollingUpdate`: replaces existing instance in ASG; valid options: MaxBatchSize, MinInstancesInService,  MinSuccessfulInstancesPercent, PauseTime, SuspendProcesses, WaitOnResourceSignals
 92 | 
 93 | #### Storage Gateway
 94 | 
 95 | - Storage Gateway does not automatically refresh the cache if the files were added directly to S3. `RefreshCache` can be used to refresh the cache periodically.
 96 | 	- **Tape gateway** is backed up by glacier, meant for backups etc
 97 | 	- **File gatewayEC2** gets on-premises data into the cloud
 98 | 	- **Volume gateway** is cloud-backed iSCSI block storage volumes
 99 | 
100 | #### SSM
101 | 
102 | - ``AWS-AmazonLinuxDefaultPatchBaseline`` is a predefined patch baseline, doesn't do custom patches
103 | - `aws:runDocument` plugin runs SSM documents stored in Systems Manager or on a local share
104 | - `aws:downloadContent` plugin downloads an SSM document from a remote location to a local share
105 | - Can use SSM to create AMIs
106 | 
107 | #### ELB
108 | 
109 | - ALBs can be configured for 'dual stack' mode that allows IPv4 and IPv6
110 | - ALBs can have weightings between target groups
111 | 
112 | ### Security
113 | 
114 | #### IAM
115 | 
116 | - `iam:passrole` passes a role to a service. E.g. a developer role to CloudFormation
117 | 
118 | #### Firewall Manager
119 | 
120 | - Firewall Manager can be used to configure and apply WAF ACLs to the ALBs in an AWS account. It can help centrally manage as well as apply them to new accounts added to the Organization in the future.
121 | 
122 | #### KMS
123 | 
124 | - **KMS grants** are commonly used by AWS services that integrate with AWS KMS to encrypt your data at rest.
125 | 	- The service creates a grant on behalf of a user in the account, uses its permissions, and retires the grant as soon as its task is complete.
126 | 
127 | ### Compliance
128 | 
129 | #### GuardDuty
130 | 
131 | - Can be used for org-wide compliance
132 | - AWS recommends a separate delegated GuardDuty administrator account
133 | - Can auto-enable GuardDuty for all future Org accounts
134 | - Can configure GuardDuty **Trusted IP** list and **Threat IP** list and work with findings based on those
135 | - GuardDuty needs EventBridge for filtering
136 | 
137 | #### Config
138 | 
139 | - AWS Config can ensure all EC2 instances are managed by AWS Systems Manager.
140 | - AWS Config can find `ec2-volume-inuse-check`, but cannot detect how long a volume was unused for
141 | - `cloudformation-stack-drift-detection-check`  checks if the actual configuration of a CloudFormation stack differs, or has drifted
142 | -  `s3-bucket-ssl-requests-only`  checks whether S3 buckets have policies that require requests to use SSL
143 | - Can deploy **conformance packs** into org accounts (from a delegated admin account)
144 | - Config itself is per region, use **Config Aggregator** for centralised collection of findings across regions & accounts
145 | 	- Uses aggregator account
146 | - By default, AWS Config will not automatically remediate the accounts that disabled its CloudTrail. You must manually set this up using a CloudWatch Events rule and a custom Lambda function that calls the StartLogging API to enable CloudTrail back again. Furthermore, the `cloudtrail-enabled` AWS Config managed rule is only available for the periodic trigger type and not Configuration changes.
147 | 
148 | #### ControlTower
149 | 
150 | - Use EventBridge to get notifications on Control Tower events like `CreateManagedAccount`
151 | - **Customizations for AWS Control Tower (CfCT)** helps you customize your AWS Control Tower landing zone and stay aligned with AWS best practices. Customizations are implemented with AWS CloudFormation templates and service control policies (SCPs).
152 | 	- CfCT capability is integrated with AWS Control Tower lifecycle events so that your resource deployments remain synchronized with your landing zone
153 | 
154 | #### Org Policies
155 | 
156 | - Are inherited down the path `org root` -> `ou` -> `accounts`
157 | 
158 | #### Trusted Advisor
159 | 
160 | - AWS Trusted Advisor checks identify ways to optimize your AWS infrastructure, improve security and performance, reduce costs, and monitor service quotas
161 | 	-  Cost Optimization, Performance, Security, Fault Tolerance, and Service Limits
162 | - TrustedAdvisor can check for under-utilized EC2
163 | - Trusted Advisor's primary integration point is CloudWatch Events
164 | -  With Trusted Advisor’s **Service Limit Dashboard**, you can view, refresh, and export utilization and limit data on a per-limit basis.
165 | 	- Metrics are published on Amazon CloudWatch in which you can create custom alarms
166 | 
167 | #### Health
168 | 
169 | - AWS Health is scanning public repos and can send events for compromised keys
170 | - On detection of an exposed IAM access key, AWS Health generates an `AWS_RISK_CREDENTIALS_EXPOSED` CloudWatch Event.
171 | - Also lists AWS Scheduled maintenance events on Health Dashboard
172 | 	- Can use CloudWatch Events/EventBridge to trigger workflows based on events
173 | -  Can monitor AWS Health events using Amazon EventBridge or CloudWatch Events by calling the AWS Health API
174 | 
175 | #### CloudTrail
176 | 
177 | - Can set up trails for
178 | 	- **Data events**: These events provide insight into the resource operations performed on or within a resource. These are also known as data plane operations.
179 | 		- For S3 or Lambda data events
180 | 	- **Management events**: Management events provide insight into management operations that are performed on resources in your AWS account. These are also known as control plane operations.
181 | 
182 | ### Networking
183 | 
184 | #### VPC
185 | 
186 | - NAT gateway does not span multiple AZs (instead: one gateway per AZ)
187 | - Can send VPC Flow Logs to CloudWatch Logs
188 | 
189 | ### Storage
190 | 
191 | #### Aurora
192 | 
193 | - Read replicas are always asynchronous
194 | - AWS Aurora Global Database uses storage-based replication with typical latency of less than 1 second, using dedicated infrastructure that leaves your database fully available to serve application workloads.
195 | 	- 1 primary region (read/write), up to 5 secondary regions (read)
196 | 	- In the event of a regional degradation or outage, one of the second regions can be promoted to read and write capabilities in less than 1 minute.
197 | - Aurora endpoints
198 | 	- single built-in cluster endpoint, connects to the primary instance of the cluster
199 | 	- reader endpoint for read-only connections for your Aurora cluster
200 | 	- can have custom cluster endpoints (managed by Aurora) that can be READER. WRITER or ANY
201 | 
202 | #### RDS
203 | 
204 | - RDS creates and saves automated backups of your DB instance or Multi-AZ DB cluster during the backup window of your database.
205 | 	- default: 30min backup during 8h per-region night
206 | - Amazon RDS uses SNS to provide notification when an Amazon RDS event occurs.
207 | 	- Can also use CloudWatch Events/Eventbridge
208 | - Failover:
209 |     - AZ outages => RDS multi-AZ deployment
210 |     - Regional outages => RDS read replica
211 |     - Multi-region deployments are like multi-AZ deployments, but other regions  can be used for reads. Read replicas can be in the same AZ, same region, or cross-region
212 |     - Read replicas have best RTO/RPO, but highest cost
213 | 
214 | #### DynamoDB
215 | 
216 | - In DynamoDb `ThrottledWriteRequests` can help adjusting increase the maximum write capacity units for the table's Auto Scaling policy.
217 | - `WriteThrottleEvents` are requests to DynamoDB that exceed the provisioned write capacity units for a table or a global secondary index.
218 | - Can use Kinesis Data Streams to capture changes to DynamoDB
219 | - Amazon DynamoDB global tables provide a single-digit millisecond latency and make sure the data is available across regions.
220 | - DynamoDB Global Tables requires
221 | 	- tables are created in each region already
222 | 	- DynamoDB Streams is enabled
223 | - Don't have multiple lambdas read from DynamoDB Streams
224 | 	- Only one process per shard!
225 | 	- Better to use fan-out pattern
226 | 
227 | #### Glue
228 | 
229 | - AWS Glue is an efficient way to store object metadata. Combination: S3 - Glue - Athena - QuickSight
230 | 
231 | #### S3
232 | 
233 | - Can include a pre-calculated checksum as part of your request. Amazon S3 compares the provided checksum to the checksum that it calculates by using your specified algorithm
234 | - Can activate access logs and use Athena for analysis/queries
235 | - S3 cross-region replication is push-based: source bucket gets a replication rule, destination bucket gets a bucket policy, source needs IAM role for S3 service to assume
236 | 	- Configure a replication rule within the source bucket to activate the replication process.
237 | 	- Create a bucket policy in the destination bucket that grants the source bucket permission to replicate objects into it.
238 | 	- In the source AWS account, create an IAM role that Amazon S3 can assume to replicate objects. Enable versioning in both buckets.
239 |  - AWS CloudTrail only logs bucket-level actions in your Amazon S3 buckets by default. If you want to record all object-level API activity in your S3 bucket, you can set up data events in CloudTrail
240 | 
241 | ### Serverless
242 | 
243 | #### API Gateway
244 | 
245 | - API Gateway does not have specific metrics for individual http error codes like 403, only a generic `4XXError` metric
246 | - Can enable API **caching** in Amazon API Gateway to cache your endpoint's responses
247 | 
248 | #### ECS
249 | 
250 | - can set ECS tasks as a target of CloudWatch events
251 | - ECS/Fargate logs
252 | 	- add the required `logConfiguration` parameters to your task definition to turn on the `awslogs` log driver
253 | - ECS/EC2
254 | 	- container instances have an attached IAM role that contains `logs:CreateLogStream` and `logs:PutLogEvents`
255 | 	- to turn on the `awslogs` log driver, your Amazon ECS container instances require at least version 1.9.0 of the container agent
256 | 
257 | ### Application Auto Scaling
258 | 
259 | - Is a web service for automatically scaling scalable resources for individual AWS services beyond Amazon EC2
260 | 	- Lambda function provisioned concurrency
261 | 	- DynamoDB tables and global secondary indexes
262 | 	- Aurora replicas
263 | 	- Amazon Elastic Container Service (ECS) services
264 | 	- ...
265 | 
266 | - **Target** tracking scaling – Scale a resource based on a target value for a specific CloudWatch metric.
267 | - **Step** scaling – Scale a resource based on a set of scaling adjustments that vary based on the size of the alarm breach.
268 | - **Scheduled** scaling – Scale a resource one time only or on a recurring schedule.
269 | 
270 | ### Content Delivery
271 | 
272 | #### CloudFront
273 | 
274 | - **OriginGroup**: An origin group includes two origins (a primary origin and a second origin to failover to) and a failover criteria that you specify.
275 | 
276 | ### Notifications/Events
277 | 
278 | #### SNS
279 | 
280 | - SNS defines a **delivery policy** for each delivery protocol. The delivery policy defines how Amazon SNS retries the delivery of messages when server-side errors occur (when the system that hosts the subscribed endpoint becomes unavailable).
281 | 	- When the delivery policy is exhausted, Amazon SNS stops retrying the delivery and discards the message
282 | 	- —> unless a **dead-letter queue** is attached to the subscription.
283 | - For ECS notifications on **essential task** stopped, used EventBridge
284 | - For S3 fanout, use SNS and subscribe consumers to it
285 | 
286 | ### Logging/Monitoring/Notification
287 | 
288 | #### CloudWatch
289 | 
290 | - CloudWatch Logs are always encrypted
291 | - CloudWatch _Metrics_ filters can be used to filter CloudWatch _Logs_
292 | - Can create CloudWatch **Alarm** for the `StatusCheckFailed_System` metric and select the EC2 action to recover the instance
293 | - **CloudWatch Logs Subscription** for near realtime feed of log events
294 | 	- "Getting logs out of CloudWatch for further processing"
295 | 	- from CloudWatch Logs, to _Kinesis_, _ElasticSearch_ or _Lambda_
296 | - CloudWatch has a predefined dashboard for CodeBuild metrics
297 | - You can call the EC2 `CreateSnapshot` API directly as a target from CloudWatch Events.
298 | 
299 | #### KMS
300 | 
301 | - KMS monitors to CloudWatch, can define alarms and alert
302 | 
303 | #### Xray
304 | 
305 | - Can run X-Ray daemon on AWS Elastic Beanstalk
306 | - X-Ray daemon uses UDP port 2000
307 | 
308 | ---
309 | 
310 | ## Comments per Topic
311 | 
312 | ### Implement CI/CD Pipelines
313 | 
314 | - CodeDeploy states + lifecycle hooks
315 | - CodeCommit IAM policies
316 | - CodeCommit needs CloudWatch Events/EventBridge to detect PRs
317 | 	- (EventBridge is the same service as CloudWatch Events, just with a new interface and more features exposed.)
318 | - GitHub needs a web hook to start a CodePipeline
319 | - CodeDeploy lifecycle hooks (reserved for CodeDeploy in parentheses):
320 | 	- `ApplicationStop`
321 | 	- (`DownloadBundle`)
322 | 	- `BeforeInstall`
323 | 	- (`Install`)
324 | 	- `AfterInstall`
325 | 	- `ApplicationStart`
326 | 	- `ValidateService`
327 | 	- `BeforeBlockTraffic`
328 | 	- (`BlockTraffic`)
329 | 	- `AfterBlockTraffic`
330 | 	- `BeforeAllowTraffic`
331 | 	- (`AllowTraffic`)
332 | 	- `AfterAllowTraffic`
333 | - Integrate automated testing into CI/CD pipelines
334 | 	- CloudWatch Logs + EventBridge to automate based on CodeBuild job results
335 | 	- CodeDeploy + EventBridge to automate based on CodeDeploy job results
336 | 	- EventBridge for CodePipeline scheduled events
337 | 	- CodeDeploy can integrate with CloudWatch Alarms to pause deployments
338 | - Build and manage artifacts
339 | 	- CodeBuild + CodePipeline + CodeDeploy + S3 for artifacts
340 | 	- S3 versioning + encryption required for CodePipeline
341 | - Implement deployment strategies for instance, container, and serverless environments
342 | 	- Elastic Beanstalk policies
343 | 		- All at once - fastest, but causes downtime; all remaining options have zero downtime
344 | 		- Rolling - still uses batches
345 | 		- Rolling with additional batch - to maintain full capacity during deploy
346 | 		- Immutable for when new & old versions must not be mixed and for fast rollback
347 | 		- Traffic splitting: for canary deploys
348 | 		- Blue/Green deployments: swap environment URLs; keep RDS in a separate stack; requires DNS change (all previous ones do not)
349 | 	- Lambda
350 | 		- canary deployments via alias weights
351 | 		- use CodeDeploy default deploy options:
352 | 			- Lambda: `LambdaLinear10PercentEvery10Minutes` (10% of traffic shifted at a time), `LambdaCanary10Percent10Minutes` (one 10% and one 90% deploy)
353 | 			- EC2: `AllAtOnce`, `OneAtATime`, `HalfAtATime`
354 | 	  - ALB + EC2 + Route53 alias record swaps
355 | 	  - OpsWorks Stack cloning + Route53 alias swaps
356 | 		  - OpsWorks lifecycle stages
357 | 
358 | ### Config Management and IaC
359 | 
360 | - Define cloud infrastructure and reusable components to provision and manage  systems throughout their lifecycle
361 | 	- CloudFormation cross-stack references use exports + Fn::ImportValue
362 | 	- Inline Lambda functions in CFN
363 | 	- Custom resource is used to invoke a Lambda function in AWS CloudFormation, the request will include a pre-signed URL. The Lambda  function is responsible for returning a response to the pre-signed URL to indicate if the resource creation was successful or not.
364 | - Deploy automation to create, onboard, and secure AWS accounts in a multi-account/multi-region environment
365 | - Design and build automated solutions for complex tasks and large-scale environments
366 | 	- CloudFormation + ASGs:
367 | 		- `AutoScalingReplacingUpdate`: `WillReplace:true` will wait for a complete replacement of the ASG and its instances before deleting the old ASG
368 | 		- `AutoScalingRollingUpdate`: replaces existing instance in ASG; valid options: `MaxBatchSize`, MinInstancesInService, MinSuccessfulInstancesPercent, PauseTime, SuspendProcesses, WaitOnResourceSignals
369 |   - OpsWorks can create _time-based_ instances for scaling of predictable workload, or _load-based_ using CPU utilisation or load, or memory utilisation
370 |   - Collecting on-prem info:
371 | 	  - Application Discovery Agent (install on each VM) or Agentless Discovery Connector (separate VM)
372 | 
373 | ### Resilient Cloud Solutions
374 | 
375 | - Implement highly available solutions to meet resilience and business requirements
376 | 	- RDS:
377 | 		- AZ outages => RDS multi-AZ deployment
378 | 		- Regional outages => RDS read replica
379 | 		- Multi-region deployments are like multi-AZ deployments, but other regions can be used for reads. Read replicas can be in the same AZ, same region, or cross-region
380 | 		- Read replicas have best RTO/RPO, but highest cost
381 | 	- Frontend traffic switching => Route53 failover
382 | 	- AutoScaling with a min & max of 1 is actually sensible - it makes the instance auto-redeploy if it dies
383 | 	- Route53 policies: `simple`, `failover`, `geolocation`, `geoproximity`, `latency`, `multi-value answer`, `weighted`
384 | - Implement solutions that are scalable to meet business requirements
385 | 	- ASG lifecycle states:
386 | 		- `Pending` (hooks `Pending:Wait`, `Pending:Proceed`)
387 | 		- `InService`
388 | 		- `Terminating` (hooks `Terminating:Wait`, `Terminating:Proceed`)
389 | 		- `Terminated`
390 | 	- EC2 autoscaling` Pending:Wait` lifecycle hook can allow AMI upgrades before bringing them into service
391 | 	- `Terminating:Wait` lifecycle hook to collect instance data (e.g. logs) before final termination
392 | 	- EKS: k8s cluster autoscaler or karpenter
393 |   - EKS networking:
394 | 	- VPC CNI plugin
395 | 	- Load Balancer Controller
396 | 	- CoreDNS
397 | 	- kube-proxy
398 | 	- Calico
399 | - Hybrid environment patching
400 | - Implement automated recovery processes to meet RTO/RPO requirements
401 | 
402 | ### Monitoring and Logging
403 | 
404 | - Configure the collection, aggregation, and storage of logs and metrics
405 | 	- AWS Config Aggregator for centralised collection of findings across regions & accounts
406 | 	- EC2 custom logging requirements => CloudWatch Logs Agent
407 | 	- ECS Fargate logs => awslogs driver on task definition
408 | 	- CloudWatch has a predefined dashboard for CodeBuild metrics
409 | - Audit, monitor, and analyze logs and metrics to detect issues
410 | 	- near real time dashboards => QuickSight
411 | 	- near real time processing on CloudWatch logs:
412 | 		- Lambda subscription filter
413 | 		- Kinesis stream filter
414 | 		- ElasticSearch (OpenSearch) subscription filter
415 | 	- CloudTrail has log integrity checking which must be turned on
416 | - Automate monitoring and event management of complex environments
417 | 	- Service limit alerting => Trusted Advisor + CloudWatch Alarms + ServiceLimitUsage metric
418 | 
419 | ### Incident and Event Response
420 | 
421 | - Manage event sources to process, notify, and take action in response to events
422 | 	- S3 event notifications for data notifications like file deletion
423 | 	- RDS event notifications for multi-AZ failover events
424 | 	- EventBridge + AWS Health for notification about IAM credentials being  exposed on GitHub, and for notifications about instance outages, etc.
425 | 	- CloudTrail _data_ events for object-level activity on S3
426 | 	- EC2 Auto Scaling groups => EventBridge
427 | 	- CodePipeline stage => EventBridge
428 | 	- CodeDeploy => CloudWatch Alarm + `MinimumHealthyHosts` metric can be used for rollbacks
429 | 	- OpsWorks self-healing => EventBridge
430 | - Implement configuration changes in response to events
431 | - Troubleshoot system and application failures
432 | 
433 | ### Security and Compliance
434 | 
435 | - Implement techniques for identity and access management at scale
436 | 	- Limit CodeCommit permissions via IAM policy which matches repo
437 | 	- S3 bucket policies for requiring TLS
438 | - Apply automation for security controls and data protection
439 | 	- Lifecycle management + auto-rotation of secrets => Secrets Manager
440 | 	- Cost-effective => SSM Parameter Store SecureStrings
441 | 	- Patching => SSM Patch Manager
442 | - Implement security monitoring and auditing solutions
443 | 


--------------------------------------------------------------------------------
/sysops-administrator-associate.md:
--------------------------------------------------------------------------------
   1 | [toc_start]::
   2 | <a name="top"></a>
   3 | ---
   4 | * [AWS-SysOps-Administrator-Associate](#1)
   5 | * [Monitoring And Metrics](#2)
   6 |   * [Virtualization Types](#2_1)
   7 |   * [EC2 Instance Types](#2_2)
   8 |   * [EC2 Monitoring](#2_3)
   9 |   * [EBS Monitoring](#2_4)
  10 |   * [EFS Monitoring](#2_5)
  11 |   * [CloudWatch](#2_6)
  12 | * [Costs](#3)
  13 |   * [Consolidated Billing](#3_1)
  14 |   * [Billing Metrics & Alarms](#3_2)
  15 |   * [Costs Optimization](#3_3)
  16 |   * [Cost Explorer](#3_4)
  17 | * [High Availability](#4)
  18 |   * [Scalability & Elasticity Fundamentals](#4_1)
  19 |   * [Reserved Instances](#4_2)
  20 |   * [Autoscaling vs Resizing](#4_3)
  21 |   * [Load Balancers](#4_4)
  22 |   * [RDS HA](#4_5)
  23 |   * [HA for IP-based Applications](#4_6)
  24 |   * [HA/Fault Tolerance for Bastion Hosts](#4_7)
  25 | * [Analysis](#5)
  26 |   * [Optimize the environment to ensure maximum performance](#5_1)
  27 |   * [Identify Performance Bottlenecks and Implement Remedies](#5_2)
  28 |   * [Identify Potential Issues on a Given Application Deployment](#5_3)
  29 | * [OpsWorks](#6)
  30 |   * [Overview and components](#6_1)
  31 |   * [Cloudformation](#6_2)
  32 | * [Backups & Recovery](#7)
  33 |   * [AWS Services with automated backups](#7_1)
  34 |   * [Disaster Recovery Scenarios](#7_2)
  35 |   * [Storing log files and backups](#7_3)
  36 | * [Security](#8)
  37 |   * [Implement and Manage Security Policies](#8_1)
  38 |   * [Ensure Data Integrity and Access Controls when Using the AWS Platform](#8_2)
  39 |   * [Share responsibility model](#8_3)
  40 |   * [AWS and IT Audits](#8_4)
  41 | * [Networking](#9)
  42 |   * [Route53 Routing Policies](#9_1)
  43 |   * [VPC Essentials](#9_2)
  44 |   * [Limits:](#9_3)
  45 | * [Etc](#10)
  46 |   * [Accessing the OS](#10_1)
  47 |   * [SQS](#10_2)
  48 |   * [DynamoDb](#10_3)
  49 | ---
  50 | [toc_end]::
  51 | <a name="1"></a>
  52 | # [↖](#top)[↑](#)[↓](#2) SysOps Administrator Associate
  53 | > 5/2018 - 9/2018
  54 | 
  55 | ---
  56 | 
  57 | <a name="2"></a>
  58 | # [↖](#top)[↑](#1)[↓](#2_1) Monitoring And Metrics
  59 | 
  60 | <a name="2_1"></a>
  61 | ## [↖](#top)[↑](#2)[↓](#2_2) Virtualization Types
  62 | 
  63 | Linux Amazon Machine Images use one of two types of virtualization:
  64 | 
  65 | AMI|Type|Effect
  66 | -|-|-
  67 | **PV**|Paravirtual|Historically better performance than HVM, but no longer the case
  68 | **HVM**|Hardware virtual machine|More modern, same or better performance than PV
  69 | 
  70 | <a name="2_2"></a>
  71 | ## [↖](#top)[↑](#2_1)[↓](#2_3) EC2 Instance Types
  72 | 
  73 | **General Purpose**|Balance of computer, memory and networking
  74 | -|-
  75 | **M5**<br/>(2017)|* Require HVM AMIs<br/> * Instance store via EBS or NVMe SSD (physically connected to to the host server)
  76 | **M4**<br/>(2015)|* Allows *enhanced networking*<br/> * EBS-optimized
  77 | **M3**<br/>(2012)|* SSD (instance) store
  78 | **T3**<br/>(2018)|* 30% better price performance
  79 | **T2**<br/>(2014)|* Intented for workloads that do not use the full CPU constantly (e.g. web server)<br/> * Allows *burstable performance* <br/>* Burst credits allow to 'burst' past the baseline performance up to 100%<br/> * 1 credit = 100% load per core per minute<br/> * Credits are earned per hour, expire after 24h<br/> * EBS storage only
  80 | 
  81 | **Compute optimized**|Lowest prize for *compute* performance
  82 | -|-
  83 | **C5**<br/>(2016)| * Intel Skylake<br/> * Use Nitro, Amazon’s lightweight hardware accelerated hypervisor<br/> * Better performance and pricing than C4
  84 | **C4**<br/>(2015)| * Intel Haswell<br/> * Optimized for EC2<br/> * Allows *enhanced networking* and *clustering*<br/> * EBS-optimized
  85 | **C3**<br/>(2013)| * SSD (instance) store<br/> * Allows *enhanced networking* and *clustering*
  86 | 
  87 | **Memory optimized**|Lowest prize for *memory* performance
  88 | -|-
  89 | **Z1d**<br/>(2018)| * Offer both high compute capacity and a high memory footprint<br/> * Ideal for workloads with high per-core licensing costs
  90 | **X1**<br/>(2016)| * One of the lowest price per GiB of RAM<br/> * SSD storage and EBS-optimized by default<br/> * **X1e** has even more RAM
  91 | **R5**<br/>(2018)| * Use Nitro, Amazon’s lightweight hardware accelerated hypervisor
  92 | **R4**<br/>(2016)| * Improved networking and EBS performance
  93 | **R3**<br/>(2014)| * SSD (instance) store<br/> * High memory capacity<br/> * Allows *enhanced networking*
  94 | 
  95 | **GPU optimized**|.
  96 | -|-
  97 | **P3**<br/>(2017)| * Faster than P2
  98 | **P2**<br/>(2016)| * Intended for general-purpose GPU compute applications
  99 | **G3**<br/>(2017)| * Optimized for graphics-intensive applications<br/>* Faster then G2
 100 | **G2**<br/>(2013)| * High frequency processors<br/> * High-performce NVIDIA GPUs
 101 | 
 102 | 
 103 | **Storage optimized**|Very fast SSD-backed instance storage optimized for high random I/O and high IOPS
 104 | -|-
 105 | **H1**<br/>(2017)| * HDD-based local storage<br/> * deliver high disk throughput<br/> * Balance of compute and memory
 106 | **I3**<br/>(2016)| * (NVMe) SSD-backed instance storage optimized for low latency<br/> * very high random I/O performance
 107 | **D2**<br/>(2015)| * Lowest price per disk throughput performance
 108 | **I2**<br/>(2013)| * SSD (instance) store<br/> * Allows *enhanced networking*<br/> * Supports *TRIM* (more efficient SSD operations)
 109 | 
 110 | **RDS instance types**|Optimized to fit different relational database use cases
 111 | -|-
 112 | **db.**|General purpose, memory optimized, burstable performance
 113 | 
 114 | .*
 115 | 
 116 | <a name="2_3"></a>
 117 | ## [↖](#top)[↑](#2_2)[↓](#2_3_1) EC2 Monitoring
 118 | 
 119 | <a name="2_3_1"></a>
 120 | ### EC2 Status Checks
 121 | * AWS performs automated checks on every running EC2 instance
 122 | * Performed every minute
 123 | * Each returns a pass or a fail status
 124 | 
 125 | **System Status Check**
 126 | * Loss of network connectivity
 127 | * Loss of system power
 128 | * Hardware/software issues on physical host
 129 | * Solution
 130 |   * Stop and start instance
 131 |   * Terminate and re-launch instance
 132 |   * Contact AWS
 133 | * Can configure for *auto-recovery*
 134 |   * Instance will be rebooted and retain instance id, (e)ip address, EBS volumes et al
 135 | 
 136 | **Instance Status Check**
 137 | * Failed system status check
 138 | * Network/startup configuration issues
 139 | * Memory/disk problems
 140 | * Kernel compatability issues
 141 | * Solution
 142 |   * Fix problem
 143 |   * Stop and start instance
 144 |   * Terminate and re-launch instance, potentially with more memory/network/disk/...
 145 | 
 146 | <a name="2_4"></a>
 147 | ## [↖](#top)[↑](#2_3_1)[↓](#2_4_1) EBS Monitoring
 148 | 
 149 | <a name="2_4_1"></a>
 150 | ### EBS Status Checks
 151 | * Run every 5 minutes
 152 |   * `insufficient data` if checks a running
 153 |   * `ok` if all checks pass
 154 |   * `warning` typically has to do with performance **degradation** from provisioned IOPS
 155 |   * `impaired` is a check fails, eg. the volume is **stalled** or not available
 156 | 
 157 | * If Amazon EBS finds that data on a volume might be inconsistent, it disables I/O to that volume.
 158 |   * Changes status to `impaired`
 159 |   * This behaviour can be disabled
 160 | 
 161 | <a name="2_4_2"></a>
 162 | ### EBS Performance Essentials
 163 | **IOPS** (Input/Output Operations Per Second) is a common performance measurement used to benchmark
 164 | computer storage devices like hard disk drives (HDD), solid state drives (SSD), and storage area
 165 | networks (SAN).
 166 | * I/O size is capped at 256 KiB for SSD volumes and 1,024 KiB for HDD volumes because SSD volumes handle
 167 |   small or random I/O much more efficiently than HDD volumes.
 168 | * SSDs deliver constant performance for both random and sequential I/O
 169 | * HDDs have optimal performance for large and sequential I/O
 170 | * HDD can deliver more throughput put drastically less IOPS
 171 | 
 172 | .|`gp2`|`io1`|`st1`|`sc1`
 173 | -|-|-|-|-
 174 | Volume type|General purpose SSD|Provisioned IOPS SSD|Throughput optimized HDD|Cold HDD
 175 | Purpose|Balances price and performance|For mission-critical low-latency or high-throughput workloads|Low cost HDD volume designed for frequently accessed, throughput-intensive workloads |Lowest cost HDD volume designed for less frequently accessed workloads
 176 | Volume Size|1 GiB - 16 TiB|4 GiB - 16 TiB|500 GiB - 16 TiB|500 GiB - 16 TiB
 177 | Max. IOPS(1)/Volume|10,000|32,000|500|250
 178 | Max. Throughput/Volume|160 MiB/s|500 MiB/s|500 MiB/s|250 MiB/s
 179 | IOPS|* 3 IOPS per GB (larger volume means more IOPS)<br/>* 100 IOPS <-> 10,000 IOPS<br/>* Can burst to 3,000 IOPS if volume size is < 1TB<br/>* Requires credits that are acquired per 3 IOPS/GB/second<br/>* Max 5.4 miilion credit (also intitial value), enough for 3,000 IOPS for 30min<br/>* Running out of credits reverts volume back to baseline performance|* 30 IOPS per GB (larger volume means more IOPS), up to 20,000<br/>* Does not burst, delivers consistent IOPS rate instead|.|.
 180 | 
 181 | > (1) gp2/io1 based on 16 KiB I/O size, st1/sc1 based on 1 MiB I/O size
 182 | 
 183 | * Using *EBS optimized* instances guarantees optimal networking between EBS and EC2
 184 | * Pre-warming/intialization
 185 |   * No longer needed for new EBS volumes
 186 |   * Storage blocks on volumes restored from snapshots do need to be initialized (read from)
 187 | 
 188 | <a name="2_5"></a>
 189 | ## [↖](#top)[↑](#2_4_2)[↓](#2_5_1) EFS Monitoring
 190 | 
 191 | * Two throughput modes to choose from for your file system
 192 |   * **Bursting** Throughput - throughput on Amazon EFS scales as your file system grows
 193 |   * **Provisioned** Throughput - you can instantly provision the throughput of your file system (in MiB/s) independent of the amount of data stored.
 194 | 
 195 | <a name="2_5_1"></a>
 196 | ### Performance comparison
 197 | .|Amazon EFS|Amazon EBS Provisioned IOPS (`io1`)
 198 | -|-|-
 199 | Per-operation latency|Low, consistent latency.|Lowest, consistent latency.
 200 | Throughput scale|10+ GB per second.|Up to 2 GB per second.
 201 | 
 202 | <a name="2_5_2"></a>
 203 | ### Storage Characteristics Comparison
 204 | 
 205 | .|Amazon EFS|Amazon EBS Provisioned IOPS
 206 | -|-|-
 207 | Availability and durability|Data is stored redundantly across multiple AZs.|Data is stored redundantly in a single AZ.
 208 | Access|Up to thousands of Amazon EC2 instances, from multiple AZs, can connect concurrently to a file system.|A single Amazon EC2 instance in a single AZ can connect to a file system.
 209 | Use cases|Big data and analytics, media processing workflows, content management, web serving, and home directories.|Boot volumes, transactional and NoSQL databases, data warehousing, and ETL.
 210 | 
 211 | <a name="2_5_3"></a>
 212 | ### S3 vs EFS vs EBS Comparison
 213 | 
 214 | Amazon S3|Amazon EBS|Amazon EFS
 215 | -|-|-
 216 | Can be publicly accessible|Accessible only via the given EC2 Machine|Accessible via several EC2 machines and AWS services
 217 | Web interface|File System interface|Web and file system interface
 218 | Object Storage|Block Storage|Object storage
 219 | Scalable|Hardly scalable|Scalable
 220 | Slower than EBS and EFS|Faster than S3 and EFS|Faster than S3, slower than EBS
 221 | Good for storing backups|Is meant to be EC2 drive|Good for shareable applications and workloads
 222 | 
 223 | <a name="2_6"></a>
 224 | ## [↖](#top)[↑](#2_5_3)[↓](#2_6_1) CloudWatch
 225 | Monitoring service that plugs into many other services
 226 | 
 227 | * **Metrics**
 228 |   * Based on currently used service
 229 |   * Not everything is available out of the box, e.g. no data on memory usage of EC2 instances
 230 | * **Alarms**
 231 |   * Based on thresholds defined on metrics
 232 |   * Can be added to dashboard
 233 |   * Invoke *Lambda*, *SNS*, email, ...
 234 |   * Takes place once, at a specific point in time
 235 |     * Disable with `mon-disable-alarm-actions` via CLI
 236 | * **Logs**
 237 |   * Log into *log groups*
 238 | * **Events**
 239 |   * Define actions on things that happened
 240 |   * Define `cron`-based events
 241 |   * Events are recorded constantly over time
 242 | 
 243 | <a name="2_6_1"></a>
 244 | ### Key metrics for EC2
 245 | 
 246 | * EC2 metrics are based on what is exposed to the hypervisor.
 247 | * *Basic Monitoring* (default) submits values every 5 minutes, *Detailed Monitoring* every minute
 248 | * Can install Cloudwatch agent (new)
 249 |   * Provides access to more metrics
 250 | * Can use Cloudwatch monitoring scripts (old) to provide more metrics
 251 |   * Perl-scripts provided by AWS, need to manually install on instance
 252 |   * Use `cron` to automate sending data to CloudWatch
 253 | 
 254 | Metric|Effect
 255 | -|-
 256 | `CPUUtilization`|The total CPU resources utilized within an instance at a given time.
 257 | `DiskReadOps`,`DiskWriteOps`|The number of read (write) operations performed on all instance store volumes. This metric is applicable for instance store-backed AMI instances.
 258 | `DiskReadBytes`,`DiskWriteBytes`|The number of bytes read (written) on all instance store volumes. This metric is applicable for instance store-backed AMI instances.
 259 | `NetworkIn`,`NetworkOut`|The number of bytes received (sent) on all network interfaces by the instance
 260 | `NetworkPacketsIn`,`NetworkPacketsOut`|The number of packets received (sent) on all network interfaces by the instance
 261 | `StatusCheckFailed`,`StatusCheckFailed_Instance`,`StatusCheckFailed_System`|Reports whether the instance has passed both/instance/system status check in the last minute.
 262 | 
 263 | * Can **not** monitor **memory usage**, **available disk space**, **swap usage**
 264 | 
 265 | <a name="2_6_2"></a>
 266 | ### Key metrics for EBS
 267 | Metric|Effect
 268 | -|-
 269 | `VolumeReadBytes`,`VolumeWriteBytes`|`sum` reports total bytes transferred, `average` also useful
 270 | `VolumeReadOps`,`VolumeWriteOps`|total number of IO operations
 271 | `VolumeQueueLength`|Number of read/write operation requests waiting to finish
 272 | `VolumeTotalReadTime`,`VolumeTotalWriteTime`|Total number of seconds spent by all operations in a given time
 273 | `VolumeThroughputPercentage`|Percentage of IOPS that was achieved out of total provisioned IOPS
 274 | `VolumeConsumedReadWriteOps`|Total amount of r/w operations consumed within a specific time period
 275 | 
 276 | * Can **not** monitor **disk usage percentage**
 277 | 
 278 | <a name="2_6_3"></a>
 279 | ### Key metrics for EFS
 280 | Metric|Effect
 281 | -|-
 282 | `BurstCreditBalance`|The number of burst credits that a file system has.
 283 | `ClientConnections`|The number of client connections to a file system.
 284 | `DataReadIOBytes`,`DataWriteIOBytes`|The number of bytes for each file system read(write) operation.
 285 | `MetadataIOBytes`|The number of bytes for each metadata operation.
 286 | `PercentIOLimit`|Shows how close a file system is to reaching the I/O limit of the General Purpose performance mode.
 287 | `PermittedThroughput`|The maximum amount of throughput a file system is allowed.
 288 | `TotalIOBytes`|The number of bytes for each file system operation, including data read, data write, and metadata operations.
 289 | 
 290 | <a name="2_6_4"></a>
 291 | ### Key metrics for ELB (classic load balancer)
 292 | Metric|Effect
 293 | -|-
 294 | `Latency`|Time it takes to receive an response. Measure `max` and `average`
 295 | `BackendConnectionErrorr`|Number of not successfully established connections to registered instances, measure `sum` and look at difference between `min` and `max`
 296 | `SurgeQueueLength`|Total number of request waiting to get routed, look at `max` and `average`
 297 | `SpilloverCount`|Dropped requests because of exceeded surge queue. Look at `sum`
 298 | `HTTPCode_ELB_3XX_Count`<br/>`HTTPCode_ELB_4XX_Count`<br/>`HTTPCode_ELB_5XX_Count`|The number of HTTP XXX server error codes that originate from the *load balancer*. This count does *not* include any response codes generated by the targets.
 299 | `RequestCount`|Number of completed requests
 300 | `HealthyHostCount`,`UnhealthyHostCount`|Self explainatory
 301 | 
 302 | * In case of sudden and very large increases in traffic it's possible to contact AWS and have them
 303 | 'pre-warm' the *ELB*.
 304 | 
 305 | > spillover and surge queue give an indication of the ELB being overloaded
 306 | 
 307 | * Typically this means that the backend system cannot process requests as fast as they are coming in
 308 |   * Ideally load balance into an autoscaling group.
 309 | 
 310 | <a name="2_6_5"></a>
 311 | ### Key metrics for ALB (active load balancer)
 312 | Metric|Effect
 313 | -|-
 314 | `RequestCount`|Number of completed requests
 315 | `HealthyHostCount`,`UnhealthyHostCount`|Self explainatory
 316 | `TargetResponseTime`|The time elapsed after the request leaves the load balancer until a response from the target is received.
 317 | `HTTPCode_ELB_3XX_Count`<br/>`HTTPCode_ELB_4XX_Count`<br/>`HTTPCode_ELB_5XX_Count`|The number of HTTP XXX server error codes that originate from the *load balancer*. This count does *not* include any response codes generated by the targets.
 318 | 
 319 | <a name="2_6_6"></a>
 320 | ### Key metrics for NLB (network load balancer)
 321 | Metric|Effect
 322 | -|-
 323 | `processedbyte `|The total number of bytes processed by the load balancer, including TCP/IP headers.
 324 | `tcp_client_reset_count`|the total number of reset (rst) packets sent from a client to a target.
 325 | `tcp_elb_reset_count`|the total number of reset (rst) packets generated by the load balancer.
 326 | `tcp_target_reset_coun`|the total number of reset (rst) packets sent from a target to a client.
 327 | 
 328 | <a name="2_6_7"></a>
 329 | ### Key metrics for elasticache
 330 | Supports *memcached* and *redis*
 331 | 
 332 | Metric|**memcached**|**redis**
 333 | -|-|-
 334 | .|Designed for simplicity|Supports a much richer set of features. can be backed up if in *cluster* mode
 335 | `cpu utilization`|* multithreaded<br/>* stay under 90%/#cores<br/>* -> increase # read replicase or use larger cache instance|* single threaded<br/>* stay under 90%<br/>* -> increase size of node or add more nodes
 336 | `evictions`|* -> increase size or add nodes to cluster|* -> increase node size
 337 | `concurrent connections`|* -> check application logic|* -> check application logic
 338 | `swap usage`|* avoid swapping<br/> -> increase `memcached_connections_overhead`|avoid swapping<br/>* -> increase node size<br/>* -> increase `memory connection overhead` (will decrease memory available for cache)
 339 | 
 340 | .*
 341 | 
 342 | <a name="2_6_8"></a>
 343 | ### Key metrics for RDS
 344 | Metric|Effect
 345 | -|-
 346 | `CPUUtilization`|Percentage of CPU utilization
 347 | `DatabaseConnections`|Number of connections that we have at a given point in time
 348 | `DiskQueueDepth`|Number of read/write requests waiting to access the disk
 349 | `FreeableMemory`|Amount of available RAM
 350 | `FreeStorageSpace`|Amount of available storage space
 351 | `SwapUsage`|When data is stored in memory on disk
 352 | `Increase`|In this usually has to do with running out of available RAMReadIOPS/WriteIOPS
 353 | `IOPS`|Represent the number of I/O operations completed per secondIf we don’t have enough IOPS, performance will slow down
 354 | `ReadLatency/WriteLatency`|* Average amount of time taken per disk I/O operation (input/output)<br/>* High latency can be solved with more IOPSReadThroughput/WriteThroughput<br/>* `Average` is number of bytes read or written to or from disk per second
 355 | 
 356 | .*
 357 | 
 358 | * Also look at *RDS Events*
 359 | 
 360 | ---
 361 | 
 362 | <a name="3"></a>
 363 | # [↖](#top)[↑](#2_6_8)[↓](#3_1) Costs
 364 | 
 365 | <a name="3_1"></a>
 366 | ## [↖](#top)[↑](#3)[↓](#3_2) Consolidated Billing
 367 | Set up a **billing account** to pay for multiple **linked accounts** at the same time.
 368 | 
 369 | * Allows for **consolidated billing**. Does *not* give IAM visibility into linked accounts.
 370 | * Enables **volume discounts** across linked accounts.
 371 | * If one account uses *reserved instances*, other accounts running on  similar *on demand* instances
 372 | will be billed under the reserved instance price. Similar for *RDS* instances.
 373 | * All *credits* earned while linked will be applied to consolidated bill.
 374 | 
 375 | Limits:
 376 | * Up to 20 linked accounts
 377 | 
 378 | <a name="3_2"></a>
 379 | ## [↖](#top)[↑](#3_1)[↓](#3_3) Billing Metrics & Alarms
 380 | * Only shows metrics of services that have been used.
 381 | * Set up *billing alarms* based on billing metrics.
 382 |   * *Overall* billing alarm, or *service-specific* alarms
 383 |   * Can still be account-specific, even with consolidated billing
 384 | 
 385 | <a name="3_3"></a>
 386 | ## [↖](#top)[↑](#3_2)[↓](#3_4) Costs Optimization
 387 | * Purchase **EC2 Reserved Instances**
 388 |   * Commit for 1-3 years and get a discount
 389 | * Minimize the number of running instances
 390 |   * Set up *CloudWatch* alarms to spin down underutilized instances
 391 |   * Find balance between acceptable downtime & costs to eleminate this downtime
 392 | * Remove unused **Load Balancers**
 393 | * Look for idle (unattached) **EBS** volumes
 394 |   * Delete unused volumes
 395 |     * Take a *snapshot* to keep the data
 396 |   * Downsize volumes that aren't near full capacity
 397 |   * Look for over-provisoned **IOPS**
 398 | * Look for unassociated **Elastic IP** addresses
 399 | * Look for idle **RDS** instances
 400 |   * Check for 0 connections
 401 | 
 402 | <a name="3_4"></a>
 403 | ## [↖](#top)[↑](#3_3)[↓](#4) Cost Explorer
 404 | * Costs per *time frame* per *service*, various grouping and filtering options
 405 | * Provides forecasts
 406 | * **Pricing API** allows to download pricing information for specific services
 407 | 
 408 | ---
 409 | 
 410 | <a name="4"></a>
 411 | # [↖](#top)[↑](#3_4)[↓](#4_1) High Availability
 412 | 
 413 | <a name="4_1"></a>
 414 | ## [↖](#top)[↑](#4)[↓](#4_2) Scalability & Elasticity Fundamentals
 415 | * Pay only for *what* you need *when* you need it
 416 |   * Define minimum capacity
 417 |   * Define what needs to stretch out
 418 | 
 419 | .|**Elasticity**|**Scalability**
 420 | -|-|-
 421 | .|*Scaling up/down on demand*|*Scaling for growth in order to meet long term requirements<br/>typically does not focus on shrinking back*
 422 | *DynamoDb*|Can provision more or less throughput|Stores as much data as we like, scales transparently
 423 | *EC2*|Use autoscaling|More instances or bigger instance types
 424 | *RDS*|./.|Bigger instances, more read replicas
 425 | 
 426 | <a name="4_2"></a>
 427 | ## [↖](#top)[↑](#4_1)[↓](#4_3) Reserved Instances
 428 | * *Reserve* instances for a specific period of time
 429 |   * *Standard* reserved instances (fixed instance type)
 430 |   * *Convertible* reserved instances (can be exchanged against another convertible instance type)
 431 |   * *Scheduled* reserved instances (purchased by the hour on a set schedule with a set instance type)
 432 | * Up to 50% cheaper than a *fully utilized* on-demand instance (because we commit upfront to a certain usage)
 433 | * Guarantees to *not* run into '*insufficent instance capacity*' issues if AWS is unable to provision instances in that AZ
 434 | * Can resell reserved capacity on *Reserved Instance Marketplace*
 435 | * Available for:
 436 |   * EC2
 437 |   * RDS (*reserved instances*)
 438 |   * DynamoDB (*reserved capacity*)
 439 |   * ElastiCache (*reserved nodes*)
 440 |   * CloudFront (*reserved capacity*)
 441 |   * Elastic MapReduce (*reserved EC2 instances*)
 442 |   * ECR (*reserved EC2 instances*)
 443 | 
 444 | <a name="4_3"></a>
 445 | ## [↖](#top)[↑](#4_2)[↓](#4_4) Autoscaling vs Resizing
 446 | * **Auto Scaling** distributes load across multiple instances
 447 |   * *Scheduled Scaling* allows to scale or shrink on a schedule
 448 |   * Relativly complex to set up
 449 |   * Applications need to be designed to benefit from multiple instances
 450 |   * Components
 451 |     * *Launch Configuration*
 452 |     * *Autoscaling Group*
 453 |     * *Scaling Policy*
 454 |     * *Cloudwatch Alarms*
 455 | 
 456 | * **Changing instance size** increases/decreases available resources to the running application
 457 |   * *EBS* backed instances need to be stopped before resizing
 458 |   * *Instance storage* need to be migrated across
 459 |   * Not as flexible as auto scaling. Not elastic
 460 |   * Within an autoscaling group the to-be-resized instance might be treated as unhealthy
 461 | 
 462 | <a name="4_4"></a>
 463 | ## [↖](#top)[↑](#4_3)[↓](#4_4_1) Load Balancers
 464 | 
 465 | .|**ALB**|**NLB**|**ELB**
 466 | -|-|-|-
 467 | .|Active Load Balancer|Network Load Balancer|Classic Load Balancer
 468 | Layer|7 (application layer)|4 (transport layer)|EC2-classic network (deprecated)
 469 | Protocoll|HTTP, HTTPS|TCP|TCP, SSL, HTTP, HTTPS
 470 | Health checks|✔|✔|✔
 471 | Cloudwatch metrics|✔|✔|✔
 472 | Logging|✔|✔|✔
 473 | Zone failover|✔|✔|✔
 474 | Connection draining|✔|✔|✔
 475 | Load balancing to different ports on the same instance|✔|✔|.
 476 | WebSockets|✔|✔|.
 477 | IP Addresses as targets|✔|✔|.
 478 | Load balancing deletion protection|✔|✔|.
 479 | Path-based routing|✔|.|.
 480 | Host-based routing|✔|.|.
 481 | Native http/2|✔|.|.
 482 | Configurable idle connection timeout|✔|.|✔
 483 | Cross zone load-balancing|✔|✔|✔
 484 | SSl-offloading|✔|.|✔
 485 | Server-name indication|✔|.|✔
 486 | Sticky-sessions|✔|.|✔
 487 | Backend server encryption|✔|.|✔
 488 | Static IP|.|✔|.
 489 | Elastic IP|.|✔|.
 490 | Preserve source IP address|.|✔|.
 491 | Resource-based IAM permissions|✔|✔|✔
 492 | Tag-based IAM permissions|✔|✔|.
 493 | Slow start|✔|.|.
 494 | User authenticaion|✔|.|.
 495 | Redirects|✔|.|.
 496 | Fixed responses|✔|.|.
 497 | 
 498 | <a name="4_4_1"></a>
 499 | ### Elastic Load Balancer ('Classic LB')
 500 | 
 501 | <a name="4_4_2"></a>
 502 | ### Overview
 503 | * *External* load balancer
 504 |   * Public facing
 505 |   * Often used to distribute load between web servers
 506 |   * Provides public DNS host name
 507 | * *Internal* load balancer
 508 |   * Often used to Distribute load between backend servers
 509 |   * Provides internal DNS host name
 510 | * Configure (in AWS console)
 511 |   * Internal and external load balancer
 512 |   * Subnets for each AZ that traffic should be routed to
 513 |     * Can route into private subnets
 514 |   * Cross-zone load balancing
 515 |   * Connection draining (maximum time for the load balancer to keep connections alive before reporting the instance as
 516 |     de-registered)
 517 | 
 518 | <a name="4_4_3"></a>
 519 | ### Sticky Sessions
 520 | * Need to make sure that session is maintained between instances
 521 |   * Load Balancer generated stickiness (*duration based* session stickiness)
 522 |   * Application generated stickiness (*application based* session stickiness)
 523 |   * For HA, use *ElastiCache* to persist and share session state. So maintaining
 524 |     stickiness doesn't matter any more
 525 | 
 526 | <a name="4_5"></a>
 527 | ## [↖](#top)[↑](#4_4_3)[↓](#4_6) RDS HA
 528 | * Create *subnets* in different AZs
 529 | * Create *subnet group* in RDS dashboard
 530 |   * Collection of subnets (typically private) in a VPC that is desgnated for DB instances
 531 |   * Should have subnets in at least two Availability Zones in a given region
 532 | * Configure RDS for **multi-AZ-deployments** and turn replication on
 533 |   * Keeps a *synchronous* standby replica in a different AZ
 534 |     * Recommendation is use of Provisioned IOPS
 535 |   * Automatic failover in case of planned or unplanned outage of the first AZ
 536 |     * Most likely still has downtime
 537 |     * Can *force* failover by *rebooting*
 538 |   * Other benefits
 539 |     * Patching
 540 |     * Backups
 541 |   * *Aurora* can replicate accross 3 AZs
 542 | * Failover process is automated
 543 |   * AWS detects an issue and starts the failover process
 544 |   * DNS records are modified to point to the standby instance
 545 |   * Application re-establishes existing DB connections
 546 | 
 547 | <a name="4_6"></a>
 548 | ## [↖](#top)[↑](#4_5)[↓](#4_7) HA for IP-based Applications
 549 | * If the application requires specific IPs (that are hardcoded somewhere), autoscaling cannot be used
 550 | * Use *Elastic IP* and standby instances in different AZs instead
 551 |   * Cannot use Elastic IP across different regions though
 552 |   * Scale by increasing instance size (vertical scaling)
 553 | 
 554 | <a name="4_7"></a>
 555 | ## [↖](#top)[↑](#4_6)[↓](#5) HA/Fault Tolerance for Bastion Hosts
 556 | * Assign Elastic IP to bastion host in AZ 1
 557 |   * This IP can also be whitelisted to comply with corporate regulations
 558 | * Have another instance on standby in different AZ
 559 | * Could be in *ASG* (min/max 1), so that it gets immediately replaced
 560 | * Place 2 instances behind ELB and enable *SSH Keep Alive*
 561 | * Place 1 instance behind ELB, configure *auto recovery*
 562 | 
 563 | ---
 564 | 
 565 | <a name="5"></a>
 566 | # [↖](#top)[↑](#4_7)[↓](#5_1) Analysis
 567 | 
 568 | <a name="5_1"></a>
 569 | ## [↖](#top)[↑](#5)[↓](#5_1_1) Optimize the environment to ensure maximum performance
 570 | 
 571 | <a name="5_1_1"></a>
 572 | ### Offloading database workload
 573 | * Using **read replicas**
 574 |   * Read queries are routed to *read replicas*, reducing load on primary db instance
 575 |     (*source instance*)
 576 |     * Table indexes can be created on read replicas directly (and not on the master)
 577 |     * Some use cases (e.g. data analytics) can be performed exclusively against read replicas
 578 |   * To create read replicas, AWS initally creates a snapshot of the source instance
 579 |     * Multi-AZ failover instance (if enabled) is used for snapshotting
 580 |     * After that all read queries are then *asynchronously* copied to read replica
 581 |     * Implies data latency, which typically is acceptable.
 582 |       * `ReplicaLag` can be monitored and *Cloudwatch* alarms can be configured
 583 |   * *Read replicas* are **not** the same as *multi-AZ failover* instances which
 584 |     * are *synchronously* updated
 585 |     * are designed to handle failover
 586 |     * don't receive any load unless failover actually happens
 587 |   * Often it is beneficial to have both read replicas and multi-AZ failover instances
 588 |     * Read replicas themselves can not use the Multi-AZ feature
 589 |   * A single master can have **up to 5** read replicas
 590 |     * Can be in different regions
 591 | 
 592 | * Setting up a read replica
 593 |   * Configure from master instance or other read replica
 594 |     * Requires 'automated backups' to be enabled on source instance
 595 |   * Choice of db engine matters, because internal engine features are being used
 596 |   * Usually pick same database instance type as source instance uses
 597 |   * AWS provisiones different *endpoint* for read replica
 598 |   * Configure use of endpoint on application level
 599 | 
 600 | * Read replicas can be promoted to normal instances
 601 |   * E.g. use read replica to implement bigger changes on db level, after these have been finished
 602 |   promote to master instance
 603 |   * Useful for database sharding, could create replicas for each shard
 604 | 
 605 | <a name="5_1_2"></a>
 606 | ### Looking at EBS volumes
 607 | * EBS *pre-warming*
 608 |   * Used to be required for maximum performance
 609 |   * Performance is reduced the very first time each block is accessed
 610 |   * Has been renamed to *initialization* and is no longer required if new EBS volumes are used
 611 |   * Still required for volumes that are restored from snapshots
 612 |     * Storage blocks must be initialized (pulled down from Amazon S3 and written to the volume)
 613 |     * Use `dd` or `fio` to *read* from every block
 614 |     * Only required if performance matters, obviously
 615 | 
 616 | <a name="5_1_3"></a>
 617 | ### Prewarming ELBs
 618 | * ELB is designed to increase its resource capacity gradually
 619 | * Prevents `http 503` (ELB cannot handle anymore requests)
 620 | * Can contact AWS to `pre-warm` ELB
 621 |   * This should not really be required. Maybe if TV ads are running or so.
 622 |   * Use load testing tools to get a rough estimate of what the current ELB can handle
 623 |     * Increase at a rate no more than 50% per 5min.
 624 | 
 625 | <a name="5_2"></a>
 626 | ## [↖](#top)[↑](#5_1_3)[↓](#5_2_1) Identify Performance Bottlenecks and Implement Remedies
 627 | 
 628 | <a name="5_2_1"></a>
 629 | ### Resizing or changing EBS root volumes
 630 | * If EBS is at capacity
 631 |   * Either upgrade volume size to increase the amount of IOPS available
 632 |   * Or switch to provisiones IOPS volumes (`io1`)
 633 | * Resizing
 634 |   * Create snapshot of EBS volume first
 635 |     * Incrementally stored on S3
 636 |     * Can continue to use EBS volume while the snapshot is taking place
 637 |   * Create new volume from snapshot
 638 |   * Stop instance
 639 |   * Attach new volume
 640 | 
 641 | <a name="5_2_2"></a>
 642 | ### Setting up certificates for Elastic Load Balancers
 643 | * Offloading overhead from the instances behind the ELB
 644 |   * Create ELB and configure https
 645 |   * Certificate from
 646 |     * ACM (AWS managed)
 647 |     * IAM (for external certificiates)
 648 |     * Upload directly
 649 | 
 650 | <a name="5_2_3"></a>
 651 | ### Network bottlenecks
 652 | * Primary network bottlenecks
 653 |   * EC2 instances
 654 |     * Instances in different AZs or regions
 655 |     * Different instance types get different bandwith capacities
 656 |       * No absolute numbers communicated by AWS though
 657 |     * Not using *enhanced network capabilities* (not supported by some instance types)
 658 |     * Check for performance issues with` iperf3` (github)
 659 |       * Measures performance for ip-based networks
 660 |     * Use VPC Peering to create a reliable connection
 661 |       * No single point of failure
 662 |   * Connection to on-prem networks
 663 |     * Use `Direct Connect`
 664 | 
 665 | <a name="5_3"></a>
 666 | ## [↖](#top)[↑](#5_2_3)[↓](#5_3_1) Identify Potential Issues on a Given Application Deployment
 667 | 
 668 | <a name="5_3_1"></a>
 669 | ### EBS Root Devices on Terminated Instances - Ensuring Data Durability
 670 | * *EBS root volumes* will be deleted on instance termination as per default option
 671 |   * Could create snapshot before termination to backup data
 672 |   * Could change default settings
 673 | * *Instance store root volumes* will be left untouched on instance termination
 674 | 
 675 | <a name="5_3_2"></a>
 676 | ### Troubleshooting Auto Scaling Issues
 677 | * Attempting to use wrong subnet
 678 | * AZ no longer available or supported (outage)
 679 | * Security group does not exist
 680 | * Associated keypair does not exist
 681 | * Auto scaling configuration is not working correctly
 682 | * Instance type specification does not exist in that AZ
 683 | * Auto scaling is not enabled on that subnet
 684 | * Invalid EBS device mapping
 685 | * Attempt to attach EBS block device to instance-store AMI
 686 | * AMI issues
 687 | * Attempt to use *placement groups* with instance types that don't support that
 688 | * AWS running out of capacity in that AZ
 689 | * If an instance is stopped, e.g. for updating it, autoscaling will consider it unhealthy and
 690 |   terminate - restart it. Need to suspend autoscaling first.
 691 | 
 692 | ---
 693 | 
 694 | <a name="6"></a>
 695 | # [↖](#top)[↑](#5_3_2)[↓](#6_1) OpsWorks
 696 | 
 697 | <a name="6_1"></a>
 698 | ## [↖](#top)[↑](#6)[↓](#6_1_1) Overview and components
 699 | * Declarative desired state engine
 700 |   * Automate, monitor and maintain deployments
 701 | * **Cookbooks** define **recipes**
 702 | * AWS' implementation of *Chef*
 703 | 	* Original Chef
 704 | 	* AWS-bespoke orchestration components
 705 | * Components
 706 | 	* **Stack**
 707 |     * Set of resources that is managed as a group
 708 | 		* Whole service stack
 709 | 	* **Layer**
 710 |     * Represent and configure components of a stack
 711 | 		* E.g. loadbalancer layer, app layer, db layer
 712 | 		* Share common configuration elements
 713 | 	* **Instance**
 714 | 		* Units of compute within the platform
 715 |     * Must be associated with at least one layer
 716 |     * Can run
 717 |       * 24/7
 718 |       * Load-based
 719 |       * Time-based
 720 | 	* **Application**
 721 | 		* Applications that are deployed on one or more instances
 722 |     * Deployed through source code repo or S3
 723 | * Recipes
 724 |   * Created in ruby, used to customize different layers
 725 |   * Run at stack lifecycle events
 726 |     * `setup`
 727 |       * *Instance* has finished booting
 728 |     * `configure`
 729 |       * *Instance* enters or leaves the `online` state
 730 |       * *Elastic IP* is associated or disassociated
 731 |       * *Load balancer* is attached or detached
 732 |       * Event is executed on *all* instances, not only the impacted one
 733 |     * `deploy`
 734 |       * *Deploy command* is run on an instance
 735 |     * `undeploy`
 736 |       * *Undeploy command* is run on an instance
 737 |       * *App* is deleted
 738 |     * `shutdown`
 739 |       * When *instance* is shutdown, before termination
 740 |       * Allows cleanup
 741 | * Under the hood
 742 | 	* *OpsWorks* **agent**
 743 | 		* Configuration of machines
 744 | 	* *OpsWorks* **automation engine**
 745 | 		* *Create*, *update* & *delete* of various AWS components
 746 | 		* Handles *loadbalancing*, *autoscaling* and *autohealing*
 747 | 		* Supports *lifecycle* events
 748 | 
 749 | <a name="6_1_1"></a>
 750 | ### BerkShelf
 751 | * Addresses an *OpsWorks* shortcoming from old versions - only one repository for recipes
 752 | * Was added in *OpsWorks* 11.10 and allows to install cookbooks from many repositories
 753 | 
 754 | TODO: Quickstart OpsWorks
 755 | 
 756 | <a name="6_2"></a>
 757 | ## [↖](#top)[↑](#6_1_1)[↓](#6_2_1) Cloudformation
 758 | 
 759 | <a name="6_2_1"></a>
 760 | ### Overview
 761 | * Allows to create and provision **resources** in a reusable **template** fashion
 762 | 	* A *CloudFormation* template is a `JSON` or `YAML` formatted text file
 763 | * Related resources are managed in a single unit called a **stack**
 764 | 	* Controls lifecycle of managed resources
 765 | 	* All the resources in a stack are defined by the stack's *CloudFormation* template
 766 | 	* Stack has `name` & `id`
 767 | * Two ways to update a stack
 768 | 	* *Direct update*
 769 | 		* Directly applies changes (if any)
 770 | 	* *Change set*
 771 | 		* Summary of proposed changes, can be applied or rejected
 772 | * Will **rollback** stack if it fails to create (can be disabled via API/console)
 773 | * A **stack policy** is an *IAM*-style policy statements that governs who can do what
 774 | 
 775 | <a name="6_2_2"></a>
 776 | ### Templates
 777 | * `AWSTemplateFormatVersion`
 778 | * `Description`
 779 | * `Metadata`
 780 | 	* Details about the template
 781 | * `Parameters`
 782 | 	* Values to pass in right before template creation
 783 | 		* Type
 784 | 			* `String`, `Number`, `List`, `CommaDelimitedList`
 785 | 			* AWS-specific types like `AWS::EC2::KeyPair::KeyName`
 786 | 		* Description
 787 | 		* Default Value
 788 | 		* Allowed Values
 789 | 		* Allowed Pattern
 790 | 			* Validation per *regular expression*
 791 | 		* MinLength/MaxLength
 792 | 		* MinValue/MaxValue
 793 | 	* Problem:
 794 | 		* Usage of parameters *might* make it hard to instantiate stacks without human interaction
 795 | 		* *CloudFormation* is able to auto-generate many resources attributes, e.g. name
 796 | * `Mappings`
 797 | 	* Maps keys to values (eg different values for different regions)
 798 | * `Conditions`
 799 | 	* Check values before deciding what to do
 800 | * `Resources`
 801 | 	* Creates resources. Only mandatory section in a template.
 802 | 	* Can have `Condition` element to toggle creation
 803 | * `Outputs`
 804 | 	* Values to be exposed from the console or from API calls.
 805 | 	* Can be used in a different stack (*cross stack references*)
 806 | 	* Can be:
 807 | 		* Constructed value
 808 | 		* Parameter reference
 809 | 		* Pseudo parameter
 810 | 		* Output from a function like `fn::getAtt` or `Ref`
 811 | 
 812 | <a name="6_2_3"></a>
 813 | ### Intrinsic Functions
 814 | * Used to pass in values that are not available until runtime
 815 | * Usable in `resource` properties, `metadata` attributes, and `update policy` attributes (auto-scaling)
 816 | * `Ref`
 817 | 	* Returns the *default* value of the specified parameter or resource, usually instance id
 818 | * `Fn::GetAtt`
 819 | 	* Returns the value of an attribute from an object, either the default or the specified attribute
 820 | 	* Object is either from the same or a nested template
 821 | * `Fn::Join`
 822 | 	* Joins a set of values into a single value separated by the specified delimiter
 823 | * `Fn::Sub`
 824 | 	* Substitutes variables in an input string with values that you specify
 825 | * `Fn::FindInMap`
 826 | 	* Returns the value corresponding to keys in a two-level map that is declared in the *Mappings*
 827 | 	section
 828 | * `Fn::Select`
 829 | 	* Returns a single object from a list of objects by index
 830 | * `Fn::Base64`
 831 | 	* Provides encoding, converts from plain text into base64
 832 | * `Fn::GetAZs`
 833 | 	* Returns an array that lists *Availability Zones* for a specified region
 834 | 	* If region is omitted return AZs from the region the template is applied in
 835 | * `Fn::ImportValue`
 836 | 	* Returns the value of an *Output* exported by another stack
 837 | * `Fn::Split`
 838 | 	* Split a string into a list of string values so that you can select an element from the resulting
 839 | 	string list
 840 | * `Fn::If`
 841 | 	* Takes a list of arguments (`boolean`, `string1`, `string2`)
 842 | 	*	Returns `string1` if `boolean` is `true`, `string2` otherwise
 843 | * `Fn::And`, `Fn::Equals`, `Fn::Or`, `Fn::Not`
 844 | 	* Good for `condition` element
 845 | 
 846 | ---
 847 | 
 848 | <a name="7"></a>
 849 | # [↖](#top)[↑](#6_2_3)[↓](#7_1) Backups & Recovery
 850 | 
 851 | <a name="7_1"></a>
 852 | ## [↖](#top)[↑](#7)[↓](#7_2) AWS Services with automated backups
 853 | * RDS
 854 |   * Backups
 855 |     * *Transactional* storage engine recommended as DB engine
 856 |     * Degrades performance if multi-AZ is not enabled (taken from slave if enabled)
 857 |     * Deleting an instance deletes all *automated* backups
 858 |     * Backups are stored internaly on S3
 859 |     * PITR 5 minutes
 860 | 
 861 |   * Restoring
 862 |     * When restoring, only default parameters and security groups are associated with instance
 863 |     * Can change to different storage engine if closely related and enough space available
 864 | 
 865 | * Elasticache
 866 |   * Backups
 867 |     * Available to Redis cluster only
 868 |     * Taking snaphots can degrade performance, should be performed on read replica
 869 |     * Backups are stored internaly on S3
 870 | 
 871 | * Redshift
 872 |   * Backups
 873 |     * Provides free storage equal to the storage capacity of the cluster
 874 |     * Snapshots can be automated or manual and are incremental
 875 |     * Backups are stored internaly on S3
 876 |   * Restoring
 877 |     * Creates a new cluster and imports the data
 878 | 
 879 | * EC2
 880 |   * Backups
 881 |     * No built-in automated backup solution
 882 |     * Snapshots of EBS volumes are incremental, causing performance degradation
 883 |     * Every snapshot will restore *all* data, even if older snapshots are deleted
 884 |     * Backups are stored internaly on S3
 885 | 
 886 | <a name="7_2"></a>
 887 | ## [↖](#top)[↑](#7_1)[↓](#7_2_1) Disaster Recovery Scenarios
 888 | 
 889 | <a name="7_2_1"></a>
 890 | ### DR of on-prem infra
 891 | * Use AWS as backup solution by storing VMs, snapshots and other data
 892 | * 'Pilot light' - have bare minimum infra always ready and scale up as required
 893 | * 'Hot standby' (aka 'multi site') - has everything ready to go
 894 | 
 895 | <a name="7_2_2"></a>
 896 | ### DR of cloud infra
 897 | * Duplicate the environment from one region to another
 898 | 
 899 | <a name="7_2_3"></a>
 900 | ### DR of RDS data
 901 | * Protection from multiple AZs being down
 902 | * Reduce latency for global audience
 903 | * Replica lag will most likely go up
 904 |   * Data transfer across regions is getting charged
 905 |   * May potentially run into bandwith issues
 906 | * Create read replica from existing DB instance, pick different region
 907 |   * Trigger setup process that will take some time
 908 | 
 909 | <a name="7_3"></a>
 910 | ## [↖](#top)[↑](#7_2_3)[↓](#8) Storing log files and backups
 911 | * Implement centralized logging
 912 |   * From there
 913 |     * Send to 3rd party tool for analyis
 914 |     * Backup to S3
 915 |       * 11x9 durability
 916 |       * Versioning
 917 |       * Lifecycle policies
 918 | 
 919 | * Other logging options
 920 |   * S3 access logs
 921 |   * Cloudtrail
 922 |   * Cloudwatch
 923 | 
 924 | ---
 925 | 
 926 | <a name="8"></a>
 927 | # [↖](#top)[↑](#7_3)[↓](#8_1) Security
 928 | 
 929 | <a name="8_1"></a>
 930 | ## [↖](#top)[↑](#8)[↓](#8_1_1) Implement and Manage Security Policies
 931 | 
 932 | <a name="8_1_1"></a>
 933 | ### IAM
 934 | IAM is a global service that helps to securely control access to AWS resources.
 935 | 
 936 | * **Users** hold credentials
 937 | * **Groups** hold users, typically only provides permission to assume a role
 938 | * **Roles** hold policies.
 939 | 	* Can have **trust relationships** with trusted entities that can *assume* this role
 940 | * **Policies** can be attached to users, groups or roles (preferred)
 941 | * An **instance profile** is a container for an IAM role that you can use to pass role information to an
 942 | 	EC2 instance when the instance starts.
 943 | * Users and/or services assume roles
 944 | 
 945 | <a name="8_1_1_1"></a>
 946 | #### Policies
 947 | * Any actions on resources that are not explicitly allowed are **denied by default**
 948 | * Structure
 949 | 	* **E** - `effect` (*allow*/*deny*)
 950 | 		* What the effect will be when the user requests the specific action
 951 | 	* **P** - `prinicpal` (*ARN*)
 952 | 		* The account or user who is allowed access to the actions and resources in the statement
 953 | 		* IAM policies do not have a principal (because they are attached to users, groups or roles)
 954 | 	* **A** - `action` or `notaction`
 955 | 		* Describes the specific action or actions that will be allowed or denied
 956 | 	* **R** - `resource` or `notresource`
 957 | 		* Specifies the object or objects that the statement covers
 958 | 	* **C** - `condition`
 959 | 		* Specifies conditions for when a policy is in effect
 960 | * Can use **policy variables**
 961 | 	* `aws:currentTime`, `aws:userid`, ...
 962 | 
 963 | ```
 964 | 	{
 965 | 		"Version": "2012-10-17",
 966 | 		"Statement": [
 967 | 			{
 968 | 				"Effect": "Allow",
 969 | 				"Action": "s3:ListAllMyBuckets",
 970 | 				"Resource": "arn:aws:s3:::*"
 971 | 			},
 972 | 			{
 973 | 				"Effect": "Allow",
 974 | 				"Action": [
 975 | 						"s3:ListBucket",
 976 | 						"s3:GetBucketLocation"
 977 | 				],
 978 | 				"Resource": "arn:aws:s3:::productionapp"
 979 | 			},
 980 | 			{
 981 | 				"Effect": "Allow",
 982 | 				"Action": [
 983 | 					"s3:GetObject",
 984 | 					"s3:PutObject",
 985 | 					"s3:DeleteObject"
 986 | 				],
 987 | 				"Resource": "arn:aws:s3:::productionapp/*"
 988 | 			}
 989 | 		]
 990 | 	}
 991 | ```
 992 | <a name="8_1_1_2"></a>
 993 | #### IAM Policies
 994 | * Managed policies (the new way)
 995 | 	* Can be attached to multiple users, groups and roles
 996 | 	* AWS managed policies
 997 | 		* Updated by AWS if new API come out
 998 |   * Customer managed policies
 999 | * Inline policies (the old way)
1000 | 
1001 | <a name="8_1_1_3"></a>
1002 | #### IAM roles and EC2
1003 | 
1004 | * Create an IAM role.
1005 |   * Define which accounts or AWS services can assume the role.
1006 |     * EC2 here, could be other services
1007 |   * Define which API actions and resources the application can use after assuming the role.
1008 |   * Specify the role when you launch your instance, or attach the role to a running or stopped instance.
1009 |   * Have the application retrieve a set of temporary credentials and use them.
1010 | 
1011 | * Only one role can be assigned to an EC2 instance, and all applications share the same role and permissions
1012 | 
1013 | <a name="8_1_2"></a>
1014 | ### S3 IAM and bucket policy concepts
1015 | 
1016 | <a name="8_1_2_1"></a>
1017 | #### Defaults
1018 | * Bucket is *owned* by the AWS account that created it
1019 | 	* Bucket ownership is not transferable
1020 | * Bucket owner gets full permission (ACL)
1021 | * The person paying the bills always has full control.
1022 | * A person uploading an object into a bucket owns it by default.
1023 | 
1024 | <a name="8_1_2_2"></a>
1025 | #### Bucket policies (resource level)
1026 | * Specify what actions are allowed or denied for which principals on the bucket that the policy
1027 |   is attached to
1028 | * Attached *only* to S3 buckets. Can however effect object in buckets.
1029 | * Contains *principal* element (unnecessary for IAM policies)
1030 | * Use if you’re more interested in *“Who can access this S3 bucket?”*
1031 | * Easiest way to grant *cross-account permissions* for all `s3:*` permission. (Cannot do this
1032 |   with ACLs.)
1033 | * Explicit *deny* in bucket policy overwrites explicite *allow* in IAM policy
1034 | * Defined as JSON
1035 | 
1036 | ```
1037 | {
1038 | "Version":"2012-10-17",
1039 | "Statement":
1040 |   [
1041 |     {
1042 |       "Sid":"PutObjectAcl",
1043 |       "Effect":"Allow",
1044 |       "Principal":
1045 |       {
1046 |         "AWS":
1047 |           [
1048 |            "arn:aws:iam::111122223333:tom", "arn:aws:iam::444455556666:chris"
1049 |           ]
1050 |       },
1051 |       "Action":
1052 |         [
1053 |           "s3:PutObject",
1054 |           "s3:PutObjectAcl"
1055 |         ],
1056 |         "Resource":
1057 |         [
1058 |           "arn:aws:s3:::examplebucket/*"
1059 |         ]
1060 |     }
1061 |   ]
1062 | }
1063 | ```
1064 | 
1065 | <a name="8_1_2_3"></a>
1066 | #### ACLs
1067 | * Defined as XML. Legacy, not recomended any more.
1068 | * Can
1069 | 	* be attached to individual objects (bucket policies only bucket level)
1070 | 	* control access to object uploaded into a bucket from a *different* account.
1071 | * Cannot..
1072 | 	* have conditions
1073 | 	* cannot explicitely deny actions
1074 | 	* grant permission to bucket sub-resources (eg. lifecycle or static website configurations)
1075 | * Other than *object ACL*s there are *bucket ACL*s as well - only for writing access log objects to a
1076 | bucket.
1077 | ```
1078 | <?xml version="1.0" encoding="UTF-8"?>
1079 | <AccessControlPolicy xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
1080 |   <Owner>
1081 |     <ID>*** Owner-Canonical-User-ID ***</ID>
1082 |     <DisplayName>owner-display-name</DisplayName>
1083 |   </Owner>
1084 |   <AccessControlList>
1085 |     <Grant>
1086 |       <Grantee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
1087 |                xsi:type="Canonical User">
1088 |         <ID>*** Owner-Canonical-User-ID ***</ID>
1089 |         <DisplayName>display-name</DisplayName>
1090 |       </Grantee>
1091 |       <Permission>FULL_CONTROL</Permission>
1092 |     </Grant>
1093 |   </AccessControlList>
1094 | </AccessControlPolicy> 
1095 | ```
1096 | 
1097 | <a name="8_1_2_4"></a>
1098 | #### IAM policies (user level)
1099 | * IAM policies (in general) specify what actions are allowed or denied on what AWS resources
1100 | * Attached to IAM users, groups, or roles (so they cannot grant access to anonymous users)
1101 | * Use if you’re more interested in *“What can this user do in AWS?”*
1102 | 
1103 | .|.
1104 | -|-
1105 | `arn:partition:service:region:namespace:relative-id`|`arn:aws:s3:::mybucket`
1106 | `arn:aws:s3:::*`|All buckets and objects in account
1107 | `arn:aws:s3:::mybucket`|`mybucket`
1108 | `arn:aws:s3:::mybucket/*`|All objects in `mybucket`
1109 | `arn:aws:s3:::mybucket/mykey`|`mykey` in `mybucket`
1110 | `arn:aws:s3:::mybucket/developers/($aws:username)/`|folder matching the accessing user's name
1111 | 
1112 | <a name="8_1_2_5"></a>
1113 | #### Cloudfront
1114 | * Can use Cloudfront Origin Access Identity to restrict access to S3 objects
1115 | 
1116 | <a name="8_2"></a>
1117 | ## [↖](#top)[↑](#8_1_2_5)[↓](#8_2_1) Ensure Data Integrity and Access Controls when Using the AWS Platform
1118 | 
1119 | <a name="8_2_1"></a>
1120 | ### MFA
1121 | * *Should* be turned on for all console access
1122 | * *Can* be enabled for API access as well
1123 |   * The administrator configures an AWS MFA device for each user who needs to make API requests that
1124 |   require MFA authentication. This process is described at Enabling MFA Devices.
1125 |   * The administrator creates policies for the users that include a *Condition* element that checks
1126 |   whether the user authenticated with an AWS MFA device.
1127 |   * The user calls one of the AWS STS API operations that support the MFA parameters `AssumeRole` or
1128 |   `GetSessionToken`, depending on the scenario for MFA protection, as explained later. As part of the
1129 |   call, the user includes the device identifier for the device that's associated with the user. The
1130 |   user also includes the time-based one-time password (TOTP) that the device generates. In either case,
1131 |   the user gets back temporary security credentials that the user can then use to make additional
1132 |   requests to AWS.
1133 |   * This is not supported by all services (support by *SQS*, *SNS*, *S3*)
1134 | 
1135 | * MFA delete can be enabled for root accounts (bucket owners) before permanently deleting an object
1136 | 
1137 | ```
1138 | {
1139 |   "Version": "2012-10-17",
1140 |   "Statement": [{
1141 |     "Effect": "Allow",
1142 |     "Principal": {"AWS": ["ALICE", "BOB"]},
1143 |     "Action": [ "s3:PutObject", "s3:DeleteObject" ],
1144 |     "Resource": ["arn:aws:s3:::Alice-Bucket/*"],
1145 |     "Condition": {"Bool": {"aws:MultiFactorAuthPresent": "true"}}
1146 |   }]
1147 | }
1148 | ```
1149 | 
1150 | <a name="8_2_2"></a>
1151 | ### Secure Token Service (STS)
1152 | * Allows to grant **temporary access** to authenticated users
1153 | 	* IAM users
1154 | 	* Web-based identity providers (google, facebook, ...)
1155 | 	* Organization's existing identity system
1156 | * Returns **temporary credentials** that expire after some time:
1157 | 	* Access key
1158 | 	* Session token
1159 | 
1160 | <a name="8_2_2_1"></a>
1161 | #### Terms
1162 | * **Federation**
1163 | 	* Trust relationship between identity provider and AWS
1164 | * **Identity broker**
1165 | 	* Broker in charge of mapping user to the right set of credentials
1166 | * **Identity store**
1167 | 	* Eg Google or Facebook
1168 | * **Identities**
1169 | 	* Users
1170 | 
1171 | <a name="8_2_2_2"></a>
1172 | #### Scenarios
1173 | * Temporary credentials with EC2
1174 | 	* Assign IAM role to instance
1175 | 	* Get temp credentials from *instance metadata*
1176 | * Temporary credentials with SDK
1177 | 	* Call `assumeRole`, extract temp credentials
1178 | * Options for temporary credentials with API calls
1179 | 	* *Sign request* with temp credentials
1180 | 	* Add AC/SK to request (*header* or *query string*)
1181 | 
1182 | <a name="8_3"></a>
1183 | ## [↖](#top)[↑](#8_2_2_2)[↓](#8_4) Share responsibility model
1184 | * **Shared responsibility** environment
1185 | * AWS is responsible for:
1186 | 	* Server/Host level and below
1187 | 	* Physical environment security
1188 | 	* Hardware decommissioning
1189 | 	* Traffic security (Networks, ACLs, SSL, DDOS-protection)
1190 | 	* EC2 hypervisor isolation
1191 | * User is responsible for:
1192 | 	* IAM
1193 | 	* MFA
1194 | 	* Password/key-rotation
1195 | 	* Access advisor (shows used permissions)
1196 | 	* Trusted advisor (validates best practices)
1197 | 	* Security groups
1198 | 	* ACL (resource based policy)
1199 | 	* VPC
1200 | 
1201 | <a name="8_4"></a>
1202 | ## [↖](#top)[↑](#8_3)[↓](#9) AWS and IT Audits
1203 | * AWS performs self audits of changes to key services to monitor quality, maintain high standards, and
1204 | facilitate continuous improvement of the change management process
1205 | *  For audits, AWS provides:
1206 |   * *Security of the cloud*
1207 |   * Information regarding their global infrastructure
1208 |   * From the host operating system and virtualization layer down to the physical security of facilities
1209 |   * Annual certifications and reports: (like the Service Organization Control (SOC) reports, ISO 27001
1210 |   cert, PCI assessments)
1211 | * For audits, the customer provides:
1212 |   * *Security in the cloud*
1213 |   * Anything their organization puts on (or connects to) their AWS assets
1214 |   Examples: guest operating system, apps on virtual machine instances, objects in S3, database like RDS,
1215 |   etc...
1216 | 
1217 | ---
1218 | 
1219 | <a name="9"></a>
1220 | # [↖](#top)[↑](#8_4)[↓](#9_1) Networking
1221 | 
1222 | <a name="9_1"></a>
1223 | ## [↖](#top)[↑](#9)[↓](#9_1_1) Route53 Routing Policies
1224 |   * *Simple*
1225 |   * *Weighted*
1226 |   * *Latency*
1227 |   * *Failover*
1228 |   * *Geolocation*
1229 | 
1230 | <a name="9_1_1"></a>
1231 | ### DNS Failover
1232 | * Can set up *health checks* for endpoints or domains from within *Route53*
1233 |   * Route 53 has health checkers in locations around the world. When you create a health check that
1234 |     monitors an endpoint, health checkers start to send requests to the endpoint that you specify
1235 |     to determine whether the endpoint is healthy.
1236 |   * `evaluate target health`
1237 | * DNS entries are then being associated with health checks and can be configured to failover as
1238 |   well (1 primary and n secondary recordsets)
1239 | 
1240 | <a name="9_1_2"></a>
1241 | ### Weighted
1242 | * Control distribution of traffic with DNS entries
1243 |   * This can be based on a certain percentage
1244 |   * Set *routing policy* to weighted (instead of failover)
1245 | 
1246 | <a name="9_1_3"></a>
1247 | ### Latency-based
1248 | * Control distribution of traffic based on latency.
1249 | 
1250 | <a name="9_2"></a>
1251 | ## [↖](#top)[↑](#9_1_3)[↓](#9_2_1) VPC Essentials
1252 | * Provisions a logically isolated section of the AWS cloud
1253 | * Spans over all AZs in a region
1254 | * Allows to create layered architecture
1255 | * Shared or dedicated tenancy (exclusive hardware or not)
1256 | * *Security groups* and subnet *network ACLs*
1257 | * Ability to extend on-premise network to cloud
1258 | 
1259 | <a name="9_2_1"></a>
1260 | ### Default VPC (Amazon specific)
1261 | * Gives easy access to a VPC without having to configure it from scratch
1262 | * Has different subnets in different AZs and an internet gateway per AZ
1263 | * Each instance launched automatically receives a *public IP* (very different to non-default VPC)
1264 | * Cannot be restored if deleted
1265 | 
1266 | <a name="9_2_2"></a>
1267 | ### Non-default VPC (regular VPC)
1268 | * Only has private IP addresses
1269 | * Resources *only* accessible through *Elastic IP*, *VPN* or *internet gateways*
1270 | * Does not have a gateway attached
1271 | 
1272 | <a name="9_2_3"></a>
1273 | ### VPC Peering
1274 | * Connect VPCs through direct network routing
1275 | * Can occur between different accounts and VPCs, but must be in the same region
1276 | * Allows instances to communicate with each other as if they were in the same network
1277 | * CIDRs must not overlap
1278 | 
1279 | <a name="9_2_4"></a>
1280 | ### VPC Scenarios
1281 | * VPC with private subnet only -> single tier apps
1282 | * VPC with public and private subnets -> layered apps
1283 | * VPC with public, private subnets and hardware connected VPN -> extending apps to on-premise
1284 | * VPC with private subnets and hardware connected VPN -> extended VPN
1285 | 
1286 | <a name="9_2_5"></a>
1287 | ### Components
1288 | * **Subnet**
1289 | 	* In exactly one AZ
1290 | 	* If a subnet doesn't have a route to the Internet gateway, it's known as a *private* subnet
1291 |     * Instances receive
1292 |       * *Private IP* address
1293 |       * Internal DNS hostname
1294 | 	* If traffic is routed to an Internet gateway, the subnet is known as a *public* subnet
1295 |     * Instances receive
1296 |       * *Public IP* address
1297 |       * External DNS hostname
1298 | 	* EC2 instances are launched into subnets
1299 | 	* Use ssh-agent forwarding to connect from public to private instances
1300 | 	* Sometimes grouped into Subnet Groups, e.g. for caching or DB. Typically across AZs
1301 | * **Route Table**
1302 | 	* Contains a set of rules, called routes that determine where network traffic is directed to
1303 | 	* Each VPC automatically comes with a main route table that can be configured
1304 | 	* Each subnet in a VPC must be associated with a route table; the table controls the routing
1305 | 	for the subnet. A subnet can only be associated with one route table at a time, but multiple
1306 | 	subnets can be associated with the same route table
1307 | 	* Each route in a table specifies a destination CIDR and a target
1308 | 	* Every route table contains a local route for communication within the VPC
1309 | 	* Can have a *default route* 0.0.0.0/0 to route everything that doesn't have a specific rule
1310 | * **Elastic IP**
1311 | 	* Static IPv4 address mapped to an *instance* or *network interface*
1312 | 	* If attached to network interface it's decoupled from the instance's lifecycle
1313 | 	* Routes to *private IP* address of instance
1314 | 	* Can be remapped in case of failure.
1315 | 	* For use in a specific region only
1316 | 	* Can only map to instances in public subnets
1317 | * **Gateways**
1318 | 	* *Internet Gateway*
1319 | 		* Horizontally scaled, redundant, and highly available VPC component that allows communication
1320 | 		between instances in a VPC and the internet
1321 | 		* Provides a target in VPC route tables for internet-routable traffic
1322 | 		* Performs network address translation (NAT) for instances that have been assigned public
1323 | 		IPv4 addresses
1324 | 	* *Virtual Private Gateway*
1325 | 		* Has VPN connection to customer gateway attached
1326 | 		* Serves as VPN concentrator on the Amazon side of the VPN connection
1327 | 	* *Customer Gateway*
1328 | 		* A physical device or software application on your side of the VPN connection
1329 | * **NAT**
1330 | 	* *NAT Instances*
1331 | 		* Manually configured instance from an NAT AMI
1332 | 	* *NAT Gateway*
1333 | 		* AWS-mananged service
1334 | 
1335 | <a name="9_2_6"></a>
1336 | ### Security
1337 | <a name="9_2_6_1"></a>
1338 | #### Network ACL
1339 | * Subnet level, acting as firewall
1340 | * Rules for inbound and outbound traffic
1341 | * Rules have numbers and are evaluated from low to high, first matching rule wins, others are *not* evaluated
1342 | * *Stateless*
1343 | 
1344 | <a name="9_2_6_2"></a>
1345 | #### Security Groups
1346 | * Acts as a virtual firewall to control inbound and outbound traffic to instances
1347 | * Acts on instance level, not subnet level
1348 | * Rules for inbound and outbound traffic
1349 | * *Stateful* - will always allow response to (allowed) outbound traffic
1350 | * Can refer to other security group, e.g. allow traffic from there
1351 | 
1352 | <a name="9_2_6_3"></a>
1353 | #### Structure & package flow
1354 | * VPC (has *CIDR*)
1355 | 	* Gateway (Internet or VPN)
1356 | 	* Routes (one per subnet, can be shared)
1357 | 	* Network ACL (one per subnet, can be shared)
1358 | 	* Subnets (CIDRs match VPC's CIDR)
1359 | 	* Security Group (on VPC level)
1360 | 	* Instance (needs public IP for internet communication, either ELB or Elastic IP)
1361 | 
1362 | * Flow from internet
1363 |   * Internet Gateway
1364 |   * VPC Router (routes into desired subnet)
1365 |   * Route Table (of that subnet)
1366 |   * NACL
1367 |   * Security Group
1368 |   * Instance
1369 | 
1370 | <a name="9_2_6_4"></a>
1371 | #### Connection To On-prem Network/Direct Connect
1372 | * VPC
1373 |   * (has attached) Virtual Private Gateway
1374 |   * (has attached) VPN Connection
1375 |   * (has attached) Customer Gateway
1376 | 
1377 | TODO: VPN vs direct connect. Can I use VPN instead of DC?
1378 | 
1379 | <a name="9_3"></a>
1380 | ## [↖](#top)[↑](#9_2_6_4)[↓](#10) Limits:
1381 | .|.
1382 | -|-
1383 | VPCs per region|5
1384 | Subnets per VPC|200
1385 | Customer gateways per region|50
1386 | Virtual private gateways per region|5
1387 | Virtual private gateways per VPC|1
1388 | Gateway per region|5 Internet
1389 | Elastic IPs per account per region|5
1390 | VPN connections per region|50
1391 | Route tables per region|200
1392 | Security groups per region|500
1393 | 
1394 | <a name="10"></a>
1395 | # [↖](#top)[↑](#9_3)[↓](#10_1) Etc
1396 | 
1397 | <a name="10_1"></a>
1398 | ## [↖](#top)[↑](#10)[↓](#10_2) Accessing the OS
1399 | * Services that allow access the the underlaying OS
1400 |   * EC2
1401 |   * ECS
1402 |   * EB  (Elastic Bean Stalk)
1403 |   * EMR (Elastic Map Reduce)
1404 |   * OpsWorks
1405 | * Services that hide the OS away (managed services)
1406 |   * DynamoDB
1407 |   * RDS
1408 | 
1409 | <a name="10_2"></a>
1410 | ## [↖](#top)[↑](#10_1)[↓](#10_3) SQS
1411 |   * Default message retention period: 4 days (max 14 days)
1412 |   * `DelaySeconds` will delay a message appearing in the queue
1413 |   * Setting `WaitTimeSeconds` will enable *long polling* (can be more cost efficient)
1414 | 
1415 | <a name="10_3"></a>
1416 | ## [↖](#top)[↑](#10_2)[↓](#) DynamoDb
1417 |   * Prefix partition key with hash to enforce even distribution of IO across many partitions
1418 | 


--------------------------------------------------------------------------------