├── README.md ├── SAA-C02_notes.md └── pic ├── AD_compatibility.png ├── CDCI.png ├── CMK_sym.png ├── CMKs.png ├── Cognito_pool.png ├── Container.png ├── DMS_source_target.png ├── EBS_TYPES.png ├── ECS_components.png ├── ECS_security.png ├── HA_bastion_1.png ├── HA_bastion_2.png ├── HA_example.png ├── NAT.png ├── NAT_comparison_J.png ├── SAM.png ├── Set_dx.png ├── VPC_pic.png ├── WAF_CloudFront.png ├── bastion.png ├── cross_zone_lb.png ├── dx.png ├── elasticache_2.png ├── endpoint.png ├── port.png ├── private_link.png ├── s3_tier.png ├── serverless.png ├── site-to-site-vpn.png └── transit_gateway.png /README.md: -------------------------------------------------------------------------------- 1 | # AWS_SAA_study_material 2 | notes and materials for SAA exams 3 | -------------------------------------------------------------------------------- /SAA-C02_notes.md: -------------------------------------------------------------------------------- 1 | [toc] 2 | 3 | 4 | # To-do 5 | - take [exam readiness](https://aws.amazon.com/training/course-descriptions/exam-workshop-solutions-architect-associate/) 6 | - read AWS FAQ 7 | - Jayendra's blog 8 | - practice exams 9 | - A Cloud Guru (2) 10 | - John Bonso on Udemy (6) 11 | - Whizlab (7+14) 12 | 13 | # To practice 14 | - create your own blog (e.g. WordPress) 15 | - create your own VPN 16 | - create your own scraping server 17 | - with database backend and dashboard in frontend 18 | 19 | # About the Exam 20 | ## Format and Logistics 21 | - valid for **2 years**; re-certify at a discount afterwards 22 | - **130** minutes 23 | - **65** multiple choice questions 24 | - some are multiple selections 25 | - **no penalty** for guessing 26 | - always guess an answer anyway 27 | - can mark questions for later review 28 | - identify key AWS features in the question 29 | - look for key phrases like "with minimum cost" 30 | 31 | ## Test Axioms 32 | - expect single AZ to unlikely be the answer 33 | - using AWS managed services should always be preferred 34 | - fault tolerant and high availability are not the same 35 | - expect everything will fail at some point and design accordingly 36 | - if data is unstructured, S3 is usually a preferred solution 37 | - security groups only allow; NACLs can deny 38 | - IAM roles are preferred to access keys 39 | - for *flexible schema*, DynamoDB is preferred 40 | - SAML <=> SSO 41 | - DDoS <=> AWS Shield (+ AWS WAF) 42 | - "Orchestration" 43 | - Container <=> ECS 44 | - Serverless <=> AWS Step Function 45 | - Tasks <=> SWF 46 | - IPSec <=> VPC VPN 47 | - usually bad practice to use SQS with database for performance; use replicas, elasticache or auto scaling if possible 48 | - WORM <=> object lock (in S3) 49 | - FIPS 140-2 Level 2 <=> KMS 50 | - FIPS 140-2 Level 3 <=> CloudHSM 51 | - most AWS services use VPC *Interface* Endpoint except for S3 and DynamoDB, which use VPC *Gateway* Endpoint 52 | 53 | ### To remember for the exam only 54 | 55 | - when creating a VPC, **subnet** and **IGW** are NOT created automatically; **route table**, **NACL** and **security group** are created by default 56 | - When migrating, AWS does NOT copy **launch permissions**, **user-defined tags**, or Amazon **S3 bucket permissions** from the source AMI to the new AMI 57 | - 3 levels of VPC flow log 58 | - **VPC** level 59 | - **subnet** level 60 | - **Network Interface** level 61 | - **Gateway** endpoint for S3 and DynamoDB; **Interface** endpoint for rest of all AWS services 62 | - CloudFormation major sections: 63 | - Format Version 64 | - Description 65 | - Metadata 66 | - Parameters 67 | - Mappings 68 | - Conditions 69 | - Transform 70 | - Resources (required) 71 | - Outputs 72 | 73 | ## Domains 74 | 75 | ### Design Resilient Architecture (30%) 76 | - reliable and resilient storage 77 | - EFS 78 | - EBS 79 | - S3 80 | - design decoupling mechanisms 81 | - SQS 82 | - load balancer 83 | - elastic IP: decouple IP address from server 84 | - multi-tier architecture solutions 85 | - high availability and/or fault tolerant solutions 86 | - high availability: user can access service under any circumstance; can allow certain performance degrade 87 | - fault tolerance: user does not experience any issue; more strict requirement 88 | - RTO vs. RPO: 89 | - RTO (Recovery Time Objective): how much time an application can be down without causing significant damage to the business 90 | - RPO (Recovery Point Objective): the amount of data that can be lost before significant harm to the business occurs 91 | 92 | #### HA: Highly Available architecture 93 | - always design for failure 94 | - use multiple AZs and regions wherever you can 95 | - multi-AZ vs. read replicas for RDS 96 | - scaling out vs. scaling up 97 | - scaling out: use auto scaling groups (add instances) 98 | - scaling up: increase resources inside EC2 instance (upgrade RAM or CPU) 99 | - beware of cost element 100 | - know different S3 storage classes 101 | ![HA](pic/HA_example.png) 102 | - HA bastion hosts 103 | - option 1: separate hosts in separate AZ; use a network load balancer with static IP address and health checks 104 | - cannot use application load balancer, because it is layer 7 and we need layer 4 105 | ![HA_bastion1](pic/HA_bastion_1.png) 106 | - option 2: one host in a AZ behind an auto scaling group with health checks and a fiexed EIP. 107 | - if the host fails, the health check will fail and the auto scaling group will automatically provision a new EC2 instance 108 | - not 100% fault tolerant; will have some downtime 109 | - lowest cost option 110 | ![HA_bastion2](pic/HA_bastion_2.png) 111 | 112 | ### Define Performant Solutions (28%) 113 | - performant storage and databases 114 | - EBS: different types 115 | - S3: host static files (instead of keeping on web server) 116 | - RDS vs. DynamoDB vs. Redshift 117 | - read replicas 118 | - apply caching 119 | - design solutions for elasticity and scalability 120 | 121 | #### HPC: High Performance Computing 122 | - data transfer: see [Jayendra's blog](https://jayendrapatil.com/aws-data-transfer-services/) for more details 123 | - Snowball, Snowmobile 124 | - DataSync 125 | - Direct Connect 126 | - cache 127 | - CloudFront 128 | - API Gateway 129 | - ElastiCache - Memcached and Redis 130 | - DynamoDB Accelerator (DAX) 131 | - compute and network 132 | - EC2 instances (GPU or CPU optimized) 133 | - EC2 fleets (e.g. spot fleets) 134 | - placement groups (cluster placement groups for low latency) 135 | - enhanced networking (ENA, VF, EFA) 136 | - storage 137 | - instance-attached storage 138 | - EBS 139 | - instance store 140 | - network storage 141 | - S3: distributed object-based storage; not a file system 142 | - EFS: scale IOPS based on total size, or use provisioned IOPS 143 | - FSx for Lustre: HPC-optimized distributed file system; millions of IOPS; backed by S3 144 | - orchestration and automation 145 | - AWS Batch: run many batch computation jobs 146 | - AWS ParallelCluster 147 | 148 | 149 | ### Specify Secure Applications and Architectures (24%) 150 | - secure application tier 151 | - secure data 152 | - in transit 153 | - SSL 154 | - VPN 155 | - Snowball 156 | - at rest 157 | - data on S3 is private by default and need credentials to access 158 | - networking infrastructure for VPC 159 | - subnets 160 | - security groups and NACL 161 | - IGW; NAT instance/gateway 162 | - bastion hosts 163 | - shared responsibility model 164 | - AWS responsibility: AZ, region, edge locations, EC2 165 | - principle of least privilege 166 | 167 | ### Design Cost-optimized Architectures (18%) 168 | - storage 169 | - compute 170 | - serverless architecture 171 | - CloudFront 172 | - no charge for data transfer between S3 and CloudFront 173 | - key principles 174 | - pay as you need 175 | - pay less when reserved 176 | - pay less per unit when use more 177 | 178 | ### Define Operationally Excellent Architectures (0%) 179 | - prepare-operate-update 180 | - perform operations with code 181 | - annotate documentation 182 | - make frequent, small, reversible changes 183 | - refine operations procedures frequently 184 | - anticipate failure 185 | - related services 186 | - AWS Config 187 | - CloudFormation 188 | - VPC flow logs 189 | - CloudTrail 190 | - CloudWatch 191 | - AWS Trusted Advisor 192 | 193 | 194 | # Topic notes 195 | 196 | ## Management 197 | 198 | ### IAM 199 | - key entities 200 | - **users** 201 | - **groups** 202 | - **roles** 203 | - **policies** (in JSON format) 204 | - identity policy 205 | - resource policy 206 | - universal: not specific to region 207 | - new users have **NO permissions** when first created 208 | - Access key ID and secret access keys are assigned to new users 209 | - not same as passwords; can only be used via APIs and command line 210 | - can only be viewed once 211 | - (for EC2) better to use **IAM roles** instead of keeping credentials 212 | - you can give federated users single sign-on (SSO) access to AWS management console with SAML (Security Assertion Markup Language) 213 | - Amazon Resource Name (ARN): uniquely identifies any AWS resource 214 | - begins with `arn:partition:service:region:account_id:`; ends with `resource` or `resource_type` 215 | - e.g. `arn:aws:ec2:us-east-1:1234523121:instance/*` 216 | - IAM policy has no effect until attached 217 | - IAM policies rules 218 | - not explicitly allowed means **implicitly denied** 219 | - explicit deny > everything else 220 | - AWS joins all applicable policies 221 | - AWS-managed vs. customer-managed 222 | - can control access based on tags (e.g. different access for resources tagged as prod vs. dev) 223 | - in-line policy: only applicable to specific roles 224 | - permission boundaries 225 | - used to delegate admin to other users 226 | - prevent privilege escalation or unnecessarily broad permissions 227 | - control maximum permissions an IAM policy can grant 228 | - "owner" (in permission policy) refers to the **identity** and **email address** used to create the AWS account 229 | 230 | ### AWS Organization 231 | - paying account should be used for billing purpose only; do not deploy resources in paying account 232 | - enable/disable AWS services using **Service Control Polices** (SCP) either on OU (Organization Unit) or or individual accounts 233 | - SCPs affect **only IAM users and roles** that are managed by accounts that are part of the organization (including root user). SCPs don't affect resource-based policies directly. They also don't affect users or roles from accounts outside the organization 234 | - **RAM: Resource Access Manager** 235 | - can share AWS resources between accounts 236 | - e.g. EC2, Aurora, Route 53, resourece groups 237 | - sharing must be enabled with the master account 238 | - only resources owned by the account are shared; cannot be re-shared from other accounts 239 | - resource sharing can be done at an individual account basis if RAM is not enabled 240 | - SSO helps centrally manage access to AWS accounts 241 | - exam tip: if you see SAML in question, look for SSO in answers 242 | 243 | 244 | ### AWS Directory Service 245 | - a family of managed services heavily integrated with Microsoft Active Directory (AD) 246 | - connect AWS resources with on-premise AD 247 | - standalone directory in the cloud 248 | - use existing corporate credentials 249 | - enable SSO to any domain-joined EC2 instance 250 | - provides AD domain controllers (DCs) running Windows Servers 251 | - reachable by applications in VPC 252 | - extend existing AD to on-premises using AD Trust 253 | - **Simple AD**: standalone managed directory 254 | - support Windows workloads that need basic AD features 255 | - easer to manage EC2 256 | - does not support trusts 257 | - **AD Connector** 258 | - directory **gateway for on-premises AD** 259 | - avoid caching information in the cloud 260 | - if using with SSO, does not cache user information; only forwards to on-premise AD 261 | - allow on-premise users to log in to AWS using AD 262 | - join EC2 instances to existing AD domain 263 | - useful for on-premise applications 264 | - scale across multiple AD connectors 265 | - Cloud Directory 266 | - directory-based store for developers 267 | - use cases: org charts 268 | - fully managed service 269 | - Cognito User Pools 270 | - managed user directory for SaaS applications 271 | - sign-up and sign-in for web or mobile 272 | - works with social media identities 273 | - compatibility 274 | ![AD Compatibility](pic/AD_compatibility.png) 275 | 276 | 277 | ### Cognito: Web Identity Federation 278 | - **Web Identity Federation** lets you give users access to AWS resources after they are authenticated with a web-based identity provider like Google 279 | - after authentication, user gets an authorization code from the web ID provider, which can be traded for temporary AWS credentials 280 | - Cognito provides Web Identity Federation with the following features 281 | - sign-up and sign-in to your apps 282 | - access for guest users 283 | - acts as an Identity Broker between your application and web ID provider; no need to write additional code 284 | - synchronize user data for multiple devices 285 | - recommended for all mobile AWS services 286 | - Cognito brokers between the app and Google/Facebook to provide temporary credentials which map to an IAM role allowing access to the required resources 287 | - no need for the application to embed or store AWS credentials locally on the device 288 | - used for user **authentication**; NOT for providing access to your AWS resources 289 | - **user pools** are user directories used to manage sign-up and sign-in functionality 290 | - users can sign in directly to the User Pool or using Facebook/Google etc. 291 | - Cognito acts as an identity broker between the identity provider and AWS 292 | - successful authentication generates a JSON Web token (JWT) 293 | - **identity pools** enabled to provide temporary AWS credentials to access AWS services like S3 or DynamoDB 294 | - identity pools are about authorizing access to AWS resources 295 | - user pools are about users (e.g. email address, passwords...) 296 | ![pool](pic/Cognito_pool.png) 297 | - track the association between user identity and the various different devices they sign in from 298 | - uses Push Synchronization to push updates and synchronize user data across multiple device 299 | - uses SNS to send a notification to all the devices associated with a given user identity whenever data stored in the cloud changes 300 | 301 | 302 | ### CloudWatch 303 | - a **monitoring** service for AWS resources and applications run on AWS 304 | - CloudTrail: a record of your management console activities and API calls (a log of who did what at when) 305 | - CloudWatch is for **monitoring** performance 306 | - e.g. CPU, disk reads, network packets, queue size 307 | - NOT included by default: memory, disk swap, disk space and page file utilization, log collection 308 | - standard monitoring with EC2 is every **5** minutes; detailed monitoring is **1** minute 309 | - by default, need **15** minutes to trigger the first alarm 310 | - CloudWatch Events can monitor state changes 311 | - CloudWatch can be used to create dashboards, set alarms, monitor events, use logs 312 | - Using Amazon CloudWatch alarm actions, you can create alarms that automatically stop, terminate, reboot, or recover your EC2 instances 313 | - can take these four actions directly; no need to use AWS Config, Lambda, or any other services 314 | - Amazon CloudWatch stores metrics for terminated Amazon EC2 instances or deleted Elastic Load Balancers for 2 weeks 315 | - Amazon CloudWatch monitoring charge does not vary by Amazon EC2 instance type 316 | - **CloudWatch Logs agent** provides an automated way to send log data to CloudWatch Logs from Amazon EC2 instances 317 | 318 | ### CloudTrail 319 | 320 | - **log**, continuously monitor, and retain account activity related to actions across your AWS infrastructure 321 | - for **auditing** and tracking suspicious activities 322 | - You can configure CloudTrail to deliver log files from multiple regions to a single S3 bucket for a single account 323 | - can be used with API 324 | - When you change an existing single-region trail to log all regions, CloudTrail logs events from all regions in your account 325 | - In the console, by default, you create a trail that logs events in **all AWS Regions** 326 | - to log events in **single** region, use AWS **CLI** 327 | - By default, CloudTrail event log files are encrypted using Amazon **S3 server-side encryption** (SSE) 328 | - can use **log file integrity validation** feature to determine whether a log file was modified, deleted, or unchanged 329 | - SHA-256 for hashing and SHA-256 with RSA for digital signing 330 | 331 | ### AWS Config 332 | 333 | - a service that enables you to assess, audit, and evaluate the configurations of your AWS resources 334 | - **Evaluate your AWS resource configurations for desired settings**. 335 | - **Get a snapshot of the current configurations** of the supported resources that are associated with your AWS account. 336 | - Retrieve configurations of one or more resources that exist in your account. 337 | - Retrieve historical configurations of one or more resources. 338 | - Receive a notification whenever a resource is created, modified, or deleted. 339 | - View relationships between resources. For example, you might want to find all resources that use a particular security group. 340 | - continuously monitors and records your AWS resource configurations and allows you to **automate the evaluation of recorded configurations against desired configurations** 341 | - you can review changes in configurations and relationships between AWS resources, dive into detailed resource configuration histories, and determine your overall compliance against the configurations specified in your internal guidelines 342 | - enables you to simplify compliance auditing, security analysis, change management, and operational troubleshooting 343 | - The AWS Config dashboard shows the **compliance status** of your rules and resources. You can verify if your resources comply with your desired configurations and learn which specific resources are noncompliant 344 | - difference from CloudTrail: can **enforce** rules to comply with organization policy; CloudTrail is only a logging service 345 | 346 | ### Auto Scaling 347 | 348 | - **groups** 349 | - logical component: webserver group, application group, or database group 350 | - may reference an ELB 351 | - health check 352 | - **configuration templates** 353 | - a **launch template** or a **launch configuration** for its EC2 instances 354 | - basically an instruction for what instances to launch, what size they are and etc. 355 | - to specify information such as AMI ID, instance type, key pair, security groups 356 | - **scaling options** 357 | - ways to scale groups 358 | - one or more may be attached to a auto scaling group 359 | - can scaled based on occurrence of a specified condition (dynamic scaling) - such as CPU utilization - or on a schedule 360 | - 5 options: 361 | - maintain current instance levels at all times 362 | - scale manually 363 | - simple scaling: Increase or decrease the current capacity of the group based on a single scaling adjustment 364 | - scheduled scaling: based on a schedule that allows you to set your own scaling schedule for predictable load changes 365 | - target-tracking scaling: Increase or decrease the current capacity of the group based on a target value for a specific metric 366 | - step scaling: Increase or decrease the current capacity of the group based on a set of scaling adjustments, known as *step adjustments*, that vary based on the size of the alarm breach 367 | - components: auto scaling; CloudWatch; Elastic Load Balancer (optional) 368 | - **termination logic** 369 | 1. If there are instances in multiple Availability Zones, choose the **AZ with the most instances** and at least one instance that is not protected from scale in. If there is more than one Availability Zone with this number of instances, choose the Availability Zone with the instances that use the **oldest launch configuration**. 370 | 2. Determine which unprotected instances in the selected Availability Zone use the oldest launch configuration. If there is one such instance, terminate it. 371 | 3. If there are multiple instances to terminate based on the above criteria, determine which unprotected instances are **closest to the next billing hour**. (This helps you maximize the use of your EC2 instances and manage your Amazon EC2 usage costs.) If there is one such instance, terminate it. 372 | 4. If there is more than one unprotected instance closest to the next billing hour, choose one of these instances at random. 373 | - **lifecycle hooks** 374 | - The lifecycle hook puts the instance into a **wait state** (`Pending:Wait` or `Terminating:Wait`). The instance is paused until you continue or the timeout period ends 375 | - During this wait state, you can perform custom actitivities (e.g. retrieve operational data) 376 | - By default, the instance remains in a wait state for **1** hour, and then the Auto Scaling group continues the launch or terminate process 377 | - **cool down** period 378 | - It ensures that the Auto Scaling group does not launch or terminate additional EC2 instances before the previous scaling activity takes effect 379 | - Its default value is **300** seconds 380 | - It is a configurable setting for your Auto Scaling group 381 | - EC2 Auto Scaling groups are **regional** constructs. They can span Availability Zones, but not AWS regions 382 | - When an impaired instance fails a health check, Amazon EC2 Auto Scaling automatically **terminates** it and replaces it with a new one 383 | 384 | ### General Security Risks 385 | - bad actors: typically automated process 386 | - content scrapers 387 | - bad bots 388 | - fake user agent 389 | - Denial of Service (DoS) 390 | - in addition to NACL where you can specify (a range of) IP address to block, you can also use host-based firewall 391 | - NACL operations on **layer 4** 392 | - when using an Application Load Balancer, the connection from bad actor terminates at ALB, not at the EC2 instance behind 393 | - a host-based firewall will be ineffective in this case; still need a NACL 394 | - can set up a WAF before ALB 395 | - operates on **layer 7** 396 | - works best for SQL injection attacks, cross-site scripting attacks 397 | - may have a configuration of CloudFront 398 | ![WAF_CloudFront](pic/WAF_CloudFront.png) 399 | - when using a Network Load Balancer, traffic goes to EC2 instance -> only counter-measure is to use NACL 400 | 401 | ### KMS: Key Management Service 402 | - **regional** secure key management for encryption and decryption 403 | - if you have an encrypted resource in one region and you want to move it to another region, must first decrypt, then move, then encrypt in new region 404 | - manages **customer master keys (CMKs)** 405 | - each associated with a key policy 406 | - rotated periodically 407 | - 3 types of CMKs 408 | ![CMK](pic/CMKs.png) 409 | - CMK symmetry 410 | ![CMK sym](pic/CMK_sym.png) 411 | - ideal for S3 objects, database passwords and API keys 412 | - can import your own keys, disable and re-enable keys, and define key management roles in IAM 413 | - encrypt and decrypt data up to 4kB in size 414 | - can use Data Encryption Key (DEK) for larger size 415 | - integrated with most AWS services 416 | - pay per API call 417 | - has audit capability using CloudTrail 418 | - **FIPS 140-2 Level 2** 419 | - just need to show proof of tampering 420 | - CloudHSM is level 3 (more stringent) 421 | 422 | ### STS: AWS Security Token Service 423 | 424 | - create and provide trusted users with **temporary security credentials** that can control access to your AWS resources 425 | - Temporary security credentials work almost identically to the long-term access key credentials that your IAM users can use 426 | 427 | ### CloudHSM: Hardware Security Module 428 | 429 | - **dedicated** HSM 430 | - **FIPS 140-2 Level 3** 431 | - manage your own keys 432 | - no access to the AWS-managed component 433 | - runs within a VPC in your account 434 | - **single tenant**, dedicated hardware, multi-AZ cluster 435 | - not highly available by default; need to provision HSMs across AZs 436 | - industry-standard: no AWS APIs 437 | - good for softwares with compliance requirement - e.g. 438 | - PKCS#11 439 | - Java Cryptography Extensions (JCE) 440 | - Microsoft CryptoNG (CNG) 441 | - irretrivable if lost 442 | - When an HSM is zeroized, all keys, certificates, and other data on the HSM is destroyed 443 | - You can use your cluster's security group to prevent an unauthenticated user from zeroizing your HSM 444 | - backups 445 | - When AWS CloudHSM makes a backup from the HSM, the HSM encrypts all of its data before sending it to AWS CloudHSM 446 | - HSM uses a unique, ephemeral encryption key known as the ephemeral backup key (EBK) 447 | - sent to S3 in the same region 448 | 449 | ### AWS Secrets Manager 450 | 451 | - an AWS service that makes it easier for you to manage secrets 452 | - *Secrets* can be database credentials, passwords, third-party API keys, and even arbitrary text 453 | - can **store and control access** to these secrets centrally by using the Secrets Manager console, the Secrets Manager command line interface (CLI), or the Secrets Manager API and SDKs 454 | - enables you to **replace hardcoded credentials** in your code (including passwords), with an API call to Secrets Manager to retrieve the secret programmatically 455 | - can configure Secrets Manager to automatically rotate the secret for you according to a schedule that you specify 456 | 457 | ### Parameter Store 458 | - component of AWS Systems Manager (SSM) 459 | - secure **serverless** storage for configuration and secrets 460 | - passwords 461 | - database connection strings 462 | - license codes 463 | - API keys 464 | - values can be stored encrpyted (KMS) or plain text 465 | - separate data from source control 466 | - can store parameters in *hierachies* 467 | - can track versions 468 | - can set TTL to expire values such as passwords 469 | - can be integrated with CloudFormation 470 | - does NOT rotate parameters by default 471 | 472 | 473 | 474 | 475 | 476 | ## Storage 477 | 478 | ### S3: Simple Storage Service 479 | - S3 is **object-based** 480 | - key (name of the object) 481 | - value (data) 482 | - version ID 483 | - metadata 484 | - system metadata (e.g. last-modified date) 485 | - user-defined metadata 486 | - sub-resources 487 | - access control lists (ACL) 488 | - torrent 489 | 490 | - Object can be retrieved as a whole or partially 491 | 492 | - partial retrieval through **Range HTTP header** 493 | 494 | - Files can be between **0-5TB** 495 | 496 | - 5GB for a single PUT 497 | 498 | - objects are **private by default** 499 | 500 | - **unlimited** storage space 501 | 502 | - pay as you use 503 | 504 | - cannot install operating system on 505 | 506 | - does not provide file system access semantics (EFS does) 507 | 508 | - cannot tag individual folders within an S3 bucket 509 | 510 | - Files are stored in **Buckets** 511 | - up to 100 buckets per account by default (maximum 1000 per account) 512 | - by default, all new buckets are **private** 513 | - buckets can be configured to create access logs which log all requests made to the S3 bucket 514 | - Bucket ownership is not transferable 515 | - Buckets **cannot be nested** and cannot have bucket within another bucket 516 | - Bucket name and region cannot be changed, once created 517 | 518 | - S3 is a **universal** namespace (globally) 519 | - URL styles 520 | - virtual style: ```https://-s3-.amazonaws.com/``` 521 | - path style: ```https://s3-.amazonaws.com/``` 522 | - static hosting: ```https://-s3-website-``` 523 | - legacy global endpoint (limited support and discouraged) 524 | - Even though S3 is a global service, buckets are created within a region specified during the creation of the bucket 525 | 526 | - **HTTP 200 code** returned to browser and MD5 checksum if upload to S3 is successful 527 | 528 | - consistency model 529 | - **read after write (strong) consistency for PUTS of new Objects** 530 | - **eventual consistency for overwrite PUTS and DELETES** 531 | 532 | - standard availability 533 | - **99.99%** availability by Amazon (chance of the file being available) 534 | - **11x9s** durability (chance of not losing the file) 535 | 536 | - S3 features 537 | - **tiered storage** 538 | - **standard**: cheaper for access but expensive for storage 539 | - **IA** (infrequent accessed): less frequent but rapid access; cheaper for storage but expensive for access; 99.9% availability 540 | - **One Zone-IA** (no multiple AZ): similar to Reduced Redundancy; good to store data that is easy to reproduce; 99.5% availability 541 | - same high durability, high throughput, and low latency as standard and IA 542 | - S3 One Zone-IA storage class is set at the object level and can exist in the same bucket as S3 Standard and S3 Standard-IA 543 | - **Intelligent tiering** (by ML): standard + IA; 99.9% availability 544 | - **Glacier**: retrieval time can be configured; 99.9% availability 545 | - retrieval type: standard (a few hours); expedited (1-5 minutes); bulk (5-12 hours) 546 | - Glacier Console can access the vaults and the objects in them; cannot restore them (need to use S3 console instead) 547 | - can enable provisioned retrieval capacity (with a cost): ensures that your retrieval capacity for expedited retrievals is available when you need it 548 | - **Glacier Deep Archive**: retrieval time about 12 hours; minimum storage duration of 180 days; 99.9% availability 549 | 550 | ![s3 tier](pic/s3_tier.png) 551 | 552 | - **lifecycle** management 553 | - 2 types of behavior 554 | - Transition in which the storage class for the objects change 555 | - Expiration where the objects expire and are permanently deleted 556 | - automates moving the objects between different storage tiers 557 | - integrates with **versioning**; can be on current or previous versions 558 | - Object’s lifecycle management applies to both Non Versioning and Versioning enabled buckets 559 | - Lifecycle configuration on MFA-enabled buckets is not supported 560 | - can apply to multipart uploads (e.g. remove such uploads if failing to complete in a specific time period). 561 | - Objects must be stored **at least 30 days** in the current storage class before you can transition them to STANDARD_IA or ONEZONE_IA 562 | - **versioning** 563 | - once enabled, cannot be disabled, only suspended 564 | - integrates with lifecycle rules 565 | - has MFA Delete capability: has to provide MFA auth to delete a file 566 | - only the bucket owner (root account) can enable MFA delete 567 | - **Permissions are set at the version level**. Each version has its own object owner; an AWS account that creates the object version is the owner. You can set different permissions for different versions of the same object 568 | - Versioning does NOT prevent Bucket deletion and must be backed up 569 | - **cross region replication** 570 | - a **bucket-level** feature that enables automatic, asynchronous copying of objects across buckets in different AWS regions 571 | - S3 can replicate all or a subset of objects with specific key name prefixes 572 | - **must have versioning enabled on both source and destination buckets** 573 | - when turned on, files in bucket before turning on are not automatically replicated 574 | - delete markers and deletions are NOT replicated cross region 575 | - S3 encrypts all data in transit across AWS regions using SSL 576 | - Objects created with server-side encryption using AWS KMS–managed encryption (SSE-KMS) keys are not replicated, by default 577 | - S3 does not replicate objects in the source bucket for which the bucket owner does not have permissions. 578 | - **encryption** 579 | - encryption **in transit** is achieved by SSL/TLS (HTTPS) 580 | - encryption at rest at **server** side 581 | - S3 managed keys: **SSE-S3 (AES-256)** 582 | - AWS key management service: **SSE-KMS** (has quota for number of requests; upload and download count towards quota; can provide audit trails) 583 | - Customer provided keys: **SSE-C** 584 | - encryption at rest at **client** side 585 | - AWS KMS-managed Customer Master Key (**CMK**) 586 | - **Client-side master key** 587 | - If you need server-side encryption for all of the objects that are stored in a bucket, use a bucket policy 588 | - if you chose to use server-side encryption with customer-provided encryption keys (SSE-C), you must provide encryption key information 589 | - e.g. with request header `x-amz-server-side-encryption-customer-key-MD5` 590 | - MFA Delete 591 | - Notification 592 | - S3 notification feature enables notifications to be triggered when certain events happen in your bucket 593 | - Notifications are enabled at **bucket level** 594 | - Notifications can be configured to be filtered by the prefix and suffix of the key name of objects. However, filtering rules cannot be defined with overlapping prefixes, overlapping suffixes, or prefix and suffix overlapping 595 | - S3 can publish the following events 596 | - New Objects created event 597 | - Object Removal event 598 | - Reduced Redundancy Storage (RRS) object lost event 599 | - S3 can publish events to the following destination 600 | - SNS topic 601 | - SQS queue 602 | - AWS Lambda 603 | - S3 object lock 604 | - write once, read many (**WORM**) model 605 | - prevents an object from being deleted or overwritten for a fixed amount of time or indefinitely 606 | - can be on single or multiple objects 607 | - governance (can't overwrite or delete without permission) vs. compliance mode (can't be modified by anyone, including root user) 608 | - legal hold (no retention period) 609 | - S3 Glacier vault lock 610 | - with vault lock policy: once locked, policy cannot be changed 611 | - performance enhancement 612 | - **prefix**: increase performance by spreading reads across prefix 613 | - more prefixes, more requests you can do at the same time 614 | - **multipart uploads** (required for >5G file) 615 | - S3 **byte-range fetches**: download large files by parts 616 | - **S3 Select**: retrieve partial data of an object by SQL 617 | - S3 Select works on objects stored in CSV, JSON, or Apache Parquet format 618 | - also works with objects that are compressed with GZIP or BZIP2 (for CSV and JSON objects only), and server-side encrypted objects 619 | - Glacier Select also available 620 | - share S3 buckets across accounts (3 ways) 621 | - Bucket policy & IAM (bucket level; programmatic only) - need to explicitly allow 622 | - Bucket ACL & IAM (object level; programmatic only) 623 | - Cross-account IAM roles (programmatic and console) 624 | - **Pre-signed URLs** 625 | - Pre-signed URLs allows user to be able to download or upload a specific object without requiring AWS security credentials or permissions 626 | - Pre-signed URL allows anyone access to the object identified in the URL, provided the creator of the URL has permissions to access that object 627 | - Creation of the pre-signed URLs requires the creator to provide his security credentials, specify a bucket name, an object key, an HTTP method (GET for download object & PUT of uploading objects), and expiration date and time 628 | - Pre-signed URLs are valid only till the expiration date & time 629 | - Transfer Acceleration through CloudFront 630 | - **Website hosting** 631 | - S3 can be used for Static Website hosting with Client side scripts 632 | - S3 does **not support server-side scripting** (e.g. JavaScript) 633 | - S3, in conjunction with Route 53, supports hosting a website at the root domain which can point to the S3 website endpoint 634 | - S3 website endpoints do not support HTTPS 635 | - For S3 website hosting the content should be made publicly readable which can be provided using a bucket policy or an ACL on an object 636 | - User can configure the index, error document as well as configure the conditional routing of on object name 637 | - The bucket must have the **same name as your domain or subdomain** 638 | - For example, if you want to use the subdomain `portal.tutorialsdojo.com`, the name of the bucket must be `portal.tutorialsdojo.com`. 639 | - Deletion 640 | - S3 allows deletion of a single object or multiple objects (max 1000) in a single call 641 | - For deleting Versioned buckets, if an object key is provided, S3 inserts a delete marker and the previous current object become non current object 642 | - if an object key with a version ID is provided, the object is permanently deleted 643 | - if the version ID is of the delete marker, the delete marker is removed and the previous non current version becomes the current version object 644 | - Deletion can be MFA enabled for adding extra security 645 | - Other operations 646 | - Copying of object up to 5GB can be performed using a single operation and multipart upload can be used for uploads up to 5TB 647 | 648 | - S3 cost 649 | - storage (depending on tiers) 650 | - (number of) requests and data retrieval 651 | - storage management pricing 652 | - data transfer (out) 653 | - transfer acceleration (use CloudFront) 654 | - cross region replication 655 | - free: 656 | - data transfer into S3 657 | 658 | - Athena 659 | - interactive query service to query data in S3 with SQL 660 | - serverless 661 | - can be used to query logs, generate business reports and etc. 662 | - support JSON, Apache Parquet, Apache ORC 663 | 664 | - Macie 665 | - ML and NLP solution to identify and protect sensitive data stored in S3 666 | - e.g. Personally Identifiable Information (PII) 667 | - continuously monitors data access activity for anomalies, and delivers alerts when it detects risk of unauthorized access or inadvertent data leaks 668 | - can be used to analyze logs 669 | 670 | ### Snowball 671 | - Snowball 672 | - petabyte-scale data transport solution 673 | - **50TB** or **80TB** 674 | - can import to and export from S3 675 | - Snowball Edge: **100TB** device with on-board storage and compute capabilities 676 | - e.g. carried in aircrafts for testing 677 | - portable AWS (without having to access cloud/internet) 678 | - Snowmobile 679 | - Exabyte-scale data transport solution; essentially a truck 680 | 681 | ### DataSync 682 | 683 | - copy large datasets with millions of files, without having to build custom solutions with open source tools or license and manage expensive commercial network acceleration software 684 | - **simplifies**, **automates**, and **accelerates** copying large amounts of data to and from AWS storage services over the internet or AWS Direct Connect 685 | - good for **S3** (including Glacier and Glacier Deep Archive), **EFS**, **FSx** 686 | - used with **NFS**- and **SMB**-compatible file systems 687 | - replication can be done hourly, daily, or weekly 688 | - install the DataSync agent to start the replication 689 | - can be used to replicate EFS to EFS 690 | - for speed, data verification can be disabled during transfer and can be enabled post transfer for data integrity 691 | 692 | ### Storage Gateway 693 | 694 | - connects **on-premise** software appliance with **cloud-based** storage 695 | - a virtual or physical device to replicate your on-prem data on AWS 696 | - The software appliance, or gateway, is deployed into your on-premises environment as a virtual machine (VM) 697 | - 3 types of Storage Gateway 698 | - **File Gateway** (for flat files; stored directly on S3) 699 | - network file system (NFS) & SMB 700 | - "a file system mount on S3" 701 | - **Volume Gateway** (iSCSI) 702 | - **stored volume**: entire local data stored on site and asynchronously backed up to S3 703 | - **cached volume**: entire data stored on S3 and frequently accessed data cached on site 704 | - **Tape Gateway** (VTL: virtual tape library) 705 | - Not suitable for transfering large amount of data 706 | - mainly used in providing **low-latency access** to data by caching frequently accessed data on-premises while storing archive data securely and durably in Amazon cloud storage services 707 | - optimizes data transfer to AWS by sending only changed data and compressing data 708 | - use DataSync to migrate large amount of data 709 | - good for establishing a **hybrid cloud storage architecture** 710 | 711 | ### EFS: Elastic File System 712 | - file storage system for EC2 713 | - **Linux only**; does not support Windows instance 714 | - can be **shared** across different EC2 instances (unlike EBS) 715 | - Amazon EFS supports one to thousands of Amazon EC2 instances connecting to a file system concurrently 716 | - grow and shrink automatically (unlike EBS); only pay for the storage you use 717 | - peta-byte scale storage 718 | - useful for **distributed**, **highly resilient** storage 719 | - can support thousands of **concurrent NFS connections** 720 | - good for big data analytics, media processing workflows, content management, web serving, and home directories 721 | - supports **Network File System version 4 (NFSv4) protocol** 722 | - one of the first network file sharing protocols native to Unix and Linux 723 | - data is stored across multiple AZ's within a region 724 | - read after write consistency 725 | - **Performance mode** 726 | - General Purpose Performance Mode 727 | - ideal for latency sensitive use cases 728 | - Max I/O Performance Mode 729 | - higher levels of aggregate throughput and operations per second 730 | - with a tradeoff of slightly higher latency 731 | - **Throughput mode** 732 | - Bursting Throughput Mode 733 | - throughput scales as your file system grows 734 | - Provisioned Throughput Mode 735 | - can specify file system's throughput independent of the amount of data stored 736 | - good for applications with high throughput-to-storage (MiB/s per TiB) ratios 737 | - process to deploy 738 | - create EFS 739 | - create mount points in a VPC 740 | - EFS can only be linked to **one VPC at a time** 741 | - lifecycle management 742 | - When enabled, lifecycle management migrates files that have not been accessed for a set period of time to the Infrequent Access (IA) storage class 743 | - After lifecycle management moves a file into the IA storage class, the file remains there **indefinitely** 744 | - Amazon EFS lifecycle management uses an internal timer to track when a file was last accessed 745 | - Metadata operations, such as listing the contents of a directory, don't count as file access 746 | - maximum days for the EFS lifecycle policy is **90** days 747 | - encryption 748 | - supports encryption **at rest**; can only be done **during creation** 749 | - encryption in transit is not an option on EFS during or after creation 750 | - You can enable encryption of data in transit when you mount the file system with NFS protocol 751 | - in that case, EFS Mount helper uses TLS 1.2 to encrypt data in transit 752 | 753 | ### FSx 754 | - **Windows FSx** 755 | - Windows Server that runs Windows **SMB**-based (Server Message Block) file services 756 | - designed for Windows and Windows applications 757 | - support AD users, access control lists, groups and security policies, along with Distributed File System (DFS) namespaces and replications 758 | - Integrated with CloudWatch to monitor storage capacity and file system activity 759 | - Integrated with CloudTrail to monitor all Amazon FSx API calls 760 | - Amazon FSx file systems is accessible from the on-premises environment using an AWS Direct Connect or AWS VPN connection 761 | - Amazon FSx is accessible from multiple VPCs, AWS accounts, and AWS Regions using VPC Peering connections or AWS Transit Gateway 762 | - Amazon FSx automatically replicates the data within an Availability Zone (AZ) to protect it from component failure 763 | - Amazon FSx supports Multi-AZ deployment 764 | - Amazon FSx supports automatic backups of the file systems, which are incremental storing only the changes after the most recent backup 765 | - Amazon FSx stores backups in Amazon S3 766 | - **Lustre FSx** 767 | - file system optimized for **compute-intensive workloads**, such as HPC, ML,media data processing workflows, electronic data automation (EDA) 768 | - can store data directly on S3 769 | - Amazon FSx provides multiple deployment options to optimize cost 770 | - **Scratch** file systems 771 | - designed for temporary storage and short-term processing of data. 772 | - data is not replicated and does not persist if a file server fails. 773 | - **Persistent** file systems 774 | - designed for long-term storage and workloads. 775 | - is highly available, and data is automatically replicated within the AZ that is associated with the file system. 776 | - data volumes attached to the file servers are replicated independently from the file servers to which they are attached. 777 | 778 | 779 | 780 | ## Compute 781 | 782 | ### EC2: Elastic Compute Cloud 783 | - a web service that provides resizable compute capacity 784 | - pricing model 785 | - **On demand** 786 | - **Reserved**: a significant discount on EC2 usage when you commit to a one-year or three-year term 787 | - type of reserved instance 788 | - standard reserved 789 | - convertible reserved (can change instance types) 790 | - scheduled reserved instances 791 | - metrics 792 | - RI utilization: RI used hours divided by total purchased RI hours - essentially RI usage efficiency 793 | - RI coverage: RI used hours divided by total EC2 on-demand and RI hours - essentially instance usage covered by RI (as opposed to other instances) 794 | - billing: if I have unused RI instances, another AWS account can launch an RI instance in **any AZ** in my RI's **region** to enjoy benefits of Consolidated billing 795 | - **Spot**: bid price for instance capacity 796 | - if gets interrupted by AWS, instance will be **terminated** (not stopped) 797 | - if Spot instance is terminated by AWS, no charge for partial hour of usage; 798 | - if you terminate the instance yourself, you will be charged for any hour in which the instance ran 799 | - decide max spot price; the instance will be provisioned so long as price is lower 800 | - can use **Spot block** to allow >max price for 1-6 hours 801 | - not good for persistent workload, critical jobs, and databases 802 | - **Spot Fleets**: a collection of Spot instances (and optionally on-demand instances) 803 | - Linux/UNIX, Windows Server and Red Hat Enterprise Linux (RHEL) are available. Windows Server with SQL Server is not currently available 804 | - **Dedicated host** 805 | - when stopping and starting again, you can transition between *dedicated* and *host* mode 806 | - can manage actual server - e.g. number of sockets, number of cores to use 807 | - For new AWS account, soft limit of 20 instances per *region* 808 | - EC2 Fleet 809 | - With a single API call, EC2 Fleet lets you provision compute capacity across different instance types, Availability Zones and across On-Demand, Reserved Instances (RI) and Spot Instances purchase models to help optimize scale, performance and cost 810 | - Spot Fleet and EC2 Fleet offer the same functionality. There is no requirement/need to migrate 811 | - do NOT support multi-region EC2 Fleet 812 | - Reserved Instance Marketplace 813 | - online marketplace that provides AWS customers the flexibility to sell their EC2 Reserved Instances to other businesses and organizations 814 | - Publich IP address is not managed on the instance 815 | - an alias applied as a network address translation of the private IP address 816 | - underlying Hypervisor: Xen and Nitro 817 | - **AMI** (Amazon Machine Image) is simply a packaged-up environment that includes all the necessary bits to set up and boot your instance 818 | - Your AMIs are your unit of deployment 819 | - When migrating, AWS does NOT copy **launch permissions**, **user-defined tags**, or Amazon **S3 bucket permissions** from the source AMI to the new AMI 820 | - AMI can be selected/configured based on 821 | - region 822 | - operating system 823 | - architecture (32 vs. 64 bit) 824 | - launch permissions 825 | - storage for root device 826 | - Storage options 827 | - **instance store** (ephemeral storage): template stored in S3 828 | - can scale to millions of IOPS 829 | - can only be terminated or rebooted; cannot be stopped. If underlying host fails, you will lose your data 830 | - **once stopped, data will be lost** 831 | - cannot add instance store volume after provisioned; can still add EBS volume 832 | - fixed capacity 833 | - support only certain EC2 instances 834 | - cannot be seen under EBS Volume in the Console (because it's not EBS) 835 | - root volume will be deleted on termination. no option to keep root device (unlike EBS volumes) 836 | - an inexpensive way to launch instances where data is not stored to the root device 837 | - generally used for caching temporary data with fast access 838 | - **EBS backed volumes** 839 | - see next section for more details 840 | - scale up to 64000 IOPS 841 | - attachable storage linked with EC2 instance one at a time 842 | - but one instance to one EBS volume (not multiple) 843 | - supports encryption and snapshots 844 | - By using Amazon EBS, data on the root device will persist independently from the lifetime of the instance. 845 | - **encryption** (of root device) 846 | - EBS root volumes of your default AMI's can be encrypted (unlike before); additional volumes can be encrypted 847 | - to encrypt a volume after provisioned, create a snapshot, make a copy, and encrypt the copied snapshot, create AMI based on it, launch instance from AMI 848 | - snapshots must be unencrypted to be shared 849 | - **instance types** 850 | - **General Purpose Instances**: provide a balance of compute, memory, and networking resources, and can be used for a variety of workloads 851 | - **Compute Optimized Instances**: compute-bound applications that benefit from high-performance processors, such as batch processing workloads and media transcoding 852 | - Accelerated Computing instance family is a family of instances which use hardware accelerators, or co-processors, to perform some functions, such as floating-point number calculation and graphics processing, more efficiently than is possible in software running on CPUs 853 | - **Memory Optimized Instances**: deliver fast performance for workloads that process large data sets in memory 854 | - **Storage Optimized Instances**: designed for workloads that require high, sequential read and write access to very large data sets on local storage; optimized to deliver tens of thousands of low-latency, random I/O operations per second (IOPS) to applications 855 | - **security groups** 856 | - specify inbound and outbound rules 857 | - rule change takes effect **immediately** 858 | - security groups are **stateful**: if you open a port, it will be open for both inbound and outbound 859 | - unlike ACL which is stateless 860 | - if you create inbound rule for HTTP, same outbound rule is automatically created 861 | - **all inbound traffic is blocked by default**; all outbound traffic is allowed 862 | - cannot block particular port, type or IP address; can specify allow rule but not deny rules 863 | - can have multiple security groups attached to one EC2 864 | - can have multiple EC2 instances with the same security group 865 | - **security group can talk to each other**: e.g. set up inbound rule to allow traffic from another security group 866 | - network options 867 | - **ENI** (Elastic Network Interface) 868 | - essentially a virtual network card 869 | - allows IPv4 addresses from the range of your VPC; with security groups, MAC address and etc. 870 | - use case: basic networking; create a management network; use network and security appliances in your VPC; create dual-homed instances 871 | - When an ENI is moved from one instance to another, network traffic is redirected to the new instance 872 | - Multiple ENIs can be attached to an instance (e.g. for low-budget, high-available solution; to create management network; create dual-homed instances with workloads on distinct subnets) 873 | - attach a network interface to an EC2 instance in the following ways 874 | - When it's running (hot attach) 875 | - When it's stopped (warm attach) 876 | - When the instance is being launched (cold attach) 877 | - EN (Enhanced Networking) 878 | - use single root I/O virtualization to provide high-performance networking capabilities; provides higher bandwidth, higher packet per second (PPS), low latency 879 | - use case: when you want good network performance (speed up to between 10-100Gbps) 880 | - can be enabled via Elastic Network Adapter (**ENA**) for up to 100Gbps or Virtual Function (**VF**) for up to 10 Gbps (usually go with ENA) 881 | - **EFA** (Elastic Fabric Adapter) 882 | - a network device you can attach to EC2 instance to accelerate High Performance Computing (**HPC**) and **ML** applications 883 | - can use OS-bypass: enable HPC and ML applications to bypass operating system kernel and to communicate directly with EFA device. **Linux only** 884 | - EFA support can be enabled either at the launch of the instance or added to a stopped instance. EFA devices cannot be attached to a running instance 885 | - use case: HPC and ML applications; OS-bypass 886 | - Examples of HPC applications include computational fluid dynamics (CFD), crash simulations, and weather simulations 887 | - **hibernation** 888 | - saves the contents from the **instance memory (RAM) to EBS root volume** 889 | - instance RAM must be less than 150GB 890 | - cannot be enabled if there is only instance store volume 891 | - If the EBS root volume does not enough space, hibernation will fail and the instance will get shutdown instead 892 | - operating system performs hibernation (suspend-to-disk); not rebooted 893 | - instance boots much faster. good for long-running processes and services that take time to initialize 894 | - **previously attained data volumes are reattached**; instance ID remains the same (unlike stop-restart) 895 | - only public IPv4 address is released and re-assigned upon restart; private IPv4 and any IPv6 addresses are retained 896 | - needs to **enable** hibernation when provisioning the instance; root volume must be encrypted 897 | - can't be hibernated for more than 60 days 898 | - available for **on-demand** and **reserved** instances 899 | - not supported with EC2 instances in an auto scaling group 900 | - Hibernating instances are charged at standard EBS rates for storage. As with a stopped instance, you do not incur instance usage fees while an instance is hibernating 901 | - when in hibernation, pricing is calculated based on the following 902 | - elastic IP address 903 | - EBS volume attached to EC2 instance 904 | - (no charge for compute capacity) 905 | - **placement group** 906 | - **clustered** 907 | - grouping of instances within a **single AZ** (cannot span multiple AZ's) 908 | - low network latency, high network throughput 909 | - recommend to have homogeneous instances within clustered placement groups 910 | - **spread** 911 | - a group of instances that are each placed on distinct underlying hardware 912 | - recommended for applications that have a small number of **critical** instances that should be kept separate from each other 913 | - up to 7 running instances per AZ 914 | - **partitioned** 915 | - each partition has its **own rack**, network and power sources 916 | - each partition can have multiple instances, but each partition is separate from each other 917 | - help reduce the likelihood of **correlated hardware failures** for your application 918 | - can have a maximum of 7 partitions per AZ 919 | - other features 920 | - unique placement group name within AWS account 921 | - cannot merge placement groups 922 | - only certain types of instances can be launched in a placement group (Compute Optimized, GPU, Memory Optimized, Storage Optimized) 923 | - existing instance can be moved to a placement group; needs to be stopped first; cannot be moved via console yet 924 | - If you receive a **capacity error** when launching an instance in a placement group that already has running instances, **stop and restart** all of the instances in the placement group, and try the launch again. Restarting the instances may migrate them to hardware that has capacity for all the requested instances 925 | - VM Import/Export 926 | - enables customers to import Virtual Machine (VM) images in order to create Amazon EC2 instances 927 | - Customers can also export previously imported EC2 instances to create VMs 928 | - VMDK is a file format that specifies a virtual machine hard disk encapsulated within a single file 929 | - The virtual machine must be in a stopped state before generating the VMDK or VHD image 930 | - metadata 931 | - information about the instance itself (e.g. public and private IPv4 address) 932 | - can be retrieved via a special URL (```http://169.254.169.254```) or using the API via CLI or an SDK 933 | - can assign your own metadata in the form of tags 934 | - billing and instance status 935 | - `pending` - The instance is preparing to enter the running state. An instance enters the pending state when it launches for the first time, or when it is restarted after being in the stopped state. You will not be billed in this state 936 | - `running` - The instance is running and ready for use. You are billed in this state 937 | - `stopping` - The instance is preparing to be stopped. Take note that you will not billed if it is preparing to stop however, you will **still be billed** if it is just preparing to hibernate 938 | - `stopped` - The instance is shut down and cannot be used. The instance can be restarted at any time 939 | - `shutting-down` - The instance is preparing to be terminated 940 | - `terminated` - The instance has been permanently deleted and cannot be restarted. Take note that Reserved Instances that applied to terminated instances are **still billed** until the end of their term according to their payment option 941 | - management and configuration 942 | - AWS Systems Manager **Run Command** lets you remotely and securely manage the configuration of your managed instances (without having to login to each instance) 943 | - automate common administrative tasks and perform ad hoc configuration changes at scale 944 | - Troubleshoot: you might be unable to log into an EC2 instance if 945 | - You're using an SSH private key but the corresponding public key is not in the authorized_keys file. 946 | - You don't have permissions for your authorized_keys file. 947 | - You don't have permissions for the .ssh folder. 948 | - Your authorized_keys file or .ssh folder isn't named correctly. 949 | - Your authorized_keys file or .ssh folder was deleted. 950 | - Your instance was launched without a key, or it was launched with an incorrect key 951 | 952 | ### EBS: Elastic Block Store 953 | - provides persistent **block storage** volumes for use with EC2 instances (essentially virtual hard disk drive) 954 | - block storage: can change single bytes of data 955 | - in contrast, S3 (and Glacier) is object-based storage, where you much update the whole object each time 956 | 957 | - automatically replicated within **AZ** 958 | 959 | - termination protection is turned off by default 960 | 961 | - provides **lowest-latency** access to data from a single EC2 instance (comparing to EFS and S3) 962 | 963 | - on an EBS-backed instance, the default action is for the root EBS volume to be deleted when the instance is terminated 964 | 965 | - additional volumes will remain (by default) 966 | 967 | - types 968 | ![EBS types](pic/EBS_TYPES.png) 969 | - SSD-backed storage for **transactional** workloads (performance depends primarily on IOPS) and HDD-backed storage for **throughput** workloads (performance depends primarily on throughput, measured in MB/s) 970 | - **SSD-backed** volumes are designed for transactional, IOPS-intensive database workloads, boot volumes, and workloads that require high IOPS; good for **random** access 971 | - **HDD-backed (magnetic)** volumes are designed for throughput-intensive and big-data workloads, large I/O sizes, and sequential I/O patterns; good for **sequential** access; lower IOPS; always cheaper than SSD 972 | 973 | - | | General Purpose SSD | Provisioned IOPS SSD | 974 | | :----------------------------------: | :----------------------------------------------------------: | ------------------------------------------------------------ | 975 | | **Volume type** | `gp2` | `io2` | 976 | | **Durability** | 99.8% - 99.9% durability (0.1% - 0.2% annual failure rate) | 99.999% durability (0.001% annual failure rate) | 977 | | **Use cases** | Boot volumesLow-latency interactive appsDevelopment and test environments | Workloads that require sustained IOPS performance or more than 16,000 IOPS or 250 MiB/s of throughput per volumeI/O-intensive database workloads | 978 | | **Volume size** | 1 GiB - 16 TiB | 4 GiB - 16 TiB | 979 | | **Max IOPS per volume** (16 KiB I/O) | 16,000 * | 64,000 † | 980 | | **Max throughput per volume** | 250 MiB/s * | 1,000 MiB/s † | 981 | | **Amazon EBS Multi-attach** | Not supported | Not Supported | 982 | 983 | - | | Throughput Optimized HDD | Cold HDD | 984 | | :---------------------------------- | :--------------------------------------------------------- | ------------------------------------------------------------ | 985 | | **Volume type** | `st1` | `sc1` | 986 | | **Durability** | 99.8% - 99.9% durability (0.1% - 0.2% annual failure rate) | 99.8% - 99.9% durability (0.1% - 0.2% annual failure rate) | 987 | | **Use cases** | Big dataData warehousesLog processing | Throughput-oriented storage for data that is infrequently accessedScenarios where the lowest storage cost is important | 988 | | **Volume size** | 500 GiB - 16 TiB | 500 GiB - 16 TiB | 989 | | **Max IOPS per volume** (1 MiB I/O) | 500 | 250 | 990 | | **Max throughput per volume** | 500 MiB/s | 250 MiB/s | 991 | | **Amazon EBS Multi-attach** | Not supported | Not supported | 992 | 993 | - EBS volume needs to be in the **same AZ as EC2** 994 | 995 | - EBS volume can be attached to multiple EC2 instances 996 | 997 | - volume can be changed (type and size) after EC2 is provisioned; may take some time to take place 998 | 999 | - to move to different AZ, create a snapshot; then create an image (AMI) based on the snapshot (with hardware-assisted virtualization); launch EC2 from AMI in another AZ or copy AMI to another region 1000 | 1001 | - additional EBS volume (not root device) can be detached without stopping the instance 1002 | 1003 | - best to unmount the volume from the instance first 1004 | 1005 | - **Snapshots** 1006 | - EBS volumes can be backed up by creating a snapshot of the volume, which is stored in S3 1007 | - EBS snapshots are only available through the Amazon EC2 APIs (not S3 APIs) 1008 | - snapshots are **incremental**; only the blocks that have changed are captured in latest snapshots 1009 | - the snapshot deletion process is designed so that you need to retain only the most recent snapshot in order to restore the volume 1010 | - latest snapshot is both incremental and complete; just need to maintain latest snapshot 1011 | - best to take snapshots when instance is stopped (can still take snapshot if running) 1012 | - EBS Snapshots can be used to migrate or create EBS Volumes in different AZs or regions 1013 | - Snapshots are **constrained to the region** in which they are created and can be used to launch EBS volumes within the same region only 1014 | - Snapshots can be shared by making them public or with specific AWS accounts by modifying the access permissions of the snapshots 1015 | - Encrypted snapshots cannot be made available publicly 1016 | - EBS snapshots fully support EBS encryption 1017 | - Snapshots of encrypted volumes are automatically encrypted 1018 | - Volumes created from encrypted snapshots are automatically encrypted 1019 | - You can use Amazon **Data Lifecycle Manager** (Amazon DLM) to automate the creation, retention, and deletion of **snapshots** taken to back up your Amazon EBS volumes 1020 | 1021 | - **Encryption** 1022 | - Encryption occurs on the servers that host EC2 instances, providing encryption of data-in-transit from EC2 instances to EBS storage. 1023 | - EBS encryption is supported with all EBS **volume** types (gp2, io1, st1 and sc1), and has the same IOPS performance on encrypted volumes as with unencrypted volumes, with a minimal effect on latency 1024 | - EBS encryption is only available on **select instance** types 1025 | - Snapshots of encrypted volumes and volumes created from encrypted snapshots are **automatically encrypted** using the same volume encryption key 1026 | - EBS encryption uses AWS Key Management Service (AWS KMS) customer master keys (CMK) when creating encrypted volumes and any snapshots created from the encrypted volumes. 1027 | - EBS volumes can be encrypted using either 1028 | - a default CMK is created for you automatically. 1029 | - a CMK that you created separately using AWS KMS, giving you more flexibility, including the ability to create, rotate, disable, define access controls, and audit the encryption keys used to protect your data. 1030 | - Public or shared snapshots of encrypted volumes are not supported, because other accounts would be able to decrypt your data and needs to be migrated to an unencrypted status before sharing. 1031 | - Encrypted snapshot can be created from a unencrypted snapshot by creating an encrypted copy of the unencrypted snapshot 1032 | - Unencrypted volume cannot be created from an encrypted volume directly but needs to be migrated 1033 | 1034 | 1035 | ### Lambda 1036 | - takes care of provisioning and managing the servers to run your **stateless** code 1037 | - makes it easy to execute code **in response to events**, such as changes to Amazon S3 buckets, updates to an Amazon DynamoDB table, or custom events generated by your applications or devices 1038 | - use cases 1039 | - event-driven compute service where Lambda runs your code in response to events (e.g. data change in S3 bucket) 1040 | - run code in response to HTTP requests using API Gateway or API calls 1041 | - use Lambda to directly interact with backend database (and skip EC2 instances where you need to manage all the configurations) 1042 | ![serverless](pic/serverless.png) 1043 | - continuously **scaling out** (automatically) 1044 | - **each event triggers an individual Lambda function** 1045 | - **Lambda functions can trigger Lambda functions** 1046 | - priced based on: 1047 | - **number of requests**: first 1 million requests are free; $0.2 per 1 million requests thereafter 1048 | - **duration**: rounded up to 100ms; also depends on **memory** allocated - e.g. $0.00001667 per GB-second 1049 | - can handle max time-out of up to 15 minutes by default 1050 | - not the best choice for long-running heavy workload (comparing to EC2) 1051 | - AWS **X-ray** allows you to debug serverless applications (as architecture can get complex) 1052 | - Lambda can do things **globally** (e.g. back up S3 buckets to another S3 bucket) 1053 | - **Lambda@Edge** 1054 | - a feature of Amazon CloudFront that lets you **run code closer to users of your application**, which improves performance and reduces latency 1055 | - a scalable solution to **segregate different types of users** accessing web applications 1056 | - By using Lambda@Edge and Kinesis together, you can process real-time streaming data so that you can track and analyze globally-distributed user activity on your website and mobile applications, including clickstream analysis 1057 | - possible [triggers](https://docs.aws.amazon.com/lambda/latest/dg/lambda-services.html) 1058 | - API Gateway 1059 | - CloudWatch events 1060 | - Elastic Load Balancer 1061 | - DynamoDB 1062 | - Kinesis 1063 | - SNS 1064 | - SQS 1065 | - MQ 1066 | - Cognito 1067 | - CloudFront 1068 | - ... 1069 | - RDS can NOT trigger Lambda 1070 | - Lambda supports hyper-threading on one or more virtual CPUs 1071 | - can use CloudWatch logs for accessing many Lambda results (e.g. print statement) and logs 1072 | - encryption 1073 | - When you create or update Lambda functions that use environment variables, AWS Lambda encrypts them using the AWS Key Management Service (KMS) 1074 | - if you wish to use encryption helpers and use KMS to encrypt environment variables after your Lambda function is created, you must create your own AWS KMS key and choose it instead of the default key 1075 | 1076 | ### Elastic Beanstalk 1077 | 1078 | - can quickly deploy and manage applications in AWS **without managing the infrastructure** 1079 | - no knowledge of AWS needed; can just upload application code and let AWS set up everything 1080 | - create Web Server environments and Worker environments 1081 | - automatically handles capacity provisioning, load balancing, scaling, health monitoring 1082 | - you can modify settings (e.g. add auto scaling) after provision 1083 | - can deploy applications based on a single Dockerfile (i.e. container-based application) 1084 | - for RDS: 1085 | - good for provisioning test/dev environments; 1086 | - not good for production environment; best practice is to launch production database outside of Elastic Beanstalk environment and establish a connection afterwards 1087 | 1088 | 1089 | 1090 | 1091 | 1092 | ## Network 1093 | 1094 | ### CloudFront 1095 | - content delivery network (CDN) 1096 | - see [Jayendra's blog](https://jayendrapatil.com/aws-cloudfront/) for more details 1097 | - components 1098 | - **edge location**: the location where content is cached 1099 | - not read only; can write to edge locations too 1100 | - objects are cached for the life of the TTL (time to live) 1101 | - you can clear cached objects (invalidation), but you will be charged 1102 | - **origin**: origin of all the files to be distributed. Can be S3, EC2 instance, Elastic Load Balancer, or Route53... 1103 | - origin group: may contain two origins: a primary and a secondary; failover can be enabled 1104 | - **distribution**: a collection of edge locations given to CDN 1105 | - web distribution vs. RTMP (for media streaming) 1106 | - benefit: use AWS backbone network, instead of traversing internet 1107 | - does NOT have the capability to route the traffic to the closest edge location via an **Anycast static IP address** 1108 | - Global Accelerator can 1109 | - troubleshoot: if origin server keeps being hit instead of edge location: 1110 | - cache-control max-age directive might be too low 1111 | - unused object may never have been requested before 1112 | - Restricted access 1113 | - CloudFront **signed URLs**: 1 file for 1 URLs 1114 | - CloudFront **signed cookie**: 1 cookie for multiple files 1115 | - does not support Real-Time Messaging Protocol (RTMP) distribution 1116 | - policy attached when signed URL or cookie is created 1117 | - URL expiration 1118 | - IP range 1119 | - trusted signers 1120 | - use CloudFront signed URL/cookie for EC2 and etc. 1121 | - S3 signed URL is an alternative for S3 only 1122 | - **S3 signed URL** 1123 | - issues a request as the IAM user who creates the pre-signed URL 1124 | - limited lifetime 1125 | - only if the user can access S3 directly (not usually the case) 1126 | - use RTMP distribution (not supported by signed cookie) 1127 | - Signed URLs are used to restrict access to files in CloudFront edge caches; it cannot prevent users from fetching files directly through S3 URLs 1128 | - if you need to restrict access so that users cannot view the files directly by using the S3 URLs, you can create **origin access identity (OAI)** and associate it with the distribution 1129 | - versioning 1130 | - control the versions of files that are served from your distribution 1131 | - enables you to control which file a request returns even when the user has a version cached either locally or behind a corporate caching proxy 1132 | - If you invalidate the file instead of using versioned files, the user might continue to see the old version until it expires from those caches 1133 | - CloudFront access logs include the names of your files, so versioning makes it easier to analyze the results of file changes 1134 | - provides a way to serve different versions of files to different users 1135 | - less expensive: You still have to pay for CloudFront to transfer new versions of your files to edge locations, but you don't have to pay for invalidating files 1136 | 1137 | ### WAF: Web Application Firewall 1138 | - monitor **HTTP** and **HTTPS** requests 1139 | - **layer-7** aware firewall: specific to application 1140 | - 3 types of behaviors: 1141 | - allow all requests except the ones you specify 1142 | - block all requests except the ones you specify 1143 | - count the requests that match the properties you specify 1144 | - protection against web attacks with the following possible conditions: 1145 | - IP address (IP match) 1146 | - country 1147 | - values in request headers 1148 | - strings that appear in requests (string match) 1149 | - length of requests (size constraint) 1150 | - malicious SQL code (SQL injection) 1151 | - malicious script (cross-site scripting) 1152 | 1153 | ### AWS Shield 1154 | 1155 | - a managed Distributed Denial of Service (**DDoS**) protection service 1156 | - network and transport layer protections 1157 | - provides always-on detection and automatic inline mitigations that minimize application downtime and latency, so there is no need to engage AWS Support to benefit from DDoS protection 1158 | - two tiers of AWS Shield: Standard and Advanced 1159 | - All AWS customers benefit from the automatic protections of AWS Shield Standard, at no additional charge 1160 | - AWS WAF alone is not enough to fully protect your VPC from DDoS. You still need to use AWS Shield in combination 1161 | 1162 | ### Route 53: DNS 1163 | 1164 | - convert human friendly domain names to Internet Protocol (IP) address 1165 | - IPv4 (32 bits) addresses are running out, hence IPv6 (128 bits) 1166 | - top level (.com) and second level domain (.uk) 1167 | - top domain names are controlled by by the Internet Assigned Numbers Authority (IANA) in a root zone database 1168 | - IANA [database](http://www.iana.org/domains/root/db) 1169 | - domain registrar: an authority to assign domain names 1170 | - e.g. Amazon, GoDaddy.com 1171 | - may take 3 days to register (with Amazon) 1172 | - Start of Authority (SoA): every DNS begins with this 1173 | - name of the server that supplied the data for the zone 1174 | - administrator of the zone 1175 | - current version of the data file 1176 | - default number of seconds for the time-to-live file on resource records (usually 48 hours) 1177 | - **Name Server (NS)** records: used by top level domain servers to direct traffic to the content DNS server 1178 | - **A Record** (A=Address): translates domain name to numerical IP address 1179 | - e.g. `http://www.acloud.guru` to `http://123.10.10.80` 1180 | - Use AAAA record to enable IPv6 resolution 1181 | - **CNAME** (Canonical Name): resolve one domain name to another 1182 | - e.g. `m.acloud.guru` and `mobile.acloud.guru` mapped to same IP 1183 | - cannot be used for naked domain names (e.g. cannot work for `http://acloud.guru`) 1184 | - If a CNAME record is created for a subdomain, any other resource record sets for that subdomain cannot be created 1185 | - **Alias Records**: map resource record sets to load balancers, CloudFront distributions or S3 (configured as websites) 1186 | - like CNAME; but can create an alias record both for the **root domain** or **apex zone** and for subdomains 1187 | - always choose Alias Record over CNAME 1188 | - Route 53 currently supports the following DNS record types 1189 | - A (address record) 1190 | - AAAA (IPv6 address record) 1191 | - CNAME (canonical name record) 1192 | - CAA (certification authority authorization) 1193 | - MX (mail exchange record) 1194 | - NAPTR (name authority pointer record) 1195 | - NS (name server record) 1196 | - PTR (pointer record) 1197 | - SOA (start of authority record) 1198 | - SPF (sender policy framework) 1199 | - SRV (service locator) 1200 | - TXT (text record) 1201 | - **Hosted Zone** 1202 | - Hosted Zone is a container for records, which include information about how to route traffic for a domain and all of its subdomains. 1203 | - A hosted zone has the same name as the corresponding domain. 1204 | - Routing Traffic to the Resources 1205 | - Create a hosted zone with either a public hosted zone or a private hosted zone: 1206 | - Public Hosted Zone – for routing internet traffic to your resources for a specific domain and its subdomains 1207 | - Private hosted zone – for routing traffic within an VPC 1208 | - Create records in the hosted zone 1209 | - Records define where to route traffic for each domain name or subdomain name. 1210 | - name of each record in a hosted zone must end with the name of the hosted zone. 1211 | - Elastic Load Balancer (ELB) does NOT have pre-defined IPv4 addresses; resolve to them using a DNS name 1212 | - use Route 53 health checking to configure **failover** configurations 1213 | - **active-active** failover 1214 | - any routing policy (or combination of routing policies) 1215 | - if you want all of your resources to be available the majority of the time 1216 | - all the records that have the same name, the same type (such as A or AAAA), and the same routing policy (such as weighted or latency) are active unless Route 53 considers them unhealthy 1217 | - no distinction of primary/secondary resource 1218 | - **active-passive** failover 1219 | - failover routing policy only 1220 | - if you want a primary resource or group of resources to be available the majority of the time and you want a secondary resource or group of resources to be on standby in case all the primary resources become unavailable 1221 | - When responding to queries, Route 53 includes only the healthy primary resources 1222 | - **Routing Policy** 1223 | - **simple routing** 1224 | - one record with multiple IP addresses 1225 | - if multiple values specified in record, Route 53 returns all values in a random order 1226 | - **weighted routing** 1227 | - split traffic based on weights assigned 1228 | - Weights can be assigned any number from **0 to 255** 1229 | - good for testing new (different) versions of applications (e.g. blue-green deployment) 1230 | - **latency-based routing** 1231 | - route traffic based on lowest network latency for end user 1232 | - Latency between hosts on the Internet can change over time as a result of changes in network connectivity and routing. Latency-based routing is based on latency measurements performed over a period of time, and the measurements reflect these changes 1233 | - **failover routing** 1234 | - for active/passive (primary/secondary) setup 1235 | - if active record set fails health check, traffic is directed to passive record set 1236 | - applicable for Public hosted zones only 1237 | - **geolocation routing** 1238 | - send traffic based on geographic location of the end user 1239 | - for overlapping geographic regions, priority goes to the smallest geographic region 1240 | - **geoproximity routing** (traffic flow only) 1241 | - route traffic based on geographic location of user and resource 1242 | - can also use bias to assign traffic -> expands or shrinks the size of a geographic region 1243 | - must use Route 53 traffic flow 1244 | - **multivalue answer routing** 1245 | - like simple routing and allows to put health check on each record set 1246 | - can set **health checks** on individual record sets; if fails, will be removed from Route 53 until it passes the check 1247 | - **Elastic IP address** 1248 | - By default, all accounts are limited to **5** Elastic IP addresses per region 1249 | - You do not need an Elastic IP address for all your instances 1250 | - The public address of your EC2 instance is associated exclusively with the instance until it is stopped, terminated or replaced with an Elastic IP address 1251 | 1252 | ### VPC: Virtual Private Cloud 1253 | - provision a logically isolated section of AWS cloud 1254 | - consists of IGWs (or Virtual Private Gateways), Route Tables, Network Access Control Lists, Subnets, and Security Groups 1255 | ![VPC](pic/VPC_pic.png) 1256 | - VPC **Sizing** 1257 | - VPC needs a set of IP addresses in the form of a Classless Inter-Domain Routing (**CIDR**) block for e.g, 10.0.0.0/16, which allows 2^16 (65536) IP address to be available 1258 | - Allowed CIDR block size is between 1259 | - /28 netmask (minimum with 2^4 – 16 available IP address) and 1260 | - /16 netmask (maximum with 2^16 – 65536 IP address) 1261 | - CIDR block from private (non-publicly routable) IP address can be assigned 1262 | - 10.0.0.0 – 10.255.255.255 (10/8 prefix) 1263 | - 172.16.0.0 – 172.31.255.255 (172.16/12 prefix) 1264 | - 192.168.0.0 – 192.168.255.255 (192.168/16 prefix) 1265 | - It’s possible to specify a range of publicly routable IP addresses; however, direct access to the Internet is not currently supported from publicly routable CIDR blocks in a VPC 1266 | - Each VPC is separate from any other VPC created with the same CIDR block even if it resides within the same AWS account 1267 | - By default, Amazon EC2 and Amazon VPC use the **IPv4** addressing protocol 1268 | - **Subnets** 1269 | - **1 subnet = 1 AZ** (spans a single AZ); cannot span across AZs 1270 | - Subnet can be configured with an Internet gateway to enable communication over the Internet, or virtual private gateway (VPN) connection to enable communication with your corporate network 1271 | - Subnet can be Public or Private and it depends on whether it has Internet connectivity i.e. is able to route traffic to the Internet through the IGW 1272 | - Instances within the Public Subnet should be assigned a **Public** IP or **Elastic** IP address to be able to communicate with the Internet 1273 | - For Subnets not connected to the Internet, but has traffic routed through Virtual Private Gateway only is termed as VPN-only subnet 1274 | - Subnets can be configured to Enable assignment of the Public IP address to all the Instances launched within the Subnet by default, which can be overridden during the creation of the Instance 1275 | - Each Subnet is associated with a route table which controls the traffic 1276 | - By default, nondefault subnets have the IPv4 public addressing attribute set to `false`, and default subnets have this attribute set to `true` 1277 | - Security Groups are *stateful*; Network ACLs are *stateless* 1278 | - VPC **Peering** 1279 | - A VPC peering connection is a networking connection between two VPCs that enables routing of traffic between them using private IP addresses 1280 | - VPC peering connection can be established between your own VPCs, or with a VPC in another AWS account in **same or different regions** 1281 | - VPC peering connection **cannot** be created between VPCs that have **matching or overlapping CIDR blocks** 1282 | - **NO transitive peering** between VPCs 1283 | - VPC peering does **NOT** support **Edge to Edge Routing** Through a Gateway or Private Connection 1284 | - Only one VPC peering connection can be established between the same two VPCs at the same time 1285 | - A placement group can span peered VPCs that are in the same region 1286 | - Any tags created for the VPC peering connection are only applied in the account or region in which they were created 1287 | - VPC Peering can be applied to create shared services or perform authentication with an on-premises instance 1288 | - all resources in each VPC have access to resources in other VPC 1289 | - security groups in VPC 1290 | - security groups **cannot** span VPCs (a security group does not show up in another) 1291 | - security groups always work at the instance level, not subnet level 1292 | - when you create a new security group, all outbound traffic is allowed by default 1293 | - when creating a VPC, **subnet** and **IGW** are NOT created automatically; **route table**, **NACL** and **security group** are created by default 1294 | - **route tables** 1295 | - Route table defines rules, termed as routes, which determine where network traffic from the subnet would be routed 1296 | - Each VPC has a implicit router to route network traffic 1297 | - Each VPC has a Main Route table, and can have multiple custom route tables created 1298 | - Each Subnet within a VPC must be associated with a single route table at a time, while a route table can have multiple subnets associated with it 1299 | - Subnet, if not explicitly associated to a route table, is implicitly associated with the main route table 1300 | - Every route table contains a local route that enables communication within a VPC which cannot be modified or deleted 1301 | - Route priority is decided by matching the most specific route in the route table that matches the traffic 1302 | - Route tables needs to be updated to defined routes for Internet gateways, Virtual Private gateways, VPC Peering, VPC Endpoints, NAT Device etc. 1303 | - **IGW: Internet Gateway** 1304 | - An Internet gateway is a **horizontally scaled, redundant, and highly available** VPC component that allows communication between instances in the VPC and the Internet. 1305 | - IGW imposes no availability risks or bandwidth constraints on the network traffic. 1306 | - only **1 IGW per VPC** 1307 | - An Internet gateway serves two purposes: 1308 | - To provide a target in the VPC route tables for Internet-routable traffic, 1309 | - To perform network address translation (NAT) for instances that have been NOT been assigned public IP addresses 1310 | - Enabling Internet access to an Instance requires 1311 | - Attaching Internet gateway to the VPC 1312 | - Subnet should have route tables associated with the route pointing to the Internet gateway 1313 | - Instances should have a Public IP or Elastic IP address assigned 1314 | - Security groups and NACLs associated with the Instance should allow relevant traffic 1315 | - E-gress only Internet Gateway 1316 | - a horizontally scaled, redundant, and highly available VPC component 1317 | - allows **outbound** communication over **IPv6** from instances in your VPC to the internet 1318 | - prevents the internet from initiating an IPv6 connection with your instances 1319 | - for use with IPv6 traffic only; to enable outbound-only internet communication over IPv4, use a NAT gateway instead 1320 | - **DNS** 1321 | - When you launch an EC2 instance into a **default** VPC, AWS provides it with public and private DNS hostnames that correspond to the public IPv4 and private IPv4 addresses for the instance 1322 | - when you launch an instance into a **non-default** VPC, AWS provides the instance with a **private DNS hostname** only 1323 | - New instances will only be provided with public DNS hostname depending on two DNS attributes: the DNS resolution and DNS hostnames, that you have specified for your VPC, and if your instance has a public IPv4 address 1324 | - hence, DNS resolution and DNS hostnames need to be enabled so that your instance in a new VPC has an associated DNS hostname 1325 | - can create an Elastic IP and associate it with your instance 1326 | - by default, AWS DNS does not respond to requests from outside the VPC 1327 | - The work around is to create a EC2 hosted DNS instance that does zone transfers from the internal DNS, and allows itself to be queried by external servers 1328 | - **IP address** 1329 | - AWS reserves first 4 and last 1 IP address in each subnet's CIDR block 1330 | - in a VPC, an instance does not retain its private IP 1331 | - in a VPC, your EC2 instance receives a static private IPv4 address from the address range of your default VPC 1332 | 1333 | | Characteristic | EC2-Classic | Default VPC | Nondefault VPC | 1334 | | :--------------------------------------------------------- | :----------------------------------------------------------- | :----------------------------------------------------------- | :----------------------------------------------------------- | 1335 | | Public IPv4 address (from Amazon's public IP address pool) | Your instance receives a public IPv4 address from the EC2-Classic public IPv4 address pool. | Your instance launched in a default subnet receives a public IPv4 address by default, unless you specify otherwise during launch, or you modify the subnet's public IPv4 address attribute. | Your instance doesn't receive a public IPv4 address by default, unless you specify otherwise during launch, or you modify the subnet's public IPv4 address attribute. | 1336 | | Private IPv4 address | Your instance receives a private IPv4 address from the EC2-Classic range each time it's started. | Your instance receives a static private IPv4 address from the address range of your default VPC. | Your instance receives a static private IPv4 address from the address range of your VPC. | 1337 | | Multiple private IPv4 addresses | We select a single private IP address for your instance; multiple IP addresses are not supported. | You can assign multiple private IPv4 addresses to your instance. | You can assign multiple private IPv4 addresses to your instance. | 1338 | | Elastic IP address (IPv4) | An Elastic IP is disassociated from your instance when you stop it. | An Elastic IP remains associated with your instance when you stop it. | An Elastic IP remains associated with your instance when you stop it. | 1339 | | Associating an Elastic IP address | You associate an Elastic IP address with an instance. | An Elastic IP address is a property of a network interface. You associate an Elastic IP address with an instance by updating the network interface attached to the instance. | An Elastic IP address is a property of a network interface. You associate an Elastic IP address with an instance by updating the network interface attached to the instance. | 1340 | | Reassociating an Elastic IP address | If the Elastic IP address is already associated with another instance, the address is automatically associated with the new instance. | If the Elastic IP address is already associated with another instance, the address is automatically associated with the new instance. | If the Elastic IP address is already associated with another instance, it succeeds only if you allowed reassociation. | 1341 | | Tagging Elastic IP addresses | You cannot apply tags to an Elastic IP address. | You can apply tags to an Elastic IP address. | You can apply tags to an Elastic IP address. | 1342 | | DNS hostnames | DNS hostnames are enabled by default. | DNS hostnames are enabled by default. | DNS hostnames are disabled by default. | 1343 | | Security group | A security group can reference security groups that belong to other AWS accounts. | A security group can reference security groups for your VPC, or for a peer VPC in a VPC peering connection. | A security group can reference security groups for your VPC only. | 1344 | | Security group association | You can't change the security groups of your running instance. You can either modify the rules of the assigned security groups, or replace the instance with a new one (create an AMI from the instance, launch a new instance from this AMI with the security groups that you need, disassociate any Elastic IP address from the original instance and associate it with the new instance, and then terminate the original instance). | You can assign up to 5 security groups to an instance.You can assign security groups to your instance when you launch it and while it's running. | You can assign up to 5 security groups to an instance.You can assign security groups to your instance when you launch it and while it's running. | 1345 | | Security group rules | You can add rules for inbound traffic only. | You can add rules for inbound and outbound traffic. | You can add rules for inbound and outbound traffic. | 1346 | | Tenancy | Your instance runs on shared hardware. | You can run your instance on shared hardware or single-tenant hardware. | You can run your instance on shared hardware or single-tenant hardware. | 1347 | | Accessing the Internet | Your instance can access the Internet. Your instance automatically receives a public IP address, and can access the Internet directly through the AWS network edge. | By default, your instance can access the Internet. Your instance receives a public IP address by default. An Internet gateway is attached to your default VPC, and your default subnet has a route to the Internet gateway. | By default, your instance cannot access the Internet. Your instance doesn't receive a public IP address by default. Your VPC may have an Internet gateway, depending on how it was created. | 1348 | | IPv6 addressing | IPv6 addressing is not supported. You cannot assign IPv6 addresses to your instances. | You can optionally associate an IPv6 CIDR block with your VPC, and assign IPv6 addresses to instances in your VPC. | You can optionally associate an IPv6 CIDR block with your VPC, and assign IPv6 addresses to instances in your VPC. | 1349 | 1350 | - US-East-1A AZ in my AWS account is not the same as in someone else's AWS account - all randomized 1351 | - need a minimum of 2 public subnets to deploy an internet facing (application) load balancer 1352 | - Some vulnerability scans are allowed without alerting AWS; some require you to alert AWS 1353 | - by default, instances in new subnets in a custom VPC can communicate with each other across AZ 1354 | - can create up to **5** VPCs in each AWS **region** 1355 | - Once a VPC is set to Dedicated hosting, it can be changed back to default hosting via the CLI, SDK or API. Note that this will not change hosting settings for existing instances, only future ones 1356 | - Deletion of the VPC is possible only after terminating all instances within the VPC, and deleting all the components with the VPC 1357 | - **Network Address Translation (NAT)**: ways to let private subnets to connect to Internet 1358 | - needs to be associated with an Elastic IP address (or public IP address) 1359 | - should have the **Source/Destination flag disabled** to route traffic from the instances in the private subnet to the Internet and send the response back 1360 | - should have a Security group associated that 1361 | - allows Outbound Internet traffic from instances in the private subnet 1362 | - disallows Inbound Internet traffic from everywhere 1363 | - Instances in the private subnet should have the Route table configured to direct all Internet traffic to the NAT device 1364 | ![NAT](pic/NAT.png) 1365 | - **NAT instance** 1366 | - EC2 instance 1367 | - need to **disable source/destination checks** 1368 | - neither the source nor the destination for the traffic and merely acts as a gateway 1369 | - must be in public subnet 1370 | - must be a route out of the private subnet to the NAT instance 1371 | - always **behind a security group** 1372 | - can easily become a bottleneck; instance size dictates the upper limit of traffic 1373 | - can create high availability using auto-scaling groups, multiple subnets in different AZs, and a script to automate failover 1374 | - can be created by using Amazon Linux AMIs 1375 | - gradually phasing out 1376 | - **NAT gateway** 1377 | - **highly available** gateway 1378 | - created in specific AZ and redundant inside the AZ 1379 | - starts at 5Gbps and scales to up to 45Gbps automatically 1380 | - **cannot be associated with security Group** 1381 | - Security can be configured for the instances in the private subnets to control the traffic 1382 | - no need to patch 1383 | - automatically assigned a public IP (or Elastic IP address) 1384 | - NAT gateway **cannot send traffic** over VPC endpoints, VPN connections, AWS Direct Connect, or VPC peering connections. Private subnet’s **route table** should be modified to route the traffic directly to these devices 1385 | - cannot be shared across VPCs 1386 | - **1 NAT gateway per AZ** 1387 | - if multiple AZs share one NAT gateway and if it's AZ fails, resources in other AZs lose Internet access 1388 | - good practice to create 1 NAT gateway in each AZ 1389 | - ![NAT Compare](pic/NAT_comparison_J.png) 1390 | - Network ACL: Network **Access Control List** 1391 | - **default** network ACL **allows** all outbound and inbound traffic 1392 | - any **custom** network ACL **denies** all inbound and outbound traffic by default 1393 | - each subnet in the VPC must be associated with a network ACL; if not specified, then default ACL 1394 | - block IP address with network ACLs, not Security Groups 1395 | - can associate a network ACL with multiple subnets 1396 | - a subnet can only be associated with one ACL at a time 1397 | - ACL contains a numbered list of rules; rules are evaluated in numerical order 1398 | - if you want to deny something, always put it before the allow rule 1399 | - always have separate inbound and outbound rules; each rule can allow or deny traffic 1400 | - ***stateless***: responses to allowed inbound traffic are subject to the rules for outbound traffic (and vice versa) 1401 | - act first, before security groups 1402 | - VPC **Flow Logs** 1403 | - capture information about **IP traffic** in/out of network interfaces 1404 | - stored using AWS CloudWatch logs 1405 | - need to have a CloudWatch log group in place 1406 | - can also send to S3 1407 | - 3 levels 1408 | - **VPC** level 1409 | - **subnet** level 1410 | - **Network Interface** level 1411 | - flow logs for peered VPC must be in the same account 1412 | - can tag flow logs 1413 | - Flow logs do not capture real-time log streams for network interfaces 1414 | - after creation, cannot change flow log configuration (e.g. cannot change IAM role) 1415 | - **not all** IP traffic is monitored, e.g. 1416 | - traffic generated by instances when they contact Amazon DNS server (unless you use your own DNS server) 1417 | - traffic generated by Windows instance for Amazon Windows license activation 1418 | - traffic to and from `169.254.169.254` for instance metadata 1419 | - DHCP traffic 1420 | - traffic to the reserved IP address for the default VPC router 1421 | - **Bastion Host** 1422 | - a special purpose computer on a network specifically designed and configured to **withstand attacks** 1423 | - Bastion host launched in the Public subnets would act as a primary **access point** from the Internet and acts as a **proxy to other instances** 1424 | - allows you to login to instances in the Private subnet securely without having to store the private keys on the Bastion host 1425 | - generally hosts a single application (e.g. proxy server) 1426 | - in comparison, a NAT Gateway or NAT instance is used to provide Internet traffic to EC2 instances in a private subnet 1427 | - The best way to implement a bastion host is to create a small EC2 instance which should only have a security group from a particular IP address for maximum security 1428 | - This will block any SSH Brute Force attacks on your bastion host 1429 | - It is also recommended to use a small instance rather than a large one because this host will only act as a jump server to connect to other instances in your VPC and nothing else 1430 | - a bastion is used to securely administer EC2 instances (using SSH or RDP) 1431 | - Deploy a Bastion host within each Availability Zone for HA, cause if the Bastion instance or the AZ hosting the Bastion server goes down the ability to connect to your private instances is lost completely 1432 | - cannot use a NAT Gateway as a Bastion host 1433 | ![bastion](pic/bastion.png) 1434 | - VPC **endpoints** 1435 | - enables you to **privately connect your VPC to supported AWS services** powered by Private Link 1436 | - Instances in VPC **do not require public IP addresses** to communicate with resources in the service 1437 | - traffic between VPC and the other service **does not leave the Amazon network** 1438 | - endpoints are virtual devices that are **horizontally scaled, redundant and highly available** VPC components 1439 | - cannot configure an inter-region VPC endpoint directly (use VPC peering instead) 1440 | - VPC **endpoint policy** is an IAM resource policy attached to an endpoint for controlling access from the endpoint to the specified service 1441 | - You can modify the endpoint policy attached to your endpoint and add or remove the route tables used by the endpoint 1442 | - if you want to allow traffic to a service (e.g. S3 bucket), you need to specify the policy **for the service**, not the VPC itself 1443 | - **interface endpoints** 1444 | - an elastic network interface (ENI) with a private IP address that serves as an entry point for traffic destined to a supported service 1445 | - enables connectivity to services powered by AWS Private Link 1446 | - For each interface endpoint, only one subnet per Availability Zone can be selected 1447 | - supports **TCP** traffic only 1448 | - **gateway endpoints** 1449 | - like a NAT gateway 1450 | - a target for a specified route in the route table, used for traffic destined to a supported AWS service 1451 | - cannot be created between a VPC and an AWS service in a different region 1452 | - cannot be transferred from one VPC to another, or from one service to another 1453 | - connections cannot be extended out of a VPC i.e. resources across the VPN connection, VPC peering connection, AWS Direct Connect connection cannot use the endpoint 1454 | - support S3 and DynamoDB 1455 | ![endpoint](pic/endpoint.png) 1456 | - **Private Link** 1457 | - best way to expose a service VPC to many (e.g. thousands of) customer VPCs 1458 | - **does not require VPC peering**; no route tables, NAT, IGWs, etc. 1459 | - requires a **Network Load Balancer** on the service VPC and an **ENI** on the customer VPC 1460 | ![privatelink](pic/private_link.png) 1461 | - **transit gateway** 1462 | - a way to **simplify network topology** 1463 | - allows **transitive peering between many VPCs and data centers** 1464 | - works on a **hub-and-spoke** model 1465 | - works on a **regional** basis; can have it across multiple regions 1466 | - can use it across multiple AWS accounts using **RAM** (Resource Access Manager) 1467 | - can use route tables to limit how VPCs talk to one another 1468 | - works with Direct Connect as well as VPN connections 1469 | - supports **IP multicast** (which is not supported by other AWS service) 1470 | ![transit](pic/transit_gateway.png) 1471 | - VPC **VPN** 1472 | - VPC VPN connections are used to **extend on-premise data centers to AWS** 1473 | - VPC VPN connections provide secure **IPSec** connections from on-premise computers/services to AWS 1474 | - **AWS hardware VPN** 1475 | - Connectivity can be established by creating an IPSec, hardware VPN connection between the VPC and the remote network. 1476 | - On the AWS side of the VPN connection, a Virtual Private Gateway (VGW) provides two VPN endpoints for automatic failover. 1477 | - On customer side a customer gateway (CGW) needs to be configured, which is the physical device or software application on the remote side of the VPN connection 1478 | - **AWS Direct Connect** 1479 | - AWS Direct Connect provides a dedicated private connection from a remote network to your VPC. 1480 | - Direct Connect can be combined with an AWS hardware VPN connection to create an IPsec-encrypted connection 1481 | - **AWS VPN CloudHub** 1482 | - For more than one remote network *for e.g. multiple branch offices*, multiple AWS hardware VPN connections can be created via the VPC to enable communication between these networks 1483 | - **Software VPN** 1484 | - VPN connection can be setup by running a software VPN like OpenVPN appliance on an EC2 instance in the VPC 1485 | - AWS does not provide or maintain software VPN appliances; however, there are range of products provided by partners and open source communities 1486 | - fast and cost-effective way to establish IPSEC connectivity 1487 | - For each VPN tunnel, AWS provides two different VPN endpoints. ECMP (Equal Cost Multi Path) can be used to carry traffic on both VPN endpoints which can increase performance 1488 | - ECMP must be enabled on client end device (not IGW end) 1489 | - **Site-to-site VPN** 1490 | - two VPN tunnels between a virtual private gateway or a transit gateway on the AWS side, and a customer gateway (which represents a VPN device) on the remote (on-premises) side 1491 | - ![site vpn](pic/site-to-site-vpn.png) 1492 | - **Virtual Private Gateway** 1493 | - the VPN concentrator on the Amazon side of the Site-to-Site VPN connection 1494 | - When you create a virtual private gateway, you can specify the private Autonomous System Number (ASN) for the Amazon side of the gateway; otherwise created with the default ASN (64512) 1495 | - cannot change the ASN after you've created the virtual private gateway 1496 | - **Transit Gateway** 1497 | - A transit gateway is a transit hub that you can use to interconnect your virtual private clouds (VPC) and on-premises networks 1498 | - can replace virtual private gateway 1499 | - **Customer Gateway Device** 1500 | - a physical device or software application on your side of the Site-to-Site VPN connection 1501 | - By default, your customer gateway device must bring up the tunnels for your Site-to-Site VPN connection by generating traffic and initiating the Internet Key Exchange (IKE) negotiation process 1502 | - **Customer Gateway** 1503 | - a resource that you create in AWS that represents the customer gateway device in your on-premises network 1504 | - need to have a (static) public address 1505 | - VPN **CloudHub** 1506 | - if you have multiple sites, each with its own VPN connection, you can use AWS VPN CloudHub to connect those sites together 1507 | - **only for VPN**, not for VPCs 1508 | - not capable of many VPCs with multiple VPN connections to their data centers that span to multiple AWS Regions 1509 | - **hub-and-spoke** model 1510 | - suitable for customers with multiple branch offices and existing 1511 | - low cost; easy to manage 1512 | - operates over public network, but all traffic between the customer gateway and the AWS VPN CloudHub is encrypted 1513 | - VGW can be used to connect multiple locations; each location using existing Internet link and customer routers will set up a VPN connection to VGW 1514 | - BGP peering will be configured between customer gateway router and VGW using unique BGP ASN at each location 1515 | - if BGP ASN is not unique, additional ALLOWS-IN will be required 1516 | - VGW will receive prefixes from each location and re-advertise to other peers 1517 | - **Shared VPCs** 1518 | - VPC sharing allows multiple AWS accounts to create their application resources, such as EC2 instances, RDS databases, Redshift clusters, and AWS Lambda functions, into shared, centrally-managed VPCs. 1519 | - In this model, the account that owns the VPC (owner) shares one or more subnets with other accounts (participants) that belong to the same organization from AWS Organizations. 1520 | - After a subnet is shared, the participants can view, create, modify, and delete their application resources in the subnets shared with them. Participants cannot view, modify, or delete resources that belong to other participants or the VPC owner. 1521 | 1522 | ### IP Addresses 1523 | Instances launched in the VPC can have Private, Public and Elastic IP address assigned to it and are properties of ENI (Network Interfaces) 1524 | - **Private IP Addresses** 1525 | - Private IP addresses are not reachable over the Internet, and can be used for communication only between the instances within the VPC 1526 | - All instances are assigned a private IP address, within the IP address range of the subnet, to the default network interface 1527 | - Primary IP address is associated with the network interface for its lifetime, even when the instance is stopped and restarted and is released only when the instance is terminated 1528 | - Additional Private IP addresses, known as secondary private IP address, can be assigned to the instances and these can be reassigned from one network interface to another 1529 | - **Public IP address** 1530 | - Public IP addresses are reachable over the Internet, and can be used for communication between instances and the Internet, or with other AWS services that have public endpoints 1531 | - Public IP address assignment to the Instance depends if the Public IP Addressing is enabled for the Subnet. 1532 | - Public IP address can also be assigned to the Instance by enabling the Public IP addressing during the creation of the instance, which overrides the subnet’s public IP addressing attribute 1533 | - Public IP address is assigned from AWS pool of IP addresses and it is not associated with the AWS account and hence is released when the instance is stopped and restarted or terminated. 1534 | - **Elastic IP address** 1535 | - Elastic IP addresses are static, persistent public IP addresses which can be associated and disassociated with the instance, as required 1536 | - Elastic IP address is allocated at an VPC and owned by the account unless released 1537 | - A Network Interface can be assigned either a Public IP or an Elastic IP. If you assign an instance, already having an Public IP, an Elastic IP, the public IP is released 1538 | - Elastic IP addresses can be moved from one instance to another, which can be within the same or different VPC within the same account 1539 | - Elastic IP are charged for non usage i.e. if it is not associated or associated with a stopped instance or an unattached Network Interface 1540 | - An Elastic IP address doesn’t incur charges as long as the following conditions are true 1541 | - The Elastic IP address is associated with an Amazon EC2 instance. 1542 | - The instance associated with the Elastic IP address is running. 1543 | - The instance has only one Elastic IP address attached to it. 1544 | 1545 | ### Direct Connect 1546 | 1547 | - a cloud service solution to establish a **dedicated private network connection from your premises to AWS** 1548 | - essentially just connect your data center directly to AWS 1549 | - **not traversing internet** at all 1550 | - **does not encrypt traffic** in transit (in comparison, VPN does encrypt traffic) 1551 | - useful for **high throughput** workloads (i.e. a lot of network traffic) or need a **stable** and **reliable** secure connection 1552 | - A link aggregation group (**LAG**) is a logical interface that uses the Link Aggregation Control Protocol (LACP) to **aggregate** multiple connections at a single AWS Direct Connect endpoint, treating them as a single, managed connection 1553 | - see [Jayendra's blog](https://jayendrapatil.com/aws-direct-connect-dx/) for more details 1554 | ![dx](pic/dx.png) 1555 | ![set_dx](pic/Set_dx.png) 1556 | 1557 | 1558 | ### Global Accelerator 1559 | - **improve availability and performance** for local and global users 1560 | - provides **static IP addresses** that act as a **fixed entry point** to your application endpoints in a single or multiple AWS Regions 1561 | - good for use cases of gaming, media, mobile applications, and financial applications, who need very low latency 1562 | - can add or remove endpoints in the AWS Regions, run blue/green deployment, and A/B test without needing to update the IP addresses in your client applications 1563 | - includes following components 1564 | - **Static IP addresses** 1565 | - provides **two** static IP addresses 1566 | - can bring your own IP address 1567 | - requests directed to nearest healthy instance 1568 | - **Accelerator** 1569 | - directs traffic to optimal endpoints over AWS 1570 | - includes one or more listener 1571 | - **DNS name** 1572 | - similar to `12312342abes.awsglobalaccelerator.com` - points to the static IP address that GA assigned to you 1573 | - **Network zone** 1574 | - services the static IP address for accelerator from a unique IP subnet 1575 | - like AZ: isolated unit with its own physical infrastructure 1576 | - by default, has two IP addresses. If one fails, can try another 1577 | - **listener** 1578 | - processes inbound connections from clients to GA 1579 | - supports both TCP and UDP protocols 1580 | - each listener has one or more endpoint groups associated with it 1581 | - association by specifying the regions that you want to distribute traffic to 1582 | - traffic is distributed to optimal endpoints within the endpoint groups associated with a listener 1583 | - **endpoint group** 1584 | - associated with a specific AWS region 1585 | - includes one or more endpoints 1586 | - can increase or reduce the percentage of traffic that would be otherwise directed to an endpoint group by adjust a setting called **traffic dial** 1587 | - traffic dial lets you easily do performance testing or blue/green deployment testing for new releases 1588 | - **endpoint** 1589 | - can be network load balancer, application load balancer, EC2 instances, elastic IP address 1590 | - can be internet-facing or internal 1591 | - traffic is routed to endpoints based on configuration options (e.g. endpoint **weights**) 1592 | 1593 | ### Network costs 1594 | - use private IP addresses over public to save on costs 1595 | - if want to cut cost, group EC2 instances in the same AZ and use private IP addresses. This will be cost free, but beware of single point of failure 1596 | 1597 | ### ELB: Elastic Load Balancer 1598 | - 3 types of load balancer 1599 | - **application load balancer** 1600 | - best suited for load balancing **HTTP** and **HTTPS** traffic 1601 | - operate at **layer 7** and are application aware 1602 | - can create **advanced request routing**, sending specified requests to specific web servers (by creating if-then rules) 1603 | - content-based, host-based, path-based routing 1604 | - provides dynamic port mapping 1605 | - **network load balancer** 1606 | - best suited for load balancing of **TCP** traffic where **extreme** performance is required 1607 | - low latency and ability to scale 1608 | - provides static IP address 1609 | - operate at connection **level 4** 1610 | - capable of handling millions of requests per second, while maintaining ultra-low latencies 1611 | - supports **UDP** protocol (not supported by ALB) 1612 | - can be used to terminate TLS connections 1613 | - to negotiate TLS connections with clients, NLB uses a security policy consisting of *protocols* and *ciphers* 1614 | - **classic load balancer** 1615 | - legacy elastic load balancer 1616 | - low cost 1617 | - can load balance **HTTP(S)** applications and use **layer 7** specific features 1618 | - can use strict layer 4 load balancing for applications that rely purely on the TCP protocol 1619 | - [ALB vs NLB vs CLB](https://jayendrapatil.com/aws-classic-load-balancer-vs-application-load-balancer/) 1620 | - balance traffic in **one region**, not multiple regions 1621 | - if application stops responding, the ELB responds with a **504 error** 1622 | - means the application not responding within the idle timeout period 1623 | - mean you need to trouble shoot the application - Web server or database server? 1624 | - if you need the IPv4 address of your end user, look for the **X-Forwarded-For** header 1625 | - **no IP address** for ELB; **only DNS names** (application and classic) 1626 | - you can get a static IP address for network load balancer 1627 | - instances monitored by ELB are reported as `InService` or `OutofService` 1628 | - provides **access logs** that capture detailed information about requests sent to your load balancer 1629 | - optional feature 1630 | - each log contains information such as the time the request was received, the client's IP address, latencies, request paths, and server responses 1631 | - load balancer routes requests to the targets in a **target group** 1632 | - you can set the targets and health checks 1633 | - not needed for classic load balancer 1634 | - **sticky sessions** allow you to bind a user's session to a specific EC2 instance 1635 | - ensures that all requests from the user during the session are sent to the same instance 1636 | - e.g. online editing of a file: edits do not go to separate instances 1637 | - can be useful if you are storing information locally to that instance 1638 | - NOT used when you want to share users' sessions among EC2 fleet 1639 | - **cross zone load balancing** 1640 | - distribute traffic (evenly) across different AZ 1641 | ![cross zone lb](pic/cross_zone_lb.png) 1642 | - **path patterns**: can create a listener with rules to direct traffic (forward requests) based on URL path 1643 | - also known as path-based routing 1644 | - e.g. route general requests to one target group and requests to render images to another target group 1645 | 1646 | 1647 | 1648 | 1649 | 1650 | ## Databases 1651 | 1652 | ### RDS: Relational Database System 1653 | - 6 RDS on AWS: SQL Server, Oracle, MySQL Server, PostgreSQL, Aurora, MariaDB 1654 | - 2 key features 1655 | - **multi-AZ** 1656 | - exact ***synchronous*** copy of production database in **another AZ** in the **same region** 1657 | - for **disaster recovery** 1658 | - fail-over is automatic 1659 | - can force a fail-over by rebooting the RDS instance 1660 | - not applicable to Aurora (due to its own unique fault-tolerant design) 1661 | - cannot be promoted to be a read-replica 1662 | - **read replicas** 1663 | - read-only copy of the production database; can have multiple (up to 5) 1664 | - ***asynchronous*** replication from the primary RDS instance to the read replica 1665 | - subject to replication lag; might miss latest transactions 1666 | - for **performance** (especially for read-intensive workload); not for disaster recovery 1667 | - must have backups turned on to create read replicas 1668 | - can have read replicas of read replicas 1669 | - each read replica has its own DNS end point 1670 | - can create read replicas of multi-AZ source databases 1671 | - read replicas **can be promoted** to be their own databases. this breaks the replication 1672 | - can have a read replica in a different region 1673 | - not applicable to SQL Server and Oracle 1674 | - RDS is for **OLTP**; Red Shift is for **OLAP** 1675 | - OLTP: Online Transaction Processing: e.g. order number 323245 1676 | - prefer provisioned IOPS (instead of standard storage) 1677 | - OLAP: Online Analytics Processing: e.g. net profit for EMEA 1678 | - RDS runs on virtual machine; you don't have access to the operating systems (cannot SSH to it) 1679 | - patching of RDS operating system and DB is Amazon's responsibility 1680 | - RDS is not serverless (except for Aurora Serverless) 1681 | - manage DB engine configuration through the use of parameters in DB parameter group 1682 | - backups 1683 | - **automated backup** 1684 | - **point-in-time recovery** within a retention period 1685 | - done in a scheduled window 1686 | - once a day (24 hours) 1687 | - capture transaction logs every 5 minutes 1688 | - same size as your database 1689 | - enabled by default 1690 | - performs a *storage volume* snapshot of your DB instance 1691 | - **database snapshot** 1692 | - initiated manually 1693 | - stored after DB is terminated 1694 | - restored version (either approach) will be a new RDS instance with a new DNS endpoint 1695 | - failover 1696 | - In Amazon RDS, failover is **automatically** handled so that you can resume database operations as quickly as possible without administrative intervention in the event that your primary database instance went down 1697 | - RDS performs an automatic failover in the event of any of the following 1698 | - Loss of availability in primary Availability Zone 1699 | - Loss of network connectivity to primary 1700 | - Compute unit failure on primary 1701 | - Storage failure on primary 1702 | - When failing over, Amazon RDS simply flips the canonical name record (**CNAME**) for your DB instance to point at the standby, which is in turn promoted to become the new primary 1703 | - encryption 1704 | - encryption at rest supported by all 6 RDS services 1705 | - use AWS Key Management Service (**KMS**) 1706 | - underlying data encrypted all together (including backups, read replicas) 1707 | - You can only enable encryption for an Amazon RDS DB instance when you create it, not after the DB instance is created 1708 | - you can encrypt a copy of an unencrypted DB snapshot 1709 | - For Microsoft SQL Server: 1710 | - When you create an SQL Server DB instance, Amazon RDS creates an SSL certificate for it. The SSL certificate includes the DB instance endpoint as the Common Name (CN) for the SSL certificate to guard against spoofing attacks 1711 | - 2 ways to use SSL to connect to your SQL Server DB instance 1712 | - Force SSL for all connections (use the `rds.force_ssl` parameter; set to `true` instead of default `false`) 1713 | - Encrypt specific connections 1714 | - authentication 1715 | - You can authenticate to your DB instance using AWS Identity and Access Management (IAM) **database authentication** 1716 | - works with MySQL and PostgreSQL 1717 | - don't need to use a password when you connect to a DB instance; can just use **authentication token** 1718 | - can create a short-lived authentication token (with `AWSAuthenticationPlugin` plugin) 1719 | - RDS instance port number is automatically applied to RDS DB security group 1720 | - recommended storage engine for MySQL is InnoDB 1721 | 1722 | ### DynamoDB 1723 | - one of the NoSQL databases 1724 | - fully managed 1725 | - serverless 1726 | - stored on SSD storage 1727 | - can store session state data (so does Elasticache) 1728 | - spread across 3 geographically distinct data centers 1729 | - **regional** service; no need to explicitly provision multi-AZ deployment 1730 | - automatically shards data and spread across servers 1731 | - good for simple GET/PUT requests and queries 1732 | - Attribute name and value combined should not exceed **400KB** 1733 | - eventual consistent reads (default) 1734 | - consistency across all copies reached within 1 second 1735 | - strongly consistent reads 1736 | - returns a result that reflects all writes that received a successful response prior to the read 1737 | - DynamoDB Accelerator (**DAX**) 1738 | - fully managed, highly available, in-memory **cache** 1739 | - reduce request time 1740 | - allow both read and write 1741 | - transaction: prepare/commit reads or writes for "all-or-nothing" operations (e.g. money transfer) 1742 | - table structure and performance 1743 | - The optimal usage of a table's provisioned throughput depends not only on the workload patterns of individual items, but also on the partition-key design 1744 | - the **more distinct partition key values** that your workload accesses, the more those requests will be spread across the partitioned space 1745 | - you will use your provisioned throughput more efficiently as the ratio of partition key values accessed to the total number of partition key values increases 1746 | - the less distinct partition key values, the less evenly spread it would be across the partitioned space, which effectively slows the performance 1747 | - on-demand capacity: 1748 | - pay-per-request pricing 1749 | - no charge for read/write when idling - only storage and backups 1750 | - higher cost per request than with provisioned capacity 1751 | - can use **auto scaling** 1752 | - on-demand backup and restore 1753 | - full backups at any time 1754 | - zero impact on table performance or availability 1755 | - consistent within seconds and retained until deleted 1756 | - operates within same region as the source table 1757 | - point-in-time recovery (PITR) 1758 | - protects against accidental writes or deletes 1759 | - restore to any point in the last 35 days 1760 | - incremental backups 1761 | - not enabled by default 1762 | - latest restorable: 5 minutes in the past 1763 | - **DynamoDB Stream** 1764 | - time-ordered sequence of item-level changes in a table 1765 | - shard: a collection of stream records (data) 1766 | - stored for **24** hours 1767 | - integrates with Lambda to create a *trigger* that leads application to react to data modifications in DynamoDB 1768 | - global tables 1769 | - managed multi-master, multi-region replication 1770 | - globally distributed applications 1771 | - based on DynamoDB streams 1772 | - multi-region redundancy for disaster recovery 1773 | - no application rewrites 1774 | - replication latency under 1 second 1775 | - security 1776 | - encryption at rest using **KMS** 1777 | - site-to-site VPN 1778 | - Direct Connect 1779 | - IAM policies and roles 1780 | - fine-grained access 1781 | - CloudWatch and CloudTrail 1782 | - VPC endpoints 1783 | - Valid header attributes 1784 | - `host` 1785 | - `x-amz-date` 1786 | - `x-amz-target` 1787 | - `content-type` 1788 | 1789 | ### Redshift 1790 | - fully managed **data warehouse** service for BI 1791 | - Configuration 1792 | - single node (160Gb) 1793 | - multi-node 1794 | - leader node (manages client connections and receives queries) 1795 | - compute node (store data and perform computations) - up to 128 compute nodes 1796 | - advanced compression 1797 | - compress data based on column (instead of row) 1798 | - doesn't require indexes or materialized views 1799 | - massive parallel processing (MPP) 1800 | - automatically distrbutes data and query load across nodes 1801 | - backups 1802 | - enabled by default with a 1-day retention period 1803 | - max retention period is 35 days 1804 | - Redshift always attempts to maintain **3 copies** of your data (original and replica on compute node and a backup in S3) 1805 | - can configure Amazon Redshift to **asynchronously** replicate **snapshots** to **S3** in another **region** for disaster recovery 1806 | - can take manual snapshots; will remain indefinitely and won't be automatically deleted 1807 | - pricing 1808 | - based on compute node hours: billed for 1 unit per node per hour 1809 | - no charge for lead node 1810 | - charge for backup; charge for data transfer within VPC 1811 | - security 1812 | - in transit using SSL 1813 | - at rest using **AES-256** encryption 1814 | - by default Redshift takes care of key management 1815 | - manage your own key through HSM 1816 | - AWS Key Management Service 1817 | - availability 1818 | - only available in **1 AZ** (no multi-AZ) 1819 | - can restore snapshots to new AZs 1820 | - Redshift **Enhanced VPC Routing** provides VPC resources access to Redshift 1821 | - no traffic through internet 1822 | 1823 | ### Aurora 1824 | - MySQL and PostgreSQL-compatible **relational** database engine 1825 | - better performance than traditional RDS and cheaper 1826 | - start with 10GB, scales in 10GB increments to 64TB (storage auto-scaling) 1827 | - compute resource up to 32vCPUs and 244GB of memory 1828 | - 2 copies of data contained in each AZ, with minimum of 3 AZ (6 copies) of data 1829 | - handles the loss of 1830 | - up to 2 copies of data without affecting database write availability 1831 | - up to 3 copies of data without affecting database read availability 1832 | - **self-healin**g: data blocks and disks are continuously scanned for errors and repaired automatically 1833 | - replica and failover 1834 | - Failover is **automatically** handled by Amazon Aurora 1835 | - If you have an Amazon Aurora Replica in the same or a different Availability Zone, when failing over, Amazon Aurora flips the canonical name record (**CNAME**) for your DB Instance to point at the healthy replica, which in turn is promoted to become the new primary 1836 | - If you are running Aurora Serverless and the DB instance or AZ become unavailable, Aurora will automatically recreate the DB instance in a **different AZ** 1837 | - If you do not have an Amazon Aurora Replica (i.e. single instance) and are not running Aurora Serverless, Aurora will attempt to create a new DB Instance in the **same AZ** as the original instance 1838 | - types of replicas 1839 | - Aurora Replicas (15): automated failover available 1840 | - MySQL Read Replicas (5) 1841 | - PostgreSQL (1) 1842 | - in-region replica only 1843 | - automated failover with no data loss (with Aurora replicas only) 1844 | - DB instance(s) 1845 | - When you connect to an Aurora cluster, the host name and port that you specify point to an intermediate handler called an *endpoint* 1846 | - Using endpoints, you can map each connection to the **appropriate instance or group** of instances based on your use case 1847 | - e.g. a reader endpoint and a query endpoint 1848 | - a cluster endpoint (also known as a writer endpoint) for an Aurora DB cluster connects to the current primary DB instance for that DB cluster 1849 | - backups 1850 | - automated backups enabled; does not impact performance 1851 | - snapshots can be taken; no impact on performance 1852 | - snapshots can be shared with other AWS accounts 1853 | - Amazon Aurora Global Database is designed for globally distributed applications, allowing a single Amazon Aurora database to span multiple AWS regions 1854 | - Aurora Serverless 1855 | - on-demand, auto-scaling configuration for SQL-compatible editions of Aurora 1856 | - automatically starts up, shuts down and scales based on needs 1857 | - option for infrequent, intermittent, or unpredictable workloads 1858 | 1859 | ### ElastiCache 1860 | - a web service that makes it easy to deploy, operate, and scale an **in-memory cache** in the cloud 1861 | - a way to speed up database performance by caching most commonly used content 1862 | - e.g. top 10 most popular products on a retailing site 1863 | - *faster* than read-replicas 1864 | - ElastiCache is only a **key-value** store and cannot therefore store relational data 1865 | - **Memcached** vs. **Redis** 1866 | - Memcached is simpler and easier to set up; Redis is more sophisticated 1867 | - Redis is multi-AZ 1868 | - can do backups and restores of Redis 1869 | - can store session data for your web applications and thereby provides distributed session data management 1870 | - You can manage HTTP session data from the web servers using an In-Memory Key/Value store such as Redis and Memcached 1871 | 1872 | ![ElastiCache](pic/elasticache_2.png) 1873 | 1874 | ### DMS: Database Migration Service 1875 | - a cloud service to migrate databases 1876 | - essentially a server in AWS that runs replication software 1877 | - can migrate data into AWS, between on-premise instances, or between combinations of cloud and on-premise setups 1878 | - **source database remains operational** 1879 | - can pre-create target tables manually, or use **AWS Schema Conversion Tool** (SCT) 1880 | - types of migration 1881 | - **homogeneous** (e.g. Oracle to Oracle) - don't need SCT 1882 | - **heterogeneous** (e.g. SQL Server to Aurora) - need SCT 1883 | ![DMS source target](pic/DMS_source_target.png) 1884 | 1885 | ### EMR: Elastic Map Reduce 1886 | - cloud big data platform 1887 | - central component is **cluster**, a collection of EC2 instances 1888 | - each instance is a node; each node has a role, referred to as the node type 1889 | - EMR installs different software components on each node type 1890 | - master node: every cluster must have a master node; log data is stored on master node by default 1891 | - core node: run tasks and store data 1892 | - task node (optional): run tasks and does not store data 1893 | - security: periodically archive the log files on master node to S3 (in case master node is terminated) 1894 | - must perform step when setting up the cluster; cannot do so after setup 1895 | - log analysis can be automatically provided by Amazon EMR 1896 | 1897 | 1898 | 1899 | 1900 | 1901 | ## Service and Applications 1902 | 1903 | ### CloudFormation 1904 | - a way of scripting cloud environment 1905 | - template is stored as a text file whose format complies with the JavaScript Object Notation (**JSON**) or **YAML** standard 1906 | - uses templates and stacks to **provision resources** 1907 | - major sections: 1908 | - Format Version 1909 | - Description 1910 | - Metadata 1911 | - Parameters 1912 | - Mappings 1913 | - Conditions 1914 | - Transform 1915 | - Resources (required) 1916 | - Outputs 1917 | - Quick Start has a collection of CloudFormation templates (e.g. set up a SharePoint server) 1918 | - more detailed and powerful than Elastic Beanstalk, but takes time to learn and configure 1919 | - promotes resilience as you can relaunch your instance from template 1920 | - when deploying across multiple regions, use mappings to specify the base AMI 1921 | - AMI IDs are different across region 1922 | - AMI IDs are difficult for users to specify; hence use parameters to control/modify AMI IDs in this case is not feasible 1923 | - CloudFormation **Drift Detection** 1924 | - detect changes made to AWS resources outside the CloudFormation templates 1925 | - only checks property values that are explicitly set by stack templates (does not determine drift for property values set by default; but you can explicitly set these values, which can be numerically the same as default) 1926 | - can associate the `CreationPolicy` attribute with a resource to prevent its status from reaching create complete until AWS CloudFormation receives a specified number of success signals or the timeout period is exceeded 1927 | - To signal a resource, you can use the `cfn-signal` helper script or SignalResource API 1928 | 1929 | ### CloudFormation vs. Elastic Beanstalk 1930 | 1931 | - Elastic Beanstalk provides a **platform** to deploy resources; CloudFormation is a **provisioning mechanism** for a broad range of AWS resources 1932 | - Elastic Beanstalk is a platform where you deploy the applications; CloudFormation is where you define a stack of resources 1933 | - Elastic Beanstalk is good for relatively narrow use cases of PaaS applications; CloudFormation is good for broad use of defining infrastructure as code 1934 | 1935 | ### SQS 1936 | 1937 | - a web service to give you access to a message queue 1938 | - distributed queue system that enables web service applications to quickly and reliably queue messages 1939 | - **pull-based**, not push-based 1940 | - message is **256KB** in size (can be bigger but will be saved to S3) 1941 | - messages can be kept in the queue from 1 minute to **14** days; the default retention period is 4 days 1942 | - SQS guarantees that your messages will be processed at least once 1943 | - can **decouple** the components of an application so they run independently 1944 | - any component of a distributed application can store messages in a fail-safe queue 1945 | - a queue is a temporary repository for messages that are awaiting processing 1946 | - acts as a buffer between the component producing and saving data, and the component receiving the data for processing 1947 | - resolves issues that arise if the producer is faster, or if the production is only intermittent 1948 | - 2 types of queues 1949 | - **standard queue** 1950 | - nearly-unlimited number of transactions per second 1951 | - guarantee a message is delivered at least once 1952 | - provide best-effort ordering to ensure messages are generally delivered in the same order they are sent 1953 | - may have different order and/or receiving duplicated messages 1954 | - **FIFO queue** (first-in-first-out) 1955 | - order is strictly preserved 1956 | - a message is delivered only once and remains available until a consumer processes it and deletes it 1957 | - duplicates are not introduced into the queue 1958 | - support message groups that allow multiple ordered message groups within a single queue 1959 | - limited to 300 transactions per second (TPS) 1960 | - queue **identifiers** 1961 | - standard and FIFO 1962 | - Message ID 1963 | - Receipt Handle 1964 | - FIFO additional 1965 | - Message Deduplication ID: The token used for deduplication of sent messages. If a message with a particular message deduplication ID is sent successfully, any messages sent with the same message deduplication ID are accepted successfully but aren't delivered during the 5-minute deduplication interval 1966 | - Message Group ID: The tag that specifies that a message belongs to a specific message group. Messages that belong to the same message group are always processed one by one, in a strict order relative to the message group 1967 | - Sequence Number: The large, non-consecutive number that Amazon SQS assigns to each message 1968 | - **visibility timeout** is the amount of time that the message is invisible in the SQS queue after a reader picks up that message 1969 | - maximum is 12 hours 1970 | - if the job is processed before visibility timeout, the message will be deleted from the queue 1971 | - if the job is not processed within that time, the message will become visible again and could result in a duplicated message delivery - potential cause of duplication (in exam questions) 1972 | - Amazon SQS **long polling** is a way to retrieve messages from your Amazon SQS queues 1973 | - short polling returns immediately (even if the queue is empty) 1974 | - doesn't return a response until a message arrives in the message queue, or the long poll times out 1975 | - long polling is an efficient and cost-saving option if queue tends to be empty 1976 | - **dead-letter** queue 1977 | - other queues can target this queue for messages that can't be processed (consumed) successfully 1978 | - isolate problematic messages to determine why their processing doesn't succeed 1979 | - useful for debugging your application or messaging system 1980 | - **message-oriented API** 1981 | - need to implement your own application-level tracking, especially if you have multiple queues 1982 | 1983 | ### AWS OpsWorks 1984 | 1985 | - a configuration management service that provides managed instances of **Chef and Puppet** 1986 | - Chef and Puppet are **automation platforms** that allow you to use code to automate the configurations of your servers 1987 | - OpsWorks lets you use Chef and Puppet to automate how servers are configured, deployed, and managed across your Amazon EC2 instances or on-premises compute environments 1988 | - 3 offerings 1989 | - AWS Opsworks for Chef Automate 1990 | - AWS OpsWorks for Puppet Enterprise 1991 | - AWS OpsWorks Stacks 1992 | - can model your application as a stack containing different layers 1993 | - best way to establish **stack-based** architecture (e.g. separate stacks for dev and prod) 1994 | 1995 | ### AWS Step Function 1996 | 1997 | - **serverless** orchestration for modern applications 1998 | - manages a workflow by breaking it into multiple steps, adding flow logic, and tracking the inputs and outputs between the steps 1999 | 2000 | ### SWF: Simple Workflow Service 2001 | 2002 | - a web service to coordinate work across distributed application components 2003 | - enables applications of a wide range of use cases to be designed as a coordination of tasks 2004 | - **tasks** represent invocations of various processing steps in an application 2005 | - executable code, web service calls, human actions, scripts 2006 | - combine digital environment with **human actions** 2007 | - e.g. place an order on Amazon -> pick up product from warehouse 2008 | - workflow executions can last up to 1 year 2009 | - **task-oriented API** 2010 | - ensures a task assigned only once and is never duplicated 2011 | - keeps track of all the tasks and events in an application 2012 | - SWF actors 2013 | - Workflow Starters: an application to initiate a workflow 2014 | - Deciders: control the flow of activity tasks 2015 | - Activity Workers: carry out the activity tasks 2016 | 2017 | ### SNS: Simple Notification Service 2018 | - set up, operate and send notifications 2019 | - push cloud notifications 2020 | - to all operating systems 2021 | - via SMS 2022 | - by email 2023 | - to any HTTP endpoint 2024 | - instantaneous push-based delivery 2025 | - can group multiple recipients using topics 2026 | - a topic is an "access point" for allowing recipients 2027 | - stored redundantly across multiple AZs 2028 | - simple APIs and easy integration with applications 2029 | - inexpensive, pay-as-you-go model with no upfront cost 2030 | - web-based AWS management console offers the simplicity of a point-and-click interface 2031 | - SNS vs. SQS: 2032 | - both are messaging service 2033 | - SNS: **push-based**; SQS: pull/poll-based 2034 | 2035 | ### Elastic Transcoder 2036 | - media transcoder in the cloud 2037 | - convert media files from original source format into different formats on target device (e.g. smartphones, PCs, etc.) 2038 | - provides transcoding presets for popular output formats; don't need to guess which setting works best 2039 | - pay based on the minutes transcoded and the resolution 2040 | 2041 | 2042 | ### API Gateway 2043 | - fully managed service for developers to publish, maintain, monitor and secure APIs at any scale 2044 | - essentially a doorway to your applications 2045 | - can do the following 2046 | - expose **HTTPS endpoints (only)** to define a restful API 2047 | - serverlessly connect to services like Lambda 2048 | - send each API endpoint to a different target 2049 | - connect to CloudWatch to log all requests for monitoring 2050 | - maintain multiple versions of your API 2051 | - run efficiently with low cost 2052 | - scale effortlessly and automatically 2053 | - track and control usage by API key 2054 | - **throttle requests to prevent attacks** 2055 | - how to configure API gateway 2056 | - define an API (container) 2057 | - define resources and nested resources (URL paths) 2058 | - for each resource: 2059 | - select supported HTTP methods 2060 | - set security 2061 | - choose targets (e.g. EC2, Lambda, DynamoDB) 2062 | - set request and response transformations 2063 | - deploy API to a stage 2064 | - use API gateway domain by default 2065 | - can use custom domain 2066 | - now supports AWS Certificate Manager: free SSL/TLS certs 2067 | - AWS Certificate Manager: encrypt traffic in transit, not at rest 2068 | - control access to your Amazon API Gateway API with IAM permissions 2069 | - To create, deploy, and manage an API in API Gateway, you must grant the API developer permissions to perform the required actions supported by the API management component of API Gateway 2070 | - To call a deployed API or to refresh the API caching, you must grant the API caller permissions to perform required IAM actions supported by the API execution component of API Gateway 2071 | - can enable API **caching** to cache your endpoint's response 2072 | - can reduce the number of calls made to your endpoint and improve latency of the requests 2073 | - when enabled, API Gateway caches responses from your endpoint for a specific time-to-live (TTL) period in seconds 2074 | - API Gateway then responds to the request by looking up the endpoint response from the cache instead of making request to endpoint 2075 | - same-origin policy 2076 | - a web browser permits scripts contained a first web page to access data in a second web page, but only if both web pages have the same origin 2077 | - this is done to prevent cross-site scripting (XSS) attacks 2078 | - enforced by web browsers 2079 | - ignored by tools like PostMan and curl 2080 | - **Cross-origin resource sharing (CORS)** is one way the server can relax the same-origin policy 2081 | - allow restricted resources (e.g. fonts) on a web page to be requested from another domain outside the first resource's domain 2082 | - if you see an error of "Origin policy cannot be read at the remote resource", it means you need to enable CORS on API Gateway 2083 | - CORS is enforced by client (e.g. browser) 2084 | 2085 | ### AWS X-Ray 2086 | 2087 | - trace and analyze user requests as they travel through your Amazon API Gateway APIs to the underlying services 2088 | - gives you an end-to-end view of an entire request, so you can analyze latencies in your APIs and their backend services 2089 | - examine Lambda (serverless) architecture 2090 | 2091 | ### Kinesis 2092 | 2093 | - streaming data 2094 | - data generated continuously by many data sources, typically sent in the data records simultaneously and in small sizes 2095 | - e.g. purchases from online stores 2096 | - Kinesis is a platform to send your streaming data to 2097 | - 3 types Kinesis 2098 | - **Kinesis Data Streams** 2099 | - storage of 24 hours by default, up to 7 days 2100 | - can **perform custom processing** of data 2101 | - consists of **shards** 2102 | - 5 transactions per second for reads; 2103 | - data read rate up to 2MB per second 2104 | - up to 1000 records per second for writes 2105 | - data write rate up to 1 MB per second (including partition keys) 2106 | - data capacity of stream is a function of the number of shards -> sum of the shards' capacity 2107 | - cannot upload data directly to Redshift (need additional applications; or use Kinesis Firehose instead) 2108 | - **Kinesis Video Stream** 2109 | - **Kinesis Firehose** 2110 | - **no data persistence**: need to process data immediately 2111 | - easiest way to **load streaming data into data stores and analytics tools** 2112 | - can use Amazon Kinesis Data Firehose in conjunction with Amazon Kinesis Data Streams if you need to implement real-time processing of streaming big data 2113 | - nealy real-time (slightly slower than Kinesis Data Streams) 2114 | - optional: Lambda function to analyze on the fly 2115 | - only supports S3, Redshift, Elastisearch, HTTP endpoint (does NOT support DynamoDB) 2116 | - **Kinesis Analytics** 2117 | - works with both Streams and Firehose 2118 | - analyze data on the fly inside either Kinesis service (instead of using a Lambda function) 2119 | - then saves data in downstream service. 2120 | - auto scaling 2121 | - can use AWS application auto scaling 2122 | - can also use Amazon Kinesis Scaling Utilities 2123 | 2124 | ### On-premise service 2125 | - Database migration service (DMS) 2126 | - allows you to move databases to and from AWS 2127 | - supports both homogeneous and heterogeneous migration 2128 | - Server Migration Service (SMS) 2129 | - incremental replication of your on-premise servers 2130 | - can be used as a backup tool, multi-site strategy, and a disaster recovery tool 2131 | - AWS Application Discovery Service 2132 | - helps enterprise to plan migration projects by gathering info about its on-premise data center 2133 | - how it works: 2134 | 1. install AWS Application Discovery Agentless Connector as a virtual appliance on VMware 2135 | 1. it then builds a server utilization map and dependency map of the on-premise environment 2136 | 1. the collected data is retained in encrypted format; can be exported as csv file to estimate total cost of ownership (TCO) 2137 | - this data is also available in AWS Migration Hub, where you can migrate the discovered servers and track migration progress 2138 | - VM Import/Export 2139 | - migrate existing applications into EC2 2140 | - can be used to create a DR strategy on AWS or use AWS as a second site 2141 | - can use it to export your AWS VMs to your on-premise data center 2142 | - download Amazon Linux 2 as an ISO 2143 | 2144 | 2145 | ### SAM: Serverless Application Model 2146 | - SAM: CloudFormation extension optimized for **deploying serverless applications** 2147 | - can define functions, APIs, tables 2148 | - supports anything CloudFormation supports 2149 | - can run serverless applications locally (in Docker) 2150 | - can package and deploy using CodeDeploy 2151 | ![SAM](pic/SAM.png) 2152 | 2153 | ### ECS: Elastic Container Service 2154 | 2155 | - a **container** is a package that contains an application, libraries, runtime and tools required to run it 2156 | - a container runs on a container engine like **docker** 2157 | - kind of like virtual machine, but different 2158 | - provides the isolation benefits of virtualization, but with less overhead and faster starts than VMs 2159 | - containerized applications are portable and offer a consistent environment 2160 | ![container](pic/Container.png) 2161 | - ECS is a managed container **orchestration** service 2162 | - can create clusters to manage fleets of container deployments 2163 | - can manage EC2 or Fargate instances 2164 | - Fargate: serverless container engine; eliminates needs to manage servers; each workload runs in its own kernel; works with both ECS and EKS 2165 | - choose EC2 for compliance requirements, broader customization, GPUs 2166 | - schedules containers for optimal placement 2167 | - define rules for CPU and memory requirement 2168 | - monitor resource utilization 2169 | - deploy, update, roll back containers 2170 | - integrates with VPC, security groups, EBS volumes, ELB 2171 | - supports CloudTrail and CloudWatch 2172 | - free service 2173 | - ECS components 2174 | ![ECS](pic/ECS_components.png) 2175 | - **EKS: Elastic Kubernete Service** 2176 | - provisions and scales the Kubernetes control plane, including the API servers and backend persistence layer, across multiple AWS availability zones for high availability and fault tolerance 2177 | - portable, extensible, and open-source platform for managing containerized workloads and services 2178 | - K8s is an open source software to deploy and manage containerized applications at scale 2179 | - same toolset on-premises and in cloud 2180 | - containers are grouped in pods 2181 | - supports both EC2 and Fargate 2182 | - use case: 2183 | - already using K8s 2184 | - best for cloud-agnostic and/or open-source platform 2185 | - want to migrate to AWS 2186 | - ECR: managed docker container registry 2187 | - store, manage, and deploy images 2188 | - integrated with ECS and EKS 2189 | - works with on-premises deployments 2190 | - highly available 2191 | - integrated with IAM 2192 | - pay for storage and data transfer 2193 | - ECS security: task role vs. ECS instance role 2194 | ![ECS security](pic/ECS_security.png) 2195 | 2196 | ### AWS Polly 2197 | 2198 | - lexicon is specific to a region 2199 | - can use SSML tags to control the speech generated (e.g. add pause between designated words) 2200 | 2201 | ### AWS Glue 2202 | 2203 | - serverless data preparation service that makes it easy for data engineers, extract, transform, and load (ETL) developers, data analysts, and data scientists to extract, clean, enrich, normalize, and load data 2204 | - You can use AWS Glue to organize, cleanse, validate, and format data for storage in a data warehouse or data lake 2205 | - tracks data that has already been processed during a previous run of an ETL job by persisting state information from the job run. This persisted state information is called a **job bookmark** 2206 | - Enable: Causes the job to update the state after a run to keep track of previously processed data 2207 | - Disable: Job bookmarks are not used, and the job always processes the entire dataset 2208 | - Pause: Process incremental data since the last successful run or the data in the range identified by the following sub-options, without updating the state of last bookmark 2209 | 2210 | ### AWS Trusted Advisor 2211 | 2212 | - an online tool that provides you **real-time guidance to help you provision your resources following AWS best practices** 2213 | - inspects your AWS environment and makes recommendations for saving money, improving system performance and reliability, or closing security gaps 2214 | - list of checks in the following 5 [categories](https://aws.amazon.com/premiumsupport/technology/trusted-advisor/best-practice-checklist/) 2215 | - cost optimization 2216 | - security 2217 | - fault tolerance 2218 | - performance 2219 | - service limits 2220 | 2221 | ### AWS Inspector 2222 | 2223 | - an automated security assessment service that helps improve the **security** and **compliance** of applications deployed on AWS 2224 | - automatically assesses applications for exposure, vulnerabilities, and deviations from best practices 2225 | 2226 | ### CI/CD: Continuous Integration 2227 | 2228 | - **CodeCommit** 2229 | - a secure, highly scalable, managed source control service that hosts private Git repositories 2230 | - **CodeBuild** 2231 | - You just specify the location of your source code, choose your build settings, and CodeBuild will run build scripts for compiling, testing, and packaging your code 2232 | - AWS CodeBuild runs your builds in preconfigured build environments that contain the operating system, programming language runtime, and build tools (e.g., Apache Maven, Gradle, npm) required to complete the task 2233 | - **CodeDeploy** 2234 | - CodeDeploy is a deployment service that automates application deployments to Amazon EC2 instances, on-premises instances, serverless Lambda functions, or Amazon ECS services 2235 | - built with AWS SAM 2236 | - application can be tested locally by invoking the Lambda function and event sources locally. No need to use separate CodeDeploy resource for development and production 2237 | - 3 platforms: 2238 | - Lambda 2239 | - *Canary*: Traffic is shifted in two increments. You can choose from predefined canary options that specify the percentage of traffic shifted to your updated Lambda function version in the first increment and the interval, in minutes, before the remaining traffic is shifted in the second increment. 2240 | - *Linear*: Traffic is shifted in equal increments with an equal number of minutes between each increment. You can choose from predefined linear options that specify the percentage of traffic shifted in each increment and the number of minutes between each increment. 2241 | - *All-at-once*: All traffic is shifted from the original Lambda function to the updated Lambda function version at once. 2242 | - ECS 2243 | - EC2/on-premise 2244 | - can use the following tools to monitor CodePipeline: 2245 | - CloudWatch events 2246 | - CloudTrail 2247 | - Console and CLI 2248 | - to automatically trigger pipeline based on changes in S3 bucket, use CloudWatch events rule and CloudTrail 2249 | - should disable periodic check so that event-based trigger is enabled 2250 | - should not use Webhook as it is for triggering pipepline when the source is Github repository 2251 | - **CodePipeline** 2252 | - CodePipeline is a continuous delivery service that automates the building, testing, and deployment of your software into production 2253 | 2254 | ![CDCI](pic/CDCI.png) 2255 | 2256 | ## MISC 2257 | 2258 | - **RAID** (Redundant Array of Independent Disks) is just a data storage virtualization technology that combines multiple storage devices to achieve higher performance or data durability 2259 | - **Stateless** installation: the scalable components are disposable, and configuration is stored away from the disposable components 2260 | - **Blue-green** deployment: 2261 | - two servers are maintained: a "blue" server and a "green" server. At any given time, only one server is handling requests 2262 | - Changes are installed on the non-live server, which is then tested through the private network to verify the changes work as expected. Once verified, the non-live server is swapped with the live server, effectively making the deployed changes live 2263 | - Common **port** numbers 2264 | 2265 | ![port](pic/port.png) 2266 | 2267 | - CIDR networks: 2268 | - calculate the number of assignable IP address: subtract from 32; raise 2 to that power and subtract 2 2269 | - e.g. for `151.0.0.0/27`: 32-27=5; 2^5=32; 32-2 =30 2270 | - Two /25 networks equal a /24 networks (and so on) 2271 | - `192.0.2.240/29` and `192.0.2.248/29` can for a supernet `192.0.2.240/28` 2272 | - `192.0.2.240/29` and `192.0.2.8/29` can NOT form this supernet 2273 | 2274 | 2275 | -------------------------------------------------------------------------------- /pic/AD_compatibility.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MarkFu/AWS_SAA_study_material/97e24c8ccbb640b5bbe607646d476cfded5709cd/pic/AD_compatibility.png -------------------------------------------------------------------------------- /pic/CDCI.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MarkFu/AWS_SAA_study_material/97e24c8ccbb640b5bbe607646d476cfded5709cd/pic/CDCI.png -------------------------------------------------------------------------------- /pic/CMK_sym.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MarkFu/AWS_SAA_study_material/97e24c8ccbb640b5bbe607646d476cfded5709cd/pic/CMK_sym.png -------------------------------------------------------------------------------- /pic/CMKs.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MarkFu/AWS_SAA_study_material/97e24c8ccbb640b5bbe607646d476cfded5709cd/pic/CMKs.png -------------------------------------------------------------------------------- /pic/Cognito_pool.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MarkFu/AWS_SAA_study_material/97e24c8ccbb640b5bbe607646d476cfded5709cd/pic/Cognito_pool.png -------------------------------------------------------------------------------- /pic/Container.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MarkFu/AWS_SAA_study_material/97e24c8ccbb640b5bbe607646d476cfded5709cd/pic/Container.png -------------------------------------------------------------------------------- /pic/DMS_source_target.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MarkFu/AWS_SAA_study_material/97e24c8ccbb640b5bbe607646d476cfded5709cd/pic/DMS_source_target.png -------------------------------------------------------------------------------- /pic/EBS_TYPES.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MarkFu/AWS_SAA_study_material/97e24c8ccbb640b5bbe607646d476cfded5709cd/pic/EBS_TYPES.png -------------------------------------------------------------------------------- /pic/ECS_components.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MarkFu/AWS_SAA_study_material/97e24c8ccbb640b5bbe607646d476cfded5709cd/pic/ECS_components.png -------------------------------------------------------------------------------- /pic/ECS_security.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MarkFu/AWS_SAA_study_material/97e24c8ccbb640b5bbe607646d476cfded5709cd/pic/ECS_security.png -------------------------------------------------------------------------------- /pic/HA_bastion_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MarkFu/AWS_SAA_study_material/97e24c8ccbb640b5bbe607646d476cfded5709cd/pic/HA_bastion_1.png -------------------------------------------------------------------------------- /pic/HA_bastion_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MarkFu/AWS_SAA_study_material/97e24c8ccbb640b5bbe607646d476cfded5709cd/pic/HA_bastion_2.png -------------------------------------------------------------------------------- /pic/HA_example.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MarkFu/AWS_SAA_study_material/97e24c8ccbb640b5bbe607646d476cfded5709cd/pic/HA_example.png -------------------------------------------------------------------------------- /pic/NAT.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MarkFu/AWS_SAA_study_material/97e24c8ccbb640b5bbe607646d476cfded5709cd/pic/NAT.png -------------------------------------------------------------------------------- /pic/NAT_comparison_J.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MarkFu/AWS_SAA_study_material/97e24c8ccbb640b5bbe607646d476cfded5709cd/pic/NAT_comparison_J.png -------------------------------------------------------------------------------- /pic/SAM.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MarkFu/AWS_SAA_study_material/97e24c8ccbb640b5bbe607646d476cfded5709cd/pic/SAM.png -------------------------------------------------------------------------------- /pic/Set_dx.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MarkFu/AWS_SAA_study_material/97e24c8ccbb640b5bbe607646d476cfded5709cd/pic/Set_dx.png -------------------------------------------------------------------------------- /pic/VPC_pic.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MarkFu/AWS_SAA_study_material/97e24c8ccbb640b5bbe607646d476cfded5709cd/pic/VPC_pic.png -------------------------------------------------------------------------------- /pic/WAF_CloudFront.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MarkFu/AWS_SAA_study_material/97e24c8ccbb640b5bbe607646d476cfded5709cd/pic/WAF_CloudFront.png -------------------------------------------------------------------------------- /pic/bastion.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MarkFu/AWS_SAA_study_material/97e24c8ccbb640b5bbe607646d476cfded5709cd/pic/bastion.png -------------------------------------------------------------------------------- /pic/cross_zone_lb.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MarkFu/AWS_SAA_study_material/97e24c8ccbb640b5bbe607646d476cfded5709cd/pic/cross_zone_lb.png -------------------------------------------------------------------------------- /pic/dx.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MarkFu/AWS_SAA_study_material/97e24c8ccbb640b5bbe607646d476cfded5709cd/pic/dx.png -------------------------------------------------------------------------------- /pic/elasticache_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MarkFu/AWS_SAA_study_material/97e24c8ccbb640b5bbe607646d476cfded5709cd/pic/elasticache_2.png -------------------------------------------------------------------------------- /pic/endpoint.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MarkFu/AWS_SAA_study_material/97e24c8ccbb640b5bbe607646d476cfded5709cd/pic/endpoint.png -------------------------------------------------------------------------------- /pic/port.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MarkFu/AWS_SAA_study_material/97e24c8ccbb640b5bbe607646d476cfded5709cd/pic/port.png -------------------------------------------------------------------------------- /pic/private_link.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MarkFu/AWS_SAA_study_material/97e24c8ccbb640b5bbe607646d476cfded5709cd/pic/private_link.png -------------------------------------------------------------------------------- /pic/s3_tier.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MarkFu/AWS_SAA_study_material/97e24c8ccbb640b5bbe607646d476cfded5709cd/pic/s3_tier.png -------------------------------------------------------------------------------- /pic/serverless.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MarkFu/AWS_SAA_study_material/97e24c8ccbb640b5bbe607646d476cfded5709cd/pic/serverless.png -------------------------------------------------------------------------------- /pic/site-to-site-vpn.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MarkFu/AWS_SAA_study_material/97e24c8ccbb640b5bbe607646d476cfded5709cd/pic/site-to-site-vpn.png -------------------------------------------------------------------------------- /pic/transit_gateway.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/MarkFu/AWS_SAA_study_material/97e24c8ccbb640b5bbe607646d476cfded5709cd/pic/transit_gateway.png --------------------------------------------------------------------------------