└── README.md /README.md: -------------------------------------------------------------------------------- 1 | # AWS Certified Developer - Associate Study notes 2 | 3 | These are my study notes for the AWS Certified Developer - Associate certification. I have already passed the Solutions Architect - Associate exam so the notes might not cover topics if i feel i already know it well enough. 4 | 5 | Table of Contents 6 | ================= 7 | 8 | * [AWS Services 10,000 foot overview](#AWS-Services) 9 | 10 | * [IAM](#iam) 11 | 12 | * [EC2](#EC2) 13 | 14 | * [S3](#S3) 15 | 16 | * [DynamoDB](#DynamoDB) 17 | 18 | * [SQS](#Simple-Queue-Service-SQS) 19 | 20 | * [SNS](#SNS) 21 | 22 | * [SWF](#SWF) 23 | 24 | * [Elastic Beanstalk](#Elastic-Beanstalk) 25 | 26 | * [CloudFormation](#CloudFormation) 27 | 28 | * [AWS Shared Responsibility](#AWS-Shared-Responsibility) 29 | 30 | * [Route 53](#Route-53) 31 | 32 | * [VPC](#VPC) 33 | 34 | * [CloudFront](#CloudFront) 35 | 36 | * [Lambda](#Lambda) 37 | 38 | [Exam Blueprint](http://awstrainingandcertification.s3.amazonaws.com/production/AWS_certified_developer_associate_blueprint.pdf) 39 | 40 | # AWS-Services 41 | 42 | EC2 - Elastic Compute Cloud virtual machines 43 | 44 | Lightsail - Provisiong service - Very hands off 45 | 46 | Elastic Container Service - running containers such as docker at scale 47 | 48 | Lambda - Serverless functions 49 | 50 | Elastic Beanstalk - Easier route for developers to get up and running with their cloud 51 | 52 | ElastiCache - Cache common searches in front of DB servers 53 | 54 | S3 - Key pair object storage kept in buckets 55 | 56 | EFS - NFS can be mounted on multiple instances 57 | 58 | Glacier - Archival storage 59 | 60 | Snowball - Hardware appliance to transfer data between on-prem and AWS 61 | 62 | Storage gateway - Virtual appliances that live on-prem and replicate to AWS 63 | 64 | RDS - MySQL, MSSQL, Aurora, PostGreSQL 65 | 66 | DynamoDB - NoSQL 67 | 68 | RedShift - Data warehousing 69 | 70 | AWS Migration Hub - Dashboard that lets you track your application migration 71 | 72 | Application Discovery Service - tracks your applications dependencies 73 | 74 | Database Migration Service - Migrate DBs to AWS 75 | 76 | Server Migration Service 77 | 78 | VPC Virtual Private Cloud 79 | 80 | Cloudfront - Content Delivery Network (CDN) caches content to make it available quicker to the end user. 81 | 82 | Route53 - Amazon's DNS service 83 | 84 | API Gateway - Creating API's for your own services 85 | 86 | Direct Connect - Network peering between yourself and AWS 87 | 88 | Codestar - Project managing code. Collorbation tool 89 | 90 | Codecommit - Source control service 91 | 92 | Codebuild - Complies and test your code 93 | 94 | Codedeploy - Automates your application deployment 95 | 96 | Codepiple - CDS 97 | 98 | X-Ray - Used to debug your serverless application 99 | 100 | Cloud9 - Online IDE 101 | 102 | CloudWatch - Monitoring 103 | 104 | CloudFormation - Infrastructure as Code 105 | 106 | CloudTrail - API logging 107 | 108 | Config - Monitor AWS account config 109 | 110 | OpsWorks - Config management using Chef or Puppet 111 | 112 | Service Catalog - Managing IT services approved for use 113 | 114 | Systems Manager - Patch maintenance 115 | Trusted Advisor - Gives advice on security, cost 116 | 117 | Elastic Transcoder - Video transcoding. Sizing videos for various devices 118 | 119 | 120 | Lex - Powers Alexa 121 | 122 | 123 | Polly - Text to speech 124 | 125 | Rekognition - Analyse images and video 126 | 127 | Amazon translate - Language translate 128 | 129 | Amazon Transcribe - Automatic speech regonition 130 | 131 | Athena - run SQL queries against S3 buckets 132 | 133 | Elastic Map Reduce - managed software framework used to process large data sets in a distributed computing environment. Used for data analysis, web indexing, data warehousing, machine learning, financial analysis, scientific simulation etc. EMR supports workloads based on Hadoop, Apache Spark, Presto and Apache HBase. 134 | 135 | CloudSearch 136 | 137 | ElasticSearch Service 138 | 139 | Kinesis - Ingesting large amounts of data 140 | 141 | Kinesis Video Streams - Ingesting lots of video streams 142 | 143 | Data Pipeline - Moving data between AWS services 144 | 145 | IAM - Identity and Management access 146 | 147 | Cognito - Mobile Device authentication using federated accounts Facebook etc 148 | 149 | Guard Duty - 150 | 151 | Inspector - Anaylse instance security using agent 152 | 153 | Macie - Scans S3 buskets for personal iD numbers 154 | 155 | Certificate manager - Free SSL certs 156 | 157 | CloudHSM - Hardware security module which store keys 158 | 159 | Directory services - Connect AWS to onside AD 160 | 161 | WAF Web application firewall - L7 firewall 162 | 163 | Shield - DDOS mitigation 164 | 165 | Artifact - AWS compliance reports 166 | 167 | SNS - Simple Notification Service 168 | 169 | SQS - Simple Queue Service - is a web service that gives you access to message queues that store messages waiting to be processed. With Amazon SQS, you can quickly build message queuing applications that can run on any computer. Amazon SQS can help you build a distributed application with decoupled components, working closely with the Amazon Elastic Compute Cloud (Amazon EC2) and other AWS infrastructure web services. 170 | 171 | SWF - Simple Workflow Service 172 | 173 | Simple Email Service - Sending emails to customers 174 | 175 | WorkMail - Office365 176 | 177 | Workspaces - VDI 178 | 179 | # IAM 180 | 181 | [IAM FAQ](https://aws.amazon.com/iam/faqs/) 182 | 183 | Users 184 | Users have the choice of being given access to the management console and/or programmatic access. Access via the management console enables a password for the account. Enabling programmatic access enables an access key ID and secret access key. This can be used to access t 185 | 186 | ### Groups 187 | 188 | Groups allow you to apply policies to multiple users. Recommended to apply policies to groups even if it is for one user. 189 | 190 | * Users can be members of multiple groups 191 | * Groups cannot be nested 192 | 193 | Policies 194 | Policies are JSON documents that contain permissions to AWS services. ie 195 | 196 | Roles 197 | 198 | Secret 199 | 200 | ## Security Token Service (STS) 201 | 202 | * Grants users limited and temporary access to AWS resources. 203 | * 3 sources: 204 | 1. Federation (often Active Directory) 205 | * Uses SAML 206 | * SSO allows users to log in to AWS Console without assigning IAM credentials 207 | 2. Federation with mobile app 208 | * Use Facebook/Amazon/Google or other openID provider 209 | 3. Cross account access 210 | * Lets users from one AWS account access resources in another 211 | 212 | ### Active Directory Federation 213 | 214 | # EC2 215 | 216 | [EC2 FAQ](https://aws.amazon.com/ec2/faqs/) 217 | 218 | Access instance meta data at http://169.254.169.254/latest/meta-data/ 219 | 220 | * Scipts can be run from the user data section when creating an instance 221 | 222 | ## Load Balancers 223 | 224 | * There are 3 types of load balancer; Application, Network and Classic. Application is used to route HTTP/HTTPS (L7) traffic. Network and Classic are used to route TCP (L4) traffic. 225 | 226 | 1. Application - TLS termination 227 | 2. Network - Extermeme performance and static IP 228 | 3. Classic (also refered to as Elastic Load Balancer ELB) - 229 | 230 | * Sticky sessions are a mechanism to route requests to the same target in a target group. This is useful for servers that maintain state information in order to provide a continuous experience to clients. 231 | 232 | # S3 233 | 234 | [S3 FAQ](https://aws.amazon.com/s3/faqs/) 235 | 236 | ### Storage classes 237 | 238 | * Buckets can contain objects of different storage classes 239 | 240 | 1. S3 Standard 241 | 2. S3 Standard-Infrequent Access - for data that is less frequently accessed but requires rapid access when needed. Availability drops to 99.9% and there is a data retrieval charge of $0.01 / GB. 242 | 3. S3 One Zone-Infrequent Access - Offers similar performance as other S3 classes but stores data redundantly within an Availability Zone not across Availability Zones. 243 | 4. Glacier - used for archiving data. 244 | 245 | * No limit on number of objects in a bucket 246 | * Largest object size is 5TB 247 | * Smallest object size is 0 bytes 248 | * Largest upload in a single PUT is 5GB. (Objects larger than 100MB should be uploaded with multipart uploader) 249 | * A bucket cannot contain a bucket 250 | 251 | * Need to delete large amounts of 252 | 253 | * S3 provides read-after-write consistency for PUTS of new objects. 254 | * S3 offers eventual consistency for overwrite PUTS and DELETES. 255 | 256 | * If you expect more than 300 PUT/LIST/DELETE requests per second or more than 800 GET request per second raise a support request with AWS to prepare for the workload. 257 | 258 | * Event notifications can sent in response to actions such as PUTs, POSTs, COPYs or DELETEs, Messages can be sent through SNS, SQS or Lambda. 259 | 260 | * CORS (Cross-Origin resource sharing) enables a way for client web applications loaded in one domain to interact with resources in a different domain. 261 | 262 | ### Encryption 263 | 264 | * Server-Side 265 | 1. KMS-Managed Encryption keys 266 | 2. Amazon S3-Managed Encryption keys 267 | 3. Customer-Provided Encryption keys 268 | 269 | * Client-Side 270 | 1. AWS KMS-managed customer master key 271 | 2. Client-side master key 272 | 273 | # DynamoDB 274 | 275 | [DynamoDB FAQ](https://aws.amazon.com/dynamodb/faqs/) 276 | 277 | NoSQL database 278 | Stored on SSD 279 | 280 | Spread across 3 geographically distinct data centres 281 | 282 | Consistency models 283 | 284 | 1. Eventual consistent reads (default). Offers best read performance. Consistency across all copies of data is usually reached within a second. 285 | 2. Strongly consistent reads. Returns a result that reflects all writes that received a successful response prior to the read. 286 | 287 | ## Indexes 288 | 289 | Two types of primary keys available; 290 | Single Attribute - partition key (Customer no, driver license etc) 291 | Composite - partition key & sort key (Customer no & date range) 292 | 293 | Local Secondary Index 294 | same partition key but different sort key 295 | Can only be created when creating a table 296 | 297 | Global Secondary Index 298 | Different partition key and different sort key 299 | Can be created at table creation or added later 300 | 301 | ## Streams 302 | 303 | Four options for streams only 1 can be selected 304 | 305 | 1. Keys Only - Only the key attributes of the modified item 306 | 2. New image the entire item, as it appears after it was modified 307 | 3. Old image - the entire item, as it appeared before it was modified 308 | 4. New and old images - both the new and the old images of the item 309 | 310 | * Max 24 hour storage 311 | * Can have Lambda triggered from streams 312 | 313 | If a new item is added to the table, the stream captures an image of the entire item, including all attributes 314 | if item is updated, stream captures the before and after image of any attributes that were modified 315 | if item is deleted the stream captures an image of the item before deletion 316 | 317 | ## Query vs Scan 318 | 319 | * Query - a query find items in a table using only the primary key. 320 | * Scan - a scan operation examines every item in the table. By default a scan returns all of the data attributes for every item. You can use the ProjectionExpression parameter so that Scan only returns some of the attributes. 321 | 322 | Query is more efficient than scan 323 | 324 | Batch get item for more efficient queries of large items 325 | 326 | ### Provisioned Throughput 327 | 328 | DynamoDB is priced on the storage size and its 'Provisioned Throughput'. Provisioned throughput is made up of read capacity units and write capacity units. 329 | All reads rounded up to 4KB. Eventually consistent reads (default) consists of 2 reads per second. Strongly consistent reads consist of 1 read per second. 330 | All writes are 1KB. All writes consist of 1 write per second. 331 | 332 | Formula is (size of read rounded to 4KB chunk / 4KB) * no of items = read throughput 333 | Divide by 2 if eventually consistent 334 | 335 | If you exceed your provisioned throughput you will get a HTTP status code 400, ProvisionedThroughputExceededException. 336 | 337 | # Simple Queue Service SQS 338 | 339 | [SQS FAQ](https://aws.amazon.com/sqs/faqs/) 340 | 341 | [SQS tutorial](https://aws.amazon.com/getting-started/tutorials/send-messages-distributed-applications/) 342 | 343 | SQS is a pull based messaging service. 344 | 345 | Allows the 'decoupling' of components of an application. 346 | 347 | FIFO queues are not supported in all regions. Currently only: US East (Ohio), US East (N. Virginia), US West (Oregon), and EU (Ireland) regions. 348 | 349 | The maximum amount of time that a message can live in a SQS queue is 14 days. The retention period can be configred to be anywhere betweeen 1 minute and 14 days. The default is 4 days. Once the message retention limit is reached, your messages are automatically deleted. 350 | 351 | SQS messages must be between 1 and 256 KB in size. Billed in 64KB chunks. 352 | 353 | SQS supports two types of pull based polling: 354 | 355 | **Short polling** - SQS returns a response immediately, even if there is no message in the queue 356 | **Long polling** - doesn’t return a response until a message arrives in the message queue, or the long poll times out. Can be cheaper then short polling as it can reduce the number of empty receives. 357 | In almost all cases, long polling is preferable to short polling. One case you might want to use short polling is if you application uses a single thread to poll multiple queues. 358 | 359 | When a consumer receives a message from the SQS queue, it stays in the SQS queue. The message must be deleted by the consumer once the message has been fully processed. To prevent other conumers from receiving the message, SQS sets a Visibility Timeout, which is the period of time where SQS prevents other consuming components from receiving and processing the message. 360 | 361 | First 1 million requests are free, then $0.50 for every million after. 362 | 363 | # Simple Notification Service (SNS) 364 | 365 | [SNS FAQ](https://aws.amazon.com/sns/faqs/) 366 | 367 | [SNS tutorial](https://aws.amazon.com/getting-started/tutorials/filter-messages-published-to-topics/) 368 | After a message has been published to a topic it cant be deleted (recalled) 369 | 370 | SNS is a messaging service that 'pushes' messages to clients. 371 | 372 | Messages protocols: 373 | 374 | * Application 375 | * SMS text message 376 | * Email 377 | * Email-JSON 378 | * AWS SQS 379 | * HTTP 380 | * HTTPS 381 | 382 | SNS can be used with SQS to fan messages out to multiple queues. 383 | 384 | SNS uses Topics to send messages. To receive messages published to a topic you have to subscribe. Once a message is published, SNS attempts to deliver to every endpoint that is subscribed. 385 | 386 | Messages can be customised by protocol type. 387 | 388 | Messages are stored reduntly across mulitple AZ's. 389 | 390 | # Simple Workflow Service (SWF) 391 | 392 | [SWF FAQ](https://aws.amazon.com/swf/faqs/) 393 | 394 | * Workers are programs that interact with SWF to get tasks, process received tasks and return the results. 395 | * Decider is a program that controls the coordination of tasks. 396 | 397 | Tasks assigned only once and never duplicated. 398 | 399 | Domains - workflow and activity types and the workflow execution itself are all scoped to a domain. Domains isolate a set of types, executions, and task lists from other within the same account. You can register a domain by using the console or SWF API. Using JSON. 400 | 401 | ## SWF vs SQS 402 | 403 | * SWF presents task oriented API whereas SQS offers message oriented API. 404 | * SWF ensures that a task is assigned only once. With SQS you need to handle duplicated messages and may also need to ensure that a message is processed only once. 405 | * SWF keeps track of all tasks and events in an application. With SQS you need to implement your own application-level tracking. 406 | 407 | # Elastic Beanstalk 408 | 409 | [Elastic Beanstalk FAQ](https://aws.amazon.com/elasticbeanstalk/faqs/) 410 | 411 | * Can have multiple versions of your applications (Dev/Test) 412 | * Your applications can be split into tiers. Frontend/backend etc 413 | * able to update application 414 | * can update your configuration ie change instance type behind the app 415 | * Updates can be 1 instance at a time or % of instances or immutable 416 | 417 | ### Languages 418 | 419 | * Apache Tomcat for Java 420 | * Apache HTTP server for PHP 421 | * Apache HTTP Server for Python 422 | * Nginx or Apache HTTP for Node.js 423 | * Passenger or Puma for Ruby 424 | * Microsoft IIS 7.5, 8.0 and 8.5 for .NET 425 | * Java SE 426 | * Docker 427 | * Go 428 | 429 | # CloudFormation 430 | 431 | [CloudFormation FAQ](https://aws.amazon.com/cloudformation/faqs/) 432 | 433 | A cloudFormation is made up of the following sections: 434 | 435 | __Resources__ (required) - specify the stack resources and their properties such as an EC2 instance or a S3 bucket. You can refer to resources in the Resources and Outputs sections of the template. 436 | 437 | __Metadata__ (optional) - objects that provide additional information about the template. 438 | 439 | __Parameters__ (optional) - specifies values that you can pass in to your template at runtime (when you create or update a stack). You can refer to parameters in the Resources and Outputs sections of the template. 440 | 441 | __Mappings__ (optional) - a mapping of keys and associated values that you can use to specify conditional parameter values, similar to a lookup table. You can match a key to a corresponding value by using the Fn::FindInMap intrinsic function in the Resources and Outputs section. 442 | 443 | __Conditions__ (optional) - defines conditions that control whether certain resources are created or whether certain resource properties are assigned a value during stack creation or update. For example, you could conditionally create a resource that depends on whether the stack is for a production or test environment. 444 | 445 | __Transform__ (optional) - for serverless applications (also referred to as Lambda-based applications), specifies the version of the AWS Serverless Application Model (AWS SAM) to use. 446 | 447 | __Outputs__ describes the values that are returned whenever you view your stack’s properties. For example, you can declare an output for an S3 bucket name and then call the aws cloudformation describe-stacks AWS CLI command to view the name. 448 | The only required section in a Cloudformation template is the Resources section 449 | 450 | * Automatic rollback on error is enabled by default. 451 | * Use function Fn:GetAtt to output data 452 | * Stacks can wait for applications to be provisioned using the 'waitCondition' 453 | 454 | # AWS Shared Responsibility 455 | 456 | IaaS - Customer manages OS and above including security and patches. AWS manages hypervisor and below including physical infrastructure. 457 | 458 | SaaS – AWS manages everything except user credentials. 459 | 460 | # Route-53 461 | 462 | [Route53 FAQ](https://aws.amazon.com/route53/faqs/) 463 | 464 | * Limit of 50 domain names - speak to AWS support to adjust. 465 | 466 | * CNAME (Canonical Name) can be used to resolve one domain another 467 | * A Record (Address Record) for resolving a domain name to an IP address 468 | * Alias records are an AWS / Route 53 specific term, similar to CNAME with the key distinction that CNAMEs can't be used on the zone apex (root domain i.e. cnames could be used against sofa.furniture.com, but not against furniture.com - for this you'd need to use either an A Record or Alias record) 469 | 470 | * In the exam always choose Alias over CNAME. Amazon dont charge to resolve Alias records and they can be used to map naked domain apex to an ELB. 471 | 472 | ## Routing Policies 473 | 474 | * Simple - default policy, used when there is only one resource. 475 | * Weighted - send a specified amount of traffic to certain resources. ie for every 10 requests send 70% to us-east-1 and 30% to eu-west-1. 476 | * Latency - used to send traffic to lowest latency region. This requires you to create an A record for each region you want the latency to be evaluated against. 477 | * Failover - failover allows you to have an active/passive design. Using health checks to assess whether to send traffic to the primary or secondary resource. A health check can use Cloud watch alarms, other health checks or simply use a TCP connection to an IP or domain name. 478 | * Geolocation - used to send traffic to a particular region based on source location. ie Customers in a Eurozone country always get routed to a server with prices in Euros. 479 | 480 | # VPC 481 | 482 | [VPC FAQ](https://aws.amazon.com/vpc/faqs/) 483 | 484 | * By default all traffic between subnets is allowed 485 | 486 | * /16 is the largest CIDR block available 487 | 488 | * Subnets have a 1 to 1 mapping to an Availability Zone 489 | 490 | * 1 Internet Gateway per VPC 491 | 492 | * You cannot change the ip range of a VPC 493 | 494 | * Elastic IP addresses (EIPs) are public IP addresses that 495 | 496 | * Elastic Network interface 497 | 498 | # CloudFront 499 | 500 | * Content Delivery Network (CDN). Provides content quicker to customers by caching it in edge locations. ie customer 1 watches a video from s3. s3 bucket is in Ireland but user is in Sydney. The content flows Ireland -----> Sydney. CloudFront caches it locally near Sydney so the second time its accessed the content flows, CloudFront Sydney -> Sydney. 501 | 502 | * Edge locations can be used for write as well as read. 503 | 504 | * Objects are cached for ther life of their TTL. TTL can be 0 seconds to 365 days. Default is 24 hours. 505 | 506 | * Origin can be S3, EC2, ELB, Route53 and non AWS server ie on-prem 507 | 508 | * Restrict viewer access by signed URL or Signed Cookies 509 | 510 | * Restrict content based on geo location (whitelist and blacklist) 511 | 512 | # Lambda 513 | * Compute service allows you to run code without provisioning and managing servers. Under the hood are EC2 Instances managed by AWS. 514 | 515 | * Lambda is stateless and event driven. 516 | 517 | * If we increase memory, cpu usage will get increase. Max memory limit is 3008 MBs. Max execution timeouts is 300 518 | 519 | * Temporary objects downloaded by lambda are stored in /tmp directory. 520 | 521 | * Alias can be use to manage different versions for lambda. You can change version behind lambda. 522 | 523 | * Use AWS Lambda Environment Variables to pass operational parameters to your function. 524 | 525 | * Lambda Optimization Tips: Avoid using recursion, keep deployment size minimum, install only dependecies that is required, keep your function logic outside handler. (source: https://docs.aws.amazon.com/lambda/latest/dg/best-practices.html) 526 | --------------------------------------------------------------------------------