├── combined.pdf ├── 022-architecture-solutions.md ├── 005-route53.md ├── 010-cloudfront.md ├── 015-serverless-solutions.md ├── 007-s3-basics.md ├── 008-s3-advanced.md ├── 006-architecture-use-cases.md ├── 009-s3-security.md ├── 001-ec2-fundamentals.md ├── 014-serverless.md ├── 002-instance-storage.md ├── 020-iam.md ├── 011-storage-extras.md ├── 013-containers.md ├── 012-sqs-sns-kinesis.md ├── 023-other-services.md └── 025-disaster-recovery-migrations.md /combined.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/tarasowski/aws-certified-solutions-architect-exam-saa-c003/HEAD/combined.pdf -------------------------------------------------------------------------------- /022-architecture-solutions.md: -------------------------------------------------------------------------------- 1 | # More Solutions 2 | 3 | **Event processing on Lambda** 4 | 5 | 1. **SQS to Lambda with Dead-Letter Queue (DLQ):** This architecture involves using Amazon Simple Queue Service (SQS) to decouple and buffer events. Messages are then polled from the SQS queue by a Lambda function. If the processing fails, messages are moved to a DLQ to prevent loss and enable debugging and reprocessing. 6 | 7 | 2. **SQS FIFO to Lambda with DLQ:** Similar to the first architecture, but it uses SQS FIFO (First-In-First-Out) queues for ordered message processing, ensuring the order of messages is preserved. Again, a DLQ is utilized for handling failed message processing. 8 | 9 | 3. **SNS to Lambda with Internal Retry and DLQ:** Events are published to an Amazon Simple Notification Service (SNS) topic. Subscribed Lambda functions process these events with internal retry mechanisms. If retries fail, messages are sent to a DLQ at the Lambda service level for further investigation and processing. 10 | 11 | 4. **Fan-Out Pattern with SNS and SQS:** This pattern involves using the SNS fan-out feature to publish messages to multiple SQS queues. Each queue can have its own processing logic, allowing for parallel and scalable event processing. 12 | 13 | **S3 Event Notifications:** 14 | 15 | This architecture utilizes Amazon S3 event notifications to trigger actions in response to object operations in S3 buckets. Notifications can be sent to Amazon SNS, SQS, Lambda functions, or Amazon EventBridge rules, enabling integration with a wide range of AWS services for further processing or automation. 16 | 17 | **Intercept API Calls:** 18 | 19 | API calls to Amazon DynamoDB are intercepted using AWS CloudTrail, which logs these events. The logs are then forwarded to Amazon EventBridge for processing and triggering additional actions or workflows. 20 | 21 | **Caching Strategies:** 22 | 23 | Amazon CloudFront is used as a content delivery network (CDN) with caching capabilities at edge locations. This helps improve latency and reduce load on backend servers. API Gateway routes requests to Lambda functions, which can utilize Redis or RDS for caching data and improving performance. 24 | 25 | **Blocking an IP Address:** 26 | 27 | Multiple layers of defense are employed to block IP addresses. Network ACLs (NACLs) at the subnet level and security groups at the instance level are used to control traffic. Web Application Firewall (WAF) can be installed on Application Load Balancer (ALB) or CloudFront to block malicious traffic based on custom rules or geographic restrictions. 28 | 29 | **High Performance Computing (HPC):** 30 | 31 | AWS offers various services and features optimized for high-performance computing workloads. This includes EC2 instances with CPU or GPU optimizations, enhanced networking options like Elastic Fabric Adapter (EFA), and storage solutions such as Amazon EBS, instance store, Amazon S3, Amazon EFS, and Amazon FSx. AWS Batch and AWS ParallelCluster are used for multi-node parallel jobs and cluster management. 32 | 33 | **Highly Available EC2 Instance:** 34 | 35 | To ensure high availability of EC2 instances, Elastic IP addresses can be attached to instances for static addressing. Standby instances and Auto Scaling groups with CloudWatch alarms and Lambda functions can be configured for failover. Auto Scaling groups ensure availability across multiple Availability Zones (AZs) while maintaining the desired number of instances. 36 | -------------------------------------------------------------------------------- /005-route53.md: -------------------------------------------------------------------------------- 1 | # Route 53 2 | 3 | ## What is DNS? 4 | - DNS translates human-friendly hostnames into machine IP addresses. 5 | - DNS uses a hierarchical naming structure. 6 | - ".com" is a top-level domain. 7 | - "example.com" is a domain. 8 | - "www.example.com" is a subdomain. 9 | - DNS records include: A, AAAA, CNAME, NS. 10 | 11 | ## Route 53 12 | - Scalable, highly available managed authoritative DNS. 13 | - Authoritative means the customer can update the DNS records. 14 | - Route 53 also functions as a domain registrar. 15 | - It's the only AWS service with a 100% availability SLA. 16 | - The reference to "53" is from the traditional DNS port. 17 | 18 | ## Route 53 Records 19 | - Determines how to route traffic for a domain. 20 | - A record maps a hostname to an IPv4 address. 21 | - AAAA record maps a hostname to an IPv6 address. 22 | - CNAME record maps a hostname to another hostname. 23 | - CNAME targets must have an A or AAAA record. 24 | - A hostname uniquely identifies a device on a network. 25 | - Cannot create a CNAME record for the top node of the DNS namespace, e.g., "example.com," but possible for "www.example.com." 26 | - CNAME acts as an alias for "www.example.com" to "example.com." 27 | - NS records define name servers for the hosted zone. 28 | - CNAME stands for canonical name, indicating an alias pointing to the canonical name. 29 | 30 | ## Route 53 - Hosted Zones 31 | - A container for records defining traffic routing for a domain and its subdomains. 32 | - Public hosted zones route internet traffic, e.g., "app1.mypublicdomain.com." 33 | - Private hosted zones route traffic within one or more VPCs, e.g., "app1.company.internal." 34 | - Costs $0.50 per month per hosted zone. 35 | 36 | ## Route 53 TTL 37 | - Caches results for the TTL of the record to reduce DNS queries frequency. 38 | 39 | ## CNAME vs. Alias 40 | - CNAME points a hostname to another hostname but not for the root domain. 41 | - Alias points to an AWS resource, applicable for both root and non-root domains, and is free. 42 | 43 | ## Route 53 Alias Records Targets 44 | - Elastic Load Balancer 45 | - CloudFront Distributions 46 | - API Gateway 47 | - Elastic Beanstalk 48 | - S3 websites 49 | - VPC interface 50 | - Cannot set an alias record for an EC2 DNS name. 51 | 52 | ## Route 53 - Routing Policy 53 | - Responds to DNS queries but doesn't route traffic. 54 | - Simple routing directs traffic to a single resource or randomly chooses among multiple values in the same record. 55 | - Weighted routing controls the percentage of requests to each resource, useful for load balancing. 56 | - Latency-based routing redirects to the resource with the least latency. 57 | - Failover routing requires association with a health check. 58 | - Geolocation routing routes based on user location. 59 | - Geoproximity routing routes based on user and resource geo location, with bias. 60 | - IP-based routing routes based on client's IP address. 61 | - Multi-value routing policy balances traffic among multiple resources, supporting up to 8 healthy records. 62 | 63 | ## Route 53 - Health Checks 64 | - HTTP health checks are for public resources. 65 | - Automatic DNS failover is possible. 66 | - Health checks generate CloudWatch metrics. 67 | - Health checks originate from worldwide locations. 68 | - Supports HTTP, HTTPS, or TCP protocols. 69 | - Only 2xx and 3xx status codes are considered passed. 70 | - Calculated health checks combine results from multiple checks. 71 | - Private hosted zones can't access VPC or on-premises resources but can create CloudWatch metrics and alarms. 72 | -------------------------------------------------------------------------------- /010-cloudfront.md: -------------------------------------------------------------------------------- 1 | # Cloudfront & Global Accelerator 2 | 3 | # CloudFront 4 | 5 | CloudFront is a Content Delivery Network (CDN) service provided by AWS that enhances read performance by caching content at edge locations distributed globally. Here's an overview of its features: 6 | 7 | - **CDN Service:** CloudFront acts as a global CDN, improving the delivery of web content to end users by caching it at edge locations closer to them. 8 | - **Global Presence:** With 216 points of presence (PoPs) worldwide, CloudFront ensures low-latency content delivery to users regardless of their geographic location. 9 | - **DDoS Protection:** CloudFront provides DDoS protection by leveraging AWS Shield and AWS Web Application Firewall (WAF), safeguarding your applications and content from malicious attacks. 10 | - **S3 Bucket as Origin:** CloudFront can use an S3 bucket as its origin, with Origin Access Identity (OAI) ensuring secure access to the bucket content. 11 | - **Ingress for Uploads:** CloudFront can also be used as an ingress point to upload files to S3, offering an efficient way to transfer data to AWS infrastructure. 12 | - **Custom Origin:** Besides S3, CloudFront supports custom origins such as Application Load Balancers (ALB), EC2 instances, S3 websites, or any HTTP backend, providing flexibility in content delivery. 13 | - **Caching:** Files are cached at edge locations based on a Time-to-Live (TTL) setting, typically for a day or as configured. 14 | 15 | # ALB as Origin 16 | 17 | When using an Application Load Balancer (ALB) as the origin for CloudFront, consider the following: 18 | 19 | - **Accessing HTTP Backends:** CloudFront can access any HTTP backend, including applications hosted behind ALBs. 20 | - **Public EC2 Instances:** To use ALB as an origin, EC2 instances behind the ALB must be publicly accessible, as CloudFront does not support private access. 21 | - **Public Load Balancer:** The ALB must be configured to be publicly accessible to allow CloudFront to connect to it. 22 | - **Edge Location Connectivity:** Ensure that the public IP addresses of CloudFront edge locations are allowed to connect to the ALB to facilitate content delivery. 23 | 24 | # GEO Restrictions 25 | 26 | CloudFront offers GEO restrictions to control access based on geographic location. Here's how it works: 27 | 28 | - **Country Restriction:** You can restrict access based on the country of the viewer, determined using a third-party IP list. 29 | - **Allow List:** Specify countries allowed to access content. 30 | - **Block List:** Specify countries restricted from accessing content. 31 | - **Copyright Protection:** GEO restrictions can be used to enforce copyright policies by limiting content access to authorized regions. 32 | 33 | By leveraging CloudFront's capabilities such as global caching, integration with ALB, and GEO restrictions, you can optimize content delivery, enhance security, and enforce access controls for your applications and content delivery workflows. 34 | 35 | 36 | # Price Classes 37 | 38 | Price Classes in CloudFront allow you to control the geographic regions where your content is distributed and affect the pricing for content delivery. Here's an overview: 39 | 40 | - **Edge Locations:** CloudFront's network spans globally, with edge locations strategically located around the world to improve content delivery performance. 41 | - **Pricing Variations:** Pricing may vary depending on the region where content is delivered from. 42 | - **Data Transfer Costs:** Typically, the more data transferred out from edge locations, the lower the unit cost. 43 | - **Price Class Options:** 44 | - **All:** This includes all CloudFront edge locations globally, ensuring maximum coverage but potentially higher costs. 45 | - **100:** Covers edge locations in major regions such as the USA and Europe, offering a balance between coverage and cost. 46 | - **200:** Extends coverage to additional regions beyond the major ones, providing broader global reach at potentially higher costs. 47 | 48 | You can optimize costs by selecting the appropriate price class based on your content delivery needs and target audience locations. 49 | 50 | # Cache Invalidation 51 | 52 | Cache Invalidation in CloudFront allows you to remove outdated or stale content from edge caches to ensure that users receive the most up-to-date content. Here are some key points: 53 | 54 | - **TTL-based Invalidation:** Content is typically invalidated based on Time-to-Live (TTL) settings. When the TTL expires, CloudFront checks the origin for updated content. 55 | - **Manual Invalidation:** You can initiate a manual cache invalidation to force CloudFront to fetch updated content from the origin. 56 | - **File Path Invalidation:** Specify specific file paths or patterns (e.g., `/path/*`) to invalidate content associated with those paths. 57 | - **Wildcard Invalidation:** Use wildcard characters (e.g., `*`) to invalidate all files in the distribution, forcing a refresh of the entire cache. 58 | 59 | By leveraging cache invalidation, you can ensure that users receive the latest content from your CloudFront distribution, maintaining a seamless and up-to-date user experience. 60 | 61 | 62 | # AWS Global Accelerator 63 | 64 | AWS Global Accelerator is a networking service that allows you to improve the performance and availability of your applications by directing user traffic to the nearest AWS edge location. Here's an overview of its features: 65 | 66 | - **Global Application Deployment:** Even if your application is deployed in only one AWS region, Global Accelerator leverages anycast IP addressing to direct users to the nearest edge location. 67 | - **Anycast IP:** Anycast IP addressing allows multiple servers to share the same IP address, and users are routed to the nearest server. This optimization helps reduce latency and improve application performance. 68 | - **AWS Internal Network:** Global Accelerator utilizes the AWS internal network to efficiently route traffic to your application, ensuring low-latency communication. 69 | - **Compatible Services:** It works seamlessly with Elastic IP addresses, EC2 instances, Application Load Balancers (ALBs), Network Load Balancers (NLBs), and can handle both public and private endpoints. 70 | - **Health Checks:** Global Accelerator performs health checks on your endpoints to ensure that traffic is only routed to healthy resources. 71 | - **Automatic DDoS Protection:** It provides automatic DDoS protection through AWS Shield, safeguarding your application from malicious attacks. 72 | 73 | # AWS Global Accelerator vs. CloudFront 74 | 75 | While both AWS Global Accelerator and CloudFront aim to improve application performance and availability, they serve different use cases: 76 | 77 | ### CloudFront: 78 | - **Content Caching:** CloudFront caches content at edge locations, serving cached content to users to reduce latency and improve scalability. 79 | - **Edge-based Delivery:** Content is served from the edge locations closest to users, enhancing the delivery speed of static and dynamic content. 80 | 81 | ### Global Accelerator: 82 | - **Direct-to-Origin Traffic:** Global Accelerator does not cache content; instead, it routes traffic directly to the backend servers without caching, improving performance for TCP or UDP-based applications. 83 | - **Non-HTTP Use Cases:** It is well-suited for use cases such as gaming (UDP), IoT (MQTT), or voice over IP (VoIP), where real-time data transmission and low-latency communication are critical. 84 | 85 | In summary, while CloudFront is optimized for content caching and edge-based delivery of HTTP content, Global Accelerator focuses on directing traffic to backend applications hosted in AWS regions, making it ideal for non-HTTP use cases and scenarios where direct-to-origin traffic routing is required. 86 | -------------------------------------------------------------------------------- /015-serverless-solutions.md: -------------------------------------------------------------------------------- 1 | # Serverless solutions diagrams 2 | 3 | # MyTodoList 4 | Title: Building a Scalable and Secure Serverless Todo List Application 5 | 6 | **Introduction:** 7 | In today's fast-paced digital world, the need for efficient and secure task management solutions is ever-growing. To address this demand, we propose the development of "MyTodoList," a serverless application leveraging AWS services to provide a seamless and robust task management experience. 8 | 9 | **Architecture Overview:** 10 | MyTodoList will expose a REST API over HTTPS, allowing users to interact with their todo lists. The application will be built using a serverless approach, utilizing AWS Lambda for compute, Amazon DynamoDB for database storage, Amazon S3 for file storage, and Amazon API Gateway to manage API endpoints. 11 | 12 | **Authentication:** 13 | To ensure secure access to the application, authentication will be handled through Amazon Cognito User Pools. Users will authenticate via Cognito, which will provide temporary access keys allowing access to the S3 bucket where user data is stored. 14 | 15 | **Data Flow:** 16 | 1. **Client Interaction:** Users interact with the application through a client-side interface. 17 | 2. **API Gateway:** Requests from clients are directed to API Gateway, which serves as the entry point to the application. 18 | 3. **AWS Lambda:** API Gateway triggers Lambda functions, which handle the business logic of the application. 19 | 4. **DynamoDB:** Lambda functions interact with DynamoDB to store and retrieve todo list data efficiently. 20 | 5. **Amazon S3:** User files, such as attachments or images, are stored securely in S3 buckets. 21 | 6. **Amazon Cognito:** Authentication and authorization are managed through Cognito User Pools, providing a secure authentication layer for the application. 22 | 23 | **Scalability and Performance Optimization:** 24 | - **DynamoDB Accelerator (DAX):** To enhance read throughput and reduce latency, DAX can be implemented as a caching layer for DynamoDB. This will help optimize the performance of read-heavy operations, resulting in a smoother user experience while also reducing the costs associated with DynamoDB provisioned throughput. 25 | - **API Gateway Caching:** To further improve performance and reduce the load on backend resources, responses from API Gateway can be cached. This will allow frequently accessed data to be served quickly without invoking Lambda functions or accessing DynamoDB, thus improving overall response times. 26 | 27 | **Conclusion:** 28 | MyTodoList provides a scalable, secure, and efficient solution for managing todo lists. By leveraging serverless architecture and AWS services such as Lambda, DynamoDB, S3, and Cognito, the application ensures reliability, scalability, and cost-effectiveness. With features like authentication via Cognito, direct interaction with S3, and performance optimizations like DAX and API Gateway caching, MyTodoList delivers a seamless user experience while meeting the demands of modern task management applications. 29 | 30 | # MyBlog.com 31 | Title: Building a Globally Scalable and Secure Blog Platform on AWS 32 | 33 | **Introduction:** 34 | In the digital landscape, creating a blog platform that not only scales globally but also ensures security and reliability is paramount. Enter "MyBlog.com," a dynamic platform hosted on AWS designed to deliver content efficiently while maintaining robust security measures. 35 | 36 | **Architecture Overview:** 37 | MyBlog.com utilizes AWS services to host static files, manage dynamic content, send personalized emails, and handle image uploads. The architecture leverages Amazon S3 for static content storage, Amazon CloudFront for global content delivery and caching, Amazon DynamoDB for dynamic data storage, AWS Lambda for serverless compute, and Amazon SES for email delivery. 38 | 39 | **Scalability and Global Reach:** 40 | - **Static Content Hosting:** All static files are hosted on Amazon S3, ensuring high availability and durability. 41 | - **Global Content Delivery:** CloudFront, a global content delivery network (CDN), is utilized to distribute content worldwide with low latency. CloudFront caches content from S3 and ensures rapid delivery to users across the globe. 42 | - **Caching:** By enabling caching at both the CloudFront and S3 levels, MyBlog.com optimizes content delivery and reduces latency for users accessing the platform from different regions. 43 | 44 | **Security Measures:** 45 | - **Access Control:** S3 bucket policies are configured to allow access only from CloudFront, ensuring that static content is not publicly accessible. 46 | - **Cross-Origin Resource Sharing (CORS):** CORS headers are added to allow secure cross-origin requests, enabling MyBlog.com to securely serve content to users from different domains. 47 | 48 | **User Engagement:** 49 | - **Welcome Emails:** Upon user registration, a welcome email is sent automatically. This is achieved by triggering a Lambda function via DynamoDB Streams, which then utilizes SES to send personalized welcome emails to new users. 50 | 51 | **Dynamic Content Management:** 52 | - **API Gateway and Lambda:** For dynamic content management, API Gateway is utilized to expose public APIs. These APIs trigger Lambda functions, which interact with DynamoDB to retrieve and update blog content. To enhance read performance, DynamoDB Accelerator (DAX) can be incorporated as a caching layer, providing faster access to frequently accessed data. 53 | 54 | **Image Management:** 55 | - **Image Uploads:** Images can be uploaded either directly to S3 or via CloudFront global distribution with Transfer Acceleration for faster uploads. Upon image upload, a Lambda function is triggered to process and handle the image, ensuring seamless integration with the blog platform. 56 | 57 | **Conclusion:** 58 | MyBlog.com offers a scalable, secure, and globally accessible platform for content delivery and user engagement. By leveraging AWS services such as S3, CloudFront, DynamoDB, Lambda, and SES, the architecture ensures high performance, reliability, and security while meeting the demands of a modern blog platform. With features like global content delivery, automated email notifications, and efficient image management, MyBlog.com delivers an exceptional user experience while maintaining stringent security measures. 59 | 60 | # Microservice Architecture 61 | Title: Enhancing Software Update Distribution in a Microservice Architecture with CloudFront 62 | 63 | **Introduction:** 64 | In a microservice architecture, efficient software update distribution is critical for maintaining system integrity and functionality. By leveraging AWS CloudFront, a content delivery network (CDN), we can optimize software update distribution, improve scalability, and reduce costs without significant architectural changes. 65 | 66 | **Architecture Overview:** 67 | The microservice architecture consists of multiple independent services communicating via REST APIs, ensuring loose coupling and scalability. For software update distribution, an application running on EC2 instances periodically distributes updates to clients. By integrating CloudFront in front of the existing load balancers, we can enhance the distribution of static software update files across the network. 68 | 69 | **CloudFront Integration:** 70 | - **Architecture Integration:** CloudFront is seamlessly integrated into the existing architecture by placing it in front of the load balancers responsible for distributing software updates. 71 | - **Static File Caching:** CloudFront caches the static software update files at the edge locations, reducing latency and enhancing download speeds for clients worldwide. 72 | - **Scalability and Cost Efficiency:** CloudFront's serverless nature ensures automatic scalability based on demand, alleviating the need for manual scaling of EC2 instances. This not only improves scalability but also significantly reduces costs associated with maintaining and scaling EC2 instances. 73 | - **Availability and Network Bandwidth:** By offloading software update distribution to CloudFront, availability is enhanced, and network bandwidth costs are minimized, as CloudFront efficiently delivers content from edge locations closer to the users. 74 | 75 | **Benefits:** 76 | - **Improved Scalability:** CloudFront's automatic scaling capabilities eliminate the need for manual scaling of EC2 instances, ensuring seamless distribution of software updates even during peak demand periods. 77 | - **Cost Savings:** By reducing reliance on EC2 instances for software update distribution, significant cost savings are achieved, both in terms of infrastructure and network bandwidth usage. 78 | - **Enhanced Availability:** CloudFront's global network of edge locations improves availability and reduces latency, ensuring faster and more reliable software update distribution to users worldwide. 79 | - **Simplified Management:** The integration of CloudFront requires minimal changes to the existing architecture, providing a straightforward and cost-effective solution for improving scalability and performance. 80 | 81 | **Conclusion:** 82 | Integrating AWS CloudFront into the microservice architecture for software update distribution offers numerous benefits, including improved scalability, cost savings, enhanced availability, and simplified management. By leveraging CloudFront's caching capabilities and global edge network, the distribution of static software update files becomes more efficient, scalable, and cost-effective, making it an ideal solution for optimizing software update distribution in microservice architectures. 83 | -------------------------------------------------------------------------------- /007-s3-basics.md: -------------------------------------------------------------------------------- 1 | # S3 2 | 3 | # S3 Basics 4 | 5 | Amazon S3 (Simple Storage Service) provides a highly scalable, durable, and secure storage solution, capable of growing to meet the needs of businesses of any size. Here are the key concepts and features: 6 | 7 | ### Key Features: 8 | - **Infinitely Scalable Storage:** S3 can grow as needed to store an unlimited amount of data. 9 | - **Backup and Storage:** Ideal for backing up data and storing large amounts of data. 10 | - **Disaster Recovery:** Allows for disaster recovery across different regions to ensure data availability and durability. 11 | - **Data Lakes:** Can be used to build data lakes for analytics and big data processing. 12 | 13 | ### Data Storage in S3: 14 | - **Buckets:** Data in S3 is stored in containers called buckets. Each bucket name must be globally unique. 15 | - **Objects:** Files stored in S3 are referred to as objects. Each object consists of the file data and metadata. 16 | - **Naming Conventions:** Bucket names must not contain uppercase letters or underscores. 17 | - **Keys:** Each object in S3 is identified by a unique key, which is essentially the full path to the object. The key is a combination of a prefix (similar to a directory path) and the object name. 18 | 19 | ### Object Details: 20 | - **Object Content:** The actual data stored in the object is known as the object value or body. 21 | - **Maximum Size:** A single object can be up to 5 TB in size. For objects larger than 5 GB, it is recommended to use multi-part upload. 22 | - **Versioning:** If versioning is enabled on a bucket, each object has a version ID, allowing you to keep multiple versions of the same object. 23 | 24 | ### Access and Security: 25 | - **Presigned URLs:** When you want to share an object with someone, you can generate a presigned URL. This URL includes temporary credentials and is only valid for a limited period, ensuring that access is controlled and secure. 26 | 27 | ### Additional Considerations: 28 | - **No Directories:** While the key structure might suggest directories, S3 does not have a true directory hierarchy. The key's prefix can be used to simulate directories for organizational purposes. 29 | - **Multipart Upload:** For large files exceeding 5 GB, S3 supports multipart upload, allowing you to upload parts of the file in parallel, improving efficiency and resilience. 30 | 31 | Amazon S3 is a powerful tool for managing and storing data at scale, with features designed to ensure data durability, availability, and security. 32 | 33 | # Amazon S3 Security 34 | 35 | ### Types of Policies: 36 | - **User-based Policies:** Managed through AWS IAM (Identity and Access Management) policies. These policies define permissions for users or roles. 37 | - **Resource-based Policies:** Include bucket policies and object access control lists (ACLs). 38 | - **Bucket Policies:** Commonly used to control access to an entire bucket. 39 | - **Object ACLs:** Fine-grained control over individual objects. 40 | 41 | ### Key Concepts: 42 | - **Principal:** The AWS account or user to which the policy is applied. 43 | - **Cross-Account Access:** To allow another AWS account to access your bucket, you must create a bucket policy that grants the necessary permissions. 44 | - **Public Access Settings:** Even if a bucket policy allows public access, the "Block all public access" setting in the S3 console must be disabled for the policy to take effect. 45 | - **Policy Scope:** A policy with `arn::eu-central-1::bucket-name/*` applies to all objects within the specified bucket. 46 | 47 | # S3 - Static Website Hosting 48 | 49 | - To host a static website on S3, the bucket must be made public, and the "Block all public access" setting must be disabled. 50 | 51 | # S3 - Versioning 52 | 53 | ### How Versioning Works: 54 | - **Bucket-Level Setting:** Versioning is enabled at the bucket level. 55 | - **Version Creation:** Uploading the same key multiple times creates new versions (e.g., version 1, 2, 3). 56 | - **Deletion Marker:** Deleting a file adds a deletion marker rather than removing the file. 57 | - **Easy Rollback:** Allows rolling back to previous versions of a file. 58 | - **Version Null:** Files uploaded before versioning was enabled have a version ID of null. 59 | - **Version-Specific Deletion:** To rollback, delete the specific version. Deleting a version removes the deletion marker, but the original object remains in the bucket. 60 | 61 | Amazon S3 provides robust security features, flexible website hosting options, and advanced versioning capabilities to manage and protect your data effectively. 62 | 63 | # S3 - Replication 64 | 65 | ### Types of Replication: 66 | - **CRR (Cross-Region Replication):** Replicates objects across different AWS regions. 67 | - **SRR (Same-Region Replication):** Replicates objects within the same AWS region. 68 | 69 | ### Key Requirements: 70 | - **Enable Versioning:** Versioning must be enabled on both the source and target buckets. 71 | - **Asynchronous Replication:** Replication occurs asynchronously, meaning there may be a delay before the replicated objects appear in the target bucket. 72 | - **IAM Permissions:** Proper IAM permissions must be granted to S3 to perform replication. 73 | 74 | ### Replication Details: 75 | - **New Objects:** Only new objects added to the source bucket are replicated automatically. 76 | - **Existing Objects:** To replicate existing objects, use the S3 Batch Replication feature. 77 | - **Delete Operations:** 78 | - Delete markers can be replicated from the source to the target bucket if configured. 79 | - By default, delete markers are not replicated, but this can be activated if needed. 80 | - Original objects that are deleted in the source bucket are not replicated to the target bucket. 81 | 82 | ### Replication Rules: 83 | - **No Chaining:** Replication does not chain. For example, if Bucket 1 replicates to Bucket 2, and Bucket 2 replicates to Bucket 3, changes in Bucket 1 will not automatically propagate to Bucket 3. 84 | - **Demo Replication Rules:** Create replication rules to specify how replication should occur, including which objects to replicate and where to replicate them. 85 | 86 | ### Versioning and Replication: 87 | - **Version IDs:** The version IDs of objects are replicated along with the objects. 88 | - **Delete Markers:** By default, delete markers are not replicated, but this can be configured. 89 | 90 | ### Additional Notes: 91 | - **Replication Activation:** Replication functionality only works if versioning is enabled on the buckets involved. 92 | 93 | Amazon S3 replication provides a robust solution for copying objects within or across regions, ensuring data availability, redundancy, and compliance with regulatory requirements. 94 | 95 | # S3 Storage Classes 96 | 97 | Amazon S3 offers a variety of storage classes designed to accommodate different use cases, access patterns, and cost requirements. Below are the main storage classes available in S3: 98 | 99 | ### Storage Classes: 100 | 101 | 1. **Standard General Purpose:** 102 | - **Description:** For frequently accessed data. 103 | - **Features:** Low latency, high throughput, can sustain 2 concurrent facility failures. 104 | - **Use Cases:** Big data analytics, mobile gaming, content distribution. 105 | - **Durability:** 99.999999999% (11 nines). 106 | - **Availability:** 99.99%. 107 | 108 | 2. **Standard Infrequent Access (IA):** 109 | - **Description:** For infrequently accessed data that still requires rapid access. 110 | - **Features:** Lower cost than Standard, but with slightly higher retrieval costs. 111 | - **Use Cases:** Disaster recovery, backups. 112 | - **Durability:** 99.999999999% (11 nines). 113 | - **Availability:** 99.9%. 114 | 115 | 3. **One-Zone Infrequent Access:** 116 | - **Description:** High durability within a single Availability Zone. 117 | - **Features:** Lower cost, but data is lost if the AZ is destroyed. 118 | - **Use Cases:** Secondary backups of on-premise data, data that can be easily recreated. 119 | - **Durability:** 99.999999999% (11 nines) in a single AZ. 120 | - **Availability:** 99.5%. 121 | 122 | 4. **Glacier Instant Retrieval:** 123 | - **Description:** For archived data that requires milliseconds retrieval time. 124 | - **Features:** Great for data accessed once a quarter, minimum storage duration of 90 days. 125 | - **Use Cases:** Archiving data with occasional access. 126 | - **Durability:** 99.999999999% (11 nines). 127 | 128 | 5. **Glacier Flexible Retrieval:** 129 | - **Description:** For long-term archive with flexible retrieval options. 130 | - **Features:** 131 | - Expedited retrieval: 1 to 5 minutes. 132 | - Standard retrieval: 3 to 5 hours. 133 | - Bulk retrieval: 5 to 12 hours. 134 | - Minimum storage duration: 90 days. 135 | - **Use Cases:** Archiving data with less frequent access needs. 136 | - **Durability:** 99.999999999% (11 nines). 137 | 138 | 6. **Glacier Deep Archive:** 139 | - **Description:** Lowest-cost storage class for archiving data that rarely needs to be accessed. 140 | - **Features:** 141 | - Standard retrieval: 12 hours. 142 | - Bulk retrieval: 48 hours. 143 | - Minimum storage duration: 180 days. 144 | - **Use Cases:** Long-term data archiving. 145 | - **Durability:** 99.999999999% (11 nines). 146 | 147 | 7. **Intelligent Tiering:** 148 | - **Description:** Automatically moves objects between different access tiers based on changing access patterns. 149 | - **Features:** No retrieval charges in S3 Intelligent-Tiering. 150 | - **Frequent Access Tier:** Default tier for frequently accessed data. 151 | - **Infrequent Access Tier:** For data not accessed for 30 days. 152 | - **Archive Instant Access Tier:** For data not accessed for 90 days. 153 | - **Archive Access Tier:** Optional tier for data not accessed for 90-700+ days. 154 | - **Deep Archive Access Tier:** Optional tier for data not accessed for 180-700+ days. 155 | - **Use Cases:** Data with unpredictable access patterns. 156 | 157 | ### Moving Objects Between Storage Classes: 158 | 159 | - **Lifecycle Rules:** You can set up lifecycle rules to automatically transition objects between different storage classes based on specified conditions. 160 | - **Manual Transfer:** Objects can be manually moved between storage classes when created or using the S3 lifecycle configuration. 161 | 162 | ### Definitions: 163 | 164 | - **Durability:** Measures how reliably data can be stored, typically resulting in an average loss of an object once in 10,000 years. 165 | - **Availability:** Measures how readily available a service is for use. 166 | 167 | Amazon S3's variety of storage classes and flexible management options ensure that you can optimize your storage costs and performance according to your specific needs. 168 | -------------------------------------------------------------------------------- /008-s3-advanced.md: -------------------------------------------------------------------------------- 1 | # S3 Advanced 2 | 3 | ## S3 Lifecycle Rules 4 | 5 | S3 Lifecycle rules allow you to automate the management of your objects to optimize storage costs. Here are the key components and functionalities: 6 | 7 | ### Transition Actions: 8 | - **Description:** Automatically move objects to a different storage class after a specified number of days since creation. 9 | - **Example:** Move objects to a cheaper storage class (e.g., from Standard to Standard-IA) 60 days after creation. 10 | 11 | ### Expiration Actions: 12 | - **Description:** Automatically delete objects after they have been stored for a specified period. 13 | - **Example:** Configure objects to be deleted 365 days after their creation date to manage storage costs and comply with data retention policies. 14 | 15 | ### Scope of Lifecycle Rules: 16 | - **Tags:** Apply lifecycle rules to objects with specific tags. 17 | - **Buckets:** Apply lifecycle rules to all objects within a bucket. 18 | - **Object Names:** Apply lifecycle rules to objects with specific name patterns (prefixes). 19 | 20 | ### Versioning: 21 | - **Requirement:** To recover deleted objects using lifecycle rules, versioning must be enabled on the bucket. 22 | - **Benefits:** Versioning allows you to retain and restore previous versions of objects, adding an additional layer of data protection. 23 | 24 | ### S3 Analytics: 25 | - **Description:** Provides insights and recommendations on how to optimize storage costs by analyzing access patterns and usage. 26 | - **Functionality:** Generates reports that help identify objects that are candidates for transitioning to more cost-effective storage classes. 27 | 28 | ## Summary 29 | 30 | Amazon S3 provides advanced features like Lifecycle rules and S3 Analytics to help manage storage costs and efficiency: 31 | 32 | - **Lifecycle Rules:** Automate transitions and deletions based on object age, tags, buckets, or object names. 33 | - **Versioning:** Essential for recovering objects that have been transitioned or expired. 34 | - **S3 Analytics:** Offers reports and recommendations to optimize storage costs by suggesting transitions based on usage patterns. 35 | 36 | These advanced features enable efficient and cost-effective storage management in S3. 37 | 38 | # S3 Request Pays 39 | 40 | In traditional S3 setups, the bucket owner bears all the costs associated with storage and operations within the bucket. However, with the introduction of Requester Pays, the dynamics change: 41 | 42 | ### Key Points: 43 | 44 | - **Cost Responsibility:** With Requester Pays, the individual or entity making the data download request pays for the associated data transfer costs, such as data egress charges. 45 | 46 | - **Bucket Owner Responsibilities:** The bucket owner continues to cover all storage costs and other charges associated with the bucket itself. 47 | 48 | - **Authentication Requirement:** To initiate a Requester Pays download, the requester must be authenticated with AWS, ensuring accountability and security. 49 | 50 | Requester Pays is particularly useful in scenarios where data access is initiated by parties external to the bucket owner, such as sharing data with external collaborators or providing access to publicly available datasets. It helps distribute the cost burden more equitably among users accessing the data. 51 | 52 | # S3 Event Notifications 53 | 54 | S3 Event Notifications enable you to automate workflows and trigger actions in response to specific events that occur within your S3 bucket. Here are the key components and functionalities: 55 | 56 | ### Event Types: 57 | - **s:ObjectCreated:** Triggered when a new object is created in the bucket. 58 | - **Filtering:** You can filter events based on criteria such as file extensions (*.jpg) to only trigger actions for specific types of objects. 59 | 60 | ### Use Cases: 61 | - **Thumbnail Generation:** Automatically generate thumbnails of images upon upload to S3. 62 | - **Workflow Automation:** Trigger downstream processes such as data processing, analysis, or archival based on object creation events. 63 | 64 | ### Delivery Speed: 65 | - **Real-time Delivery:** Events are delivered within seconds of the triggering action, ensuring timely responsiveness to changes in the S3 bucket. 66 | 67 | ### Resource Access Policy: 68 | - **SNS Resource Access Policy:** To send event notifications to other AWS services like SNS (Simple Notification Service), SQS (Simple Queue Service), or Lambda, you need to attach an appropriate resource access policy to your S3 bucket. 69 | - **Access Policy vs. IAM Roles:** Instead of using IAM roles, S3 event notifications rely on access policies. You modify the access policy on the target (e.g., SNS, SQS, Lambda) to grant permissions for S3 to send events to those services. 70 | 71 | ### Integration Options: 72 | - **SNS, SQS, Lambda:** Common targets for S3 event notifications. You can configure S3 to send notifications directly to these services, triggering custom actions or workflows. 73 | - **EventBridge (formerly CloudWatch Events):** Alternatively, you can route all S3 events to EventBridge and set up rules within EventBridge to trigger actions across a wide range of AWS services, providing more centralized event management and routing capabilities. 74 | 75 | S3 Event Notifications offer a powerful mechanism for automating processes and reacting to changes within your S3 buckets, enabling seamless integration and workflow automation across your AWS environment. 76 | 77 | # S3 Performance 78 | 79 | Amazon S3 offers robust performance capabilities to handle a high volume of requests efficiently. Here are some key aspects of S3's performance: 80 | 81 | ### Scalability: 82 | - **High Request Rate:** S3 can handle a high number of requests, typically responding within 100-200 milliseconds. 83 | - **Concurrency:** Supports a large number of concurrent requests, allowing for high throughput operations. 84 | 85 | ### Request Limits: 86 | - **PUT Requests:** Up to 3500 PUT requests per second per prefix in a bucket. 87 | - **GET Requests:** Up to 5500 GET requests per second per prefix in a bucket. 88 | - *Note:* The prefix is the part of the object key before the first slash ("/") in the object's key name. For example, in "bucket1/file", the prefix is "bucket1". 89 | 90 | ### Best Practices for Large Files: 91 | - **Multi-part Upload:** Recommended for files larger than 100 MB and required for files larger than 5 GB. 92 | - *Benefits:* Improves reliability, efficiency, and speed of uploads for large files by breaking them into smaller parts. 93 | 94 | S3's performance capabilities make it suitable for a wide range of use cases, from small object storage to handling large-scale data transfers and storage needs. By leveraging multi-part upload and understanding request limits, you can optimize the performance of your S3 operations effectively. 95 | 96 | # S3 Transfer Acceleration 97 | 98 | S3 Transfer Acceleration enhances data transfer speed by leveraging AWS edge locations as intermediate points. Here's how it works: 99 | 100 | - **Increased Speed:** Files are initially transferred to the nearest AWS edge location, which then forwards the data to the designated S3 bucket in the target region. 101 | - **Multi-part Upload Compatibility:** S3 Transfer Acceleration is compatible with multi-part uploads, allowing for large file uploads to be accelerated as well. 102 | - **Edge Locations:** With over 200 edge locations globally, data can be quickly transferred to and from these points, significantly reducing latency and improving transfer speeds compared to traditional transfer methods. 103 | 104 | This feature is particularly useful for scenarios where data needs to be transferred quickly across regions or when dealing with large files that can benefit from accelerated upload speeds. 105 | 106 | # S3 Byte-Range Fetches 107 | 108 | S3 Byte-Range Fetches enable the parallelization of GET requests by requesting specific byte ranges of a file. Here are its key features: 109 | 110 | - **Improved Resilience:** By fetching data in parallel and requesting specific byte ranges, byte-range fetches provide better resilience in case of network failures or interruptions during the download process. 111 | - **Speed Optimization:** Byte-range fetches can be used to speed up downloads by fetching different parts of the file in parallel, maximizing bandwidth usage and reducing overall download time. 112 | - **Partial Data Retrieval:** Additionally, byte-range fetches allow for retrieving only specific portions of a file, such as the head or tail, rather than downloading the entire file. This can be useful for scenarios where only a subset of the data is required. 113 | 114 | By leveraging byte-range fetches, users can optimize download performance, increase resilience, and efficiently retrieve partial data from S3 objects, enhancing overall data access capabilities within the S3 ecosystem. 115 | 116 | # S3 Select & Glacier Select 117 | 118 | S3 Select and Glacier Select offer powerful capabilities for querying and filtering data stored in Amazon S3 and Glacier using SQL expressions. Here's what you need to know: 119 | 120 | - **SQL Queries:** These services allow you to perform server-side filtering using simple SQL statements, enabling you to filter data by rows and columns. 121 | - **Reduced Network Transfer:** By performing filtering on the server-side, only the selected data is transferred over the network, reducing both network transfer costs and client-side CPU usage. 122 | - **Efficiency:** Server-side processing reduces the need to transfer and process large volumes of data locally, leading to faster query execution times and improved efficiency. 123 | 124 | These features are particularly useful for applications that require querying large datasets stored in S3 or Glacier, allowing for efficient and cost-effective data retrieval and analysis. 125 | 126 | # S3 Batch Operations 127 | 128 | S3 Batch Operations provide a way to perform bulk operations on existing S3 objects, enabling various management and optimization tasks: 129 | 130 | - **Bulk Operations:** Perform actions such as encryption of unencrypted objects, modification of ACLs, restoration of objects from S3 Glacier, modification of object metadata, and invocation of Lambda functions for custom actions on each object. 131 | - **Efficiency:** S3 Batch Operations allow you to process large numbers of objects in parallel, improving efficiency and reducing the time required to perform bulk tasks. 132 | - **Integration with S3 Inventory and Select:** Utilize S3 Inventory to get a list of objects and S3 Select to filter objects before applying batch operations, enhancing flexibility and control over data management tasks. 133 | 134 | These capabilities streamline data management workflows and enable efficient batch processing of objects within Amazon S3, enhancing overall operational efficiency and scalability. 135 | 136 | # S3 Storage Lens 137 | 138 | S3 Storage Lens provides comprehensive insights and analytics to understand, analyze, and optimize storage across your entire AWS organization: 139 | 140 | - **Default and Custom Dashboards:** Access default dashboards or create custom dashboards tailored to your specific needs and preferences. 141 | - **Anomaly Detection:** Discover anomalies and unusual patterns in your storage usage, enabling proactive management and optimization of resources. 142 | - **Metrics:** Storage Lens provides a wide range of metrics, including general insights about storage, storage bytes, object count, cost optimization metrics, data protection metrics, access management metrics, and event metrics. 143 | 144 | By leveraging S3 Storage Lens, organizations can gain valuable insights into their storage usage, optimize costs, improve data protection, and enhance overall storage management practices across their AWS environment. 145 | -------------------------------------------------------------------------------- /006-architecture-use-cases.md: -------------------------------------------------------------------------------- 1 | # Classic Solutions Architecture Decisions 2 | 3 | # WhatsTheTime.com Example (stateless) 4 | 5 | In the bustling world of online services, ensuring seamless user experiences while managing dynamic traffic loads is paramount. Enter WhatsTheTime.com, a hypothetical service aiming to provide accurate timekeeping information to users worldwide. To achieve this goal with optimal efficiency and reliability, WhatsTheTime.com adopts an AWS architecture, meticulously designed to handle the challenges of scalability, availability, and performance. 6 | 7 | At the heart of this architecture lies the use of Elastic IP addresses, offering a stable and consistent point of access for users. This choice ensures that regardless of fluctuations in underlying infrastructure, clients can reliably connect to the service. However, the AWS platform imposes a restriction of 5 Elastic IPs per region, prompting careful consideration of resource allocation and scalability planning from the outset. 8 | 9 | Recognizing the need to scale dynamically in response to fluctuating demand, the architects behind WhatsTheTime.com integrate a load balancer into the system. This load balancer serves as a gateway, intelligently distributing incoming traffic across a fleet of EC2 instances. By doing so, it not only optimizes resource utilization but also mitigates the risk of overloading individual instances, thus safeguarding against performance bottlenecks and downtime. 10 | 11 | Yet, the introduction of a load balancer alone is not sufficient to address the complexities of dynamic scaling. To truly embrace the elasticity offered by cloud computing, WhatsTheTime.com harnesses the power of AWS Auto Scaling. By configuring Auto Scaling groups behind the load balancer, the system gains the ability to autonomously adjust the number of EC2 instances based on predefined criteria, such as CPU utilization or network traffic. This capability empowers WhatsTheTime.com to gracefully handle sudden surges in user activity without manual intervention, ensuring a responsive and reliable experience for all users. 12 | 13 | But resilience goes beyond mere scalability; it encompasses the ability to withstand and recover from failures gracefully. To this end, the architects adopt a multi-AZ deployment strategy. Load balancers are architected to span multiple availability zones, thereby distributing traffic across geographically distinct data centers. Similarly, Auto Scaling groups are configured to launch instances across multiple availability zones, reducing the risk of service disruption in the event of a localized outage. This redundant infrastructure not only enhances fault tolerance but also instills confidence in users, knowing that WhatsTheTime.com is built to withstand unforeseen challenges. 14 | 15 | In conclusion, the architecture of WhatsTheTime.com on AWS exemplifies a harmonious blend of scalability, availability, and resilience. By leveraging Elastic IPs, load balancers, Auto Scaling, and multi-AZ deployments, the service achieves its mission of delivering accurate timekeeping information to users worldwide, all while adapting dynamically to evolving demands and ensuring uninterrupted access. In the ever-evolving landscape of online services, such architectural foresight proves indispensable, laying the foundation for a robust and reliable user experience. 16 | 17 | # MyClothes.com Example (stateful) 18 | 19 | In the realm of e-commerce, where user sessions are pivotal and data integrity is paramount, MyClothes.com emerges as a prime example of leveraging AWS architecture to seamlessly blend stateful functionality with scalability and security. Through a carefully orchestrated ensemble of services, MyClothes.com not only delivers a personalized shopping experience but also ensures robustness and resilience in the face of dynamic demand. 20 | 21 | At the core of MyClothes.com's architecture lies a commitment to multi-AZ deployments for both load balancing and Auto Scaling groups. This strategic choice not only distributes traffic across geographically dispersed data centers but also safeguards against single points of failure, bolstering the platform's availability and fault tolerance. 22 | 23 | To maintain session continuity and enhance user experience, MyClothes.com adopts Elastic Load Balancer (ELB) stickiness. By ensuring that each user request is directed to the same instance, session affinity is preserved, enabling seamless interactions and uninterrupted shopping sessions. This meticulous attention to detail underscores MyClothes.com's dedication to providing a cohesive and intuitive user experience. 24 | 25 | However, the challenges of stateful session management extend beyond mere load balancing. Recognizing the need to securely store and manage user session data, MyClothes.com transitions from traditional cookie-based approaches to a more robust solution. By utilizing Elasticache, an in-memory caching service, in conjunction with session IDs, MyClothes.com achieves a higher level of security and scalability. User session data is securely stored and retrieved from the Elasticache cluster, mitigating the risk of unauthorized access or tampering while optimizing performance and scalability. 26 | 27 | But MyClothes.com's commitment to data integrity extends beyond session management. By incorporating Amazon RDS into the architecture, the platform ensures persistent storage of user data, including shopping cart contents and preferences. Leveraging RDS's scalability features, such as read replicas, MyClothes.com can effortlessly scale read operations to meet growing demand, providing users with real-time access to their data without compromising performance. 28 | 29 | Moreover, by strategically configuring Elasticache to offload read-heavy workloads from RDS, MyClothes.com optimizes resource utilization and minimizes database latency, further enhancing the platform's responsiveness and scalability. Additionally, by deploying RDS and Elasticache in multi-AZ configurations, MyClothes.com fortifies its infrastructure against potential AZ failures, ensuring continuity of service and data integrity. 30 | 31 | Innovation remains at the forefront of MyClothes.com's architectural decisions. While Elasticache serves as a robust solution for session management, the platform remains open to alternative technologies. DynamoDB, with its seamless scalability and low-latency performance, stands as a viable alternative to Elasticache, offering MyClothes.com flexibility and choice in designing its stateful architecture. 32 | 33 | In summary, MyClothes.com's AWS architecture exemplifies a delicate balance between stateful functionality and scalability, underpinned by a commitment to security, reliability, and innovation. By harnessing the capabilities of Elasticache, RDS, and DynamoDB, MyClothes.com delivers a personalized and responsive shopping experience while ensuring the integrity and security of user data. In an era where user expectations are ever-evolving, MyClothes.com stands as a beacon of excellence, setting the standard for stateful e-commerce architectures on the AWS platform. 34 | 35 | # MyWordpress.com (stateful) 36 | 37 | In the realm of content management and blogging, MyWordpress.com stands as a beacon of creativity and expression, leveraging AWS architecture to seamlessly blend stateful functionality with scalability and reliability. With a focus on ensuring data integrity, high availability, and optimal performance, MyWordpress.com adopts a strategic approach to architecture design, addressing the challenges of storing dynamic content, such as images, while maintaining consistency across multiple instances. 38 | 39 | Central to MyWordpress.com's architecture is the adoption of multi-AZ RDS with Aurora and Read Replicas. By leveraging Amazon Aurora's high-performance, scalable database engine, MyWordpress.com ensures that its data is replicated across multiple availability zones, providing resilience against hardware failures and minimizing downtime. This strategic choice not only enhances data durability but also supports read scalability, enabling MyWordpress.com to handle increasing traffic and complex queries with ease. 40 | 41 | However, the challenges of stateful content management extend beyond database replication. MyWordpress.com recognizes the need to efficiently store and access dynamic content, such as images, across multiple EC2 instances. While connecting EC2 instances to EBS volumes initially seems like a viable solution, the inherent limitations become apparent when considering load balancing. With each EC2 instance potentially accessing different EBS volumes, inconsistencies may arise, leading to missing images or data discrepancies. 42 | 43 | To overcome this challenge, MyWordpress.com embraces Amazon EFS (Elastic File System) as a centralized storage solution for dynamic content. By mounting EFS to multiple EC2 instances, MyWordpress.com ensures that all instances have access to a shared file system, eliminating the risk of data inconsistencies and simplifying content management. However, it's essential to note that mounting EFS requires Elastic Network Interfaces (ENIs), necessitating careful network configuration to ensure seamless integration with EC2 instances. 44 | 45 | In summary, MyWordpress.com's AWS architecture exemplifies a thoughtful balance between stateful content management and scalability, prioritizing data integrity, high availability, and performance. Through the adoption of multi-AZ RDS with Aurora and Read Replicas, MyWordpress.com ensures robust database replication and scalability, while leveraging Amazon EFS for centralized storage of dynamic content. By addressing the complexities of stateful content management with innovative solutions, MyWordpress.com sets the standard for reliable and scalable WordPress hosting on the AWS platform, empowering users to create and share their stories with confidence. 46 | 47 | # Instantiating applications quickly 48 | 49 | Summary: 50 | Instantiating applications quickly involves efficient methods for installing and deploying applications. Key points include: 51 | 52 | 1. **Golden AMI**: A pre-configured Amazon Machine Image (AMI) that can be used to launch instances quickly. 53 | 2. **User Data**: Can be used to bootstrap applications during instance launch, although this method may be slower compared to using a Golden AMI. 54 | 3. **Hybrid Approach**: Combining Golden AMI and User Data for optimal deployment speed and customization. 55 | 4. **RDS Snapshots**: Restoring databases from snapshots is preferred over manual inserts for faster deployment and consistency. 56 | 5. **EBS Volumes**: Can be restored from snapshots to expedite the deployment process. 57 | 58 | # Beanstalk Overview 59 | 60 | Summary: 61 | Beanstalk Overview: 62 | 63 | 1. **Managed Service**: AWS Elastic Beanstalk is a managed service that automates various tasks including capacity provisioning, load balancing, scaling, and app health monitoring. 64 | 2. **Developer Perspective**: Developers only need to focus on shipping their code, while Beanstalk handles the underlying infrastructure. 65 | 3. **Configuration Control**: While Beanstalk manages many aspects, developers still retain full control over configuration. 66 | 4. **Cost Structure**: The service itself is free, but users pay for the resources utilized. 67 | 5. **Environment Tiers**: Beanstalk offers two environment tiers: web server and worker. 68 | 6. **Deployment Process**: Involves creating an application, uploading a version of the code, and launching an environment. 69 | 7. **Web Server Environment**: Utilizes Elastic Load Balancer (ELB) to EC2 instances, where these instances serve as web servers. 70 | 8. **Worker Environment**: Utilizes Amazon Simple Queue Service (SQS) to EC2 instances, where instances function as workers. 71 | 9. **High Availability**: Options include deploying a single instance or setting up high availability with load balancers. RDS databases can also be connected for additional functionality. 72 | -------------------------------------------------------------------------------- /009-s3-security.md: -------------------------------------------------------------------------------- 1 | # S3 Security 2 | 3 | # S3 Object Encryption 4 | 5 | S3 offers several options for encrypting your data to ensure its confidentiality and integrity: 6 | 7 | ### Server-Side Encryption (SSE-S3): 8 | - **Description:** Encrypts objects using keys managed by AWS. 9 | - **Encryption Standard:** AES-256 encryption. 10 | - **Default Encryption:** Enabled by default for both buckets and objects. 11 | - **Usage:** Specify the header `x-amz-server-side-encryption: aws:s3` when uploading objects to enable SSE-S3 encryption. 12 | 13 | ### Server-Side Encryption with AWS Key Management Service (SSE-KMS): 14 | - **Description:** Uses keys managed in AWS Key Management Service (KMS) to encrypt objects. 15 | - **Logging:** Every use of the KMS key is logged in AWS CloudTrail. 16 | - **Usage:** Specify the header `x-amz-server-side-encryption: aws:kms` when uploading objects. Access to the KMS service is required to read the encrypted objects. 17 | 18 | ### Server-Side Encryption with Customer-Provided Keys (SSE-C): 19 | - **Description:** Allows you to use keys managed outside of AWS to encrypt objects. 20 | - **Key Management:** S3 does not store the encryption key provided by the customer. 21 | - **Usage:** Encryption key must be provided in the HTTP headers for each request. HTTPS must be used for encryption in transit. 22 | 23 | ### Client-Side Encryption: 24 | - **Description:** Client encrypts data and keys before sending them to S3. 25 | - **Key Management:** Customers fully manage encryption keys and data encryption process. 26 | - **Usage:** Encrypted data is sent to S3, and clients must decrypt data themselves when retrieving it. 27 | 28 | ### Encryption In-Transit (TLS): 29 | - **Description:** Data is encrypted during transit using HTTPS endpoints. 30 | - **Usage:** HTTPS must be used, especially when using SSE-C to prevent data exposure over unencrypted channels. 31 | 32 | ### Default Encryption vs. Bucket Policy: 33 | - **Default Encryption:** All objects are encrypted by default. 34 | - **Bucket Policy:** You can enforce encryption by specifying encryption headers in the bucket policy. Bucket policies are evaluated before default encryption. 35 | 36 | By leveraging S3's encryption options, you can ensure that your data remains secure both at rest and in transit, meeting compliance requirements and protecting sensitive information from unauthorized access. 37 | 38 | # S3 - CORS (Cross-Origin Resource Sharing) 39 | 40 | Cross-Origin Resource Sharing (CORS) is a mechanism that allows web applications running in one domain to request resources from another domain. Here's what you need to know about CORS in the context of Amazon S3: 41 | 42 | ### CORS Basics: 43 | - **Same Origin:** Requests from the same origin (e.g., tarasowski.de/app1 and tarasowski.de/app2) do not require CORS headers. 44 | - **Different Origin:** Requests from different origins (e.g., tarasowski.de and google.com) require CORS headers to be enabled on the server to allow cross-origin requests. 45 | 46 | ### Preflight Requests: 47 | - When making a cross-origin request, the browser first sends an OPTIONS preflight request to the server to check if the cross-origin resource sharing is allowed. 48 | - The preflight request includes CORS headers such as `Origin` to indicate the origin of the request. 49 | 50 | ### CORS Configuration in S3: 51 | - To allow cross-origin requests to your S3 bucket, you need to enable CORS headers on the bucket. 52 | - The CORS configuration specifies which origins are allowed to access the resources in the bucket and what HTTP methods are allowed. 53 | 54 | ### Example CORS Configuration: 55 | ```xml 56 | 57 | 58 | * 59 | GET 60 | 3000 61 | Authorization 62 | 63 | 64 | ``` 65 | 66 | ### Activating CORS in S3: 67 | - CORS settings can be configured directly from the S3 Management Console or programmatically using AWS SDKs or CLI. 68 | - Once configured, S3 will include the appropriate CORS headers in responses to cross-origin requests, allowing the browser to proceed with the request if the CORS policy permits it. 69 | 70 | By correctly configuring CORS headers in your S3 bucket, you can ensure that cross-origin requests are handled securely and effectively, enabling seamless integration with web applications across different domains. 71 | 72 | # S3 MFA Delete 73 | 74 | S3 Multi-Factor Authentication (MFA) Delete adds an extra layer of security by requiring users to generate a code using their MFA device before performing critical operations on S3, such as permanently deleting objects or suspending versioning on the bucket. Here's what you need to know: 75 | 76 | - **Purpose:** Helps prevent accidental or unauthorized deletions of objects or changes to bucket settings by requiring additional authentication. 77 | - **Operations Requiring MFA Delete:** 78 | - Permanent deletion of objects. 79 | - Suspending versioning on the bucket. 80 | - **Prerequisites:** MFA Delete requires versioning to be enabled on the bucket. 81 | - **Owner Access:** Only the bucket owner (root account) can enable or disable MFA Delete for the bucket. 82 | 83 | # S3 Access Logs 84 | 85 | S3 Access Logs enable you to monitor and track all requests made to your S3 bucket by storing detailed log files in another designated bucket. Here's what you need to know: 86 | 87 | - **Logging:** Each request made to the bucket is logged as a file in the designated logging bucket. 88 | - **Same Region Requirement:** The logging bucket must be in the same AWS region as the source bucket to avoid issues and ensure efficient logging. 89 | - **Avoid Logging Loop:** It's crucial to avoid configuring the same bucket as the logging destination to prevent a logging loop. 90 | - **Log Analysis:** Log files are stored in a structured format and can be easily analyzed using services like Amazon Athena to gain insights into bucket usage, access patterns, and potential security issues. 91 | 92 | # S3 Pre-signed URLs 93 | 94 | S3 Pre-signed URLs provide temporary access to specific objects in your S3 bucket, allowing users to perform specific actions without requiring permanent credentials or direct access to the bucket. Here's what you need to know: 95 | 96 | - **Generation Methods:** Pre-signed URLs can be generated using the S3 console, CLI, or SDKs. 97 | - **Expiration:** URLs have a limited lifespan, ranging from minutes to hours, depending on how they are generated. 98 | - **Use Cases:** Pre-signed URLs are useful for scenarios such as: 99 | - Allowing logged-in users to download premium content. 100 | - Temporarily granting a user the ability to upload a file to a specific location in the bucket. 101 | - **Shareability:** Pre-signed URLs can be easily shared from the S3 console or generated programmatically using CLI commands or SDK functions. 102 | 103 | By leveraging MFA Delete, Access Logs, and Pre-signed URLs, you can enhance the security, monitoring, and access control capabilities of your S3 buckets, ensuring that your data remains protected and accessible as needed. 104 | 105 | # S3 Glacier Vault Lock 106 | 107 | S3 Glacier Vault Lock enables the adoption of a WORM (Write Once Read Many) model for Glacier vaults, providing immutable storage for your data. Here's what you need to know: 108 | 109 | - **WORM Model:** The WORM model ensures that once data is written to the vault, it cannot be modified or deleted. 110 | - **Vault Lock Policy:** Create a vault lock policy to define the lock configuration for the vault, including retention settings. 111 | - **Immutable Policy:** Once the vault lock policy is applied, it cannot be modified or deleted, ensuring that the data remains immutable for the specified retention period. 112 | - **Bucket-Level Lock:** Vault lock policies are applied at the bucket level, providing granular control over data retention and immutability. 113 | 114 | # S3 Object Lock 115 | 116 | S3 Object Lock provides a mechanism to enforce retention periods and legal holds on S3 objects, ensuring data integrity and compliance. Here's what you need to know: 117 | 118 | - **Versioning Requirement:** Object Lock requires versioning to be enabled on the bucket. 119 | - **Retention Policies:** 120 | - **Compliance Mode:** Prevents object deletion for a specified retention period. Once set, retention cannot be shortened. 121 | - **Governance Mode:** Allows specified users to change retention settings and delete objects before the retention period expires. 122 | - **Retention Period:** Specify a retention period for objects, ensuring they cannot be deleted until the period elapses. 123 | - **Legal Hold:** Protect objects indefinitely by applying a legal hold, preventing them from being deleted until the hold is removed. 124 | - **IAM Permissions:** Users with appropriate IAM permissions can manage legal holds and retention settings, ensuring proper governance and compliance. 125 | 126 | By leveraging S3 Glacier Vault Lock and S3 Object Lock, you can enforce data immutability, retention policies, and legal holds, ensuring that your data remains secure, compliant, and tamper-proof. 127 | 128 | # S3 Access Points 129 | 130 | S3 Access Points provide a way to enforce granular access controls and permissions for specific prefixes within your S3 buckets. Here's how you can utilize them: 131 | 132 | - **Access Point Creation:** Create separate access points for different departments or use cases, such as a "finance" access point for the `/finance` prefix and a "sales" access point for the `/sales` prefix. 133 | - **Permission Configuration:** Configure permissions for each access point to control read and write access to the designated prefixes. 134 | - **IAM Integration:** Users with IAM permissions can connect to specific access points based on their access requirements, ensuring least privilege access. 135 | - **DNS Name:** Each access point is assigned its own DNS name, making it easy to identify and connect to. 136 | 137 | # S3 Access Point VPC Origin 138 | 139 | S3 Access Point VPC Origin allows you to restrict access to an access point so that it can only be accessed from within a specified VPC. Here's how you can set it up: 140 | 141 | - **VPC Configuration:** Define the VPC from which the access point should be accessible. 142 | - **VPC Endpoint Creation:** Create a VPC endpoint (gateway) to access the access point from within the VPC. 143 | - **Endpoint Policy:** Ensure that the VPC endpoint policy allows access to both the target bucket and the specific access point, allowing traffic to flow between the VPC and the access point securely. 144 | 145 | By leveraging S3 Access Points and VPC Origin configurations, you can enforce fine-grained access controls and restrict access to your S3 data based on specific criteria, enhancing security and compliance within your AWS environment. 146 | 147 | # S3 Object Lambda 148 | 149 | S3 Object Lambda introduces a powerful capability where you can dynamically process and manipulate data stored in S3 buckets on the fly. Here's how it works: 150 | 151 | - **Bucket Setup:** You have an S3 bucket configured with access points. 152 | - **Access Points:** Each access point is associated with a Lambda function. 153 | - **Data Processing:** When a client retrieves data from the bucket via an access point, the request is intercepted by the associated Lambda function. 154 | - **Data Manipulation:** The Lambda function performs dynamic data manipulation or processing before returning the modified data to the client. 155 | - **Use Cases:** 156 | - Redacting Personally Identifiable Information (PII) from files. 157 | - Converting data formats (e.g., XML to JSON) in real-time. 158 | - Resizing and watermarking images on the fly. 159 | - **Flexibility:** Object Lambda provides flexibility in data processing, enabling you to tailor the manipulation logic according to your specific requirements. 160 | 161 | By leveraging S3 Object Lambda, you can implement dynamic data transformations and processing directly within your S3 infrastructure, offering enhanced flexibility and efficiency for various use cases, including data security, format conversion, and image processing. 162 | -------------------------------------------------------------------------------- /001-ec2-fundamentals.md: -------------------------------------------------------------------------------- 1 | # EC2 Fundamentals 2 | 3 | # Introduction EC2 4 | 5 | Amazon Web Services (AWS) offers a wide range of EC2 instance types tailored to different workload requirements. Each instance type is categorized based on its class, generation, and size, providing users with flexibility and scalability. The m5.2xlarge instance type, for example, falls under the "general" class, fifth generation, and is of extra-large size within its class. 6 | 7 | **More Information:** 8 | 9 | 1. **Instance Class:** 10 | - **General Purpose (m):** Designed to provide a balance between compute, memory, and networking resources. Suitable for a variety of workloads, including web servers, small databases, and development environments. 11 | - **Compute Optimized (c):** Optimized for tasks requiring high computational power. Ideal for batch processing, media transcoding, dedicated game servers, and high-performance computing (HPC) applications. 12 | - **Memory Optimized (r):** Focused on delivering high memory capacity for memory-intensive applications such as large-scale databases, in-memory databases, and applications requiring real-time processing. 13 | - **Accelerated Computing (e.g., p, g, f, inf):** Designed to leverage specialized hardware accelerators such as GPUs, FPGAs, and inference chips for tasks like machine learning inference, graphics rendering, and video encoding. 14 | - **Storage Optimized (i, d, h, o, u, z):** Optimized for storage-intensive workloads, providing high disk I/O performance and storage capacity. Ideal for applications such as high-frequency online transaction processing (OLTP), NoSQL databases, and caching databases. 15 | 16 | 2. **Generation:** 17 | - The generation number indicates the iteration and improvements made over time by AWS in terms of performance, efficiency, and features. Higher generation numbers often signify advancements in hardware, networking, and virtualization technologies. 18 | 19 | 3. **Size:** 20 | - The size designation within an instance class denotes the specific resource allocation, including CPU cores, RAM, storage, and network capacity. Larger sizes typically offer higher performance and scalability but may incur higher costs. 21 | 22 | Understanding the characteristics and capabilities of each EC2 instance type helps users select the most suitable option for their specific application requirements, ensuring optimal performance, cost-effectiveness, and scalability on the AWS cloud platform. 23 | 24 | 25 | # Security Groups 26 | 27 | Security groups are essential components of AWS's network security model. They act as virtual firewalls for EC2 instances, controlling inbound and outbound traffic based on defined rules. These rules are primarily allow rules, specifying which traffic is permitted. Security groups can reference each other or use IP addresses for defining access permissions. 28 | 29 | Inbound traffic flows from the internet to the EC2 instances, while outbound traffic goes from the instances to the internet. By default, outbound traffic is allowed to the World Wide Web. Security groups can be attached to multiple instances, providing consistent security settings across them. 30 | 31 | It's important to note that security groups are tied to specific Virtual Private Clouds (VPCs) within AWS and operate on a per-region basis. They resemble firewalls placed outside of EC2 instances, protecting them from unauthorized access and malicious activity. 32 | 33 | In the event of a timeout, often indicative of a security group issue, all inbound traffic is blocked while all outbound traffic remains authorized. Specific rules, such as allowing SSH (port 22) for Linux instances and RDP (port 3389) for Windows instances, are common practices to enable necessary access while maintaining security protocols. 34 | 35 | To extend this summary, we can delve into the flexibility and scalability of security groups within AWS. They offer granular control over network traffic, enabling administrators to tailor access permissions based on individual instance requirements or application needs. Additionally, security groups integrate seamlessly with other AWS services, facilitating secure communication between different components of a cloud infrastructure. 36 | 37 | Furthermore, AWS provides additional layers of security beyond security groups, such as Network Access Control Lists (NACLs) and AWS Identity and Access Management (IAM), allowing for a comprehensive security strategy to protect cloud environments. Understanding and effectively configuring security groups is fundamental to ensuring the integrity and confidentiality of data hosted on AWS EC2 instances. 38 | 39 | Security groups operate in a stateful manner, which means they automatically track the state of connections and allow return traffic for permitted inbound connections. This simplifies network administration by eliminating the need to define explicit rules for return traffic. For example, if an inbound rule allows traffic on port 80 for a web server, the security group automatically allows return traffic from that web server on established connections. 40 | 41 | # EC2 Instance Purchasing Options 42 | 43 | The purchasing options for EC2 instances provide flexibility and cost-effectiveness tailored to various workload requirements: 44 | 45 | 1. **On-Demand Instances**: Pay for compute capacity by the second, with the option for Linux or Windows instances billed by the second and other operating systems billed by the hour. 46 | 47 | 2. **Reserved Instances**: 48 | - Offers significant discounts (up to 72%) compared to On-Demand pricing, ideal for long-term workloads. 49 | - Reservation period options of 1 or 3 years, with upfront or no upfront payment choices. 50 | - Convertible Reserved Instances allow flexibility to change instance types. 51 | 52 | 3. **Savings Plans**: 53 | - Commit to a specific amount of usage for 1 or 3 years, suitable for steady, long-term workloads. 54 | - Locked to instance family and region, supporting different operating systems. 55 | 56 | 4. **Spot Instances**: 57 | - Provides steep discounts (up to 90%) compared to On-Demand pricing, suitable for short-term workloads. 58 | - Can be interrupted by AWS if the current spot price exceeds your bid. 59 | - Ideal for batch jobs, image processing, but not recommended for critical or persistent workloads like databases. 60 | 61 | 5. **Dedicated Hosts**: 62 | - Allows booking an entire physical server, offering control over instance placement. 63 | - Suitable for applications with specific licensing requirements. 64 | - Options for On-Demand or Reserved Instances for 1 or 3 years. 65 | 66 | 6. **Dedicated Instances**: 67 | - Provides hardware dedicated solely to your account, ensuring isolation from other customers within the same account. 68 | 69 | 7. **Capacity Reservations**: 70 | - Reserves capacity in a specific Availability Zone (AZ) for any duration, without billing discounts. 71 | - Useful for short-term workloads requiring uninterrupted access in specific AZs. 72 | 73 | Additional features include: 74 | 75 | - **EC2 Spot Instance Requests**: 76 | - Offers significant cost savings (up to 90% compared to On-Demand) by defining a maximum spot price. 77 | - Ideal for one-time or persistent requests, suitable for batch jobs and big data analysis. 78 | 79 | - **Spot Fleets**: 80 | - Comprises a mix of Spot Instances and On-Demand Instances. 81 | - Designed to meet target capacity with price constraints, allowing flexibility in launch pools, instance types, operating systems, and AZs. 82 | - Various strategies available to optimize allocation, such as LowestPrice, Diversified, CapacityOptimized, and PriceCapacityOptimized. 83 | 84 | Public vs. Private IP 85 | 86 | Private vs. public IP addresses play crucial roles in networking, particularly within cloud environments like AWS. 87 | 88 | Private IPs are used for communication within a private network, typically within a company or a cloud-based environment like AWS. By attaching an internet gateway to a private network in AWS, instances within that network gain access to the internet. This allows them to communicate with resources outside of the private network, such as accessing web servers or external databases. It's like having a bridge between your private network and the vast expanse of the internet. 89 | 90 | On the other hand, public IPs are used for communication over the internet. When you start an EC2 instance in AWS, it's assigned a public IP address. However, this IP address can change when you stop and start the instance unless you use an Elastic IP (EIP). Elastic IPs provide a static, fixed IP address for your instance, ensuring consistent accessibility even if the instance is stopped and restarted. 91 | 92 | While Elastic IPs provide stability, they have limitations. AWS allows a maximum of five Elastic IPs per account and charges for them. It's generally advised to avoid using Elastic IPs due to their associated costs and architectural implications. Instead, best practices include using dynamic public IPs and associating them with DNS names for easier management. Alternatively, employing load balancers can help distribute traffic efficiently without relying on individual instance public IPs. 93 | 94 | In summary, private IPs facilitate internal communication within a network, while public IPs enable communication over the internet. Elastic IPs provide static public IPs for AWS instances, but their usage should be carefully considered due to cost and architectural concerns. Utilizing dynamic public IPs with DNS or load balancers is often recommended for better scalability and cost-effectiveness. 95 | 96 | # Placements Groups 97 | 98 | Placement groups in Amazon EC2 offer control over instance placement strategies. They provide several types: 99 | 100 | 1. **Cluster Placement Group**: Designed for applications that require low-latency communication, it clusters instances within a single Availability Zone (AZ) to minimize network latency. While offering high performance, it also poses a higher risk due to potential correlated failures. 101 | 102 | 2. **Spread Placement Group**: This type spreads instances across distinct underlying hardware to minimize the risk of simultaneous failures. Limited to a maximum of seven instances per group per AZ, it's suitable for critical applications that require high availability. 103 | 104 | 3. **Partition Placement Group**: Ideal for distributed applications like Hadoop, Cassandra, or Kafka, it spreads instances across multiple partitions within an AZ. Each partition represents a distinct rack, enabling the scaling of hundreds of EC2 instances. It offers both performance and fault tolerance benefits. 105 | 106 | By leveraging placement groups, users can tailor instance placement to their specific workload requirements, balancing performance, availability, and fault tolerance effectively. 107 | 108 | Elastic Network Interface (ENI) 109 | 110 | An Elastic Network Interface (ENI) serves as a virtual network card within an Amazon Virtual Private Cloud (VPC), acting as a logical component to facilitate connectivity for instances. It enables instances within a VPC to communicate with other AWS services like Amazon S3, RDS, or other instances within the same VPC by providing the necessary network interface. ENIs play a crucial role in instance failover scenarios, allowing them to detach from one instance and attach to another seamlessly, ensuring continuous connectivity and functionality. They essentially act as the bridge that allows instances to interact with the broader AWS ecosystem and maintain network connectivity even in dynamic environments. 111 | 112 | You can think of an Elastic Network Interface (ENI) as similar to a physical network interface card (NIC) in a traditional computer. Just like a NIC facilitates network connectivity for a physical machine, an ENI serves as the virtual equivalent within an Amazon Virtual Private Cloud (VPC). It provides the necessary networking capabilities for instances in the VPC to communicate with each other, as well as with other AWS services and resources outside the VPC. So, in essence, an ENI functions as a virtual network card, enabling instances to send and receive network traffic within the AWS environment. 113 | 114 | # EC2 Hibernate 115 | 116 | EC2 Hibernate is a feature that allows users to preserve the in-memory state of their instances while stopping them. This enables much faster boot times when compared to traditional stop and start methods. When an instance is hibernated, its RAM state is written to a file in the root EBS volume, which must be encrypted. This allows users to load the RAM and resume the instance without it ever being fully stopped. However, hibernated instances can't remain in this state for more than 60 days, and all data stored in RAM is ultimately stored on the EBS disk. 117 | 118 | EC2 Hibernate offers significant advantages for users who need to quickly stop and start instances while preserving their current state. By saving the RAM state to the encrypted root EBS volume, users can achieve faster reboots without needing to reload data from external sources. This feature is particularly useful for applications with large datasets or complex configurations that would benefit from reduced downtime and faster recovery times. 119 | -------------------------------------------------------------------------------- /014-serverless.md: -------------------------------------------------------------------------------- 1 | # Serverless 2 | 3 | # AWS Lambda 4 | **AWS Lambda** is a serverless compute service that allows you to run code without provisioning or managing servers. Here are some key points about Lambda: 5 | 6 | 1. **Pay-Per-Use Model**: 7 | - You are charged based on the number of requests and the duration of your code's execution. 8 | - Pay-per-request and compute time increments, with no charge when your code is not running. 9 | 10 | 2. **Resource Configuration**: 11 | - Supports up to 10GB of memory per function, with memory increments of 1MB. 12 | - Allows custom runtime APIs, enabling the execution of code written in any programming language. 13 | - Supports Lambda container images, but requires implementing a custom runtime for Lambda. 14 | 15 | 3. **Use Cases**: 16 | - Well-suited for various tasks such as data processing, real-time file processing, API backend services, and more. 17 | - Offers flexibility for running code in response to events triggered by other AWS services or HTTP requests. 18 | 19 | 4. **Limits per Region**: 20 | - Defines various limits per region, including memory, execution duration, environment variables, disk space, concurrency, and function size. 21 | - Allows the use of the `/tmp` directory to load additional files during function startup. 22 | 23 | 5. **AWS Lambda SnapStart**: 24 | - Enhances Lambda function performance by up to 10x for Java 11 and above. 25 | - Achieved by invoking Lambda functions from a pre-initialized state, reducing cold start times. 26 | - Takes snapshots of Lambda function states, allowing new invocations to start from these snapshots. 27 | 28 | 29 | # AWS Lambda@Edge / Cloudfront Functions 30 | 31 | **AWS Lambda@Edge / CloudFront Functions** allow you to customize content delivery through Amazon CloudFront, providing powerful capabilities to modify viewer requests and responses. Here's a breakdown of both CloudFront Functions and Lambda@Edge: 32 | 33 | ### CloudFront Functions: 34 | - **Lightweight Functions**: 35 | - Written in JavaScript. 36 | - Used to modify viewer requests and viewer responses. 37 | - **Request and Response Phases**: 38 | - Operate in two phases: viewer request and viewer response. 39 | - Viewer Request: After CloudFront receives a request from a viewer. 40 | - Viewer Response: Before CloudFront forwards the response to the viewer. 41 | - **Native Integration**: 42 | - A native feature of CloudFront, allowing you to manage code entirely within CloudFront. 43 | - **Limitations**: 44 | - Maximum package size of 2MB. 45 | - Maximum execution time of less than 1ms. 46 | - No network access. 47 | - Package size limited to 10KB. 48 | - No access to the request body. 49 | - **Use Cases**: 50 | - Cache key normalization. 51 | - Header manipulation. 52 | - URL rewrites or redirects. 53 | - Request authentication and authorization. 54 | 55 | ### Lambda@Edge: 56 | - **Programming Languages**: 57 | - Supports Node.js and Python. 58 | - **Scalability**: 59 | - Scales to thousands of requests per second. 60 | - **Execution Phases**: 61 | - Operates in four phases: viewer request, origin request, origin response, and viewer response. 62 | - **Regional Deployment**: 63 | - Deployed in one AWS region (typically us-east-1), then replicated to CloudFront edge locations. 64 | - **Limits**: 65 | - Maximum execution time ranging from 5 to 10 seconds. 66 | - Memory limits range from 128MB to 10GB. 67 | - Package size limits range from 1MB to 50MB. 68 | - **Use Cases**: 69 | - Tasks requiring longer execution times. 70 | - Utilizing CPU or memory-intensive operations. 71 | - Using third-party libraries. 72 | - Network access to external services. 73 | - File system access or access to the body of HTTP requests. 74 | 75 | Both CloudFront Functions and Lambda@Edge provide powerful capabilities for customizing content delivery, allowing you to tailor your CDN behavior to specific requirements and enhance the performance and security of your applications. 76 | 77 | 78 | # AWS Lambda in VPC 79 | 80 | **AWS Lambda in VPC:** 81 | 82 | By default, Lambda functions are executed outside of your Virtual Private Cloud (VPC), in an AWS-owned VPC. Consequently, they lack direct access to resources within your VPC, such as RDS, ElastiCache, or internal ELBs. 83 | 84 | **Launching Lambda in VPC:** 85 | - To enable Lambda to access resources in your VPC, you must specify: 86 | - VPC ID 87 | - Subnets 88 | - Security groups 89 | - Lambda creates an Elastic Network Interface (ENI) within your specified subnets. 90 | - Example flow: Lambda function -> Private subnet -> ENI -> RDS inside the VPC 91 | 92 | **Lambda with RDS Proxy:** 93 | - RDS Proxy enhances scalability by pooling and sharing database connections. 94 | - Improves availability by reducing failover time by up to 66% and preserving connections. 95 | - Enhances security by enforcing IAM authentication and storing credentials in Secrets Manager. 96 | - Lambda functions utilizing RDS Proxy must be deployed within your VPC, as RDS Proxy is never publicly accessible. 97 | 98 | Integrating Lambda with your VPC and utilizing RDS Proxy can significantly enhance the performance, scalability, and security of your serverless applications, particularly when interacting with relational databases like RDS. 99 | 100 | 101 | **RDS Invoking Lambda & Event Notification:** 102 | 103 | **Invoke Lambda from within your DB Instance:** 104 | - Supported by RDS for PostgreSQL and Aurora MySQL. 105 | - Use cases include sending welcome emails or performing other automated tasks based on database events. 106 | - Requires allowing outbound traffic from your DB instance to your Lambda function. 107 | - Provides a seamless way to trigger Lambda functions directly from database events, enhancing automation and real-time responsiveness. 108 | 109 | **RDS Event Notifications (not invoking Lambda):** 110 | - Provides event notifications for various RDS events, such as DB instance creation, deletion, or snapshot creation. 111 | - Does not provide information about the data itself; it's focused on notifying about administrative events. 112 | - Offers near real-time notifications, typically delivered within up to 5 minutes. 113 | - Useful for monitoring and managing RDS resources, allowing you to stay informed about changes and events related to your database instances and snapshots. 114 | 115 | # DynamoDB 116 | **DynamoDB:** 117 | 118 | - Provides single-digit millisecond performance for both read and write operations, making it suitable for applications requiring low-latency access to data. 119 | - Offers auto-scaling capabilities to manage throughput capacity automatically based on workload demands. 120 | - Supports standard and infrequent access classes, allowing you to optimize costs based on the access patterns of your data. 121 | - Attributes within items can be nullable, providing flexibility in data modeling. 122 | - Maximum item size is 400 KB, ensuring efficient storage and retrieval of data. 123 | - Supports various data types including scalar types, document types, and set types, catering to diverse data modeling needs. 124 | - Offers a flexible schema that can rapidly evolve, making it suitable for agile development and better suited than traditional relational databases like RDS for certain use cases. 125 | 126 | **Read/Write Capacity Modes:** 127 | 128 | - **Provisioned Mode:** In this mode, you provision and pay for the desired Read Capacity Units (RCUs) and Write Capacity Units (WCUs) upfront. DynamoDB automatically adjusts capacity in response to your traffic patterns within the provisioned limits. This mode is suitable for predictable workloads where you can estimate your throughput requirements. 129 | 130 | - **On-Demand Mode:** In this mode, there is no need to provision or manage capacity. You simply pay for the read and write requests your application makes. This mode is ideal for workloads with unpredictable or highly variable traffic patterns, as it allows you to scale seamlessly without worrying about capacity planning. 131 | 132 | # DynamoDB Advanced Features 133 | 134 | **DynamoDB Advanced Features:** 135 | 136 | **DynamoDB Accelerator (DAX):** 137 | - Fully managed in-memory cache for DynamoDB tables. 138 | - Helps alleviate read latency by caching frequently accessed data. 139 | - Provides microseconds latency for cached data, improving application performance. 140 | - No need to modify application logic to leverage DAX. 141 | - Default TTL (Time-to-Live) for cached items is 5 minutes. 142 | - Ideal for individual object caching or caching query and scan results. 143 | 144 | **Stream Processing:** 145 | - Utilizes DynamoDB Streams or Kinesis Data Streams for real-time data processing. 146 | - Commonly used for real-time user analytics and cross-region applications. 147 | - DynamoDB Streams offer 24 hours retention for a limited number of consumers. 148 | - Kinesis Data Streams offer up to 1 year of retention. 149 | 150 | **Global Tables:** 151 | - Replicates DynamoDB tables across multiple AWS regions. 152 | - Supports two-way replication, allowing bidirectional data synchronization. 153 | - Enables active-active replication, allowing your application to read and write to any replicated table. 154 | 155 | **Time-to-Live (TTL):** 156 | - Automatically deletes items from a table after a specified expiration timestamp. 157 | - Useful for scenarios like web session handling, where sessions need to be kept for a certain duration and then automatically removed. 158 | 159 | **Backups for Disaster Recovery:** 160 | - Offers continuous backups using Point-in-Time Recovery (PITR) for the last 35 days. 161 | - Allows point-in-time recovery to any time within the backup window, creating a new table with the recovered data. 162 | - Supports on-demand backups for long-term retention until explicitly deleted, with no impact on performance or latency. 163 | - Supports cross-region copying of backups. 164 | 165 | **S3 Integration:** 166 | - Enables exporting DynamoDB table data to S3, provided PITR is enabled. 167 | - Supports exporting data for the last 35 days, allowing for data analysis. 168 | - Export formats include DynamoDB JSON or Ion format. 169 | - Supports importing data from S3 using CSV, DynamoDB JSON, or Ion format, without consuming any write capacity. Import errors are logged in CloudWatch Logs. 170 | 171 | # API Gateway 172 | **API Gateway:** 173 | 174 | API Gateway enables the creation of RESTful APIs with various features: 175 | 176 | - It serves as a proxy for requests to AWS Lambda, removing the need to manage infrastructure. 177 | - Supports WebSocket protocol for real-time, bidirectional communication. 178 | - Offers versioning capabilities to manage different versions of APIs. 179 | - Facilitates handling of different environments such as production and development. 180 | - Provides robust security features for authentication and authorization. 181 | - Allows the creation of API keys for access control. 182 | - Supports Swagger for API documentation. 183 | - Provides caching of API responses for improved performance. 184 | - Allows for the exposure of AWS Lambda functions, HTTP endpoints, or other AWS services like SQS. 185 | - Supports exposing any AWS service to the outside world. 186 | - Offers HTTP APIs, a simpler version of REST APIs. 187 | - Endpoint types include edge-optimized (default for global clients), regional (for clients within the same region), or private APIs for use inside your VPC. 188 | 189 | Security options include: 190 | - IAM roles for internal applications. 191 | - Cognito for user authentication. 192 | - Custom authorization logic. 193 | - Custom domain names with HTTPS security through integration with AWS Certificate Manager (ACM). Note that if using an edge-optimized endpoint, the certificate must be in the `us-east-1` region. For region endpoints, the certificate must be in the API Gateway region. 194 | - Setup of CNAME or A-alias record in Route 53 for custom domain names. 195 | 196 | 197 | # Step Functions 198 | 199 | Step Functions allow the creation of serverless visual workflows to orchestrate AWS Lambda functions and other AWS services. Key features include: 200 | 201 | - Building complex workflows within AWS by defining sequences, parallel tasks, conditions, timeouts, and error handling. 202 | - Integration with various AWS services including EC2, ECS, and API Gateway, as well as on-premises systems through AWS services. 203 | - Implementation of human approval features for workflows that require human intervention or decision-making. 204 | - Use cases include order fulfillment, data processing, web applications, and any workflow that requires coordination between multiple tasks or services. 205 | 206 | 207 | # AWS Cognito 208 | 209 | AWS Cognito provides user identity management for web and mobile applications. It consists of two main components: 210 | 211 | 1. **User Pool:** 212 | - A serverless user database where user identities are stored. 213 | - Supports integration with social identity providers like Facebook, Google, etc. 214 | - Allows users to sign in to applications using username/password or social login. 215 | - Often integrated with application load balancers to verify user logins. 216 | 217 | 2. **Identity Pool:** 218 | - Provides temporary credentials to users for accessing AWS resources directly. 219 | - Allows web or mobile applications to access AWS services like S3 or DynamoDB on behalf of the user. 220 | - Useful when applications need to access AWS resources without going through an application load balancer. 221 | - Can be authenticated via user pools or other identity providers like Google. 222 | 223 | AWS Cognito enables secure authentication and authorization mechanisms for applications, allowing users to interact with resources securely. 224 | -------------------------------------------------------------------------------- /002-instance-storage.md: -------------------------------------------------------------------------------- 1 | # Instance Storage EC2 2 | 3 | # EBS 4 | 5 | Elastic Block Store (EBS) is a fundamental component of AWS storage solutions, offering persistent block-level storage volumes for EC2 instances. These volumes can be easily attached to running instances and reattached as needed, providing flexibility in managing storage resources. Each EBS volume is bound to a specific Availability Zone (AZ) and functions akin to a network USB stick or network drive, communicating over the network. 6 | 7 | One of the key features of EBS is its ability to support failover scenarios by allowing volumes to be detached from one instance and attached to another. This facilitates high availability and disaster recovery strategies. Additionally, EBS volumes can be snapshot, creating backups that can be copied across AZs and even regions, providing data redundancy and enabling efficient disaster recovery. 8 | 9 | EBS Snapshots offer further functionalities such as the ability to create backups without detaching volumes, a recycle bin for accidental deletions, and the option to archive snapshots for cost optimization. Fast Snapshot Restore (FSR) ensures minimal latency upon first use, and retention rules allow for the management of snapshot lifecycle. 10 | 11 | Overall, EBS and its snapshot capabilities provide robust storage solutions with features tailored for scalability, reliability, and cost-effectiveness in various deployment scenarios. 12 | 13 | # AMI 14 | 15 | An Amazon Machine Image (AMI) serves as a foundational component for customizing Amazon EC2 instances. It encapsulates a pre-configured environment, including operating system, software packages, and configuration settings, enabling faster boot times and consistent deployments. AMIs can be tailored to specific needs, allowing users to add their configurations and software packages before creating instances. 16 | 17 | AMIs are region-specific but can be copied across regions, facilitating global deployments. Users can launch EC2 instances directly from the AWS Marketplace using existing AMIs or create their own custom AMIs. After starting an instance, users can further customize it to their requirements before stopping it. 18 | 19 | Building an AMI not only captures the instance's configuration but also creates snapshots of attached Elastic Block Store (EBS) volumes, ensuring data persistence and integrity. Additionally, users can launch instances from shared or public AMIs created by others, fostering collaboration and efficiency in deploying standardized environments. 20 | 21 | # EC2 instance store 22 | Summary: 23 | EC2 Instance Store provides high-performance, hardware-based disk storage for Amazon EC2 instances. Unlike EBS (Elastic Block Store), which offers network-based storage with limited performance, EC2 Instance Store delivers better I/O performance, making it suitable for applications requiring high-speed data access. However, there are some trade-offs; EC2 Instance Store volumes are ephemeral, meaning data is lost when the associated instance is stopped or terminated. This makes it ideal for temporary data such as buffers, caches, and scratch data, but it requires users to manage their own backups and replication since AWS does not provide built-in data protection for instance store volumes. 24 | 25 | Extension: 26 | The EC2 Instance Store is a powerful tool for applications demanding superior disk performance. Its direct attachment to the underlying hardware ensures low-latency access to data, making it particularly advantageous for tasks like real-time data processing, high-performance computing (HPC), and big data analytics where rapid I/O operations are critical. 27 | 28 | However, the ephemeral nature of EC2 Instance Store volumes necessitates careful planning in architectural design. While it excels in scenarios where data persistence is not a primary concern, such as temporary workloads or transient data processing tasks, it requires users to implement robust backup and replication strategies to safeguard valuable data. Leveraging services like Amazon S3 for durable storage or implementing automated snapshotting mechanisms can mitigate the risk of data loss. 29 | 30 | Moreover, the performance benefits of EC2 Instance Store extend beyond traditional disk operations. Its high-speed I/O capabilities make it an ideal choice for applications requiring intensive read/write operations, such as databases, content delivery networks (CDNs), and in-memory caching systems. By harnessing the full potential of EC2 Instance Store, users can optimize the performance of their applications and deliver enhanced user experiences. 31 | 32 | In conclusion, while EC2 Instance Store offers unparalleled performance advantages, its ephemeral nature and lack of built-in data protection necessitate careful consideration and proactive management. By understanding its strengths and limitations, users can harness its capabilities effectively to meet the demanding requirements of modern cloud-based applications. 33 | 34 | 35 | # EBS Volume Types 36 | 37 | EBS Volume Types provide various options to suit different storage needs in AWS. There are four main types: 38 | 39 | 1. **gp2/gp3 (SSD)**: These are general-purpose SSD volumes suitable for a wide range of workloads. The newer gp3 offers higher performance with up to 3,000 IOPS and 125 MiB/s throughput, scaling up to 16,000 IOPS and 1,000 MiB/s. 40 | 41 | 2. **io1/io2 Block Express (SSD)**: These are the highest performance SSD volumes, ideal for critical business applications requiring sustained IOPS performance, such as databases. io1/io2 volumes offer high IOPS ranging from 64,000 to 256,000. 42 | 43 | 3. **st1/sc1 (HDD)**: Designed for scenarios where cost-effectiveness is a priority and workloads are less performance-sensitive. ST1 is throughput-optimized for large, sequential workloads like big data processing or data warehousing. 44 | 45 | Some key considerations: 46 | 47 | - Only gp2, gp3, io1, and io2 volumes can be used as root volumes for EC2 instances. 48 | - HDD volumes cannot be used as boot volumes and have a maximum size limit of 16 TB. 49 | - For workloads requiring over 32,000 IOPS, EC2 Nitro instances are necessary to fully utilize the performance potential. 50 | 51 | Extensions: 52 | 53 | When choosing between EBS volume types, it's crucial to consider factors such as workload requirements, performance needs, and budget constraints. For instance, for applications demanding high IOPS and throughput consistently, io1/io2 volumes are the best choice, albeit at a higher cost compared to gp2/gp3 volumes. However, for workloads with less demanding performance requirements and where cost optimization is paramount, st1/sc1 HDD volumes can offer significant savings. Additionally, understanding the scalability options and the performance characteristics of each volume type can help in designing architectures that meet both current and future needs effectively. 54 | 55 | EBS Multi-Attach 56 | EBS Multi-Attach is a feature that allows the same Elastic Block Store (EBS) volume to be attached to multiple EC2 instances within the same Availability Zone (AZ). This capability is supported for EBS volume types io1 and io2, providing high-performance storage for applications requiring low-latency access. 57 | 58 | In this setup, each attached instance has both read and write access to the shared EBS volume, making it suitable for scenarios where multiple instances need simultaneous access to a shared dataset, such as in clustered environments or distributed systems. However, it's essential to note that applications running on Linux must be designed to manage concurrent writes effectively, as simultaneous writes from multiple instances can lead to data corruption or conflicts. 59 | 60 | # EBS Multi-Attach 61 | 62 | EBS Multi-Attach supports up to 16 EC2 instances attached to the same EBS volume at a given time. To ensure data consistency and reliability, applications utilizing Multi-Attach must be cluster-aware and utilize file systems compatible with concurrent access, such as clustered file systems like OCFS2 or GFS2, rather than traditional file systems like XFS or ext4, which are not designed for shared access scenarios. 63 | 64 | Extending on this, ensuring proper synchronization mechanisms and concurrency controls within the application layer becomes paramount when leveraging EBS Multi-Attach. Additionally, monitoring and managing I/O operations, particularly during peak usage periods, is crucial to maintaining performance and preventing bottlenecks or contention issues. Furthermore, integrating with AWS services like CloudWatch for monitoring and AWS Identity and Access Management (IAM) for access control adds layers of security and visibility to the multi-attached EBS setup. 65 | 66 | # EBS Encryption 67 | 68 | EBS encryption ensures data security by encrypting data at rest, in transit, and during snapshots. It leverages AES-256 keys from AWS Key Management Service (KMS). All volumes and snapshots are encrypted, ensuring comprehensive protection. Encryption can be applied to unencrypted volumes by creating a snapshot, copying it with encryption enabled, and then creating a new encrypted volume from the encrypted snapshot. 69 | 70 | Extension: 71 | 72 | AWS Key Management Service (KMS) plays a crucial role in EBS encryption by providing a highly secure and scalable way to manage encryption keys. KMS allows fine-grained control over access to keys and provides auditing capabilities to monitor key usage. 73 | 74 | When encrypting an unencrypted EBS volume, it's essential to understand the process thoroughly. Creating a snapshot of the volume is the first step, preserving its data while preparing for encryption. Copying the snapshot with encryption enabled ensures that the data remains secure throughout the process. This encrypted snapshot serves as the foundation for creating a new EBS volume, which inherits the encryption properties of its source snapshot. 75 | 76 | This approach not only secures existing data but also provides a seamless transition to encrypted volumes without disrupting operations. Once the encrypted volume is created, it can be attached to the original instance, maintaining data integrity and security across the AWS environment. This method offers a robust solution for organizations aiming to enhance their data protection strategies in the cloud. 77 | 78 | 79 | # Amazon EFS - Elastic File System 80 | 81 | Amazon Elastic File System (EFS) offers managed NFS (Network File System) storage that can be easily mounted on multiple EC2 instances. It seamlessly integrates with EC2 instances across multiple Availability Zones (AZs), ensuring high availability. It's suitable for various use cases such as content management, web serving, and data sharing, but it's only compatible with Linux-based AMIs. 82 | 83 | Encryption at rest is provided using Key Management Service (KMS), ensuring data security. EFS automatically scales to accommodate workload demands, supporting thousands of concurrent NFS clients and up to 10 GB/s throughput. It offers different performance modes: General Purpose for typical workloads, Max I/O for big data applications, and Throughput for scenarios requiring high throughput. 84 | 85 | Storage tiers are available, including Standard for frequently accessed data, Infrequent Access (IA) for less frequently accessed data at a lower cost, and Archive for rarely accessed data, offering significant cost savings. Lifecycle policies can be implemented to automatically move files between storage tiers based on access patterns. 86 | 87 | For cost optimization, EFS provides options like using single AZ (EFS One Zone-IA) for non-critical workloads and leveraging bursting, where higher storage usage results in increased throughput. Enhanced mode with automatic scaling is recommended for unpredictable workloads. 88 | 89 | Basic settings include Enhanced, Elastic, and Provisioned modes, catering to different workload requirements. Security groups need to be configured for EFS, and billing is based on actual usage, ensuring cost efficiency. Overall, Amazon EFS provides a scalable, flexible, and cost-effective solution for managing file storage in AWS environments. 90 | 91 | Extension: 92 | Additionally, Amazon EFS simplifies the management of file storage by abstracting the underlying infrastructure complexities, allowing users to focus on their applications. Its multi-AZ architecture ensures data availability and durability, making it suitable for mission-critical workloads. The ability to dynamically scale resources based on demand enables businesses to handle sudden spikes in workload without manual intervention. Furthermore, the integration with AWS KMS ensures compliance with regulatory requirements and enhances data security. Overall, Amazon EFS is a versatile solution that empowers organizations to efficiently manage their file storage needs in the cloud. 93 | 94 | # EBS vs. EFS 95 | 96 | 97 | | Feature | EFS | EBS | 98 | |----------------------|----------------------------------------------|----------------------------------------------| 99 | | Type | Network file system | Block-level storage | 100 | | Compatibility | Linux instances only (POSIX compliant) | Compatible with all EC2 instances | 101 | | Multi-instance access | Yes, can be mounted by hundreds of instances across AZs | Attached to individual EC2 instances | 102 | | AZ Locking | No, accessible across multiple AZs | Yes, locked at the AZ level | 103 | | Volume Types | N/A | gp2, gp3, io1 | 104 | | Scalability | Highly scalable, suitable for applications with fluctuating demand | Limited scalability, tied to instance size | 105 | | Cost | Higher price point | Lower price point | 106 | | Storage Tiers | Yes, for cost savings based on usage patterns | N/A | 107 | | Migration Across AZs | N/A | Requires snapshot and restore | 108 | | Default Behavior on Instance Termination | N/A | Root volumes terminated by default | 109 | 110 | This table outlines the key differences between EFS and EBS, including their compatibility, scalability, pricing, and other features. 111 | -------------------------------------------------------------------------------- /020-iam.md: -------------------------------------------------------------------------------- 1 | # IAM 2 | 3 | **# AWS Organizations: Centralized Account Management** 4 | 5 | AWS Organizations offers centralized management for AWS accounts, providing various benefits and controls for organizations. Here's an overview: 6 | 7 | - **Account Structure:** 8 | - The main organization account serves as the management account, while others are member accounts. 9 | - Billing is consolidated across all accounts. 10 | 11 | - **Cost Optimization:** 12 | - Aggregated usage enables pricing benefits like volume discounts for services such as EC2 and S3. 13 | 14 | - **Automation and Organization:** 15 | - API allows for automated creation of accounts. 16 | - Accounts can be organized by business units, environments, or projects. 17 | 18 | - **Security Enhancements:** 19 | - Each account has its own Virtual Private Cloud (VPC) for better isolation. 20 | - All actions are logged in CloudTrail for auditing purposes. 21 | 22 | - **Service Control Policies (SCPs):** 23 | - SCPs function similarly to IAM policies but at the organizational level. 24 | - SCPs are attached at the root level, allowing you to define what actions are allowed or denied across member accounts. 25 | - Blocklist and allowlist can be implemented to control access to specific services. 26 | 27 | - **Policy Enforcement:** 28 | - SCPs enable fine-grained control over services accessed within member accounts, ensuring better security posture. 29 | 30 | - **Backup and Tag Policies:** 31 | - Backup and tag policies can be applied at the member account level for consistent management. 32 | 33 | - **Organizational Units (OUs):** 34 | - Accounts are organized into OUs, allowing for the application of SCPs at different levels. 35 | - SCP inheritance ensures that restrictions apply hierarchically, even if an account is managed by a different team. 36 | 37 | AWS Organizations offers a comprehensive suite of features for managing multiple AWS accounts, enabling organizations to enforce security policies, optimize costs, and streamline management tasks effectively. 38 | 39 | **# IAM Conditions: Fine-Grained Access Control** 40 | 41 | IAM Conditions allow for fine-grained access control within AWS Identity and Access Management (IAM) policies. Here's a breakdown of commonly used conditions: 42 | 43 | - **aws:SourceIp:** Restricts API calls based on the client's IP address. 44 | - **aws:RequestRegion:** Limits access to specific AWS regions. 45 | - **ec2:ResourceTag:** Restricts access based on resource tags, such as those applied to EC2 instances. 46 | - **aws:MultiFactorAuthPresent:** Requires multi-factor authentication (MFA) for access. 47 | - **S3 Bucket Policies:** Conditions can be applied within S3 bucket policies (bucketname) and object policies (/*) to control access at both the bucket and object levels. 48 | 49 | IAM Conditions provide granular control over access permissions, allowing organizations to enforce security policies based on various factors such as IP address, region, resource tags, and MFA status. This enhances security by ensuring that access is granted only under specified conditions. 50 | 51 | 52 | **# IAM Resource-based vs. IAM Role-based Access** 53 | 54 | IAM offers both resource-based and role-based access control mechanisms, each with its own use cases and considerations: 55 | 56 | - **Cross-Account Access:** 57 | - **Resource-based Policy:** Attaching a policy directly to a resource (e.g., S3 bucket policy). 58 | - **IAM Role:** Using a role as a proxy to access resources in another account. 59 | 60 | - **Role Assumption:** 61 | - When a user, application, or service assumes a role, they adopt the permissions assigned to that role, relinquishing their original permissions. 62 | - With resource-based policies, the principal retains their original permissions. 63 | 64 | - **Example Scenario:** 65 | - A user in Account A needs to scan a DynamoDB table in Account A and dump it into an S3 bucket in Account B. 66 | - Supported services for cross-account access include S3, SNS, and SQS. 67 | 68 | **EventBridge:** 69 | - When an EventBridge rule runs, it requires permissions on the target. 70 | - **Resource-based Policy:** Lambda, SNS, SQS, S3 buckets, API Gateway. 71 | - **IAM Role:** Kinesis stream, Systems Manager Run Command, ECS tasks. 72 | 73 | **IAM Permissions Boundaries:** 74 | - Supported for users and roles (not groups). 75 | - Advanced feature utilizing a managed policy to set the maximum permissions an IAM entity can have. 76 | - When a permissions boundary is set, additional permissions cannot be attached, ensuring a more restricted access scope. 77 | - Can be used in conjunction with AWS Organizations SCPs. 78 | - Use cases include allowing developers to manage their permissions while preventing privilege escalation. 79 | 80 | IAM resource-based and role-based access controls offer flexibility and security in managing access to AWS resources, catering to various use cases and security requirements. 81 | 82 | # IAM Policy Evaluation Logic 83 | 84 | - **Deny Evaluation**: Deny permissions take precedence over allow permissions in IAM policy evaluation. 85 | 86 | - **Organization SCP**: Service Control Policies (SCPs) at the organization level can further restrict permissions across member accounts. 87 | 88 | - **Resource-based Policies**: These policies are attached directly to the AWS resource and define who can access the resource and what actions they can perform. 89 | 90 | - **Identity-based Policies**: IAM policies attached to IAM identities (users, groups, roles) which define their permissions. 91 | 92 | - **IAM Permissions Boundaries**: A feature that sets the maximum permissions an IAM entity can have. Policies cannot grant more permissions than the boundaries set. 93 | 94 | - **Final Decision: Allow**: If there is no explicit deny or explicit allow, access to the resource or action is denied by default. 95 | 96 | - **Session Policies**: Policies that are applied temporarily during a session. They are used to grant temporary permissions for a specific operation. 97 | 98 | **Highlights**: 99 | - Understanding the precedence of deny over allow is crucial for effective permission management. 100 | - Organization SCPs provide centralized control over permissions across multiple accounts. 101 | - Resource-based policies offer granular control over access to specific AWS resources. 102 | - Identity-based policies define permissions for IAM entities like users, groups, and roles. 103 | - IAM permissions boundaries prevent policies from granting excessive permissions. 104 | - In the absence of explicit allow or deny, access is denied by default. 105 | - Session policies can be used to grant temporary permissions for specific operations. 106 | 107 | # IAM Identity Center (Single Sign-On) 108 | 109 | - **One Login for All AWS Accounts**: Centralized authentication system allowing users to access multiple AWS accounts with a single set of credentials. 110 | 111 | - **Integration with Business Apps**: Seamless integration with third-party business applications such as Salesforce, Box, and Microsoft. 112 | 113 | - **SAML 2.0-enabled**: Support for Security Assertion Markup Language (SAML) 2.0 for secure authentication and authorization. 114 | 115 | - **EC2 Windows Instances**: Ability to authenticate and authorize access to Windows instances running on EC2. 116 | 117 | **Fine-grained Permissions and Assignments**: 118 | - **Multi-account Permissions**: 119 | - Manage access across AWS accounts within your AWS organization. 120 | - Permissions Sets: Collections of IAM policies assigned to users and groups to define AWS access. 121 | 122 | - **App Assignments**: 123 | - SSO access to SAML 2.0-enabled business apps like Salesforce, Box, and Microsoft. 124 | - Provision required URLs, certificates, and metadata for seamless integration. 125 | 126 | - **Attribution-based Access Control (ABAC)**: 127 | - Fine-grained permissions based on user attributes stored in IAM. 128 | - Attributes like cost center enable precise control over access. 129 | - Use Cases: Define permissions once, then modify AWS access by changing user attributes. 130 | 131 | **Highlights**: 132 | - IAM Identity Center simplifies access management by providing single sign-on across AWS accounts. 133 | - Seamless integration with popular business applications enhances user experience. 134 | - SAML 2.0 support ensures secure authentication and authorization processes. 135 | - Fine-grained permissions enable precise control over access to resources and applications. 136 | - Attribution-based access control allows for dynamic adjustment of permissions based on user attributes like cost center. 137 | 138 | # AWS Directory Services 139 | 140 | - **Introduction:** 141 | - AWS Directory Services is a suite of services that enables you to integrate AWS resources with your existing on-premises Microsoft Active Directory or to set up and operate a new directory in the AWS Cloud. 142 | 143 | - **AWS Managed Microsoft AD:** 144 | - Provides a fully managed Active Directory service, allowing you to create your own AD in AWS. 145 | - Enables you to manage users locally, supporting multi-factor authentication (MFA). 146 | - Useful for scenarios where you need an AD in the cloud without the overhead of managing the infrastructure. 147 | 148 | - **AD Connector:** 149 | - Acts as a directory gateway proxy, allowing you to redirect directory requests from AWS resources to your on-premises AD. 150 | - Supports multi-factor authentication (MFA), enhancing security for directory access. 151 | - Ideal for hybrid environments where you want to leverage your existing on-premises AD infrastructure alongside AWS services. 152 | 153 | - **Simple AD:** 154 | - Offers an AD-compatible managed directory in AWS. 155 | - Provides basic AD functionality, allowing you to join EC2 instances to a domain, authenticate users, and manage group policies. 156 | - Suitable for scenarios where you require simple AD functionality in the AWS Cloud. 157 | 158 | - **Compatibility:** 159 | - Compatible with any Windows Server with Active Directory Domain Services (AD DS), providing seamless integration with existing Microsoft environments. 160 | 161 | - **Flexibility and Scalability:** 162 | - Allows for the flexibility to choose the appropriate directory service based on your specific requirements, whether it's a fully managed AD in AWS or integration with your on-premises infrastructure. 163 | - Scales seamlessly to accommodate growing organizational needs, ensuring that directory services remain responsive and reliable. 164 | 165 | - **Security:** 166 | - Supports multi-factor authentication (MFA) across various services, enhancing security posture and protecting against unauthorized access. 167 | - Enables you to implement fine-grained access controls, ensuring that only authorized users have access to directory resources. 168 | 169 | - **Cost-Effectiveness:** 170 | - Offers a pay-as-you-go pricing model, allowing you to pay only for the resources you consume without any upfront investments in hardware or infrastructure. 171 | - Helps reduce operational overhead by offloading the management of directory services to AWS, freeing up resources to focus on core business activities. 172 | 173 | - **Integration:** 174 | - Seamlessly integrates with other AWS services, such as Amazon EC2, Amazon RDS, and AWS Single Sign-On (SSO), providing a unified authentication and authorization experience across the AWS Cloud. 175 | 176 | - **Ease of Management:** 177 | - Provides a centralized management console for configuring and managing directory services, simplifying administrative tasks and reducing the complexity of managing distributed environments. 178 | 179 | - **Reliability and Availability:** 180 | - Offers high availability and durability, leveraging AWS's global infrastructure to ensure that directory services remain accessible and resilient to failures. 181 | 182 | - **Compliance:** 183 | - Helps organizations meet regulatory compliance requirements, such as GDPR, HIPAA, and SOC, by providing built-in security features and audit logs for monitoring and reporting purposes. 184 | 185 | - **Continuous Innovation:** 186 | - Benefits from AWS's continuous innovation and updates, ensuring that directory services remain up-to-date with the latest security patches and feature enhancements. 187 | 188 | AWS Directory Services provides a comprehensive solution for managing directory services in the AWS Cloud, offering flexibility, scalability, security, and cost-effectiveness for organizations of all sizes. Whether you need to extend your existing on-premises AD infrastructure to the cloud or set up a new directory in AWS, AWS Directory Services has you covered. 189 | 190 | # AWS Control Tower 191 | 192 | - **Introduction:** 193 | - AWS Control Tower provides an easy way to set up and govern a secure and compliant multi-account AWS environment based on best practices. 194 | 195 | - **AWS Organizations Integration:** 196 | - Utilizes AWS Organizations to create and manage accounts within your AWS environment, enabling centralized governance and management. 197 | 198 | - **Benefits:** 199 | - **Automated Setup:** 200 | - Allows for the automated setup of your environment with just a few clicks, streamlining the deployment process. 201 | - **Policy Management:** 202 | - Automates ongoing policy management using guardrails, ensuring compliance with organizational standards and best practices. 203 | - **Policy Violation Detection and Remediation:** 204 | - Detects policy violations and automatically remediates them, reducing the risk of non-compliance and security breaches. 205 | - **Compliance Monitoring:** 206 | - Provides an interactive dashboard to monitor compliance across your environment, offering insights into the state of your infrastructure. 207 | 208 | - **Guardrails:** 209 | - Provide ongoing governance for your Control Tower environment, enforcing policies and best practices. 210 | - **Preventive Guardrails:** 211 | - Utilize Service Control Policies (SCPs) to enforce preventive measures, such as restricting regions across all accounts, minimizing potential security risks. 212 | - **Detective Guardrails:** 213 | - Leverage AWS Config to implement detective guardrails, identifying issues like untagged resources and helping maintain visibility and control over your environment. 214 | 215 | AWS Control Tower simplifies the process of setting up and managing a secure and compliant multi-account AWS environment by automating key tasks, providing governance through guardrails, and offering comprehensive monitoring capabilities. By integrating with AWS Organizations and leveraging automation, Control Tower enables organizations to maintain a robust and well-governed infrastructure in alignment with industry best practices. 216 | -------------------------------------------------------------------------------- /011-storage-extras.md: -------------------------------------------------------------------------------- 1 | # AWS Storage Extras 2 | 3 | # AWS Snow Family 4 | 5 | The AWS Snow Family offers a suite of devices designed for securely transferring large amounts of data to and from AWS. Here's an overview of its offerings: 6 | 7 | ## AWS Snowball Edge 8 | 9 | Snowball Edge is a rugged, portable device designed for moving terabytes or petabytes of data in and out of AWS. It comes in different variants: 10 | 11 | - **Storage Optimized:** Offers up to 80TB of HDD storage capacity. 12 | - **Compute Optimized:** Provides up to 42TB of HDD storage capacity along with compute capabilities. 13 | - **Use Cases:** Commonly used for large-scale data cloud migration, edge computing, and storage in remote locations. 14 | 15 | ## AWS Snowcone 16 | 17 | Snowcone is a smaller and lighter version of the Snow Family, designed for edge computing, storage, and data transfer in challenging environments. Key features include: 18 | 19 | - **Compact Design:** Weighs only 4.5 pounds (2 kilograms), making it highly portable. 20 | - **Storage Capacity:** Offers up to 8TB of HDD storage or 14TB of SSD storage. 21 | - **Versatility:** Suitable for use cases where Snowball devices may not fit, and users need to provide their own power source and cables. 22 | - **Data Transfer:** Data can be sent back to AWS offline or connected to the internet to use AWS DataSync for data transfer. 23 | 24 | ## AWS Snowmobile 25 | 26 | Snowmobile is a massive, ruggedized shipping container capable of securely transferring exabytes of data to AWS. Key features include: 27 | 28 | - **Enormous Capacity:** Offers up to 100PB of storage capacity, making it suitable for large-scale data migration projects. 29 | - **Efficient Transfer:** Designed to handle data transfers at a scale that exceeds traditional network-based methods, especially useful for transferring petabytes of data. 30 | - **Use Cases:** Ideal for organizations with extremely large datasets that would take impractical amounts of time to transfer over the internet. 31 | 32 | In summary, the AWS Snow Family provides a range of solutions for securely and efficiently transferring large amounts of data to and from AWS, catering to diverse use cases and requirements. Whether it's moving data from remote locations, edge computing, or massive data migration projects, there's a Snow device suited to meet your needs. 33 | 34 | # Edge Computing 35 | 36 | Edge computing refers to the practice of processing data closer to the source of generation, typically at or near the edge of the network, rather than relying solely on centralized cloud services. Here's an overview of edge computing with a focus on AWS Snow Family devices: 37 | 38 | ## Edge Locations 39 | 40 | - **Remote and Disconnected:** Edge locations are often situated far from cloud data centers and may lack reliable internet connectivity. 41 | - **Processing Needs:** Despite limited connectivity, there's a demand for processing capabilities at these edge locations to analyze and act on data in near real-time. 42 | 43 | ## AWS Snow Family Devices 44 | 45 | ### Snowcone and Snowcone SSD 46 | 47 | - **Compact and Portable:** Snowcone devices are lightweight and portable, making them ideal for deployment in remote or challenging environments. 48 | - **Storage Options:** Available with HDD or SSD storage configurations, providing flexibility based on storage requirements. 49 | - **Use Cases:** Suitable for edge computing scenarios where internet access may be limited or unreliable. 50 | 51 | ### Snowball Edge 52 | 53 | - **Compute and Storage Optimized:** Snowball Edge devices come in two variants: 54 | - **Compute Optimized:** Designed for compute-intensive workloads, providing processing power along with storage capabilities. 55 | - **Storage Optimized:** Emphasizes storage capacity, making it suitable for data-intensive applications. 56 | 57 | ### Edge Computing Capabilities 58 | 59 | - **EC2 Instances and Lambda Functions:** All Snow Family devices can run EC2 instances and AWS Lambda functions, enabling you to execute code at the edge. 60 | - **AWS Greengrass IoT:** Integration with AWS Greengrass extends edge computing capabilities, allowing you to deploy and manage applications that seamlessly interact with local resources and cloud services. 61 | 62 | ## Long-Term Deployment Options 63 | 64 | - **Extended Duration:** Snow Family devices offer long-term deployment options ranging from one to three years, ensuring continuous operation in remote locations without frequent maintenance or replacement. 65 | 66 | ## Benefits of Edge Computing 67 | 68 | - **Remote Processing:** Edge computing enables organizations to perform data processing and analysis at the edge, reducing latency and ensuring timely decision-making even in disconnected environments. 69 | - **Resilience:** By decentralizing processing capabilities, edge computing enhances resilience by reducing dependence on centralized infrastructure and mitigating the impact of network outages or latency issues. 70 | 71 | In summary, edge computing with AWS Snow Family devices empowers organizations to extend their computing capabilities to remote and disconnected locations, enabling efficient data processing and analysis at the edge of the network. 72 | 73 | # AWS OpsHub 74 | 75 | AWS OpsHub is a software tool designed to streamline the management of AWS Snow Family devices. Here's an overview of its features: 76 | 77 | - **Simplified Management:** Instead of relying solely on the command-line interface (CLI), users can utilize OpsHub for a more user-friendly management experience. 78 | - **Installation:** OpsHub is installed on your computer or laptop, providing a convenient interface for managing Snow Family devices. 79 | - **Data Management:** OpsHub facilitates importing data into Amazon S3 or exporting data from Amazon S3, simplifying the transfer of large datasets to and from Snow Family devices. 80 | 81 | # Snowball into Glacier 82 | 83 | When transferring data from Snow Family devices to Amazon Glacier, it's important to note that you cannot directly import data into Glacier. Instead, you can follow these steps: 84 | 85 | 1. **Import into S3:** First, import the data from the Snowball device into Amazon S3, where it will be stored temporarily. 86 | 87 | 2. **Lifecycle Policy:** Once the data is in S3, create a lifecycle policy that specifies the conditions under which objects should be transitioned to Glacier storage. 88 | 89 | 3. **Transition to Glacier:** The lifecycle policy will automatically move the objects from S3 to Glacier storage based on the specified criteria, such as time-based rules or object age. 90 | 91 | By leveraging AWS OpsHub for Snow Family device management and implementing a lifecycle policy in Amazon S3, you can efficiently transfer data from Snowball devices to Glacier storage while automating the process of transitioning objects to long-term archival storage. 92 | 93 | # Amazon FSx 94 | 95 | Amazon FSx provides fully managed third-party file systems on AWS, offering high performance and scalability. Here's an overview of its offerings: 96 | 97 | ## FSx for Lustre 98 | 99 | - **Lustre File System:** Lustre is a high-performance parallel file system used in large-scale computing environments. 100 | - **Integration with S3:** FSx for Lustre allows read and write access to Amazon S3, enabling seamless integration with cloud storage. 101 | - **Scratch File System:** Ideal for temporary storage with high burst performance, suitable for short-term processing tasks and cost optimization. 102 | - **Persistent File System:** Offers long-term storage with data replication across multiple Availability Zones (AZs), suitable for long-term processing and sensitive data. 103 | 104 | ## FSx for Windows File Server 105 | 106 | - **Windows File System:** FSx for Windows File Server provides fully managed Windows file shares, supporting the SMB protocol and Windows NTFS. 107 | - **Active Directory Integration:** Supports integration with Active Directory for user authentication and access control. 108 | - **Cross-Platform Mounting:** File shares can be mounted on Linux EC2 instances, providing flexibility in mixed-platform environments. 109 | - **Multi-AZ Deployment:** File systems can be deployed across multiple AZs for high availability. 110 | - **Backup to S3:** Offers backup capabilities to Amazon S3 for data protection and disaster recovery. 111 | 112 | ## FSx for NetApp ONTAP 113 | 114 | - **NAS Workloads Migration:** FSx for NetApp ONTAP allows seamless migration of workloads that rely on NAS to AWS. 115 | - **Cross-Platform Compatibility:** Supports Linux, Windows, macOS, and VMware environments. 116 | - **Autoscaling Storage:** Storage capacity automatically scales based on demand, ensuring efficient resource utilization. 117 | - **Snapshots and Replication:** Provides snapshotting and replication features for data protection and disaster recovery. 118 | - **Data Deduplication:** Offers data deduplication capabilities to optimize storage utilization. 119 | - **Point-in-Time Cloning:** Enables instantaneous point-in-time cloning of file systems for efficient data management. 120 | 121 | ## FSx for OpenZFS 122 | 123 | - **NFS Protocol Compatibility:** FSx for OpenZFS is compatible only with the NFS protocol. 124 | - **OpenZFS Support:** Designed to run on ZFS, an advanced file system known for its reliability and data integrity features. 125 | - **Cross-Platform Compatibility:** Works with various operating systems such as Linux, Windows, and macOS. 126 | - **Point-in-Time Cloning:** Supports point-in-time cloning for efficient data management and replication. 127 | 128 | In summary, Amazon FSx offers a range of fully managed file system solutions tailored for different use cases, providing high performance, scalability, and compatibility with various operating systems and protocols. Whether you need Lustre for high-performance computing, Windows File Server for SMB shares, NetApp ONTAP for NAS workloads, or OpenZFS for NFS compatibility, FSx has you covered with both scratch and persistent file system options. 129 | 130 | # AWS Storage Gateway 131 | 132 | AWS Storage Gateway provides a hybrid cloud storage solution, bridging the gap between on-premises environments and the AWS cloud. Here's an overview of its offerings: 133 | 134 | ## Hybrid Cloud Adoption 135 | 136 | - **Hybrid Cloud Strategy:** AWS promotes a hybrid cloud approach where organizations maintain part of their data and applications on-premises while leveraging cloud services for scalability and flexibility. 137 | - **Bridge Between On-Premises and Cloud:** Storage Gateway serves as a bridge, allowing seamless integration and data transfer between on-premises infrastructure and AWS cloud storage services. 138 | 139 | ## S3 File Gateway 140 | 141 | - **Exposing S3 Data On-Premises:** S3 File Gateway enables organizations to expose Amazon S3 objects to their on-premises environments using standard network protocols like NFS or SMB. 142 | - **Data Caching:** Data is cached locally within the file gateway, ensuring low-latency access to frequently used data. 143 | - **Active Directory Integration:** Integration with Active Directory enables seamless authentication and access control. 144 | 145 | ## FSx File Gateway 146 | 147 | - **Native Access to FSx for Windows File Server:** FSx File Gateway provides native access to Amazon FSx for Windows File Server, allowing on-premises applications to interact with FSx file shares. 148 | - **Local Cache:** Frequently accessed data is cached locally within the file gateway for improved performance. 149 | 150 | ## Volume Gateway 151 | 152 | - **Block Storage with iSCSI Protocol:** Volume Gateway offers block storage using the iSCSI protocol, with data stored in Amazon S3. 153 | - **Backed by EBS Snapshots:** Data is backed by Amazon EBS snapshots, facilitating data restoration and recovery on-premises. 154 | 155 | ## Tape Gateway 156 | 157 | - **Physical Tape Integration:** Tape Gateway enables organizations to archive data using physical tapes, with data backed up to Amazon S3 and Glacier. 158 | - **Backup for Existing Tape Data:** Provides a seamless backup solution for existing tape data, ensuring data durability and cost-effectiveness. 159 | 160 | ## Deployment Options 161 | 162 | - **Software Installation:** Storage Gateway software can be installed on corporate data centers, providing a software-defined solution for hybrid cloud storage. 163 | - **Hardware Appliance:** Alternatively, organizations can opt for a hardware appliance for physical installation, simplifying deployment and management. 164 | 165 | In summary, AWS Storage Gateway offers a range of solutions for integrating on-premises environments with AWS cloud storage services, facilitating hybrid cloud adoption and enabling seamless data transfer and management between on-premises infrastructure and the cloud. Whether you need to expose S3 data on-premises, integrate with FSx for Windows File Server, leverage block storage with Volume Gateway, or archive data with Tape Gateway, Storage Gateway provides versatile options to meet your hybrid cloud storage needs. 166 | 167 | # AWS Transfer Family 168 | 169 | AWS Transfer Family is a fully managed service that facilitates file transfers into and out of Amazon S3 or Amazon EFS using standard file transfer protocols like FTP, FTPS, and SFTP. Here's an overview: 170 | 171 | - **Managed File Transfers:** AWS Transfer Family simplifies the process of transferring files to and from cloud storage, offering support for FTP, FTPS, and SFTP protocols. 172 | - **High Availability:** The service is highly available and operates across multiple Availability Zones (AZs) to ensure reliability and redundancy. 173 | - **Pay-Per-Use Pricing:** Users are charged based on provisioned endpoints and data transfer volume, with pricing determined per hour and per gigabyte (GB) of data transferred. 174 | - **Integration with S3 and EFS:** AWS Transfer Family securely transfers files to Amazon S3 or Amazon EFS, providing a seamless workflow for storing and accessing data in the cloud. 175 | 176 | # AWS DataSync 177 | 178 | AWS DataSync is a data transfer service designed to efficiently move large amounts of data to and from AWS, whether it's between on-premises systems, other cloud providers, or different AWS storage services. Here are its key features: 179 | 180 | - **Data Movement Capabilities:** DataSync facilitates data movement between various sources, including on-premises environments, other cloud providers, and AWS storage services. 181 | - **Agent-Based Transfer:** For on-premises data transfers, DataSync requires the installation of an agent, which securely transfers data to AWS. 182 | - **Supported Destinations:** DataSync supports synchronization with Amazon S3 (including all storage classes, including Glacier), Amazon EFS, and Amazon FSx (including Windows File Server, Lustre, NetApp, and OpenZFS). 183 | - **Scheduled Replication:** Replication tasks can be scheduled to run hourly, daily, or weekly, providing flexibility in data synchronization. 184 | - **Preservation of Metadata:** File permissions and metadata are preserved during the data transfer process, ensuring data integrity and consistency. 185 | - **High-Speed Transfers:** Each DataSync agent task can utilize up to 10 gigabits per second (Gbps) of network bandwidth, enabling fast and efficient data transfers. 186 | - **Workflow Overview:** On-premises data is transferred to AWS using the DataSync agent, which then synchronizes the data with the specified AWS storage service, such as S3, EFS, or FSx. 187 | - **Cron Job Support:** DataSync operates on a schedule-based model with cron jobs, allowing users to define when data replication tasks should occur. 188 | 189 | In summary, AWS Transfer Family and AWS DataSync provide comprehensive solutions for securely transferring and synchronizing data to and from AWS, offering support for various protocols, destinations, and scheduling options to meet diverse data transfer requirements. 190 | -------------------------------------------------------------------------------- /013-containers.md: -------------------------------------------------------------------------------- 1 | # Containers 2 | 3 | # Containers Introduction 4 | 5 | **Container Section** 6 | 7 | - **Use Case**: 8 | - Microservice architecture is a common use case. 9 | 10 | - **Running Docker Images**: 11 | - Docker agents need to be running. 12 | - Multiple Docker containers of the same application can run simultaneously. 13 | 14 | - **Image Storage**: 15 | - Images can be stored in various repositories like DockerHub, Amazon ECR, or private repositories. 16 | - Both public and private repositories are available. 17 | 18 | - **Differences from VM**: 19 | - Docker operates differently from traditional virtual machines (VMs). 20 | - Docker utilizes a Docker daemon, whereas VMs use a hypervisor. 21 | 22 | - **AWS Services**: 23 | - Amazon ECS (Elastic Container Service) is an AWS service for managing containerized applications. 24 | - Amazon EKS (Elastic Kubernetes Service) is a managed Kubernetes service. 25 | - AWS Fargate is a serverless compute engine for containers. 26 | - Amazon ECR (Elastic Container Registry) is a fully managed Docker container registry. 27 | 28 | In summary, containers are widely used for microservice architecture, with Docker being a popular choice. AWS offers several services for container management, including ECS, EKS, Fargate, and ECR, providing flexibility and scalability for containerized applications. 29 | 30 | Here's the breakdown of Amazon ECS: 31 | 32 | **Amazon ECS (Elastic Container Service)** 33 | 34 | - **Launch Types**: 35 | - **EC2 Launch Type**: 36 | - Requires provisioning and maintaining infrastructure (EC2 instances). 37 | - Cluster consists of multiple EC2 instances, each running an ECS agent. 38 | - ECS tasks start and stop Docker containers on these instances. 39 | - **Fargate Launch Type**: 40 | - Serverless approach; no need to provision infrastructure. 41 | - Task definitions are created, and AWS runs the tasks based on CPU/RAM requirements. 42 | - Scaling is achieved by increasing the number of tasks. 43 | 44 | - **IAM Roles for ECS**: 45 | - **EC2 Instance Profile (EC2 Launch Type)**: 46 | - Used by the ECS agent running on EC2 instances. 47 | - Allows ECS agent to make API calls to ECS service, send container logs to CloudWatch, and pull Docker images from ECR. 48 | - Can reference sensitive data in Secrets Manager or SSM Parameter Store. 49 | - **ECS Task Role**: 50 | - Allows each ECS task to have a specific role. 51 | - Different roles can be used for different ECS services. 52 | - Task role permissions are defined in task definitions. 53 | 54 | - **Load Balancer Integrations**: 55 | - **ALB (Application Load Balancer)**: 56 | - Supported and works for most use cases. 57 | - **NLB (Network Load Balancer)**: 58 | - Recommended for high throughput/high-performance use cases or when paired with AWS PrivateLink. 59 | - **Classic Load Balancer**: 60 | - Not supported. 61 | 62 | - **Data Volumes (EFS)**: 63 | - EFS can be mounted onto ECS tasks. 64 | - Works for both EC2 and Fargate launch types. 65 | - Tasks running in any Availability Zone share the same data in EFS. 66 | - Fargate + EFS combination provides serverless shared storage. 67 | - Use cases include persistent multi-AZ shared storage for containers. 68 | - Note: S3 cannot be mounted as a file system. 69 | 70 | Amazon ECS offers flexibility in managing containers, supporting both EC2 and Fargate launch types, with various integrations for load balancing and data volumes. Additionally, IAM roles provide fine-grained access control for ECS tasks. 71 | 72 | Here's the breakdown of ECS Cluster and ECS Service: 73 | 74 | **ECS Cluster:** 75 | - Supports both Fargate and EC2 (Auto Scaling Groups). 76 | - For EC2, desired capacity for Auto Scaling Groups can be set to 1, ensuring a single instance is always running and registered in the cluster. 77 | 78 | **ECS Service:** 79 | - Before creating a service, a task definition needs to be created. 80 | - Task definitions can choose between Fargate and EC2 instance modes, allowing tasks to be started on Fargate or EC2 instances. 81 | - An IAM role needs to be assigned to a task if API calls to other AWS services are required. 82 | - Container settings include port mapping, environment variables, etc. 83 | - Launching a task definition as a service involves specifying its desired count, placement constraints, and task placement strategies. 84 | 85 | Sure, let's break down the key components in Amazon ECS (Elastic Container Service) and their differences: 86 | 87 | 1. **Task**: 88 | - A task is the smallest unit of work in ECS. 89 | - It represents a set of containerized applications that should be run together. 90 | - A task definition specifies which Docker images to use, how many containers are in the task, and how they interact. 91 | - Tasks can be launched as part of a service or manually through the ECS console or API. 92 | - A task can consist of one or more containers, which are treated as a single logical unit. 93 | 94 | 2. **Service**: 95 | - A service in ECS manages and maintains a specified number of instances of a task definition. 96 | - It ensures that the desired number of tasks (instances) are running and restarts them if they fail or stop. 97 | - Services allow for load balancing and scaling of tasks across multiple EC2 instances or Fargate containers. 98 | - They provide a way to scale containers horizontally and distribute traffic across them. 99 | - Services can be configured to use a variety of load balancers for distributing traffic. 100 | 101 | 3. **Cluster**: 102 | - An ECS cluster is a logical grouping of container instances or Fargate tasks. 103 | - It acts as the foundation for ECS, providing the infrastructure where tasks are scheduled and run. 104 | - Clusters can contain EC2 instances (which run the ECS agent) or Fargate tasks. 105 | - Multiple services can be deployed within a cluster, each with its own set of tasks. 106 | 107 | 4. **Container Instance**: 108 | - A container instance is an EC2 instance (in EC2 launch type) or an isolated compute environment (in Fargate launch type) that runs containers. 109 | - It's part of an ECS cluster and can run multiple tasks concurrently. 110 | - Container instances must have the ECS agent running to register with the ECS cluster and receive task definitions. 111 | 112 | 5. **Task Definition**: 113 | - A task definition is a blueprint for a task. 114 | - It defines which Docker images to use, how many containers are in the task, and how they interact. 115 | - Task definitions also specify resource requirements, container definitions (like CPU, memory, networking), logging configuration, etc. 116 | - Task definitions are versioned, allowing multiple revisions to be stored and used. 117 | 118 | In summary, tasks represent individual workloads, services manage and maintain a specified number of tasks, clusters provide the infrastructure for running tasks, container instances execute tasks, and task definitions define the configuration for tasks. Each component plays a critical role in orchestrating containerized applications within ECS. 119 | 120 | # ECS Auto Scaling 121 | ECS Auto Scaling provides dynamic scaling capabilities to ensure that your ECS tasks can handle varying levels of workload demand efficiently. Here's a breakdown of ECS Auto Scaling and related concepts: 122 | 123 | 1. **Auto Scaling Policies**: 124 | - Automatically adjusts the number of ECS tasks based on specified criteria like CPU utilization, memory usage, or custom CloudWatch metrics. 125 | - Scaling policies can be configured using target tracking, step scaling, or scheduled scaling. 126 | 127 | 2. **Target Tracking Scaling**: 128 | - Scales ECS tasks based on a target value for a specific CloudWatch metric, such as CPU utilization or request count. 129 | - Automatically adjusts the number of tasks to maintain the target value. 130 | 131 | 3. **Step Scaling**: 132 | - Scales ECS tasks based on CloudWatch alarms and scaling adjustment steps. 133 | - Allows more granular control over scaling actions by defining specific thresholds and actions. 134 | 135 | 4. **Scheduled Scaling**: 136 | - Allows you to schedule changes to the number of ECS tasks at specific times. 137 | - Useful for predictable changes in workload demand, such as during peak hours or scheduled maintenance windows. 138 | 139 | 5. **Fargate Auto Scaling**: 140 | - Simplifies auto-scaling for Fargate tasks by automatically provisioning and scaling the underlying infrastructure based on resource requirements. 141 | - Provides a serverless experience without the need to manage EC2 instances. 142 | 143 | 6. **EC2 Instance Auto Scaling**: 144 | - Scales the EC2 instances within the ECS cluster based on criteria like CPU utilization or custom CloudWatch metrics. 145 | - Managed by Auto Scaling Groups (ASGs), which automatically add or remove EC2 instances to meet the desired capacity. 146 | 147 | 7. **ECS Cluster Capacity Providers**: 148 | - Automatically provisions and scales EC2 instances within ECS clusters to ensure sufficient capacity for running tasks. 149 | - Paired with Auto Scaling Groups to add or remove EC2 instances dynamically based on workload demand. 150 | 151 | 8. **Event-Based Task Invocation**: 152 | - ECS tasks can be invoked based on events from other AWS services, such as S3 uploads or messages in SQS queues. 153 | - EventBridge (formerly CloudWatch Events) can trigger ECS tasks based on predefined rules, allowing for event-driven scaling and task execution. 154 | 155 | 9. **Intercepting Stopped Tasks**: 156 | - EventBridge can be used to intercept events when ECS tasks are stopped. 157 | - This can trigger actions like sending notifications or executing cleanup tasks, providing visibility and control over task lifecycle events. 158 | 159 | By leveraging ECS Auto Scaling and related features, you can ensure that your ECS tasks can dynamically adapt to changing workload conditions, improving efficiency and reliability in your containerized environment. 160 | 161 | # Amazon ECR 162 | Amazon Elastic Container Registry (ECR) is a fully managed Docker container registry service provided by AWS. Here's an overview of its key features and functionalities: 163 | 164 | 1. **Private and Public Registries**: 165 | - ECR supports both private and public container registries. 166 | - Private registries are secured and accessible only to authorized users within your AWS account. 167 | - Public registries, such as gallery.ecr.aws, provide a curated collection of publicly available container images. 168 | 169 | 2. **Integration with IAM**: 170 | - IAM roles are used to control access to ECR resources. 171 | - Policies can be defined to grant or restrict permissions for users and services to push, pull, or manage container images. 172 | 173 | 3. **Secure Storage**: 174 | - Container images are securely stored within ECR repositories. 175 | - Images are durably stored in Amazon S3, ensuring high availability and durability. 176 | 177 | 4. **Image Lifecycle Management**: 178 | - Supports versioning of container images, allowing you to maintain multiple versions of the same image. 179 | - Images can be tagged with labels for organization and identification purposes. 180 | 181 | 5. **Image Scanning**: 182 | - ECR provides built-in image vulnerability scanning capabilities. 183 | - Automatically scans container images for known security vulnerabilities and issues. 184 | 185 | 6. **Container Image Push and Pull**: 186 | - Docker CLI and other container management tools can be used to push container images to ECR repositories. 187 | - Authorized users and services can pull images from ECR repositories to deploy containers in ECS, EKS, or other container orchestration platforms. 188 | 189 | 7. **Image Replication**: 190 | - Supports cross-region replication of container images to improve availability and reduce latency for distributed applications. 191 | 192 | 8. **Integration with AWS Services**: 193 | - Seamlessly integrates with AWS services like ECS and EKS, allowing you to deploy containerized applications using images stored in ECR. 194 | - Provides native support for AWS Identity and Access Management (IAM) for fine-grained access control. 195 | 196 | 9. **Private Network Access**: 197 | - Supports VPC endpoints for secure and private access to ECR within your AWS Virtual Private Cloud (VPC). 198 | 199 | Amazon ECR simplifies the process of storing, managing, and deploying container images, providing a secure and reliable registry solution for your containerized applications. 200 | 201 | # EKS 202 | Amazon Elastic Kubernetes Service (EKS) is a managed Kubernetes service offered by AWS. Here's an overview of its key features and functionalities: 203 | 204 | 1. **Kubernetes Management**: 205 | - Provides a fully managed Kubernetes control plane, allowing you to deploy, scale, and manage containerized applications using Kubernetes. 206 | - Offers compatibility with the Kubernetes API, enabling seamless integration with existing Kubernetes tools and workflows. 207 | 208 | 2. **Launch Types**: 209 | - Supports both Fargate and EC2 launch types. 210 | - Fargate launch type eliminates the need to manage underlying EC2 instances, providing a serverless Kubernetes experience. 211 | - EC2 launch type allows you to manage and customize the underlying EC2 instances that host your Kubernetes nodes. 212 | 213 | 3. **Node Types**: 214 | - **Managed Node Groups**: EKS automatically creates and manages EC2 instances (nodes) for you. These nodes are part of Auto Scaling Groups (ASGs) managed by EKS. 215 | - **Self-managed Nodes**: You can create and manage your own EC2 instances and register them with the EKS cluster. These nodes can be provisioned using prebuilt Amazon EKS-optimized Amazon Machine Images (AMIs) and can be part of ASGs managed by you. 216 | 217 | 4. **Networking**: 218 | - Integrates with Amazon VPC to provide network isolation for Kubernetes clusters. 219 | - Supports Kubernetes Network Policies for fine-grained network access control between pods. 220 | 221 | 5. **Integration with AWS Services**: 222 | - Seamlessly integrates with other AWS services such as ECR, IAM, CloudWatch, and CloudFormation. 223 | - Enables you to leverage AWS security features and services for authentication, authorization, monitoring, and logging. 224 | 225 | 6. **Load Balancer Integration**: 226 | - Automatically provisions AWS Elastic Load Balancers (ELBs) or Network Load Balancers (NLBs) to expose Kubernetes services to external traffic. 227 | - Supports integration with AWS Application Load Balancers (ALBs) through Ingress resources. 228 | 229 | 7. **Scaling and High Availability**: 230 | - Provides built-in support for horizontal scaling and high availability of Kubernetes applications. 231 | - Utilizes Auto Scaling Groups (ASGs) to automatically scale EC2 instances based on CPU/memory utilization or other metrics. 232 | 233 | 8. **Attach Data Volumes**: 234 | - Supports attaching persistent storage volumes to Kubernetes pods using StorageClasses and Container Storage Interface (CSI) compliant drivers. 235 | - Integrates with various AWS storage services such as Amazon EBS, Amazon EFS, FSx for Lustre, and FSx for NetApp ONTAP. 236 | 237 | Amazon EKS simplifies the process of deploying and managing Kubernetes clusters on AWS, providing a scalable, reliable, and secure platform for running containerized applications. 238 | 239 | # AWS App Runner Service 240 | 241 | **AWS App Runner** is a fully managed service designed to simplify the deployment of web applications and APIs at scale. Here are its key features and capabilities: 242 | 243 | 1. **Fully Managed Service**: 244 | - Requires no prior infrastructure experience, making it accessible to developers of all levels. 245 | - Handles the entire deployment process, from provisioning resources to managing scaling and availability. 246 | 247 | 2. **Source Code or Container Image Deployment**: 248 | - Supports deploying applications directly from your source code or container images. 249 | - Offers flexibility in how you package and deploy your applications. 250 | 251 | 3. **Automatic Build and Deployment**: 252 | - Automates the build and deployment process, reducing the need for manual intervention. 253 | - Allows developers to focus on coding while App Runner takes care of the deployment pipeline. 254 | 255 | 4. **Automatic Scaling**: 256 | - Scales resources automatically based on application traffic and load. 257 | - Ensures that your applications are highly available and can handle varying levels of demand. 258 | 259 | 5. **Load Balancer and Encryption**: 260 | - Provides built-in load balancing capabilities to distribute traffic across multiple instances of your application. 261 | - Ensures data security through encryption mechanisms to protect sensitive information in transit and at rest. 262 | 263 | 6. **VPC Access Support**: 264 | - Allows applications deployed on App Runner to securely access resources within your Virtual Private Cloud (VPC). 265 | - Provides network isolation and control over inbound and outbound traffic flow. 266 | 267 | 7. **Integration with AWS Services**: 268 | - Enables seamless integration with other AWS services such as databases, caching solutions, and message queues. 269 | - Allows you to build complex architectures by connecting your application to various AWS resources. 270 | 271 | 8. **Use Cases**: 272 | - Well-suited for deploying various types of web applications, APIs, and microservices. 273 | - Ideal for scenarios requiring rapid production deployments, where simplicity, scalability, and reliability are paramount. 274 | 275 | Overall, AWS App Runner simplifies the process of deploying and managing web applications and APIs, allowing developers to focus on building great software without worrying about infrastructure management. 276 | -------------------------------------------------------------------------------- /012-sqs-sns-kinesis.md: -------------------------------------------------------------------------------- 1 | # Decoupling Applications: SQS, SNS, Kinesis 2 | 3 | # Introduction to Messaging 4 | 5 | In the realm of distributed systems, messaging plays a crucial role in facilitating communication between various components or services. Here's a brief overview: 6 | 7 | - **Need for Data Sharing:** Services often need to share data with each other, either synchronously or asynchronously, to enable effective coordination and collaboration within a system. 8 | 9 | - **Sync vs. Async Communication:** Messaging can be categorized into synchronous communication, where applications communicate directly with each other, and asynchronous communication, where messages are exchanged through intermediary components like queues. 10 | 11 | - **Decoupling Applications:** Asynchronous messaging, in particular, enables the decoupling of applications by allowing them to communicate indirectly through messages. This decoupling enhances system flexibility, scalability, and resilience. 12 | 13 | # What is SQS (Simple Queue Service) 14 | 15 | Amazon SQS is a fully managed message queuing service provided by AWS. Here are some key features and characteristics: 16 | 17 | - **Producer-Consumer Model:** Producers send messages to SQS queues, and consumers poll these queues to retrieve and process messages. 18 | 19 | - **Decoupling Applications:** SQS is designed to decouple the components of distributed applications, allowing them to operate independently and asynchronously. 20 | 21 | - **Scalability:** SQS offers unlimited throughput and supports an unlimited number of messages in the queue, making it suitable for high-volume and distributed systems. 22 | 23 | - **Message Retention:** Messages in SQS queues can remain in the queue for a configurable duration, with a default minimum retention period of 4 days and a maximum of 14 days. 24 | 25 | - **Low Latency:** SQS provides low-latency message delivery, typically around 10 milliseconds for both message publishing and retrieval. 26 | 27 | - **Message Size:** Each message in SQS can be up to 256 KB in size, accommodating various types of payloads. 28 | 29 | - **Message Delivery Guarantees:** SQS provides at-least-once message delivery, meaning that messages may be delivered more than once but are never lost. However, there's no strict ordering guarantee (best-effort ordering). 30 | 31 | - **Consumer Scalability:** Multiple consumers can simultaneously poll an SQS queue to process messages, allowing for horizontal scalability. 32 | 33 | - **Auto Scaling Integration:** SQS can be integrated with AWS Auto Scaling to dynamically adjust the number of consumers based on queue metrics, such as the approximate number of messages. 34 | 35 | - **Encryption:** SQS ensures encryption of messages in transit (in-flight encryption) and also supports client-side encryption for added security. 36 | 37 | - **Access Policies:** Access to SQS queues is managed through access policies, allowing you to specify who can send messages to and receive messages from a queue. This is in addition to IAM-based authentication and authorization. 38 | 39 | In summary, Amazon SQS provides a reliable, scalable, and fully managed messaging service that enables asynchronous communication between distributed components or services within AWS and beyond. Its features, such as scalability, low latency, and message retention, make it well-suited for building loosely coupled and resilient distributed systems. 40 | 41 | # Message Visibility Timeout 42 | 43 | - **Concept:** When a consumer polls a message from an SQS queue, that message becomes temporarily invisible to other consumers. This invisibility period is known as the message visibility timeout. 44 | 45 | - **Default Timeout:** By default, the message visibility timeout is set to 30 seconds. During this period, the message is expected to be processed and deleted by the consumer. 46 | 47 | - **Handling Message Processing Time:** If a message requires more time for processing than the default timeout allows, you can adjust the message visibility timeout accordingly. 48 | 49 | - **Impact of Timeout Setting:** It's crucial to strike a balance with the timeout setting. Setting it too high may cause delays in processing messages, while setting it too low may result in messages being returned to the queue prematurely. 50 | 51 | # SQS Long Polling 52 | 53 | - **Purpose:** SQS Long Polling enhances the efficiency of message retrieval by allowing consumers to wait for messages to arrive if the queue is currently empty. 54 | 55 | - **Configuration:** Consumers can specify a longer polling duration (1 to 20 seconds) when requesting messages from the queue. 56 | 57 | - **Advantages:** Long polling reduces the number of empty responses from the queue, leading to lower costs and improved performance compared to short polling. 58 | 59 | - **Queue-Level Setting:** Long polling can be enabled at the queue level, ensuring consistent behavior across all consumers. 60 | 61 | # FIFO (First-In-First-Out) Queues 62 | 63 | - **Ordering:** FIFO queues preserve the order in which messages are sent and received. Messages are processed in the exact order they are sent into the queue. 64 | 65 | - **Throughput:** FIFO queues have limited throughput compared to standard queues, with a maximum rate of 300 messages per second (without batching) and 3000 messages per second (with batching). 66 | 67 | - **Exactly-Once Processing:** FIFO queues offer exactly-once message processing, ensuring that duplicates are removed and each message is processed only once. 68 | 69 | - **Naming Convention:** FIFO queues are identified by the `.fifo` suffix in their names. 70 | 71 | - **Content-Based Deduplication:** FIFO queues support an option for content-based deduplication, where messages with identical content are automatically filtered to prevent duplicates. 72 | 73 | # SQS + Auto Scaling Group 74 | 75 | - **Scaling Based on Queue Size:** SQS can be integrated with AWS Auto Scaling to dynamically scale the number of consumers (instances) based on the number of messages in the queue. 76 | 77 | - **Metric Monitoring:** The `ApproximateNumberOfMessages` metric from SQS can trigger alarms in CloudWatch, which, in turn, can initiate scaling actions on the Auto Scaling group. 78 | 79 | - **Buffering for Database Writes:** SQS acts as a buffer between application components, such as frontend services and database writes. Requests are first sent to SQS, dequeued by the Auto Scaling group, and then processed and inserted into the database. 80 | 81 | - **Decoupling Applications:** This architecture decouples different tiers of an application, ensuring smoother operation, improved fault tolerance, and easier scalability. 82 | 83 | 84 | # Amazon SNS (Simple Notification Service) 85 | 86 | Amazon SNS is a fully managed messaging service provided by AWS, facilitating the pub/sub (publish/subscribe) messaging pattern. Here's an overview: 87 | 88 | - **Pub/Sub Model:** With SNS, an event producer (publisher) sends messages to a topic, and multiple event consumers (subscribers) can receive and process these messages. 89 | 90 | - **Scalability:** SNS allows for highly scalable and distributed message publication to potentially thousands or millions of subscribers. 91 | 92 | - **Subscription Limits:** Each topic can support up to 12 million subscriptions, and AWS accounts can create up to 100,000 topics. 93 | 94 | - **Supported Protocols:** Subscribers can receive messages via various protocols, including email, SMS, HTTP/HTTPS, SQS, Lambda, Kinesis Firehose, and more. 95 | 96 | - **Direct Publish for Mobile Apps:** SNS provides direct publish capabilities for mobile applications, enabling the delivery of push notifications to mobile devices via SDKs. 97 | 98 | - **Encryption and Access Policies:** SNS ensures encryption of messages in transit and at rest. Access policies control who can publish messages to topics. 99 | 100 | # SNS + SQS: Fan Out Pattern 101 | 102 | - **Pattern Description:** The Fan Out pattern involves publishing a message to an SNS topic and delivering that message to all subscribed SQS queues. 103 | 104 | - **Decoupled Architecture:** This pattern enables fully decoupled communication between publishers and subscribers, ensuring no data loss and allowing SQS to provide message persistence. 105 | 106 | - **Benefits:** By leveraging SQS for message delivery, the Fan Out pattern enhances fault tolerance and scalability in distributed systems. 107 | 108 | # S3 Events to Multiple Queues 109 | 110 | - **Limitations:** While S3 events can trigger multiple actions, such as invoking Lambda functions or sending messages to SQS, each event type and prefix combination can only have one S3 event rule. 111 | 112 | - **Solution:** To achieve multiple queue subscriptions from S3 events, you can route the events through SNS and then distribute them to the desired SQS queues via subscriptions. 113 | 114 | # SNS FIFO (First-In-First-Out) 115 | 116 | - **Similarity to SQS FIFO:** SNS FIFO queues offer similar features to SQS FIFO queues, including deduplication and message ordering based on message group ID. 117 | 118 | - **Naming Convention:** FIFO topics must have names ending with `.fifo`, similar to FIFO queues in SQS. 119 | 120 | # Message Filtering 121 | 122 | - **Consumer-Specific Filtering:** SNS allows for message filtering based on consumer-specific criteria using filter policies defined in JSON format. 123 | 124 | - **Customization:** By applying filter policies, subscribers can receive only the messages that meet specific conditions, improving message relevance and reducing unnecessary processing. 125 | 126 | # Amazon Kinesis 127 | 128 | Amazon Kinesis is a platform provided by AWS for collecting, processing, and analyzing streaming data in real-time. It consists of several components: 129 | 130 | ## Kinesis Data Streams 131 | 132 | - **Purpose:** Capture, process, and store data streams in real-time. 133 | - **Shards:** Data Streams are composed of multiple shards, which serve as the unit of capacity provisioning. You must provision shards in advance based on expected workload. 134 | - **Producers:** Data producers, such as applications, clients, SDKs, or the Kinesis Agent, write records into Data Streams. Each record includes a partition key and a data blob. 135 | - **Consumers:** Data Streams can be consumed by various services including SDKs, AWS Lambda, Kinesis Firehose, and Kinesis Analytics. 136 | - **Record Attributes:** Each record includes a partition key, sequence number, and data blob. 137 | - **Retention:** Data retention can be set between 1 to 365 days. 138 | - **Replayability:** Data can be replayed from the stream. 139 | - **Capacity Modes:** Supports provisioned mode where you pay per provisioned shard per hour, and on-demand mode with default capacity. 140 | - **Security:** Deployed within a region, supports encryption and VPC integration. 141 | - **Monitoring:** Activity can be monitored using AWS CloudTrail. 142 | - **Reading Data:** Consumers can read records from the beginning (TRIM_HORIZON) or from a specific point (AFTER_SEQUENCE_NUMBER, AT_TIMESTAMP). 143 | 144 | ## Kinesis Data Firehose 145 | 146 | - **Purpose:** Load data streams into AWS data stores. 147 | - **Fully Managed:** Automatically scales to handle varying workloads and manages resources. 148 | - **Direct Data Delivery:** Streams data directly to S3, Redshift, Elasticsearch, or Splunk without needing intermediate storage. 149 | - **Serverless:** No need to manage resources or servers. 150 | - **Data Transformation:** Supports data transformation using AWS Lambda before delivery to destinations. 151 | - **Security:** Encrypted data delivery and supports VPC endpoint policies. 152 | 153 | ## Kinesis Data Analytics 154 | 155 | - **Purpose:** Analyze streaming data using SQL queries. 156 | - **Real-time Insights:** Perform analytics on live streaming data with SQL. 157 | - **Integration:** Easily integrates with Kinesis Data Streams and Kinesis Data Firehose. 158 | - **Automatic Scaling:** Automatically scales based on query complexity and volume of data. 159 | - **Output Options:** Send results to various destinations including S3, Redshift, Elasticsearch, or Lambda functions. 160 | - **Real-time Monitoring:** Monitor and visualize queries and performance metrics in real-time. 161 | 162 | # Kinesis Data Firehose 163 | 164 | Kinesis Data Firehose is a fully managed service that makes it easy to load streaming data into AWS data stores and analytics services. Here are the key features and capabilities: 165 | 166 | - **Data Collection:** Firehose can ingest data from various sources, acting as a receiver for data producers. 167 | - **Record Size:** Supports records up to 1 MB in size. 168 | - **Data Transformation:** Allows for data transformation using AWS Lambda functions before delivering it to destinations. This enables data enrichment, filtering, and format conversion. 169 | - **Destination Options:** Data can be seamlessly written to destinations without writing any code. Supported destinations include Amazon S3, Amazon Redshift (via S3), Amazon OpenSearch Service, and various third-party services like Datadog, Splunk, and New Relic. Additionally, it supports HTTP endpoints for custom destinations. 170 | - **Handling Failed Data:** Firehose can automatically write all or failed data into Amazon S3 for troubleshooting and reprocessing. 171 | - **Near-Real-Time Delivery:** Offers near-real-time data delivery with a buffer interval ranging from 0 to 900 seconds. You can specify a minimum buffer size of 1 MB. Data is delivered within a few seconds of being ingested. 172 | - **Automatic Storage Management:** Firehose automatically manages the storage of data and doesn't support data replay since it doesn't store the data internally. 173 | - **Data Transformation:** Supports transforming record format into Parquet or ORC for efficient storage and analytics. 174 | - **Data Prefixing:** Allows adding prefixes to the delivered S3 objects for better organization and management. 175 | - **Buffer Management:** Utilizes buffering to accumulate data before delivering it to the target destination. Data is delivered either when the buffer size reaches a specified threshold (e.g., 5 MB) or when the buffer interval expires (e.g., 5 minutes). 176 | - **Compression:** Supports compression formats like gzip and snappy to reduce storage costs and improve data transfer efficiency. 177 | - **Visibility Delay:** It may take some time (up to the buffer size or buffer interval) for data to become visible in the destination, depending on the buffer configuration. 178 | 179 | Kinesis Data Firehose simplifies the process of loading streaming data into AWS services and third-party destinations, providing flexibility, scalability, and reliability without the need for managing infrastructure or writing custom code. 180 | 181 | # Data Ordering Kinesis vs. SQS Fifo 182 | - **Data Ordering in Kinesis:** Achieved by using a partition key, ensuring that records with the same key are sent to the same shard. This allows for ordered processing of data within each shard. However, different shards may contain data in a different order. Each shard can be consumed by only one consumer, providing strong ordering guarantees within that shard. 183 | 184 | - **Data Ordering in SQS FIFO:** Similar to Kinesis, ordering is achieved using a `group_id`. Messages with the same `group_id` are processed in order, while messages in different groups can be processed independently. SQS FIFO provides ordered message processing within a message group. However, unlike Kinesis, where each shard can only be consumed by one consumer, SQS FIFO allows multiple consumers to process messages from different groups concurrently. 185 | 186 | In both cases, the use of partition keys (in Kinesis) and `group_id` (in SQS FIFO) ensures that related messages are processed in order, providing deterministic message ordering within their respective systems. 187 | 188 | | Feature | SQS | SNS | Kinesis | 189 | |------------------|----------------------------------------------|----------------------------------------------|----------------------------------------------| 190 | | Data Movement | Consumer pulls data | Pushes data to subscribers | Standard: Pull data, Enhanced Fan-Out: Push data | 191 | | Data Persistence | Data is deleted after being consumed | Data is not persisted | Data expires after a certain period of time | 192 | | Scalability | Can have as many workers as needed | Up to 12 million subscribers | Provisioned mode: Fixed capacity, On-demand mode: Autoscaling | 193 | | Throughput | No need to provision throughput | No need to provision throughput | Standard: 2MB per shard, Enhanced Fan-Out: 2MB per shard per consumer | 194 | | Ordering | FIFO provides ordering guarantee | Not inherently ordered | Ordered at the shard level | 195 | | Integration | Integrates with SQS for fan-out | - | - | 196 | | Message Delay | Individual message delay capability | - | - | 197 | | Topic Limit | - | Up to 100k topics | - | 198 | | Use Case | - | Pub/Sub model | Real-time big data, analytics, etc. | 199 | 200 | 201 | Here's the information presented in a structured manner: 202 | 203 | **Amazon MQ** 204 | 205 | - **Purpose**: 206 | - Traditional applications running on-premises may use open protocols such as MQTT, AMQP, STOMP, OpenWire. 207 | - Instead of re-engineering the app to use SQS and SNS when migrating to the cloud, Amazon MQ can be used. 208 | 209 | - **Service Type**: 210 | - Managed message broker service. 211 | 212 | - **Supported Protocols**: 213 | - Supports protocols like MQTT, AMQP, STOMP, OpenWire. 214 | 215 | - **Scalability**: 216 | - Doesn't scale as much as SQS/SNS. 217 | 218 | - **Deployment**: 219 | - Runs on servers. 220 | - Can run in multi-AZ with failover. 221 | 222 | - **Features**: 223 | - Provides both queue features like SQS and topic features like SNS. 224 | 225 | - **Failover**: 226 | - Failover mechanism via EFS (Elastic File System) that acts as backup for data. 227 | - EFS can be mounted to multi-AZs. 228 | 229 | Amazon MQ is essentially a managed message broker service designed to facilitate the migration of traditional applications running on-premises to the cloud without the need for extensive re-engineering. It supports various open protocols commonly used in traditional setups and offers features similar to both SQS and SNS, making it a convenient choice for such migration scenarios. However, it's important to note that it may not scale as much as SQS and SNS, and it runs on servers rather than being fully serverless. 230 | -------------------------------------------------------------------------------- /023-other-services.md: -------------------------------------------------------------------------------- 1 | # Other Services 2 | 3 | # Cloudformation 4 | 5 | Sure, here's a summary with key points highlighted and extended where necessary: 6 | 7 | Summary: 8 | CloudFormation is a declarative method for defining your AWS infrastructure as code, allowing you to specify resources and their configurations in a structured format. It ensures that resources are created in the correct order as specified, simplifying the provisioning process. Each resource within the CloudFormation stack is tagged to identify associated costs, aiding in cost management and allocation. Additionally, CloudFormation supports strategies such as scheduling deletion at 5pm and generation at 8am, facilitating automated workflow management. 9 | 10 | Key Points: 11 | 1. **Declarative Infrastructure:** CloudFormation enables the declaration of infrastructure resources and their configurations in code, offering a clear and concise representation of the desired state. 12 | 2. **Ordered Creation:** Resources are provisioned in the order specified, ensuring dependencies are met and avoiding potential deployment errors. 13 | 3. **Cost Tagging:** Each resource is tagged to identify its associated costs, aiding in cost management and allocation. 14 | 4. **Workflow Automation:** Strategies like scheduling deletion and generation at specific times automate routine tasks, enhancing workflow efficiency. 15 | 5. **Declarative Programming:** CloudFormation follows a declarative programming paradigm, where the focus is on specifying what needs to be done rather than how it should be accomplished. 16 | 6. **Automated Diagram Generation:** CloudFormation offers automated generation of diagrams for your infrastructure template, aiding in visualization and understanding of the architecture. 17 | 7. **Custom Resources:** Custom resources can be utilized when there are no built-in AWS resources available for a specific task or requirement, allowing for greater flexibility and extensibility. 18 | 8. **Application Composer:** Application Composer can be used to generate CloudFormation templates, streamlining the process of creating complex architectures. 19 | 9. **Infrastructure as Code (IaC) Reusability:** CloudFormation enables the repetition of infrastructure as code across different environments and regions, promoting consistency and scalability. 20 | 21 | Extended Points: 22 | - **Cost Management:** By tagging each resource, CloudFormation helps in tracking and managing costs associated with various components of the infrastructure. This facilitates better cost allocation and optimization efforts. 23 | - **Workflow Optimization:** The ability to schedule deletion and generation of resources at specific times allows for optimized resource utilization, ensuring that resources are available when needed while minimizing costs during idle periods. 24 | - **Infrastructure Consistency:** CloudFormation's support for infrastructure as code (IaC) promotes consistency across environments and regions, reducing the risk of configuration drift and simplifying management and troubleshooting tasks. 25 | - **Visualization:** Automated diagram generation provides a visual representation of the infrastructure defined in the CloudFormation template, aiding in architectural review, documentation, and communication among team members. 26 | 27 | By leveraging CloudFormation, organizations can efficiently manage and automate the deployment and maintenance of their AWS infrastructure, leading to improved agility, cost-effectiveness, and reliability. 28 | 29 | # Cloudformation - service role 30 | 31 | Sure, here's a summary with key points highlighted: 32 | 33 | Summary: 34 | CloudFormation's service role, often referred to as a dedicated role, allows CloudFormation to perform operations such as creating, deleting, and updating AWS resources on behalf of the user. This role is granted the necessary permissions to manage resources, ensuring that CloudFormation can execute these operations without requiring additional permissions from the user. By utilizing a dedicated service role, CloudFormation follows the principle of least privilege, granting only the necessary permissions to perform its tasks while maintaining security and governance. 35 | 36 | Key Points: 37 | 1. **Service Role Functionality:** CloudFormation's service role enables it to execute operations like creating, deleting, and updating AWS resources as instructed by the user. 38 | 2. **Resource Management:** The service role possesses the permissions required to manage resources on behalf of the user, including updating and deleting them. 39 | 3. **Least Privilege Principle:** By using a dedicated service role, CloudFormation adheres to the principle of least privilege, ensuring that it only has access to the resources and actions necessary to fulfill its intended tasks. 40 | 4. **Permission Delegation:** While users may not have direct permissions to perform certain operations, CloudFormation can execute them on their behalf, simplifying resource management and maintaining security best practices. 41 | 42 | Utilizing CloudFormation's service role enhances the security posture of AWS environments by restricting access to resources to only those operations necessary for infrastructure management, thereby reducing the risk of unauthorized actions and potential security breaches. 43 | 44 | # Amazon SES 45 | 46 | Here's a concise summary highlighting key points about Amazon SES: 47 | 48 | Summary: 49 | Amazon SES (Simple Email Service) offers features like DKIM (DomainKeys Identified Mail) and SPF (Sender Policy Framework) to verify the authenticity of email senders. It supports both transactional and bulk email sending, providing a reliable platform for businesses to deliver their messages efficiently and securely. 50 | 51 | Key Points: 52 | 1. **Email Authentication:** Amazon SES supports DKIM and SPF, which are essential for verifying the authenticity of email senders and preventing email spoofing and phishing attacks. 53 | 2. **Transactional Email:** SES facilitates the sending of transactional emails, such as order confirmations and account notifications, ensuring timely and reliable delivery of critical messages to recipients. 54 | 3. **Bulk Email:** Businesses can use Amazon SES for sending bulk emails, such as newsletters and marketing campaigns, with features to manage large volumes of emails efficiently and maintain deliverability rates. 55 | 56 | By leveraging Amazon SES, businesses can establish trust with their email recipients through proper authentication mechanisms while effectively delivering both transactional and bulk emails, enhancing communication and engagement with their audience. 57 | 58 | # Amazon Pinpoint 59 | Here's a summarized version with key points highlighted: 60 | 61 | Summary: 62 | Amazon Pinpoint is a comprehensive platform for managing inbound and outbound marketing communication messages. It supports various channels including email, SMS, push notifications, voice, and in-app messaging, providing a versatile solution for engaging with customers. Pinpoint allows businesses to create and manage SMS messages and campaigns within their applications, providing flexibility and control over messaging strategies. Additionally, features like delivery scheduling, highly-targeted segments, and full campaign management capabilities enable businesses to deliver personalized and effective communication to their audience. 63 | 64 | Key Points: 65 | 1. **Multi-channel Communication:** Amazon Pinpoint facilitates inbound and outbound marketing communication through various channels including email, SMS, push notifications, voice, and in-app messaging. 66 | 2. **SMS Messaging and Campaigns:** Pinpoint allows businesses to create and manage SMS messages and campaigns directly within their applications, streamlining the process of reaching customers via text messaging. 67 | 3. **Delivery Schedule:** Businesses can schedule the delivery of messages and campaigns according to specific timeframes and preferences, ensuring timely and effective communication with their audience. 68 | 4. **Highly-targeted Segments:** Pinpoint offers advanced segmentation capabilities, enabling businesses to target specific customer segments based on various criteria such as demographics, behaviors, and preferences. 69 | 5. **Full Campaign Management:** With Pinpoint, businesses have access to comprehensive campaign management features, empowering them to create, track, and optimize marketing campaigns across multiple channels. 70 | 71 | By utilizing Amazon Pinpoint, businesses can orchestrate personalized and highly-targeted marketing campaigns across multiple communication channels, driving engagement and fostering stronger connections with their audience. 72 | 73 | # SSM Session Manager 74 | 75 | Here's a summarized version with key points highlighted: 76 | 77 | Summary: 78 | SSM Session Manager provides a secure way to establish shell connections to EC2 instances and on-premises servers without the need for SSH access, bastion hosts, or SSH keys. Unlike traditional SSH connections, it doesn't require port 22 to be open. Session Manager supports Linux and macOS platforms, allowing users to initiate secure shell sessions directly from the AWS Management Console. It's important to note that Session Manager is distinct from Session Connect; while Session Connect requires port 22 to be open, Session Manager eliminates this requirement entirely, enabling hassle-free connections directly from the console. 79 | 80 | Key Points: 81 | 1. **Secure Shell Access:** SSM Session Manager enables secure shell access to EC2 instances and on-premises servers without the need for SSH infrastructure or SSH keys. 82 | 2. **Portless Connectivity:** Unlike traditional SSH connections, Session Manager doesn't rely on port 22 being open, enhancing security by reducing attack surface. 83 | 3. **Platform Support:** Session Manager supports Linux and macOS platforms, providing flexibility for managing different types of instances and servers. 84 | 4. **Console-based Access:** Users can initiate shell sessions directly from the AWS Management Console, simplifying the connection process and enhancing usability. 85 | 5. **Distinction from Session Connect:** While Session Connect requires port 22 to be open for connections, Session Manager eliminates this requirement entirely, offering a more secure and convenient alternative for managing instances and servers. 86 | 87 | By leveraging SSM Session Manager, users can securely manage their EC2 instances and on-premises servers without the complexities associated with traditional SSH access methods, enhancing security and operational efficiency. 88 | 89 | # SSM Other Services (Run Command) 90 | Here's a summarized version with key points highlighted: 91 | 92 | Summary: 93 | SSM offers additional services like Run Command, enabling users to execute commands across multiple instances without requiring SSH access. The output of these commands can be displayed in the AWS Management Console, stored in an S3 bucket, or logged in CloudWatch. Users can also set up notifications via SNS for command execution results. SSM services are integrated with IAM for access control and CloudTrail for logging, ensuring security and compliance. Run Command can also be invoked using EventBridge for automated and event-driven workflows. 94 | 95 | Key Points: 96 | 1. **Cross-Instance Command Execution:** SSM's Run Command allows users to execute commands across multiple instances without the need for SSH access, simplifying management tasks. 97 | 2. **Flexible Output Options:** Command output can be displayed in the AWS Management Console, stored in an S3 bucket for later analysis, or logged in CloudWatch for monitoring purposes. 98 | 3. **Notification Integration:** Users can set up notifications via SNS to receive alerts and notifications regarding command execution results, enabling proactive monitoring and response. 99 | 4. **Security and Compliance:** SSM services are integrated with IAM for access control, ensuring that only authorized users can execute commands, and with CloudTrail for logging command execution activities, facilitating compliance and auditing requirements. 100 | 5. **Event-Driven Automation:** Run Command can be invoked using EventBridge, allowing users to automate command execution based on events and triggers, streamlining operational workflows and enhancing efficiency. 101 | 102 | By leveraging SSM's Run Command and other services, users can efficiently manage and automate administrative tasks across their infrastructure, improving operational efficiency, and ensuring security and compliance. 103 | 104 | # System Manager - Patch Manager 105 | Here are summaries with key points highlighted for each of the System Manager services: 106 | 107 | **System Manager - Patch Manager:** 108 | - **Automated Patching:** Patch Manager automates the process of patching managed instances, covering operating system updates, application updates, and security updates. 109 | - **Supported Platforms:** It supports patching for EC2 instances as well as on-premises servers, across various operating systems including Linux, macOS, and Windows. 110 | 111 | **System Manager - Maintenance Windows:** 112 | - **Scheduled Actions:** Maintenance Windows allow users to define schedules for performing actions, such as OS patching, on their instances. 113 | - **Focus on OS Patching:** One of the primary uses of Maintenance Windows is to schedule OS patching activities to ensure timely updates and compliance. 114 | 115 | **System Manager Automation:** 116 | - **IAM Integration:** Users can create IAM roles to define permissions for automation tasks. 117 | - **Automation Runbooks:** Automation allows the creation of runbooks to execute commands or scripts on EC2 instances, simplifying common maintenance and deployment tasks. 118 | - **Efficiency Improvement:** It streamlines tasks such as updating software, deploying applications, and managing configurations, enhancing operational efficiency and consistency. 119 | 120 | These System Manager services collectively provide a comprehensive set of tools for managing, automating, and maintaining the infrastructure and applications running on AWS instances, ensuring they remain up-to-date, secure, and efficiently managed. 121 | 122 | # AWS Cost Explorer 123 | Here's a summary with key points highlighted for each service: 124 | 125 | **AWS Cost Explorer:** 126 | - **Savings Plan Alternative:** Cost Explorer offers Savings Plans as an alternative to Reserved Instances, providing flexibility in cost optimization strategies. 127 | - **Forecasting:** It provides a 12-month forecast of usage, aiding in budgeting and planning for future expenses. 128 | 129 | **AWS Cost Anomaly Detection:** 130 | - **Continuous Monitoring:** The service continuously monitors cost and usage patterns, utilizing machine learning to detect anomalies indicative of unusual spending. 131 | - **No Configuration Required:** There's no need to define specific thresholds or rules; the system autonomously identifies anomalies. 132 | - **Account-wide Monitoring:** It monitors the entire AWS account, providing comprehensive coverage across all services and resources. 133 | - **Notification:** Anomaly reports are sent via SNS on a weekly or monthly basis, keeping users informed about any detected irregularities in spending patterns. 134 | 135 | These services empower users to gain insights into their AWS spending, optimize costs, and detect any unusual spending patterns, thereby enabling effective cost management and budgeting. 136 | 137 | # AWS Batch 138 | Here's a summary with key points highlighted: 139 | 140 | **AWS Batch:** 141 | - **Fully Managed Batch Processing:** AWS Batch offers fully managed batch processing capabilities, allowing users to execute large-scale computing jobs on AWS infrastructure efficiently. 142 | - **Scalability:** It can handle the execution of up to 10,000 computing batch jobs on AWS, enabling users to process large workloads without worrying about scalability issues. 143 | - **Instance Management:** AWS Batch automatically launches and manages EC2 instances or Spot Instances based on workload requirements, optimizing resource utilization and cost-effectiveness. 144 | - **Docker Containerization:** Batch jobs are defined as Docker images and run on Amazon ECS (Elastic Container Service), providing flexibility and consistency in job execution environments. 145 | - **Cost Optimization:** AWS Batch offers cost optimization features, such as the ability to use Spot Instances for cost-effective computing resources, helping users achieve efficient resource utilization and cost savings. 146 | - **Comparison with Lambda:** Unlike AWS Lambda, which has limitations such as a maximum execution duration of 15 minutes and a limited runtime environment, AWS Batch offers more flexibility in terms of runtime, storage, and execution time for batch processing jobs. 147 | 148 | AWS Batch provides users with a powerful and flexible platform for executing batch processing workloads at any scale, with features designed to optimize cost, resource utilization, and performance. 149 | 150 | # Amazon AppFlow 151 | Here's a summary with key points highlighted: 152 | 153 | **Amazon AppFlow:** 154 | - **Fully Managed Integration Service:** Amazon AppFlow is a fully managed service designed to facilitate secure data transfer between various Software-as-a-Service (SaaS) applications and AWS. 155 | - **Supported Sources:** It supports integration with popular SaaS applications such as Salesforce, SAP, Zendesk, Slack, and ServiceNow. 156 | - **Supported Destinations:** Data can be transferred to destinations including Amazon S3, Amazon Redshift, Snowflake, and back to Salesforce. 157 | - **Flexible Integration Frequency:** AppFlow allows data transfer on a schedule, in response to events, or on-demand, offering flexibility in integration timing. 158 | - **Data Transformations:** Users can perform data transformations such as filtering and validation as part of the integration process, ensuring data quality and consistency. 159 | - **Secure Transfer:** Data transfer is encrypted over the public internet and can also utilize AWS PrivateLink for enhanced security. 160 | - **API Integration:** AppFlow eliminates the need for manual integration efforts by leveraging APIs, enabling users to quickly establish data connections without writing custom integrations. 161 | 162 | Amazon AppFlow streamlines the process of integrating SaaS applications with AWS, providing a secure, flexible, and efficient solution for data transfer and synchronization between different systems. 163 | 164 | # AWS Amplify 165 | Here's a concise summary with key points highlighted: 166 | 167 | **AWS Amplify:** 168 | - **Development and Deployment Tools:** AWS Amplify is a set of tools and services designed to assist developers in building and deploying scalable full-stack web and mobile applications. 169 | - **Full-Stack Support:** It provides support for both web and mobile applications, offering a comprehensive solution for end-to-end development and deployment needs. 170 | - **Comparison to Elastic Beanstalk:** Amplify serves a similar purpose to Elastic Beanstalk but is specifically tailored for web and mobile applications, providing specialized features and workflows for these types of applications. 171 | 172 | AWS Amplify simplifies the process of developing and deploying web and mobile applications, offering a streamlined experience with tailored tools and services for each platform. 173 | -------------------------------------------------------------------------------- /025-disaster-recovery-migrations.md: -------------------------------------------------------------------------------- 1 | # Disaster Recovery & Migrations: Strategies and Best Practices 2 | 3 | Disaster recovery and migrations are critical components of ensuring business continuity and resilience in the face of adverse events. Here's a comprehensive overview of strategies, terms, and best practices: 4 | 5 | ## Disaster Recovery Strategies: 6 | 7 | ### RPO and RTO Definitions: 8 | - **RPO (Recovery Point Objective)**: Defines how frequently backups are taken and the acceptable amount of data loss in the event of a disaster. 9 | - **RTO (Recovery Time Objective)**: Specifies the maximum allowable downtime for applications or systems. 10 | 11 | ### Disaster Recovery Strategies: 12 | 1. **Backup and Restore**: 13 | - Involves regularly backing up data and restoring it in the event of a disaster. 14 | - Data can be transferred from on-premises to AWS, stored in S3, and restored as needed, although it typically has a high RPO and RTO. 15 | 16 | 2. **Pilot Light**: 17 | - Maintains a minimal version of the application running in AWS. 18 | - Allows for rapid scaling and failover using Route 53 in case of a disaster. 19 | 20 | 3. **Warm Standby**: 21 | - Keeps a scaled-down version of the system running in AWS, ready to be scaled up in the event of a disaster. 22 | - Utilizes Route 53 for failover and offers faster recovery than backup and restore. 23 | 24 | 4. **Hot Site / Multi-Site Approach**: 25 | - Maintains a fully operational production environment both on-premises and in AWS. 26 | - Ensures minimal RTO and RPO but comes with higher costs. 27 | 28 | ## Backup Strategies: 29 | - Utilize EBS snapshots, RDS automated backups, and regular backups to S3 or Glacier. 30 | - Use services like Snowball or Storage Gateway for transferring data from on-premises to AWS. 31 | 32 | ## High Availability: 33 | - Leverage Route 53 for DNS management. 34 | - Utilize multi-AZ configurations for RDS, Elasticache, and S3. 35 | - Establish site-to-site VPN as a recovery solution for Direct Connect. 36 | 37 | ## Replication Strategies: 38 | - Implement RDS replication (cross-region) and database replication from on-premises to RDS. 39 | - Utilize Storage Gateway for replication purposes. 40 | 41 | ## Automation: 42 | - Employ CloudFormation or Elastic Beanstalk for infrastructure re-creation. 43 | - Use CloudWatch to automate recovery or reboot EC2 instances. 44 | - Leverage AWS Lambda functions for automated tasks. 45 | 46 | ## Chaos Engineering: 47 | - Follow the example of companies like Netflix, which uses a "Simian Army" to randomly terminate EC2 instances as part of their Chaos Engineering approach. 48 | 49 | By implementing a combination of these strategies and best practices, organizations can ensure robust disaster recovery capabilities and smooth migrations to AWS, thus safeguarding their business operations and data integrity. 50 | 51 | # Database Migration Service (DMS): Seamless Database Migration to AWS 52 | 53 | The Database Migration Service (DMS) is a robust and resilient tool provided by AWS for migrating databases to and from the cloud. Here's an overview of its features and functionalities: 54 | 55 | ## Features: 56 | 57 | - **Database Migration**: DMS facilitates the migration of databases from various sources, including on-premises and EC2 instances, as well as AWS RDS, Amazon S3, and others. 58 | 59 | - **Supported Sources**: 60 | - DMS supports migration from sources such as Oracle, Microsoft SQL Server, MariaDB, PostgreSQL, Azure SQL, Amazon RDS, and Amazon S3. 61 | 62 | - **Supported Targets**: 63 | - Targets include on-premises databases, EC2 instances, Amazon RDS, Amazon Redshift, Amazon DynamoDB, Amazon S3, Amazon OpenSearch, Kinesis Data Streams, Kafka, Amazon DocumentDB, and Redis. 64 | 65 | - **Schema Conversion Tool (SCT)**: 66 | - When the source and target databases use different engines, the Schema Conversion Tool (SCT) assists in converting the schema to ensure compatibility. 67 | - For example, it can convert Oracle to MySQL or Teradata to Amazon Redshift. 68 | 69 | ## Benefits: 70 | 71 | - **Self-Healing and Resilient**: 72 | - DMS is designed to be resilient and self-healing, ensuring minimal disruption during migration processes. 73 | 74 | - **Efficiency and Reliability**: 75 | - It offers efficient and reliable migration of large-scale databases, minimizing downtime and data loss. 76 | 77 | - **Flexibility**: 78 | - DMS provides flexibility in choosing migration sources and targets, supporting a wide range of databases and storage options. 79 | 80 | - **Ease of Use**: 81 | - With a user-friendly interface and straightforward setup process, DMS simplifies the complexities of database migration. 82 | 83 | ## Use Cases: 84 | 85 | - **Cloud Migration**: 86 | - Migrate on-premises databases to AWS cloud infrastructure, enabling scalability, flexibility, and cost-efficiency. 87 | 88 | - **Database Consolidation**: 89 | - Consolidate multiple databases into a single, centralized AWS database solution for improved management and resource utilization. 90 | 91 | - **Replication and Sync**: 92 | - Replicate data in real-time between on-premises and cloud databases, ensuring data consistency and availability. 93 | 94 | - **Data Warehousing**: 95 | - Migrate data from various sources to Amazon Redshift for analytics and data warehousing purposes, enabling powerful insights and decision-making capabilities. 96 | 97 | ## Conclusion: 98 | 99 | The Database Migration Service (DMS) offers a seamless and efficient solution for migrating databases to and from AWS, supporting a wide range of sources and targets. Whether it's cloud migration, database consolidation, or real-time data replication, DMS provides the tools and capabilities to streamline the migration process while ensuring data integrity and reliability. 100 | 101 | # RDS & Aurora MySQL Migration: Seamless Transition to Aurora 102 | 103 | Migrating from RDS MySQL to Aurora MySQL offers enhanced performance, scalability, and reliability. Here's a guide on how to efficiently migrate your databases: 104 | 105 | ## Migration from RDS MySQL to Aurora MySQL: 106 | 107 | ### 1. Snapshot Restore Method: 108 | - **Steps**: 109 | - Take a snapshot of your RDS MySQL database. 110 | - Restore the snapshot as an Aurora MySQL database. 111 | - **Benefits**: 112 | - Simple and straightforward process. 113 | - Preserves data integrity during migration. 114 | 115 | ### 2. Read Replica Promotion: 116 | - **Steps**: 117 | - Create an Aurora Read Replica from your RDS MySQL database. 118 | - Monitor the replication lag; ensure it is zero. 119 | - Promote the Aurora Read Replica to its own Aurora MySQL DB cluster. 120 | - **Benefits**: 121 | - Minimal downtime during migration. 122 | - Automatic failover and high availability with Aurora. 123 | 124 | ### 3. External MySQL to Aurora MySQL Migration: 125 | - **Steps**: 126 | - Utilize Percona XtraBackup to create a file backup stored in S3. 127 | - Create an Aurora MySQL DB cluster directly from the S3 backup. 128 | - **Benefits**: 129 | - Efficient migration process leveraging S3 storage. 130 | - Suitable for large-scale databases. 131 | 132 | ### 4. Migration Using `mysqldump` Utility: 133 | - **Steps**: 134 | - Use the `mysqldump` utility to export the MySQL database. 135 | - Import the dump file into an Aurora MySQL database. 136 | - **Considerations**: 137 | - Slower migration process compared to utilizing S3 backups. 138 | - Suitable for smaller databases or situations where direct backup restoration is not feasible. 139 | 140 | ### 5. Database Migration Service (DMS): 141 | - **Steps**: 142 | - Utilize AWS Database Migration Service if both source and target databases are operational. 143 | - Configure DMS to efficiently migrate data between RDS MySQL and Aurora MySQL. 144 | - **Benefits**: 145 | - Offers real-time data replication with minimal downtime. 146 | - Handles schema conversion if needed. 147 | 148 | ## Conclusion: 149 | Migrating from RDS MySQL to Aurora MySQL provides numerous benefits, including improved performance and scalability. By choosing the appropriate migration method based on your specific requirements and database size, you can seamlessly transition to Aurora while ensuring data integrity and minimizing downtime. 150 | 151 | # On-Premises Strategies with AWS: Seamless Integration and Migration 152 | 153 | Integrating on-premises infrastructure with AWS enables organizations to leverage the scalability, flexibility, and reliability of cloud computing. Here are strategies for seamlessly integrating on-premises environments with AWS: 154 | 155 | ## 1. Run Amazon AMI on Your Own Infrastructure: 156 | - **Amazon Machine Images (AMIs)** can be run on your own infrastructure on-premises. 157 | - Simply download the AMI as a virtual machine (VM) in .ISO format and run it on various virtualization platforms like VMware, KVM, VirtualBox, and Hyper-V. 158 | 159 | ## 2. VM Import/Export: 160 | - **VM Import/Export** facilitates the migration of existing applications into EC2 instances. 161 | - Establish a Disaster Recovery (DR) repository strategy for on-premises environments. 162 | - Export VMs from EC2 back to on-premises if needed. 163 | 164 | ## 3. AWS Application Discovery Service: 165 | - Use the **AWS Application Discovery Service** to gather information about on-premises servers for migration planning. 166 | - Obtain insights into server utilization and dependency mappings. 167 | - Track migration progress with AWS Migration Hub. 168 | 169 | ## 4. AWS Database Migration Service (DMS): 170 | - **AWS Database Migration Service (DMS)** enables replication from on-premises to AWS, AWS to AWS, and AWS to on-premises. 171 | - Supports various database technologies such as Oracle, MySQL, and DynamoDB. 172 | 173 | ## 5. AWS Server Migration Service (SMS): 174 | - **AWS Server Migration Service (SMS)** allows incremental replication of on-premises live servers to AWS. 175 | - Simplifies the migration process by automating replication tasks. 176 | 177 | Integrating on-premises infrastructure with AWS empowers organizations to modernize their IT environments, enhance scalability, and improve operational efficiency. By utilizing these strategies and AWS services, businesses can seamlessly transition to hybrid or cloud-centric architectures while ensuring minimal disruption and maximum flexibility. 178 | 179 | # AWS Backup: Streamlined Backup Management Across AWS Services 180 | 181 | AWS Backup simplifies and automates backup management across various AWS services, eliminating the need for custom scripts and manual processes. Here's an overview of its features and benefits: 182 | 183 | ## Features: 184 | 185 | - **Service Coverage**: 186 | - AWS Backup supports multiple AWS services, including EC2, EBS, S3, RDS, DocumentDB, EFS, Amazon FSx (Lustre & Windows), and AWS Storage Gateway. 187 | 188 | - **Cross-Region and Cross-Account Backups**: 189 | - Backup data can be stored across different AWS regions and AWS accounts, ensuring data resilience and compliance. 190 | 191 | - **Point-in-Time Recovery (PITR)**: 192 | - PITR functionality is available for supported services, enabling recovery to specific points in time for data restoration. 193 | 194 | - **On-Demand and Scheduled Backups**: 195 | - Users can perform both ad-hoc and scheduled backups, allowing flexibility in backup management according to business needs. 196 | 197 | - **Tag-Based Backup Policies**: 198 | - Backup policies can be defined based on resource tags, streamlining backup management and ensuring consistency across resources. 199 | 200 | - **Backup Plans**: 201 | - Users can create customized backup plans specifying backup frequency, retention policies, and other parameters, facilitating centralized backup management. 202 | 203 | - **AWS Backup Vault Lock**: 204 | - Backup Vault Lock feature ensures that backups stored in AWS Backup Vault cannot be deleted or modified, adding an extra layer of defense against data loss or tampering. 205 | 206 | ## Benefits: 207 | 208 | - **Simplified Backup Management**: 209 | - AWS Backup streamlines backup processes across diverse AWS services, reducing complexity and enhancing operational efficiency. 210 | 211 | - **Automated Backup Operations**: 212 | - Automation features eliminate the need for manual intervention in backup tasks, saving time and resources. 213 | 214 | - **Data Protection and Compliance**: 215 | - By leveraging AWS Backup, organizations can ensure data protection, compliance with regulatory requirements, and resilience against data loss. 216 | 217 | - **Centralized Backup Policy Management**: 218 | - Backup plans and policies can be centrally managed, providing consistency and control over backup operations. 219 | 220 | - **Enhanced Security and Data Integrity**: 221 | - The Vault Lock feature offers an additional layer of security, safeguarding backups against accidental deletion or unauthorized access. 222 | 223 | AWS Backup empowers organizations to implement robust backup strategies, ensuring data resilience, compliance, and business continuity across their AWS environments. By leveraging its comprehensive features, businesses can effectively manage and protect their critical data assets with ease. 224 | 225 | # AWS Application Discovery Service: Streamline Cloud Migration Planning 226 | 227 | AWS Application Discovery Service is a vital tool for organizations planning to migrate their applications to the cloud. Here's an overview of its features and benefits: 228 | 229 | ## Features: 230 | 231 | - **Server Utilization Data and Dependency Mapping**: 232 | - AWS Application Discovery Service scans server utilization data and maps dependencies between servers and applications, providing insights into the existing infrastructure. 233 | 234 | - **Agentless Discovery**: 235 | - Collects VM inventory and configuration data without requiring agents, simplifying the discovery process. 236 | 237 | - **Agent-Based Discovery**: 238 | - Gathers system configuration and performance data using agents installed on servers, offering deeper insights into system metrics. 239 | 240 | - **AWS Migration Hub Integration**: 241 | - Integrates with AWS Migration Hub to create migration plans, determining when and how to migrate applications to AWS while tracking the progress of migration projects. 242 | 243 | - **Lift-and-Shift (Rehost) Solutions**: 244 | - Offers a lift-and-shift approach to migration, allowing applications to be migrated to AWS without significant modifications. 245 | 246 | - **Automated Migration**: 247 | - Automatically converts physical, virtual, and cloud-based servers to run natively on AWS, minimizing the need for manual intervention and reducing migration effort. 248 | 249 | ## Benefits: 250 | 251 | - **Comprehensive Discovery**: 252 | - Provides a comprehensive view of server utilization and dependencies, enabling organizations to make informed decisions during the migration planning process. 253 | 254 | - **Simplified Deployment**: 255 | - Agentless discovery eliminates the need for complex setup procedures, facilitating quick deployment and reducing administrative overhead. 256 | 257 | - **Efficient Planning**: 258 | - Integration with AWS Migration Hub enables efficient migration planning, ensuring smooth transitions to AWS while minimizing downtime and disruption. 259 | 260 | - **Seamless Migration**: 261 | - Lift-and-shift solutions streamline the migration process, allowing applications to be migrated to AWS with minimal effort and complexity. 262 | 263 | - **Cost Optimization**: 264 | - Automated migration reduces the need for manual labor and accelerates the migration process, leading to cost savings and improved efficiency. 265 | 266 | AWS Application Discovery Service empowers organizations to efficiently plan and execute their cloud migration strategies by providing deep insights into existing infrastructure and simplifying the migration process. With its comprehensive features and seamless integration with AWS services, organizations can achieve successful and hassle-free migrations to the cloud. 267 | 268 | 269 | # Transferring Large Amounts of Data into AWS 270 | 271 | When transferring substantial data volumes into AWS, various methods offer different speeds and efficiencies. Let's explore some options using an example of transferring 200TB of data with a network speed of 100Mbps: 272 | 273 | ## Over the Internet or Site-to-Site VPN: 274 | - **Duration**: Approximately 185 days. 275 | - **Details**: This method utilizes the existing internet connection or a site-to-site VPN. However, due to limited bandwidth, the transfer time is extensive. 276 | 277 | ## Over Direct Connect (1Gbps): 278 | - **Duration**: Approximately 18.5 days. 279 | - **Details**: While Direct Connect offers faster speeds than standard internet connections, the initial setup time can delay the transfer. Once established, the transfer duration is significantly reduced. 280 | 281 | ## AWS Snowball: 282 | - **Duration**: Approximately 1 week. 283 | - **Details**: Snowball involves physically shipping storage devices to AWS data centers. With multiple Snowballs operating in parallel, the transfer time can be expedited. Additionally, AWS DMS (Database Migration Service) can assist in managing the data transfer process. 284 | 285 | ## For Ongoing Replication/Transfers: 286 | - **Site-to-Site VPN or Direct Connect with DMS or DataSync**: 287 | - **DMS**: AWS Database Migration Service supports ongoing data replication and can utilize either a site-to-site VPN or Direct Connect for efficient data transfer. 288 | - **DataSync**: AWS DataSync offers accelerated data transfer, synchronization, and migration between on-premises storage and AWS. 289 | 290 | Selecting the appropriate transfer method depends on factors such as available bandwidth, transfer speed requirements, initial setup time, and ongoing data transfer needs. By leveraging the right combination of methods, organizations can efficiently migrate and replicate large volumes of data into AWS. 291 | 292 | # VMware Cloud on AWS: Extending VMware Environments to AWS 293 | 294 | VMware Cloud on AWS allows customers to seamlessly extend their on-premises VMware-based data centers to the AWS cloud infrastructure while continuing to leverage VMware's familiar software stack. Here's a breakdown of its key features and use cases: 295 | 296 | ## Features: 297 | 298 | - **Seamless Extension**: 299 | - Provides a seamless extension of on-premises VMware environments to AWS infrastructure, allowing customers to scale their data center capacity as needed. 300 | 301 | - **Consistent Operations**: 302 | - Maintains consistency with VMware's software-defined data center (SDDC) stack, ensuring familiar operational processes and tools across environments. 303 | 304 | - **Migration Capabilities**: 305 | - Facilitates the migration of VMware vSphere-based workloads to AWS without the need for significant refactoring or rearchitecting. 306 | 307 | - **Hybrid Cloud Deployment**: 308 | - Enables the deployment of production workloads across VMware vSphere-based private, public, and hybrid cloud environments, offering flexibility and scalability. 309 | 310 | - **Disaster Recovery Strategy**: 311 | - Provides a robust disaster recovery strategy by leveraging the AWS infrastructure for backup and replication, ensuring business continuity and data resilience. 312 | 313 | ## Use Cases: 314 | 315 | - **Migration to AWS**: 316 | - Organizations can migrate their existing VMware vSphere-based workloads to AWS seamlessly, leveraging the scalability and agility of the cloud. 317 | 318 | - **Production Workloads**: 319 | - Run critical production workloads across VMware vSphere-based private, public, and hybrid cloud environments, optimizing performance and resource utilization. 320 | 321 | - **Disaster Recovery**: 322 | - Utilize VMware Cloud on AWS as a disaster recovery solution, leveraging AWS infrastructure for backup, replication, and failover, ensuring business continuity in the event of a disaster. 323 | 324 | VMware Cloud on AWS offers organizations the flexibility to extend their VMware environments to the AWS cloud seamlessly, enabling efficient workload migration, scalable infrastructure deployment, and robust disaster recovery strategies. By leveraging this integrated solution, customers can achieve greater agility, resilience, and operational efficiency across their hybrid cloud environments. 325 | --------------------------------------------------------------------------------