├── README.md ├── assets └── images │ ├── concurrency.png │ ├── examples │ └── ramping-vus.png │ ├── infrastructure-diagram.png │ ├── logo.png │ ├── lti │ ├── lms-lessons.png │ ├── lms-tool-launch.png │ ├── lti-advantage.png │ ├── lti-home.png │ └── lti-resource1.png │ ├── optimistic-locking.png │ ├── pessimistic-vs-optimistic-locking.png │ ├── ses-sns-topic.png │ └── test-email.png ├── examples └── performance-testing-antipattern-examples.md └── recipes ├── automated-testing.md ├── circleci-build-guide.md ├── docker-image-guide.md ├── handling-concurrency.md ├── lti.md └── ses-bounce-handling.md /README.md: -------------------------------------------------------------------------------- 1 |
2 | Studion Logo 7 |

Welcome to the Studion Platform Tech Guide repository

8 |

The central place for all common packages

9 |
10 | 11 | ## 📦 Package List 12 | 13 | 1. [Prettier config](https://github.com/ExtensionEngine/prettier-config) - Studion Prettier config 14 | 2. [Infra Code Blocks](https://github.com/ExtensionEngine/infra-code-blocks) - Studion common infra components 15 | 16 | ## Recipes 17 | 18 | 1. [Docker image guide](./recipes/docker-image-guide.md) 19 | 2. [AWS SES bounce handling](./recipes/ses-bounce-handling.md) 20 | 3. [Handling concurrency using optimistic or pessimistic locking](./recipes/handling-concurrency.md) 21 | 4. [LTI - Learning Tools Interoperability Protocol](./recipes/lti.md) 22 | 5. [CircleCI Build Guide](./recipes/circleci-build-guide.md) 23 | 24 | ## Guides 25 | 26 | 1. [Automated Testing](./recipes/automated-testing.md) 27 | 28 | ## Examples 29 | 30 | 1. [Performance Testing - Antipattern Examples](./examples/performance-testing-antipattern-examples.md) 31 | 32 | ## 🙌 Want to contribute? 33 | 34 | We are open to all kinds of contributions. If you want to: 35 | 36 | - 🤔 Suggest an idea 37 | - 🐛 Report an issue 38 | - 📖 Improve documentation 39 | - 👨‍💻 Contribute to the code 40 | 41 | You are more than welcome. 42 | -------------------------------------------------------------------------------- /assets/images/concurrency.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ExtensionEngine/tech-guide/0c4ed0c9b1bff4976caba468345a396757f9c61c/assets/images/concurrency.png -------------------------------------------------------------------------------- /assets/images/examples/ramping-vus.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ExtensionEngine/tech-guide/0c4ed0c9b1bff4976caba468345a396757f9c61c/assets/images/examples/ramping-vus.png -------------------------------------------------------------------------------- /assets/images/infrastructure-diagram.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ExtensionEngine/tech-guide/0c4ed0c9b1bff4976caba468345a396757f9c61c/assets/images/infrastructure-diagram.png -------------------------------------------------------------------------------- /assets/images/logo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ExtensionEngine/tech-guide/0c4ed0c9b1bff4976caba468345a396757f9c61c/assets/images/logo.png -------------------------------------------------------------------------------- /assets/images/lti/lms-lessons.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ExtensionEngine/tech-guide/0c4ed0c9b1bff4976caba468345a396757f9c61c/assets/images/lti/lms-lessons.png -------------------------------------------------------------------------------- /assets/images/lti/lms-tool-launch.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ExtensionEngine/tech-guide/0c4ed0c9b1bff4976caba468345a396757f9c61c/assets/images/lti/lms-tool-launch.png -------------------------------------------------------------------------------- /assets/images/lti/lti-advantage.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ExtensionEngine/tech-guide/0c4ed0c9b1bff4976caba468345a396757f9c61c/assets/images/lti/lti-advantage.png -------------------------------------------------------------------------------- /assets/images/lti/lti-home.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ExtensionEngine/tech-guide/0c4ed0c9b1bff4976caba468345a396757f9c61c/assets/images/lti/lti-home.png -------------------------------------------------------------------------------- /assets/images/lti/lti-resource1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ExtensionEngine/tech-guide/0c4ed0c9b1bff4976caba468345a396757f9c61c/assets/images/lti/lti-resource1.png -------------------------------------------------------------------------------- /assets/images/optimistic-locking.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ExtensionEngine/tech-guide/0c4ed0c9b1bff4976caba468345a396757f9c61c/assets/images/optimistic-locking.png -------------------------------------------------------------------------------- /assets/images/pessimistic-vs-optimistic-locking.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ExtensionEngine/tech-guide/0c4ed0c9b1bff4976caba468345a396757f9c61c/assets/images/pessimistic-vs-optimistic-locking.png -------------------------------------------------------------------------------- /assets/images/ses-sns-topic.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ExtensionEngine/tech-guide/0c4ed0c9b1bff4976caba468345a396757f9c61c/assets/images/ses-sns-topic.png -------------------------------------------------------------------------------- /assets/images/test-email.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ExtensionEngine/tech-guide/0c4ed0c9b1bff4976caba468345a396757f9c61c/assets/images/test-email.png -------------------------------------------------------------------------------- /examples/performance-testing-antipattern-examples.md: -------------------------------------------------------------------------------- 1 | # Performance Testing - Antipattern Examples 2 | 3 | ## Antipattern 1: Ignoring Think Time 4 | 5 | Excluding think time between user actions can result in unrealistic performance metrics for certain types of tests, such as average, stress, and soak tests. However, think time is less critical for tests like breakpoint and spike tests, as other parameters can control these scenarios effectively. Incorporating think time is crucial when simulating real user behavior based on user scenarios. In the provided example, user actions are executed without any delay, which does not accurately reflect real-world conditions for this type of test. 6 | 7 | ```javascript 8 | export default function () { 9 | http.get('http://example.com/api/resource1'); 10 | http.get('http://example.com/api/resource2'); 11 | http.get('http://example.com/api/resource3'); 12 | } 13 | ``` 14 | 15 | ### Solution 16 | 17 | Introduce think time between user actions to simulate real user behavior. This example adds a random delay between 1 to 5 seconds between each request. The bigger the range, the more realistic the simulation. 18 | 19 | ```javascript 20 | import { randomIntBetween } from 'https://jslib.k6.io/k6-utils/1.4.0/index.js'; 21 | import { sleep } from 'k6'; 22 | 23 | export default function () { 24 | http.get('http://example.com/api/resource1'); 25 | sleep(randomIntBetween(1, 5)); 26 | http.get('http://example.com/api/resource2'); 27 | sleep(randomIntBetween(1, 5)); 28 | http.get('http://example.com/api/resource3'); 29 | } 30 | ``` 31 | 32 | ## Antipattern 2: Lack of Data Variation 33 | 34 | Using static, hardcoded data for requests can cause caching mechanisms to produce artificially high performance metrics. In this example, the same username is used for every request, which may not represent real-world scenarios. 35 | 36 | ```javascript 37 | export default function () { 38 | const payload = JSON.stringify({ 39 | username: 'username', // Static username used in every request 40 | password: 'password', 41 | }); 42 | 43 | http.post('http://example.com/api/login', payload); 44 | } 45 | ``` 46 | 47 | ### Solution 48 | 49 | Use dynamic data or randomization to simulate different user scenarios. This example generates a random username for each request. 50 | 51 | ```javascript 52 | import exec from 'k6/execution'; 53 | 54 | export default function () { 55 | const payload = JSON.stringify({ 56 | username: `username${exec.vu.idInTest}`, // Unique identifier for each virtual user, we will use it to be sure every username is unique 57 | password: 'password', 58 | }); 59 | 60 | http.post('http://example.com/api/login', payload); 61 | } 62 | ``` 63 | 64 | ## Antipattern 3: Not Scaling Virtual Users 65 | 66 | Running performance tests with unrealistic numbers of virtual users or ramping up too quickly can lead to inaccurate results. In this example, the test starts with 1000 VUs immediately. 67 | 68 | ```javascript 69 | export const options = { 70 | vus: 1000, 71 | duration: '1m', 72 | }; 73 | 74 | export default function () { 75 | http.get('http://example.com/api/resource'); 76 | } 77 | ``` 78 | 79 | ### Solution 80 | 81 | Executors control how K6 schedules VUs and iterations. The executor that you choose depends on the goals of your test and the type of traffic you want to model. For example, the `ramping-vus` executor gradually increases the number of VUs over a specified duration, allowing for more realistic load testing for specific test types. 82 | 83 | ```javascript 84 | export const options = { 85 | discardResponseBodies: true, 86 | scenarios: { 87 | contacts: { 88 | executor: 'ramping-vus', 89 | startVUs: 0, 90 | stages: [ 91 | { duration: '20s', target: 10 }, 92 | { duration: '10s', target: 0 }, 93 | ], 94 | gracefulRampDown: '0s', 95 | }, 96 | }, 97 | }; 98 | 99 | export default function () { 100 | http.get('http://example.com/api/resource'); 101 | // Injecting sleep 102 | // Sleep time is 500ms. Total iteration time is sleep + time to finish request. 103 | sleep(0.5); 104 | } 105 | ``` 106 | 107 | Based upon our test scenario inputs and results: 108 | 109 | - The configuration defines 2 stages for a total test duration of 30 seconds. 110 | - Stage 1 ramps up VUs linearly from the 0 to the target of 10 over a 20 second duration. 111 | - From the 10 VUs at the end of stage 1, stage 2 then ramps down VUs linearly to the target of 0 over a 10 second duration. 112 | - Each iteration of the default function is expected to be roughly 515ms, or ~2/s. 113 | - As the number of VUs changes, the iteration rate directly correlates; each addition of a VU increases the rate by about 2 iterations/s, whereas each subtraction of a VUs reduces by about 2 iterations/s. 114 | - The example performed ~300 iterations over the course of the test. 115 | 116 | #### Chart representation of the test execution 117 | 118 | ![ramping-vus execution chart](../assets//images//examples/ramping-vus.png) 119 | 120 | ## Glossary 121 | 122 | ### **VU** 123 | 124 | - Virtual User 125 | 126 | ### **Think Time** 127 | 128 | - Amount of time a script stops during test execution to 129 | replicate delays experienced by real users while using an application. 130 | 131 | ### **Iteration** 132 | 133 | - A single execution of the default function in a K6 script. 134 | 135 | ### **Average Test** 136 | 137 | - Assess how the system performs under a typical load for your system or application. Typical load might be a regular day in production or an average timeframe in your daily traffic. 138 | 139 | ### **Stress Test** 140 | 141 | - Help you discover how the system functions with the load at peak traffic. 142 | 143 | ### **Spike Test** 144 | 145 | - A spike test verifies whether the system survives and performs under sudden and massive rushes of utilization. 146 | 147 | ### **Breakpoint Test** 148 | 149 | - Breakpoint tests discover your system’s limits. 150 | 151 | ### **Soak Test** 152 | 153 | - Soak tests are a variation of the average-load test. The main difference is the test duration. In a soak test, the peak load is usually an average load, but the peak load duration extends several hours or even days. 154 | -------------------------------------------------------------------------------- /recipes/automated-testing.md: -------------------------------------------------------------------------------- 1 | # Automated Testing 2 | 3 | ##### Table of contents 4 | 5 | [Glossary](#glossary) 6 | 7 | [Testing best practices](#testing-best-practices) 8 | 9 | [Types of automated tests](#types-of-automated-tests) 10 | 11 | [Unit Tests](#unit-tests) 12 | 13 | [Integration Tests](#integration-tests) 14 | 15 | [API Tests](#api-tests) 16 | 17 | [E2E Tests](#e2e-tests) 18 | 19 | [Performance Tests](#performance-tests) 20 | 21 | [Visual Tests](#visual-tests) 22 | 23 | 24 | ## Glossary 25 | **Confidence** - describes a degree to which passing tests guarantee that the app is working. 26 | **Determinism** - describes how easy it is to determine where the problem is based on the failing test. 27 | **Use Case** - a potential scenario in which a system receives external input and responds to it. It defines the interactions between a role (user or another system) and a system to achieve a goal. 28 | **Combinatiorial Explosion** - the fast growth in the number of combinations that need to be tested when multiple business rules are involved. 29 | 30 | ## Testing best practices 31 | 32 | ### Quality over quantity 33 | Don't focus on achieving a specific code coverage percentage. 34 | While code coverage can help us identify uncovered parts of the codebase, it doesn't guarantee high confidence. 35 | 36 | Instead, focus on identifying important paths of the application, especially from user's perspective. 37 | User can be a developer using a shared function, a user interacting with the UI, or a client using server app's JSON API. 38 | Write tests to cover those paths in a way that gives confidence that each path, and each separate part of the path works as expected. 39 | 40 | --- 41 | 42 | Flaky tests that produce inconsistent results ruin confidence in the test suite, mask real issues, and are the source of frustration. The refactoring process to address the flakiness is crucial and should be a priority. 43 | To adequately deal with flaky tests it is important to know how to identify, fix, and prevent them: 44 | - Common characteristics of flaky tests include inconsistency, false positives and negatives, and sensitivity to dependency, timing, ordering, and environment. 45 | - Typical causes of the stated characteristics are concurrency, timing/ordering problems, external dependencies, non-deterministic assertions, test environment instability, and poorly written test logic. 46 | - Detecting flaky tests can be achieved by rerunning, running tests in parallel, executing in different environments, and analyzing test results. 47 | - To fix and prevent further occurrences of flaky tests the following steps can be taken, isolate tests, employ setup and cleanup routines, handle concurrency, configure a stable test environment, improve error handling, simplify testing logic, and proactively deal with typical causes of the flaky tests. 48 | 49 | --- 50 | 51 | Be careful with tests that alter database state. We want to be able to run tests 52 | in parallel so do not write tests that depend on each other. Each test should be 53 | independent of the test suite. 54 | 55 | --- 56 | 57 | Test for behavior and not implementation. Rather focus on writing tests that 58 | follow the business logic instead of programming logic. Avoid writing parts of 59 | the function implementation in the actual test assertion. This will lead to tight 60 | coupling of tests with internal implementation and the tests will have to be fixed 61 | each time the logic changes. 62 | 63 | --- 64 | 65 | Writing quality tests is hard and it's easy to fall into common pitfalls of testing 66 | that the database update function actually updates the database. Start off simple 67 | and as the application grows in complexity, it will be easier to determine what 68 | should be tested more thoroughly. It is perfectly fine to have a small test suite 69 | that covers the critical code and the essentials. Small suites will run faster 70 | which means they will be run more often. 71 | 72 | ## Types of Automated Tests 73 | 74 | There are different approaches to testing, and depending on boundaries of the 75 | test, we can split them into following categories: 76 | 77 | - **Unit Tests** 78 | - **Integration Tests** 79 | - **API Tests** 80 | - **E2E Tests** 81 | - **Load/Performance Tests** 82 | - **Visual Tests** 83 | 84 | *Note that some people can call these tests by different names, but for Studion 85 | internal purposes, this should be considered the naming convention.* 86 | 87 | ### Unit Tests 88 | 89 | These are the most isolated tests that we can write. They should take a specific 90 | function/service/helper/module and test its functionality. Unit tests will 91 | usually require mocked data, but since we're testing the case when specific 92 | input produces specific output, the mocked data set should be minimal. 93 | 94 | Unit testing is recommended for functions that contain a lot of logic and/or branching. 95 | It is convenient to test a specific function at the lowest level so if the logic 96 | changes, we can make minimal changes to the test suite and/or mocked data. 97 | 98 | #### When to use 99 | - Test a unit that implements the business logic, that's isolated from side effects such as database interaction or HTTP request processing 100 | - Test function or class method with multiple input-output permutations 101 | 102 | #### When **not** to use 103 | - To test unit that integrates different application layers, such as persistence layer (database) or HTTP layer (see "Integration Tests") or performs disk I/O or communicates with external system 104 | 105 | #### Best practices 106 | - Unit tests should execute fast (<50ms) 107 | - Use mocks and stubs through dependency injection (method or constructor injection) 108 | 109 | #### Antipatterns 110 | - Mocking infrastructure parts such as database I/O - instead, revert the control by using the `AppService`, `Command` or `Query` to integrate unit implementing business logic with the infrastructure layer of the application 111 | - Monkey-patching dependencies used by the unit - instead, pass the dependencies through the constructor or method, so that you can pass the mocks or stubs in the test 112 | 113 | 114 | ### Integration Tests 115 | 116 | With these tests, we test how multiple components of the system behave together. 117 | 118 | #### Infrastructure 119 | 120 | Running the tests on test infrastructure should be preferred to mocking, unlike in unit tests. Ideally, a full application instance would be run, to mimic real application behavior as close as possible. 121 | This usually includes running the application connected to a test database, inserting fake data into it during the test setup, and doing assertions on the current state of the database. This also means integration test code should have full access to the test infrastructure for querying. 122 | > [!NOTE] 123 | > Regardless of whether using raw queries or the ORM, simple queries should be used to avoid introducing business logic within tests. 124 | 125 | However, mocking can still be used when needed, for example when expecting side-effects that call third party services. 126 | 127 | #### Entry points 128 | 129 | Integration test entry points can vary depending on the application use cases. These include services, controllers, or the API. These are not set in stone and should be taken into account when making a decision. For example: 130 | - A use case that can be invoked through multiple different protocols can be tested separately from them, to avoid duplication. A tradeoff in this case is the need to write some basic tests for each of the protocols. 131 | - A use case that will always be invokeable through a single protocol might benefit enough from only being tested using that protocol. E.g. a HTTP API route test might eliminate the need for a lower level, controller/service level test. This would also enable testing the auth layer integration within these tests, which might not have been possible otherwise depending on the technology used. 132 | 133 | Multiple approaches can be used within the same application depending on the requirements, to provide sufficient coverage. 134 | 135 | #### Testing surface 136 | 137 | **TODO**: do we want to write anything about mocking the DB data/seeds? 138 | 139 | In these tests we should cover **at least** the following: 140 | - **authorization** - make sure only logged in users with correct role/permissions 141 | can access this endpoint 142 | - **success** - if we send correct data, the endpoint should return response that 143 | contains correct data 144 | - **failure** - if we send incorrect data, the endpoint should handle the exception 145 | and return appropriate error status 146 | - **successful change** - successful request should make the appropriate change 147 | 148 | If the endpoint contains a lot of logic where we need to mock a lot of different 149 | inputs, it might be a good idea to cover that logic with unit tests. Unit tests 150 | will require less overhead and will provide better performance while at the same 151 | time decoupling logic testing and endpoint testing. 152 | 153 | #### When to use 154 | - To verify the API endpoint performs authentication and authorization. 155 | - To verify user permissions for that endpoint. 156 | - To verify that invalid input is correctly handled. 157 | - To verify the basic business logic is handled correctly, both in expected success and failure cases. 158 | - To verify infrastructure related side-effects, e.g. database changes or calls to third party services. 159 | 160 | #### When **not** to use 161 | - For extensive testing of business logic permutations beyond fundamental scenarios. Integration tests contain more overhead to write compared to unit tests and can easily lead to a combinatorial explosion. Instead, unit tests should be used for thorough coverage of these permutations. 162 | - For testing third party services. We should assume they work as expected. 163 | 164 | #### Best practices 165 | - Test basic functionality and keep the tests simple. 166 | - Prefer test infrastructure over mocking. 167 | - If the tested endpoint makes database changes, verify that the changes were 168 | actually made. 169 | - Assert that output data is correct. 170 | 171 | #### Antipatterns 172 | - Aiming for code coverage percentage number. An app with 100% code coverage can 173 | have bugs. Instead, focus on writing meaningful, quality tests. 174 | 175 | ### API Tests 176 | 177 | With these tests, we want to make sure our API contract is valid and the API 178 | returns the expected data. That means we write tests for the publically 179 | available endpoints. 180 | 181 | > [!NOTE] 182 | > As mentioned in the Integration Tests section, API can be the entry point to the integration tests, meaning API tests are a subtype of integration tests. However, when we talk about API tests here, we are specifically referring to the public API contract tests, which don't have access to the internals of the application. 183 | 184 | In the cases where API routes are covered extensively with integration tests, API tests might not be needed, leaving more time for QA to focus on E2E tests. 185 | However, in more complex architectures (e.g. integration tested microservices behind an API gateway), API tests can be very useful. 186 | 187 | #### When to use 188 | - To make sure the API signature is valid. 189 | 190 | #### When **not** to use 191 | - To test application logic. 192 | 193 | #### Best practices 194 | - Write these tests with the tools which allow us to reuse the tests to write 195 | performance tests (K6). 196 | 197 | #### Antipatterns 198 | 199 | ### E2E Tests 200 | 201 | E2E tests are executed in a browser environment using tools like Playwright, 202 | Cypress, or similar frameworks. The purpose of these tests is to make sure that 203 | interacting with the application UI produces the expected result, verifying the 204 | application’s functionality from a user’s perspective. 205 | 206 | Usually, these tests will cover a large portion of the codebase with least 207 | amount of code. Because of that, they can be the first tests to be added to 208 | existing project that has no tests or has low test coverage. 209 | 210 | These tests should not cover all of the use cases because they are the slowest 211 | to run. If we need to test edge cases, we should try to implement those at a 212 | lower level (integration or unit tests). 213 | 214 | #### When to use 215 | - To validate user interactions and critical workflows in the application UI. 216 | - For testing specific user flows. 217 | - For making sure that critical application features are working as expected. 218 | - For better coverage of the most common user pathways. 219 | 220 | #### When **not** to use 221 | - For data validation. 222 | 223 | #### Best practices 224 | - Tests should be atomic and simple, all complicated tests should be thrown out. 225 | - Focus on the most important user workflows rather than attempting exhaustive 226 | coverage. 227 | - Each test should be able to run independently, with the environment reset to a 228 | known state before every test. 229 | - Performance is key in these tests. We want to run tests as often as possible 230 | and good performance will allow that. 231 | - Flaky tests should be immediately disabled and refactored. Flaky tests will 232 | cause the team to ignore or bypass the tests and these should be dealt with 233 | immediately. 234 | - Ensure consistent data states to avoid test failures due to variability in 235 | backend systems or environments. 236 | - Run tests in parallel and isolate them from external dependencies to improve 237 | speed and reliability. 238 | - Automate E2E tests in your CI/CD pipeline to catch regressions early in the 239 | deployment process. 240 | 241 | #### Antipatterns 242 | - Avoid trying to cover all use cases or edge cases in E2E tests; these are 243 | better suited for unit or integration tests. 244 | 245 | ### Performance Tests 246 | 247 | Performance tests replicate typical user scenarios and then scale up to simulate 248 | concurrent users. They measure key performance metrics such as response time, 249 | throughput, error rate, and resource utilization. These tests help uncover 250 | bottlenecks and identify specific endpoints or processes that require 251 | optimization. 252 | 253 | Performance tests are supposed to be run on a production-like environment since 254 | they test the performance of code **and** infrastructure. It's essential to 255 | consider real user behavior when designing and running these tests. The best 256 | practice is to create a clone of the production environment for testing 257 | purposes, avoiding potential disruption to actual users. 258 | 259 | #### When to use 260 | - To stress test application's infrastructure. 261 | - To evaluate the app’s behavior and performance under increasing traffic. 262 | - To identify and address bottlenecks or resource limitations in the 263 | application. 264 | - To ensure the application can handle anticipated peak traffic or usage 265 | patterns. 266 | 267 | #### When **not** to use 268 | - To verify functional requirements or application features. 269 | - To test a specific user scenario. 270 | 271 | #### Best practices 272 | - Establish clear goals. Are you testing scalability, stability, or 273 | responsiveness? Without these objectives, tests risk being unfocused, resulting 274 | in meaningless data. 275 | - Include diverse scenarios that represent different user journeys across the 276 | system, not just a single performance test/scenario. 277 | - Use a clone of the production environment to ensure the infrastructure matches 278 | real-world conditions, including hardware, network, and database configurations. 279 | - Schedule performance tests periodically or before major releases to catch 280 | regressions early. 281 | - Record and analyze test outcomes to understand trends over time, identify weak 282 | points, and track improvements. 283 | - Performance testing should not be a one-time task; it should be an ongoing 284 | process integrated into the development lifecycle. 285 | 286 | #### Antipatterns 287 | - Running these tests locally or on an environment that doesn't match production 288 | in terms of infrastructure performance. Tests should be developed on a local 289 | instance, but the actual measurements should be performed live. 290 | - Ignoring data variability, ensure the test data mirrors real-world conditions, 291 | including varying user inputs and dataset sizes. 292 | - Ignoring randomness in user behavior, ensure the tests mimic actual user 293 | behavior, including realistic click frequency, page navigation patterns, and 294 | input actions. 295 | 296 | #### [Antipattern Examples](/examples/performance-testing-antipattern-examples.md) 297 | 298 | ### Visual Tests 299 | 300 | The type of test where test runner navigates to browser page, takes snapshot and 301 | then compares the snapshot with the reference snapshot. 302 | 303 | Visual tests allow you to quickly cover large portions of the application, 304 | ensuring that changes in the UI are detected without writing complex test cases. 305 | The downside is that they're requiring engineers to invest time in identifying 306 | the root cause of errors. 307 | 308 | #### When to use 309 | - When we want to make sure there are no changes in the UI. 310 | 311 | #### When **not** to use 312 | - To test a specific feature or business logic. 313 | - To test a specific user scenario. 314 | 315 | #### Best practices 316 | - Ensure the UI consistently renders the same output by eliminating randomness 317 | (e.g., by always using same seeds data or controlling API responses to always 318 | return same values). 319 | - Add as many pages as possible but keep the tests simple. 320 | - Consider running visual tests at the component level to isolate and detect 321 | issues earlier. 322 | - Define acceptable thresholds for minor visual differences (e.g., pixel 323 | tolerance) to reduce noise while detecting significant regressions. 324 | 325 | #### Antipatterns 326 | - Avoid creating overly complicated visual tests that try to simulate user 327 | behavior. These are better suited for E2E testing. 328 | - Visual tests should complement, not replace other types of tests like E2E 329 | tests. Over-relying on them can leave functional gaps in coverage. 330 | - Blindly updating snapshots without investigating failures undermines the 331 | purpose of visual testing and risks missing real issues. 332 | -------------------------------------------------------------------------------- /recipes/circleci-build-guide.md: -------------------------------------------------------------------------------- 1 | # CircleCI Build Guide Setup 2 | 3 | The following page provides a "getting started" example of CircleCI config. 4 | This config is used on a few active Studion projects. However, there are plans 5 | to make this guide obsolete by making Studion orb and a CLI script to generate 6 | config.yml file automatically. We can use this as a reference and to learn how to 7 | set up CircleCI. Especially as Studion orb(s) will be based off this setup. 8 | 9 | ## The application 10 | 11 | This guide assumes the application has the following components: 12 | - SPA frontend 13 | - Dockerfile for building server 14 | - Pulumi config for infrastructure 15 | 16 | This guide also assumes we are hosting the app on AWS. 17 | 18 | ## `./.circleci/config.yml` 19 | 20 | CircleCI uses `config.yml` file to define the tasks it will perform. 21 | The file should be placed inside `.circleci` directory in project root. 22 | Knowing YAML syntax is a prerequisite for writing CircleCI config. 23 | 24 | For better clarity, we will look at separate config blocks and describe what they do. 25 | 26 | ### Config Init 27 | 28 | Here we define the config version and dependencies. 29 | 30 | Orbs are CircleCI packages that allow us to define build process 31 | in a simple and easy way. Read more about orbs here https://circleci.com/orbs/. 32 | 33 | For this app, we need `aws-cli`, `aws-ecr`, `node` and `pulumi` orbs. 34 | 35 | 36 | ```yaml 37 | version: 2.1 38 | orbs: 39 | aws-cli: circleci/aws-cli@4.1.3 40 | aws-ecr: circleci/aws-ecr@9.0.4 41 | node: circleci/node@5.2.0 42 | pulumi: pulumi/pulumi@2.1.0 43 | 44 | executors: 45 | node: 46 | docker: 47 | - image: cimg/node:16.20.2 48 | base: 49 | docker: 50 | - image: cimg/base:stable-20.04 51 | ``` 52 | 53 | ### AWS Credentials 54 | 55 | In case we have multiple AWS credentials, we can define them at the beginning and 56 | reuse them where applicable. In this example, we have Studion AWS account and client 57 | AWS account credentials. 58 | 59 | ```yaml 60 | studion-aws-credentials: &studion-aws-credentials 61 | access_key: STUDION_AWS_ACCESS_KEY 62 | secret_key: STUDION_AWS_SECRET_KEY 63 | region: ${STUDION_AWS_REGION} 64 | 65 | client-aws-credentials: &client-aws-credentials 66 | access_key: CLIENT_AWS_ACCESS_KEY 67 | secret_key: CLIENT_AWS_SECRET_KEY 68 | region: ${CLIENT_AWS_REGION} 69 | ``` 70 | 71 | Note that we used YAML anchor here so we can reuse the credentials objects. 72 | Also, note that `access_key` and `secret_key` just contain the name of the env 73 | variable while `region` contains the actual value of the env variable. 74 | 75 | Environment variables are configured in CircleCI project settings 76 | within the CircleCI application. 77 | 78 | 79 | ### Job 1: Build Frontend 80 | 81 | This step pulls the code, injects secret .npmrc file, installs npm packages and 82 | runs build process. Finally, the output is persisted to workspace so we can upload 83 | it to S3 later in the build process. 84 | 85 | ```yaml 86 | jobs: 87 | build-frontend: 88 | working_directory: ~/app 89 | executor: node 90 | steps: 91 | - checkout 92 | - run: 93 | command: echo "@fortawesome:registry=https://npm.fontawesome.com/" > ~/app/.npmrc 94 | - run: 95 | command: echo "//npm.fontawesome.com/:_authToken=${FA_TOKEN}" >> ~/app/.npmrc 96 | - node/install-packages: 97 | override-ci-command: npm ci 98 | - run: 99 | name: Build frontend 100 | command: npm run build 101 | - persist_to_workspace: 102 | root: . 103 | paths: 104 | - dist 105 | ``` 106 | 107 | In this example, we have .npmrc file that contains the auth token for Font Awesome Pro 108 | package. This is how we can construct that file so `npm install` can install all 109 | required packages. 110 | 111 | 112 | ### Job 2: Build server 113 | 114 | This step pulls the code and uses AWS ECR orb to build Docker image and push it to 115 | private AWS registry. 116 | 117 | ```yaml 118 | build-server: 119 | working_directory: ~/app 120 | executor: 121 | name: aws-ecr/default 122 | docker_layer_caching: true 123 | parameters: 124 | access_key: 125 | type: string 126 | secret_key: 127 | type: string 128 | region: 129 | type: string 130 | account_id: 131 | type: string 132 | ecr_repo: 133 | type: string 134 | resource_class: medium 135 | steps: 136 | - checkout 137 | - run: 138 | command: echo "@fortawesome:registry=https://npm.fontawesome.com/" > ~/app/.npmrc 139 | - run: 140 | command: echo "//npm.fontawesome.com/:_authToken=${FA_TOKEN}" >> ~/app/.npmrc 141 | - aws-ecr/build_and_push_image: 142 | auth: 143 | - aws-cli/setup: 144 | aws_access_key_id: << parameters.access_key >> 145 | aws_secret_access_key: << parameters.secret_key >> 146 | region: << parameters.region >> 147 | account_id: << parameters.account_id >> 148 | attach_workspace: true 149 | checkout: false 150 | extra_build_args: "--secret id=npmrc_secret,src=.npmrc --target server" 151 | region: << parameters.region >> 152 | repo: << parameters.ecr_repo >> 153 | repo_encryption_type: KMS 154 | tag: latest,${CIRCLE_SHA1} 155 | ``` 156 | 157 | Note that this step accepts parameters which will be passed later when we define 158 | the complete workflow. 159 | 160 | More info about AWS ECR orb can be found here: 161 | https://circleci.com/developer/orbs/orb/circleci/aws-ecr 162 | 163 | 164 | ### Job 3: Deploy infrastructure 165 | 166 | This part calls Pulumi to set up AWS resources. 167 | 168 | ```yaml 169 | deploy-aws: 170 | working_directory: ~/app 171 | executor: node 172 | parameters: 173 | access_key: 174 | type: string 175 | secret_key: 176 | type: string 177 | region: 178 | type: string 179 | account_id: 180 | type: string 181 | ecr_repo: 182 | type: string 183 | stack: 184 | type: string 185 | steps: 186 | - checkout 187 | - aws-cli/setup: 188 | aws_access_key_id: << parameters.access_key >> 189 | aws_secret_access_key: << parameters.secret_key >> 190 | region: << parameters.region >> 191 | - pulumi/login 192 | - node/install-packages: 193 | app-dir: ./infrastructure 194 | - run: 195 | name: Configure envs 196 | command: | 197 | echo 'export SERVER_IMAGE="<< parameters.account_id >>.dkr.ecr.<< parameters.region >>.amazonaws.com/<< parameters.ecr_repo >>:${CIRCLE_SHA1}"' >> "$BASH_ENV" 198 | source "$BASH_ENV" 199 | - pulumi/update: 200 | stack: "<< parameters.stack >>" 201 | working_directory: ./infrastructure 202 | skip-preview: true 203 | - pulumi/stack_output: 204 | stack: "<< parameters.stack >>" 205 | property_name: frontendBucketName 206 | env_var: S3_SITE_BUCKET 207 | working_directory: ./infrastructure 208 | - pulumi/stack_output: 209 | stack: "<< parameters.stack >>" 210 | property_name: cloudfrontId 211 | env_var: CF_DISTRIBUTION_ID 212 | working_directory: ./infrastructure 213 | - run: 214 | name: Store pulumi output as env file 215 | command: cp $BASH_ENV bash.env 216 | - persist_to_workspace: 217 | root: . 218 | paths: 219 | - bash.env 220 | ``` 221 | 222 | Note that this step assumes that Pulumi files are located in `infrastructure` 223 | directory in project root. 224 | 225 | We export `SERVER_IMAGE` env variable which is used in Pulumi to create an ECS 226 | service with that image. Notice we're missing .env files. That is because we put 227 | all secrets in AWS SSM Parameter Store and we configured our Pulumi ECS service 228 | to pull the secrets from there. 229 | 230 | Pulumi needs to be configured so it outputs at least two variables: 231 | 232 | 1. S3 bucket name where we will upload built frontend from job 1 233 | 2. Cloudfront Distribution ID which we'll use to invalidate its cache 234 | 235 | 236 | Both variables are stored in `bash.env` file and that file is persisted to workspace 237 | because that is the easiest way of carrying those variables over to the next step. 238 | 239 | ### Job 4: Deploy Frontend 240 | 241 | This is the step where we upload the frontend dist files from job 1 to S3 bucket 242 | that was created in job 3. 243 | 244 | ```yaml 245 | deploy-frontend: 246 | working_directory: ~/app 247 | parameters: 248 | access_key: 249 | type: string 250 | secret_key: 251 | type: string 252 | region: 253 | type: string 254 | executor: base 255 | steps: 256 | - attach_workspace: 257 | at: . 258 | - aws-cli/setup: 259 | aws_access_key_id: << parameters.access_key >> 260 | aws_secret_access_key: << parameters.secret_key >> 261 | region: << parameters.region >> 262 | - run: 263 | name: Set environment variables 264 | command: cat bash.env >> $BASH_ENV 265 | - run: 266 | name: Deploy to S3 267 | command: | 268 | aws s3 sync dist s3://${S3_SITE_BUCKET} --no-progress --delete 269 | aws cloudfront create-invalidation --distribution-id ${CF_DISTRIBUTION_ID} --paths "/*" 270 | 271 | ``` 272 | 273 | ### Workflow definition 274 | 275 | Workflow is used to orchestrate different jobs and configure job dependencies, 276 | for example: we need to wait for the infrastructure deployment before we can upload 277 | files to S3 (which is supposed to be created in that job). 278 | In this example we can see that we run this workflow only when the branch name is 279 | `develop`. 280 | 281 | ```yaml 282 | workflows: 283 | version: 2 284 | build-and-deploy-dev: 285 | when: 286 | and: 287 | - equal: [develop, << pipeline.git.branch >>] 288 | jobs: 289 | - build-frontend 290 | - build-server: 291 | <<: *studion-aws-credentials 292 | account_id: ${STUDION_AWS_ACCOUNT_ID} 293 | ecr_repo: app_server 294 | - deploy-aws: 295 | <<: *studion-aws-credentials 296 | account_id: ${STUDION_AWS_ACCOUNT_ID} 297 | ecr_repo: app_server 298 | stack: dev 299 | requires: 300 | - build-server 301 | - deploy-frontend: 302 | <<: *studion-aws-credentials 303 | requires: 304 | - build-frontend 305 | - deploy-aws 306 | ``` 307 | 308 | Note that job params are set for each job and here we can use AWS credentials which 309 | we defined at the beginning of the file. We can also see that some jobs can run in 310 | parallel, for example: frontend and backend builds don't depend on each other and 311 | that is how we can speed up the build process. 312 | 313 | ### Workflow definition part 2 314 | 315 | In the previous steps we defined job parameters that allow us to 316 | easily build different environments, for example: staging. 317 | 318 | Everything remains the same, we just need to change some variables and we can 319 | easily deploy to as many environments as we want. 320 | 321 | ```yaml 322 | build-and-deploy-stage: 323 | when: 324 | and: 325 | - equal: [stage, << pipeline.git.branch >>] 326 | jobs: 327 | - build-frontend 328 | - build-server: 329 | <<: *client-aws-credentials 330 | account_id: ${CLIENT_AWS_ACCOUNT_ID} 331 | ecr_repo: app_server 332 | - deploy-aws: 333 | <<: *client-aws-credentials 334 | account_id: ${CLIENT_AWS_ACCOUNT_ID} 335 | ecr_repo: app_server 336 | stack: stage 337 | requires: 338 | - build-server 339 | - deploy-frontend: 340 | <<: *client-aws-credentials 341 | requires: 342 | - build-frontend 343 | - deploy-aws 344 | ``` 345 | -------------------------------------------------------------------------------- /recipes/docker-image-guide.md: -------------------------------------------------------------------------------- 1 | # Docker image guide 2 | 3 | The following handbook offers best practices for creating small and secure NodeJs 4 | Docker images suitable for production use. 5 | You will find it helpfull no matter what type of NodeJs application you aim to build. 6 | 7 | Examples with each step explained will be used to guide you through best practices. 8 | 9 | ## Simple NodeJs application 10 | 11 | ### The application 12 | 13 | Let's start with the simple nodejs application. Here is an overview of the files 14 | included in the project: 15 | 16 | ``` 17 | ├── index.js 18 | ├── package.json 19 | ├── package-lock.json 20 | ├── Dockerfile 21 | ├── .dockerignore 22 | ├── .npmrc 23 | ``` 24 | 25 | ```js 26 | // index.js 27 | const express = require("express"); 28 | const os = require("os"); 29 | 30 | const app = express(); 31 | 32 | app.use("/", (req, res) => { 33 | res.send(`Hello world from ${os.hostname()}.`); 34 | }); 35 | 36 | app.listen(3000, () => { 37 | console.log("App is listening on port 3000"); 38 | }); 39 | ``` 40 | 41 | ### The Dockerfile 42 | 43 | ```dockerfile 44 | # https://cheatsheetseries.owasp.org/cheatsheets/NodeJS_Docker_Cheat_Sheet.html#1-use-explicit-and-deterministic-docker-base-image-tags 45 | # https://snyk.io/blog/choosing-the-best-node-js-docker-image 46 | FROM node:20.9-bookworm-slim@sha256:c325fe5059c504933948ae6483f3402f136b96492dff640ced5dfa1f72a51716 AS base 47 | # https://docs.docker.com/build/cache/#combine-commands-together-wherever-possible 48 | # https://cheatsheetseries.owasp.org/cheatsheets/NodeJS_Docker_Cheat_Sheet.html#5-properly-handle-events-to-safely-terminate-a-nodejs-docker-web-application 49 | # https://github.com/Yelp/dumb-init 50 | RUN apt update && apt install -y --no-install-recommends dumb-init 51 | ENTRYPOINT ["dumb-init", "--"] 52 | 53 | FROM node:20.9-bookworm@sha256:3c48678afb1ae5ca5931bd154d8c1a92a4783555331b535bbd7e0822f9ca8603 AS install 54 | # https://www.pathname.com/fhs/pub/fhs-2.3.html#USRSRCSOURCECODE 55 | WORKDIR /usr/src/app 56 | # https://cheatsheetseries.owasp.org/cheatsheets/NodeJS_Docker_Cheat_Sheet.html#3-optimize-nodejs-tooling-for-production 57 | ENV NODE_ENV production 58 | COPY package*.json . 59 | # https://cheatsheetseries.owasp.org/cheatsheets/NodeJS_Docker_Cheat_Sheet.html#2-install-only-production-dependencies-in-the-nodejs-docker-image 60 | # when NODE_ENV is set to production, npm ci automatically omits dev dependencies 61 | # https://docs.npmjs.com/cli/v10/commands/npm-ci#omit 62 | # NOTE: if we don't have secrets, this is how we install npm packages, however if we 63 | # do have npmrc secret, we skip this step and proceed to the next. 64 | RUN npm ci --omit=dev 65 | # we can mount .npmrc secret file without leaving the secrets in the final built image 66 | # refer to docs https://docs.docker.com/build/building/secrets/ 67 | RUN --mount=type=secret,id=npmrc_secret,target=/usr/src/app/.npmrc,required npm ci --omit=dev 68 | 69 | FROM base AS configure 70 | WORKDIR /usr/src/app 71 | COPY --chown=node:node --from=install /usr/src/app/node_modules ./node_modules 72 | # https://docs.docker.com/build/cache/#dont-include-unnecessary-files 73 | COPY --chown=node:node ./index.js . 74 | 75 | FROM configure AS run 76 | ENV NODE_ENV production 77 | # https://cheatsheetseries.owasp.org/cheatsheets/NodeJS_Docker_Cheat_Sheet.html#4-dont-run-containers-as-root 78 | USER node 79 | CMD [ "node", "index.js" ] 80 | ``` 81 | 82 | ### Important notes: 83 | 84 | 1. **Always** specify `.dockerignore` files to reduce security risks and image 85 | footprint size. Also, by avoiding sending unwanted files to the builder, 86 | build speed is improved. The file should at least include 87 | `node_modules`, `.git` and `.env` files. 88 | 89 | 2. [The order of Dockerfile instructions matters](https://docs.docker.com/build/guide/layers/). 90 | 91 | 3. `FROM node:20.9-bookworm-slim@sha256:c32...16 AS base` 92 | 93 | - Selecting the appropriate Docker image is crucial to achieve minimal resource 94 | utilization and minimize vulnerability risks. 95 | - It is recommended to always use official docker images even though they are 96 | not the smallest ones. An excellent illustration of this is the "alpine" image, 97 | which, while having a minimal footprint, has experimental support. 98 | - Include image sha256 hash to ensure that always the same image is downloaded. 99 | - Use explicit and deterministic Docker base image tags to improve readability 100 | and maintainability. 101 | - Currently, the official `bookworm-slim` image appears to be the most suitable 102 | choice for a Node.js runner image, given its minimal size (nearly equivalent 103 | to the Alpine version). Read more about chosing the best NodeJs Docker image 104 | [here](https://snyk.io/blog/choosing-the-best-node-js-docker-image). 105 | 106 | 4. `RUN apt update && apt install -y --no-install-recommends dumb-init` 107 | 108 | - NodeJs is not designed to be a PID 1 process so we are using a process wrapper 109 | to handle termination signals for us instead of doing it manually inside our 110 | NodeJs application. 111 | - Notice how these two commands (apt update and apt install dumb-init) are chained. 112 | By doing so we are saving some image footprint size because each docker image step 113 | adds an additional layer which affects the final size. That being said, it is 114 | recommended whenever is possible to chain RUN commands into single command. 115 | 116 | 5. `ENTRYPOINT ["dumb-init", "--"]` 117 | 118 | - As explained above, we are using dump-init as PID 1 process. 119 | 120 | 6. `FROM node:20.9-bookworm@sha256:3c...603 AS install` 121 | 122 | - For dependency installation (and later for the build phase), we use the 123 | standard `bookworm` image instead of the `slim` version because certain 124 | dependencies require additional tools for the compilation step. 125 | 126 | 7. `WORKDIR /usr/src/app` 127 | 128 | - Application code should be placed inside `/usr/src` subdirectory. 129 | 130 | 8. `ENV NODE_ENV production` 131 | 132 | - If you are building your image for production this ensures that all frameworks 133 | and libraries are using the optimal settings for performance and security. 134 | 135 | 9. `COPY package*.json .` 136 | 137 | - It's important to notice here that we are copying `package*.json` files 138 | separate from the rest of the codebase. By doing so, we are leveraging Docker 139 | layers caching functionality mentioned in step 2. When source code 140 | changes, we don't want to reinstall dependencies because they remain unchanged. 141 | By copying source code files after dependency installation, we are only 142 | re-executing those steps that come after that step, including that step. 143 | 144 | 10. `RUN npm ci --omit=dev` 145 | 146 | - devDependencies are not essential for the application to work. By installing 147 | only production dependencies we are reducing security risks and image 148 | footprint size and also improving build speed. 149 | 150 | 11. `COPY --chown=node:node --from=install /usr/src/app/node_modules ./node_modules` 151 | 152 | - From the install phase we are copying only the node_modules folder in order 153 | to keep the final Docker image minimal. 154 | - The default Docker behavior is that copied files are owned by `root`. 155 | By specifying `--chown=node:node` we are telling that `node_modules` files 156 | will be owned by `node` user instead of `root`. 157 | `node` is the least privileged user and by selecting it, we are limiting the 158 | number of actions an attacker can do in case our application gets compromised. 159 | 160 | 12. `COPY --chown=node:node ./index.js .` 161 | 162 | - Copy the rest of the codebase as described in step 9. For this example, we 163 | are copying only the `index.js` file because that is the only file we need in 164 | order to run our application. Avoid adding unnecessary files to your builds by 165 | explicitly stating the files or directories you intend to copy over. 166 | 167 | 13. `USER node` 168 | 169 | - The process should be owned by the `node` user instead of `root`. 170 | 171 | 14. `RUN --mount=type=secret,id=npmrc_secret,target=/usr/src/app/.npmrc,required npm ci --omit=dev` 172 | 173 | - The files mounted as secrets will be available during build, but they will not 174 | remain in the final image. The secret can be any file, but npmrc is most common 175 | so we use it as an example. 176 | To be able to use the secret, we must pass it either as a param to Docker build or 177 | define it in Docker compose. 178 | 179 | - Docker build example: 180 | `docker build -t ntc-lms . --secret id=npmrc_secret,src=.npmrc` 181 | 182 | - Docker compose.yaml example: 183 | ```yaml 184 | services: 185 | app: 186 | build: 187 | context: . 188 | secrets: 189 | - npmrc_secret 190 | 191 | ... 192 | 193 | secrets: 194 | npmrc_secret: 195 | file: .npmrc 196 | ``` 197 | 198 | 199 | ## Typescript NodeJs application 200 | 201 | ### The application 202 | 203 | ``` 204 | ├── src 205 | │ ├── index.ts 206 | ├── dist 207 | │ ├── index.js 208 | ├── node_modules 209 | ├── tsconfig.json 210 | ├── package.json 211 | ├── package-lock.json 212 | ├── Dockerfile 213 | ├── .dockerignore 214 | ``` 215 | 216 | ### The Dockerfile 217 | 218 | ```dockerfile 219 | FROM node:20.9-bookworm-slim@sha256:c325fe5059c504933948ae6483f3402f136b96492dff640ced5dfa1f72a51716 AS base 220 | RUN apt update && apt install -y --no-install-recommends dumb-init 221 | ENTRYPOINT ["dumb-init", "--"] 222 | 223 | FROM node:20.9-bookworm@sha256:3c48678afb1ae5ca5931bd154d8c1a92a4783555331b535bbd7e0822f9ca8603 AS build 224 | WORKDIR /usr/src/app 225 | COPY package*.json . 226 | RUN npm ci 227 | COPY ./src tsconfig.json ./ 228 | RUN npm run build 229 | 230 | FROM node:20.9-bookworm@sha256:3c48678afb1ae5ca5931bd154d8c1a92a4783555331b535bbd7e0822f9ca8603 AS install 231 | WORKDIR /usr/src/app 232 | ENV NODE_ENV production 233 | COPY package*.json . 234 | RUN npm ci --omit=dev 235 | 236 | FROM base AS configure 237 | WORKDIR /usr/src/app 238 | COPY --chown=node:node --from=build /usr/src/app/dist ./dist 239 | COPY --chown=node:node --from=install /usr/src/app/node_modules ./node_modules 240 | 241 | FROM configure AS run 242 | ENV NODE_ENV production 243 | USER node 244 | CMD [ "node", "dist/index.js" ] 245 | ``` 246 | 247 | ### Important notes: 248 | 249 | 1. [Use multi-stage builds](https://cheatsheetseries.owasp.org/cheatsheets/NodeJS_Docker_Cheat_Sheet.html#8-use-multi-stage-builds). 250 | By splitting the docker image into multiple stages, we are ensuring that the 251 | final image only contains essential files which reduces image footprint size 252 | and security risks. In the given example, we begin by building a TypeScript 253 | application in the build stage. 254 | Ultimately, in the configure phase, we exclusively copy the `dist` 255 | output from the build stage to the final Docker image. 256 | Another benefit of using a multi-stage build is that the Docker builder will 257 | work out dependencies between the stages and run them using the most 258 | efficient strategy. This even allows you to run multiple builds concurrently. 259 | 260 | ## Bonus Tips 261 | 262 | ### Caching 263 | 264 | Often, Docker images are built inside CI/CD pipeline. To enhance the efficiency 265 | of CI/CD and minimize computation costs, leveraging caching is crucial. 266 | 267 | Caching depends on the platform that is being used. 268 | For detailed guidance on caching Docker image layers with CircleCI, refer to 269 | this [link](https://circleci.com/docs/docker-layer-caching). 270 | 271 | Also, on this 272 | [link](https://courses.devopsdirective.com/docker-beginner-to-pro/lessons/06-building-container-images/02-api-node-dockerfile#use-a-cache-mount-to-speed-up-dependency-installation-%EF%B8%8F) 273 | you can find how to cache npm dependencies between builds. 274 | 275 | ## Resources 276 | 277 | - [NodeJs Docker Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/NodeJS_Docker_Cheat_Sheet.html#nodejs-docker-cheat-sheet) 278 | - [Choosing the best NodeJs Docker image](https://snyk.io/blog/choosing-the-best-node-js-docker-image) 279 | - [Docker guide](https://docs.docker.com/build/guide/) 280 | - [10 best practices to containerize NodeJs web applications with Docker](https://snyk.io/blog/10-best-practices-to-containerize-nodejs-web-applications-with-docker/) 281 | - [NodeJs API Dockerfile](https://courses.devopsdirective.com/docker-beginner-to-pro/lessons/06-building-container-images/02-api-node-dockerfile#nodejs-api-dockerfile) 282 | -------------------------------------------------------------------------------- /recipes/handling-concurrency.md: -------------------------------------------------------------------------------- 1 | # Handling concurrency using optimistic or pessimistic locking 2 | 3 | Concurrency issues arise when multiple entities, such as services, threads, or users, access or modify the same resource simultaneously, leading to race conditions. 4 | 5 | ![Concurrency](../assets/images/concurrency.png) 6 | 7 | Race conditions are very hard to track and debug. 8 | Typically, we develop sequential applications where concurrency concerns are minimal. 9 | Concurrency is a key factor in boosting performance; it is utilized to increase throughput or reduce execution times, and is likely unnecessary in the absence of specific performance requirements. 10 | However, concurrency issues can also occur from parallel requests to the same resources. 11 | For instance, when two users try to reserve the same inventory item at the same time, it highlights a scenario where concurrency management becomes crucial, regardless of performance considerations. 12 | 13 | Addressing concurrency within a monolithic architecture differs significantly from managing it in distributed systems. In this article, we will concentrate on strategies for handling concurrency in monolithic applications, reflecting a more common scenario within our company. 14 | 15 | There are multiple approaches to managing concurrency, including database isolation levels and design-centric solutions like end-to-end partitioning. In this article, we will specifically explore the concepts of optimistic and pessimistic locking. 16 | 17 | We'll also explore practical example of optimistic locking. Let's dive into it! 18 | 19 | ## Pesimisstic locking vs Optimistic locking 20 | 21 | ![Pessimistic vs Optimistic locking](../assets/images/pessimistic-vs-optimistic-locking.png) 22 | 23 | The concepts of pessimistic versus optimistic locking can be metaphorically compared to asking for permission versus apologizing afterward. In pessimistic locking, a lock is placed on the resource, requiring all consumers to ASK for permission before they can modify it. On the other hand, optimistic locking does not involve placing a lock. Instead, consumers proceed under the assumption that they are the sole users of the resource and APOLOGIZE in case of conflict. 24 | 25 | In the pessimistic lock approach, actual locks are used on the database level, while in the optimistic approach, no locks are used, resulting in higher throughput. 26 | 27 | Optimistic locking, generally speaking, is easier to implement and offers performance benefits but can become costly in scenarios where the chance of conflict is high. 28 | As the probability of conflict increases, the chances of transaction abortion also rise. Rollbacks can be costly for the system as it needs to revert all current pending changes which might involve both table rows and index records. 29 | 30 | In situations with a high risk of conflict, it might be better to use pessimistic locking in the first place rather than doing rollbacks and subsequent retries, which can put additional load on the system. 31 | However, pessimistic locking can affect the performance and scalability of your application by maintaining locks on database rows or tables and may also lead to deadlocks if not managed carefully. 32 | 33 | **Steps for Implementing Pessimistic Locking:** 34 | 35 | 1. Retrieve and lock the resource from the database (using "SELECT FOR UPDATE") 36 | 2. Apply the necessary changes 37 | 3. Save the changes by committing the transaction, which also releases the locks. 38 | 39 | **Steps for Implementing Optimistic Locking:** 40 | 41 | ![Optimistic locking](../assets/images/optimistic-locking.png) 42 | 43 | 1. Retrieve the resource from the database 44 | 2. Apply the necessary changes 45 | 3. Save the changes to the database; an error is thrown in case of a version mismatch, otherwise, the version is incremented 46 | 47 | ## Example 48 | 49 | Let's see optimistic locking in action. 50 | We'll illustrate its application through an example where users can reserve inventory items for a specified period. 51 | Although I'm utilizing MikroORM and PostgreSQL for this demonstration, it's worth noting that optimistic locking can be implemented with nearly any database. 52 | 53 | Potentially, we could encounter a scenario where users attempt to reserve the same inventory item simultaneously for overlapping time periods, which could lead to issues. 54 | 55 | I would say that the majority of the work here lies in the aggregate design. 56 | I've structured it so that the `InventoryItem` entity acts as the aggregate root, containing `Reservations`. This design mandates that the creation of reservations must proceed through the `InventoryItem` aggregate root, which encapsulates this specific logic. 57 | Upon retrieving an inventory item from the database, it will include all current reservations for that item, enabling us to apply business logic to determine if there is any overlapping reservation that conflicts with the one we intend to create. 58 | If there is no conflict, we proceed with the creation. This method centralizes the reservation creation logic, thereby ensuring consistency. 59 | 60 | For optimistic locking, we need a field to determine if an entity has changed since we retrieved it from the database. I used an integer `version` field that increments with each modification. 61 | 62 | Here is the code for the `InventoryItem` aggregate root: 63 | 64 | ```ts 65 | // inventory-item.entity.ts 66 | @Entity() 67 | export class InventoryItem extends AggregateRoot { 68 | @Property({ version: true }) 69 | readonly version: number = 1; 70 | 71 | @OneToMany({ 72 | entity: () => Reservation, 73 | mappedBy: (it) => it.reservationItem, 74 | orphanRemoval: true, 75 | eager: true, 76 | }) 77 | private _reservations = new Collection(this); 78 | 79 | createReservation(startDate: Date, endDate: Date, userId: number) { 80 | const overlapingReservation = 81 | this.reservations.some(/** Find overlaping reservation logic */); 82 | if (overlapingReservation) { 83 | throw new ReservationOverlapException(); 84 | } 85 | const reservation = new Reservation(this, startDate, endDate, userId); 86 | this._reservations.add(reservation); 87 | this.updatedAt = new Date(); 88 | } 89 | 90 | get reservations() { 91 | return this._reservations.getItems(); 92 | } 93 | } 94 | ``` 95 | 96 | By applying `@Property({ version: true })` we instruct MikroORM to treat this field as the version field. 97 | MikroORM will handle incrementing the version field and will throw an `OptimisticLock` exception in the case of a conflict. 98 | 99 | ```ts 100 | // reservation.entity.ts 101 | @Entity() 102 | export class Reservation extends BaseEntity { 103 | @Property() 104 | startDate: Date; 105 | 106 | @Property() 107 | endDate: Date; 108 | 109 | @ManyToOne({ 110 | entity: () => InventoryItem, 111 | serializer: (it) => it.id, 112 | serializedName: "inventoryItemId", 113 | fieldName: "inventory_item_id", 114 | }) 115 | inventoryItem!: InventoryItem; 116 | 117 | @Property() 118 | userId: number; 119 | 120 | constructor( 121 | inventoryItem: InventoryItem, 122 | startDate: Date, 123 | endDate: Date, 124 | userId: number 125 | ) { 126 | super(); 127 | this.inventoryItem = inventoryItem; 128 | this.startDate = startDate; 129 | this.endDate = endDate; 130 | this.userId = userId; 131 | } 132 | } 133 | ``` 134 | 135 | Now, within the `createReservation` method, all we need to do is: 136 | 137 | - Retrieve the inventory item entity from the repository 138 | - Create a reservation by invoking the inventoryItem.createReservation method 139 | - Flush the changes to the database 140 | 141 | ```ts 142 | // create-reservation.command.ts 143 | @RetryOnError(OptimisticLockError) 144 | @CreateRequestContext() 145 | async createReservation(payload: CreateReservationPayload): Promise { 146 | const { 147 | userId, 148 | inventoryItemId: id, 149 | startDate, 150 | endDate, 151 | } = payload; 152 | const inventoryItem = await this.repository.findById(id); 153 | inventoryItem.createReservation(startDate, endDate, userId); 154 | await this.em.flush(); 155 | } 156 | ``` 157 | 158 | You might have noticed that in the case of an `OptimisticLockError`, I used a custom `@RetryOnError` decorator to retry the operation. This approach is adopted because users may attempt to reserve the same inventory item for different time periods, leading to an `OptimisticLockError` for one of the requests. By retrying, we ensure that the end user is not aware of multiple concurrent requests occurring simultaneously. 159 | 160 | In this scenario, we could also leverage database transaction isolation levels, like SERIALIZABLE, since this transaction does not span across multiple requests. However, there is often a requirement for long-running business processes that span multiple requests. In these situations, database transactions alone are insufficient for managing concurrency throughout such an extended business transaction. For these cases, optimistic locking proves to be a highly suitable solution. 161 | 162 | You can find full working example [here](https://github.com/ikovac/teem-clone/tree/master/apps/api/src/reservation). 163 | 164 | Also, If you are interested in implementing cross-request optimistic locking, check out the [MikroORM documentation](https://mikro-orm.io/docs/transactions#optimistic-locking) 165 | -------------------------------------------------------------------------------- /recipes/lti.md: -------------------------------------------------------------------------------- 1 | # LTI - Theory 2 | 3 | ## What is LTI? 4 | 5 | LTI (Learning Tools Interoperability) is a standard developed by IMS Global Learning Consortium designed to enable seamless integration of educational tools and platforms. LTI standard allows learning management systems (LMS) or any other platforms to integrate remote tools and content in a standard way without logging into each tool. 6 | 7 | LTI allows platforms to host numerous integrations. By using LTI, tools can be developed independently from the platform by different parties. 8 | 9 | **This explanation sounds a little bit overwhelming. Can you explain it to me like I am 10?** 10 | 11 | Sure! Imagine you go to a school that has its own website where you can find all your lessons, videos, and tests. Every student has their own account, and you can see your scores and progress on this website. This special website is called the Learning Management System or LMS. 12 | 13 | Now, think about this: a new, fun quiz game web app has just come out, and your school wants to use it in class. Normally, you would have to go to the quiz game’s website, create a new account, and then log in to play. But wouldn’t it be easier if you could play the quiz game right from your school’s LMS, without making a new account? 14 | 15 | That’s exactly what the LTI standard helps with! LTI allows the quiz game to be added to your school’s LMS. This way, you can play the quiz without leaving the LMS and without creating a new account. You just log in to the LMS, and you can access everything you need with one username and password. 16 | 17 | ## Key concepts 18 | 19 | - **Platform**. A *tool platform* or, more simply, *platform* has traditionally been a Learning Management System (LMS), but it may be any kind of platform that needs to delegate bits of functionality out to a suite of *tools*. Examples of platforms are LMS systems such as Coursera, edX, Moodle, Blackboard, and Canvas. 20 | - **Tool**. The external application or service providing functionality to the *platform* is called a *tool*. 21 | 22 | 26 | 27 | ## Example: LTI from the learner's perspective 28 | 29 | Let's explore LTI from the learner's perspective. 30 | 31 | Firstly, let's introduce the tool. It's a very simple web app with just a few pages: a home page and some resource pages. 32 | 33 | ![lti-home.png](../assets/images/lti/lti-home.png) 34 | 35 | ![lti-resource1.png](../assets/images/lti/lti-resource1.png) 36 | 37 | On the LMS side, we have lessons, and when a user clicks on the lesson reading link, it launches our LTI tool, which provides the content for Resource 1. This process is called an LTI launch. 38 | 39 | ![lms-lessons.png](../assets/images/lti/lms-lessons.png) 40 | 41 | ![lms-tool-launch.png](../assets/images/lti/lms-tool-launch.png) 42 | 43 | The tool's content is embedded within an iframe, so the user doesn't even realize the content is coming from another source. This creates a seamless experience, making it feel like part of the LMS, and the user never leaves the LMS platform. Additionally, the user doesn't need to log in to the tool itself. There are other techniques for integrating an LTI tool, but using an iframe is the most common method. 44 | 45 | **The platform acts as the OIDC provider in the LTI process, meaning the tool doesn't need to know anything about the platform’s identity provider. For example, if the platform uses Auth0 for authentication, the tool doesn't need to know anything about Auth0 because the platform issues the ID token, thereby serving as the OIDC provider from the tool's perspective.** 46 | 47 | **LTI messages sent from the platform are *OpenID Tokens*. Messages sent from the tool are *JSON Web Tokens* (JWT) as the tool is not typically acting as OpenID Provider.** 48 | 49 | ## LTI 1.1 vs 1.3 50 | 51 | The most popular LTI versions are 1.1 and 1.3. Version 1.1 is deprecated, so this article will focus on the latest version, 1.3, which offers additional capabilities and improvements. Some key benefits of version 1.3 compared to 1.1 include security enhancements: 52 | 53 | - LTI 1.1 uses OAuth 1.0, which has been deprecated for years due to insufficient protection. In practice, controlling access to the secret is very difficult. 54 | - LTI 1.3, on the other hand, uses OAuth 2.0 and JWT message signing protocols. 55 | - Additionally, LTI 1.3 adopts the OpenID Connect workflow for authentication with every launch. 56 | 57 | You can read about all the benefits of version 1.3 [here](https://brijendrasinghrajput.medium.com/lti-1-3-benefits-over-lti-1-1-a1a37e94bc5b). 58 | 59 | ## LTI components 60 | 61 | ![lti-advantage.png](../assets/images/lti/lti-advantage.png) 62 | 63 | The message for embedding content into the platform (LTI launch) is referred to as [LTI Core](https://www.imsglobal.org/spec/lti/v1p3). Additional services are deep linking, names and roles, and assignments and grades. All together are called LTI Advantage. 64 | 65 | - [_Deep Linking_](https://www.imsglobal.org/spec/lti-dl/v2p0/): the ability to launch a tool’s configuration panel that will return a configured resource link to the platform based on what the user has set up. The next time a user accesses the link, it will be taken to the configured activity/resource instead of the configuration panel. 66 | - [_Assignment and Grading Services_](https://www.imsglobal.org/spec/lti-ags/v2p0/) (AGS): the ability of posting grades generated by the tool back to the platform’s grade book. 67 | - [_Names and Role Provisioning Services_](http://www.imsglobal.org/spec/lti-nrps/v2p0) (NRPS): the ability of the platform providing the tool with students list and user information. 68 | 69 | ## Implementation 70 | 71 | 1. [Creating LTI compatible Tool](https://github.com/ExtensionEngine/lti-tool-example) 72 | 2. Creating LTI compatible Platform - TBA 73 | -------------------------------------------------------------------------------- /recipes/ses-bounce-handling.md: -------------------------------------------------------------------------------- 1 | # SES Bounce handling 2 | 3 | ## Context 4 | 5 | Amazon is very strict on rules regarding its email service SES. If you’re having too many bounces or complaints, resulting in a non-healthy sending status, you’ll receive a service block easily. 6 | 7 | That's why it's important to handle bounces and complaints accordingly. 8 | 9 | **What are bounces and complaints?** 10 | 11 | Bounces occur when an email cannot be delivered to the recipient for various reasons. There are two types of bounces: hard bounces and soft bounces. 12 | 13 | 1. Hard Bounces 14 | 15 | - Hard bounces are caused by permanent issues, such as an invalid email address, a non-existent domain, or the recipient's email server blocking the message. 16 | - **Impact**: Consistently sending emails to invalid or non-existent addresses can harm your sender reputation. A poor sender reputation may lead to your emails being marked as spam or rejected by email service providers, impacting your overall deliverability. 17 | 18 | 2. Soft Bounces: 19 | 20 | - Soft bounces result from temporary issues, such as the recipient's mailbox being full or the email server being temporarily unavailable. 21 | - Unlike hard bounces, soft bounces provide an opportunity to retry delivering the email. 22 | 23 | Complaints arise when recipients mark your emails as spam or unwanted. It's important to promptly process this feedback, investigate the reasons for complaints, and take corrective actions, such as removing complaining recipients from your mailing list. 24 | 25 | ## Building a simple SES bounce handling example 26 | 27 | What are we going to build? 28 | 29 | ![Infrastructure diagram](../assets/images/infrastructure-diagram.png) 30 | 31 | We are going to build two bounce handlers - the first one which will send an alert email upon detecting a bounce, and the second one which will add bounced emails to the block list. 32 | The application can subsequently verify whether the recipient's email is listed in the block list. If it is, the system will refrain from sending an email to that address. 33 | 34 | You can find the complete working code [here](https://github.com/ikovac/ses-bounce-handling). 35 | 36 | We are going to use [Pulumi](https://www.pulumi.com/) to create and manage our infrastructure. 37 | 38 | ### SNS Topic 39 | 40 | Let's start with creating a simple Pulumi program. 41 | We need to create a SNS Topic which is used for receiving SES notification messages. 42 | 43 | ```typescript 44 | import * as aws from "@pulumi/aws"; 45 | import * as pulumi from "@pulumi/pulumi"; 46 | 47 | export const snsTopic = new aws.sns.Topic("ses-notifications", { 48 | fifoTopic: false, 49 | namePrefix: "ses-notifications", 50 | }); 51 | ``` 52 | 53 | Once we have created the SNS topic we have to tell SES to use that topic for sending notification messages. To do so, navigate to the AWS Console. Go to the SES service -> Verified identities -> Select your identity -> Notifications tab -> Feedback notifications -> Edit. 54 | 55 | ![SES SNS topic](../assets/images/ses-sns-topic.png) 56 | 57 | Select your newly created SNS topic for bounce feedback and click "Save changes". 58 | 59 | ### SNS Topic Subscription 60 | 61 | Now, with SES notifications routed to the SNS topic, we can proceed to create our initial handler. This handler will be responsible for sending notification emails when bounces occur. 62 | 63 | ```typescript 64 | const config = new pulumi.Config(); 65 | 66 | export const subscription = new aws.sns.TopicSubscription( 67 | "send-email-notification-handler", 68 | { 69 | topic: snsTopic.arn, 70 | protocol: "email-json", 71 | endpoint: config.require("email"), 72 | filterPolicyScope: "MessageBody", 73 | filterPolicy: JSON.stringify({ notificationType: ["Bounce"] }), 74 | } 75 | ); 76 | ``` 77 | 78 | In the code snippet above we created `sns.TopicSubscription` resource that uses `email-json` protocol. Additionally, by defining a `filterPolicy`, we ensure that emails are sent only for messages of the type `Bounce`. 79 | After deploying the current code, you will receive a subscription confirmation email. Please confirm it to start receiving email notifications. 80 | 81 | Now we can test our handler by sending a test email. 82 | Navigate to your SES identity page and click the "Send test email" button. 83 | 84 | ![SES test email](../assets/images/test-email.png) 85 | 86 | If everything goes as expected you should receive an email notification containing bounce details 🥳. 87 | 88 | Although the current handler notifies relevant parties about a bounce, it doesn't take preventive measures for the future. Let's develop a block list handler to address this issue effectively. 89 | 90 | ### SQS Queue 91 | 92 | ```typescript 93 | const queue = new aws.sqs.Queue("ses-notifications-queue", { 94 | fifoQueue: false, 95 | namePrefix: "ses-notifications-queue", 96 | sqsManagedSseEnabled: true, 97 | }); 98 | 99 | const allowSNSToQueueMessages = aws.iam.getPolicyDocumentOutput({ 100 | statements: [ 101 | { 102 | sid: "AllowSNSToQueueMessages", 103 | effect: "Allow", 104 | actions: ["sqs:SendMessage"], 105 | resources: [queue.arn], 106 | principals: [ 107 | { 108 | type: "*", 109 | identifiers: ["*"], 110 | }, 111 | ], 112 | conditions: [ 113 | { 114 | test: "ArnEquals", 115 | variable: "aws:SourceArn", 116 | values: [snsTopic.arn], 117 | }, 118 | ], 119 | }, 120 | ], 121 | }); 122 | 123 | const allowSNSToQueueMessagesPolicy = new aws.sqs.QueuePolicy( 124 | "allow-sns-to-queue-messages-policy", 125 | { 126 | queueUrl: queue.id, 127 | policy: allowSNSToQueueMessages.apply((policy) => policy.json), 128 | } 129 | ); 130 | 131 | const addToBlockListHandler = new aws.sns.TopicSubscription( 132 | "add-to-block-list-handler", 133 | { 134 | topic: snsTopic.arn, 135 | protocol: "sqs", 136 | endpoint: queue.arn, 137 | filterPolicyScope: "MessageBody", 138 | filterPolicy: JSON.stringify({ notificationType: ["Bounce"] }), 139 | } 140 | ); 141 | ``` 142 | 143 | The code snippet above creates sqs queue and another `sns.TopicSubscription` which forwards bounce messages to our SQS queue. 144 | 145 | While it is possible to directly attach a Lambda function to the SNS topic, it is advisable to introduce a queue in between. 146 | 147 | The primary advantage of incorporating an SQS (Simple Queue Service) between SNS and Lambda is the ability to reprocess messages. By adding the message to a dead letter queue, we can reprocess it at a later time—something not achievable with direct SNS to Lambda integration. 148 | 149 | Another benefit of using SQS is cost efficiency in Lambda invocations. This approach allows for more efficient scaling and reduced costs, as it enables the processing of messages in batches. 150 | 151 | In the snippet above we also created the `AllowSNSToQueueMessages` policy which allows SNS to enqueue messages. 152 | 153 | ⚠ Please take note that, with `notificationType: ["Bounce"] })`, we are currently filtering only Bounce messages for the purpose of this tutorial. However, in real-world scenarios, it is advisable to handle Complaint messages as well. Additionally, it is recommended to distinguish between hard and soft bounces and implement a retry mechanism in the latter case. 154 | 155 | ### Dynamo DB 156 | 157 | Let's proceed by creating a simple DynamoDB table with only an email column, which will be used to store bounced emails. 158 | 159 | ```typescript 160 | const dynamoTable = new aws.dynamodb.Table("block-list-table", { 161 | name: "blocklist", 162 | attributes: [ 163 | { 164 | name: "email", 165 | type: "S", 166 | }, 167 | ], 168 | hashKey: "email", 169 | readCapacity: 1, 170 | writeCapacity: 1, 171 | }); 172 | ``` 173 | 174 | ### AWS Lambda 175 | 176 | Before creating the Lambda function, it's essential to set up the necessary execution role. This role will provide the Lambda function with the required permissions to access DynamoDB and other services such as CloudWatch. 177 | 178 | ```typescript 179 | const assumeRolePolicy = aws.iam.getPolicyDocument({ 180 | statements: [ 181 | { 182 | effect: "Allow", 183 | actions: ["sts:AssumeRole"], 184 | principals: [ 185 | { 186 | type: "Service", 187 | identifiers: ["lambda.amazonaws.com"], 188 | }, 189 | ], 190 | }, 191 | ], 192 | }); 193 | 194 | const iamForLambda = new aws.iam.Role("lambda-execution-role", { 195 | name: "LambdaExecutionRole", 196 | assumeRolePolicy: assumeRolePolicy.then((policy) => policy.json), 197 | }); 198 | 199 | new aws.iam.RolePolicyAttachment("execution-role-policy-attachment", { 200 | role: iamForLambda.name, 201 | policyArn: 202 | "arn:aws:iam::aws:policy/service-role/AWSLambdaSQSQueueExecutionRole", 203 | }); 204 | 205 | const allowLambdaToAccessDynamoDb = aws.iam.getPolicyDocumentOutput({ 206 | statements: [ 207 | { 208 | effect: "Allow", 209 | actions: ["dynamodb:*"], 210 | resources: [dynamoTable.arn], 211 | }, 212 | ], 213 | }); 214 | 215 | const allowLambdaToAccessDynamoDbPolicy = new aws.iam.Policy( 216 | "allow-lambda-to-access-dynamo-db-policy", 217 | { 218 | name: "AllowLambdaToAccessDynamoDb", 219 | policy: allowLambdaToAccessDynamoDb.apply((policy) => policy.json), 220 | } 221 | ); 222 | 223 | new aws.iam.RolePolicyAttachment("lambda-dynamodb-policy-attachment", { 224 | role: iamForLambda.name, 225 | policyArn: allowLambdaToAccessDynamoDbPolicy.arn, 226 | }); 227 | ``` 228 | 229 | Now we can create our AWS Lambda handler responsible for processing queued messages. 230 | 231 | ```typescript 232 | const codeZip = new pulumi.asset.AssetArchive({ 233 | "index.mjs": new pulumi.asset.FileAsset("./lambda.mjs"), 234 | }); 235 | 236 | export const lambda = new aws.lambda.Function("add-to-block-list-lambda", { 237 | name: "add-to-block-list-handler", 238 | code: codeZip, 239 | role: iamForLambda.arn, 240 | handler: "index.handler", 241 | runtime: "nodejs20.x", 242 | environment: { 243 | variables: { 244 | TABLE_NAME: dynamoTable.name, 245 | }, 246 | }, 247 | }); 248 | ``` 249 | 250 | Lambda code is available in the `lambda.mjs` file. 251 | 252 | ```javascript 253 | import { DynamoDBClient } from "@aws-sdk/client-dynamodb"; 254 | import { DynamoDBDocumentClient, PutCommand } from "@aws-sdk/lib-dynamodb"; 255 | 256 | const client = new DynamoDBClient({}); 257 | const docClient = DynamoDBDocumentClient.from(client); 258 | 259 | export const handler = async (event) => { 260 | const records = event.Records; 261 | const emails = records.reduce((acc, it) => { 262 | const body = JSON.parse(it.body); 263 | const message = JSON.parse(body.Message); 264 | const bouncedRecipients = message.bounce.bouncedRecipients.map( 265 | (r) => r.emailAddress 266 | ); 267 | return [...acc, ...bouncedRecipients]; 268 | }, []); 269 | 270 | const pResult = emails.map((email) => { 271 | const command = new PutCommand({ 272 | TableName: process.env.TABLE_NAME, 273 | Item: { email }, 274 | }); 275 | return docClient.send(command); 276 | }); 277 | await Promise.all(pResult); 278 | }; 279 | ``` 280 | 281 | Finally, we can map the SQS queue to our Lambda function: 282 | 283 | ```typescript 284 | new aws.lambda.EventSourceMapping("sqs-lambda-mapping", { 285 | eventSourceArn: queue.arn, 286 | functionName: lambda.arn, 287 | }); 288 | ``` 289 | 290 | By sending test email again, we can see that `bounce@simulator.amazonses.com` has been successfully added to the blocklist table 🥳🥳🥳. 291 | 292 | You can find the complete working code [here](https://github.com/ikovac/ses-bounce-handling). 293 | 294 | ## Final words 295 | 296 | The purpose of this guide is not to offer a one-size-fits-all solution for every project but rather to present a simple demo handler. This demonstration aims to illustrate that handling bounces and complaints is not that hard. 297 | 298 | Before implementing a bounces and complaints handler, it is recommended to investigate the solution that best fits your specific project requirements. For instance, instead of storing bounced emails in DynamoDB, you may opt for an HTTP `sns.TopicSubscription` to notify your system about bounces. Subsequently, you can handle these notifications within your system accordingly. 299 | 300 | Additionally, for production use, it is highly advisable to incorporate a dead-letter queue. This ensures the ability to reprocess messages that may have failed. 301 | --------------------------------------------------------------------------------