Welcome to the Studion Platform Tech Guide repository

├── README.md
├── assets
    └── images
    │   ├── concurrency.png
    │   ├── examples
    │       └── ramping-vus.png
    │   ├── infrastructure-diagram.png
    │   ├── logo.png
    │   ├── lti
    │       ├── lms-lessons.png
    │       ├── lms-tool-launch.png
    │       ├── lti-advantage.png
    │       ├── lti-home.png
    │       └── lti-resource1.png
    │   ├── optimistic-locking.png
    │   ├── pessimistic-vs-optimistic-locking.png
    │   ├── ses-sns-topic.png
    │   └── test-email.png
├── examples
    └── performance-testing-antipattern-examples.md
└── recipes
    ├── automated-testing.md
    ├── circleci-build-guide.md
    ├── docker-image-guide.md
    ├── handling-concurrency.md
    ├── lti.md
    └── ses-bounce-handling.md


/README.md:
--------------------------------------------------------------------------------
 1 | <div align="center">
 2 |   <img 
 3 |     src="assets/images/logo.png" 
 4 |     alt="Studion Logo" 
 5 |     width="320" 
 6 |     style="padding: 10px; background-color: white;">
 7 |   <h1>Welcome to the Studion Platform Tech Guide repository</h1>
 8 |   <h3><strong>The central place for all common packages</strong></h3>
 9 | </div>
10 | 
11 | ## 📦 Package List
12 | 
13 | 1. [Prettier config](https://github.com/ExtensionEngine/prettier-config) - Studion Prettier config
14 | 2. [Infra Code Blocks](https://github.com/ExtensionEngine/infra-code-blocks) - Studion common infra components
15 | 
16 | ## Recipes
17 | 
18 | 1. [Docker image guide](./recipes/docker-image-guide.md)
19 | 2. [AWS SES bounce handling](./recipes/ses-bounce-handling.md)
20 | 3. [Handling concurrency using optimistic or pessimistic locking](./recipes/handling-concurrency.md)
21 | 4. [LTI - Learning Tools Interoperability Protocol](./recipes/lti.md)
22 | 5. [CircleCI Build Guide](./recipes/circleci-build-guide.md)
23 | 
24 | ## Guides
25 | 
26 | 1. [Automated Testing](./recipes/automated-testing.md)
27 | 
28 | ## Examples
29 | 
30 | 1. [Performance Testing - Antipattern Examples](./examples/performance-testing-antipattern-examples.md)
31 | 
32 | ## 🙌 Want to contribute?
33 | 
34 | We are open to all kinds of contributions. If you want to:
35 | 
36 | - 🤔 Suggest an idea
37 | - 🐛 Report an issue
38 | - 📖 Improve documentation
39 | - 👨‍💻 Contribute to the code
40 | 
41 | You are more than welcome.
42 | 


--------------------------------------------------------------------------------
/assets/images/concurrency.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ExtensionEngine/tech-guide/0c4ed0c9b1bff4976caba468345a396757f9c61c/assets/images/concurrency.png


--------------------------------------------------------------------------------
/assets/images/examples/ramping-vus.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ExtensionEngine/tech-guide/0c4ed0c9b1bff4976caba468345a396757f9c61c/assets/images/examples/ramping-vus.png


--------------------------------------------------------------------------------
/assets/images/infrastructure-diagram.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ExtensionEngine/tech-guide/0c4ed0c9b1bff4976caba468345a396757f9c61c/assets/images/infrastructure-diagram.png


--------------------------------------------------------------------------------
/assets/images/logo.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ExtensionEngine/tech-guide/0c4ed0c9b1bff4976caba468345a396757f9c61c/assets/images/logo.png


--------------------------------------------------------------------------------
/assets/images/lti/lms-lessons.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ExtensionEngine/tech-guide/0c4ed0c9b1bff4976caba468345a396757f9c61c/assets/images/lti/lms-lessons.png


--------------------------------------------------------------------------------
/assets/images/lti/lms-tool-launch.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ExtensionEngine/tech-guide/0c4ed0c9b1bff4976caba468345a396757f9c61c/assets/images/lti/lms-tool-launch.png


--------------------------------------------------------------------------------
/assets/images/lti/lti-advantage.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ExtensionEngine/tech-guide/0c4ed0c9b1bff4976caba468345a396757f9c61c/assets/images/lti/lti-advantage.png


--------------------------------------------------------------------------------
/assets/images/lti/lti-home.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ExtensionEngine/tech-guide/0c4ed0c9b1bff4976caba468345a396757f9c61c/assets/images/lti/lti-home.png


--------------------------------------------------------------------------------
/assets/images/lti/lti-resource1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ExtensionEngine/tech-guide/0c4ed0c9b1bff4976caba468345a396757f9c61c/assets/images/lti/lti-resource1.png


--------------------------------------------------------------------------------
/assets/images/optimistic-locking.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ExtensionEngine/tech-guide/0c4ed0c9b1bff4976caba468345a396757f9c61c/assets/images/optimistic-locking.png


--------------------------------------------------------------------------------
/assets/images/pessimistic-vs-optimistic-locking.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ExtensionEngine/tech-guide/0c4ed0c9b1bff4976caba468345a396757f9c61c/assets/images/pessimistic-vs-optimistic-locking.png


--------------------------------------------------------------------------------
/assets/images/ses-sns-topic.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ExtensionEngine/tech-guide/0c4ed0c9b1bff4976caba468345a396757f9c61c/assets/images/ses-sns-topic.png


--------------------------------------------------------------------------------
/assets/images/test-email.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ExtensionEngine/tech-guide/0c4ed0c9b1bff4976caba468345a396757f9c61c/assets/images/test-email.png


--------------------------------------------------------------------------------
/examples/performance-testing-antipattern-examples.md:
--------------------------------------------------------------------------------
  1 | # Performance Testing - Antipattern Examples
  2 | 
  3 | ## Antipattern 1: Ignoring Think Time
  4 | 
  5 | Excluding think time between user actions can result in unrealistic performance metrics for certain types of tests, such as average, stress, and soak tests. However, think time is less critical for tests like breakpoint and spike tests, as other parameters can control these scenarios effectively. Incorporating think time is crucial when simulating real user behavior based on user scenarios. In the provided example, user actions are executed without any delay, which does not accurately reflect real-world conditions for this type of test.
  6 | 
  7 | ```javascript
  8 | export default function () {
  9 |   http.get('http://example.com/api/resource1');
 10 |   http.get('http://example.com/api/resource2');
 11 |   http.get('http://example.com/api/resource3');
 12 | }
 13 | ```
 14 | 
 15 | ### Solution
 16 | 
 17 | Introduce think time between user actions to simulate real user behavior. This example adds a random delay between 1 to 5 seconds between each request. The bigger the range, the more realistic the simulation.
 18 | 
 19 | ```javascript
 20 | import { randomIntBetween } from 'https://jslib.k6.io/k6-utils/1.4.0/index.js';
 21 | import { sleep } from 'k6';
 22 | 
 23 | export default function () {
 24 |   http.get('http://example.com/api/resource1');
 25 |   sleep(randomIntBetween(1, 5));
 26 |   http.get('http://example.com/api/resource2');
 27 |   sleep(randomIntBetween(1, 5));
 28 |   http.get('http://example.com/api/resource3');
 29 | }
 30 | ```
 31 | 
 32 | ## Antipattern 2: Lack of Data Variation
 33 | 
 34 | Using static, hardcoded data for requests can cause caching mechanisms to produce artificially high performance metrics. In this example, the same username is used for every request, which may not represent real-world scenarios.
 35 | 
 36 | ```javascript
 37 | export default function () {
 38 |   const payload = JSON.stringify({
 39 |     username: 'username', // Static username used in every request
 40 |     password: 'password',
 41 |   });
 42 | 
 43 |   http.post('http://example.com/api/login', payload);
 44 | }
 45 | ```
 46 | 
 47 | ### Solution
 48 | 
 49 | Use dynamic data or randomization to simulate different user scenarios. This example generates a random username for each request.
 50 | 
 51 | ```javascript
 52 | import exec from 'k6/execution';
 53 | 
 54 | export default function () {
 55 |   const payload = JSON.stringify({
 56 |     username: `username${exec.vu.idInTest}`, // Unique identifier for each virtual user, we will use it to be sure every username is unique
 57 |     password: 'password',
 58 |   });
 59 | 
 60 |   http.post('http://example.com/api/login', payload);
 61 | }
 62 | ```
 63 | 
 64 | ## Antipattern 3: Not Scaling Virtual Users
 65 | 
 66 | Running performance tests with unrealistic numbers of virtual users or ramping up too quickly can lead to inaccurate results. In this example, the test starts with 1000 VUs immediately.
 67 | 
 68 | ```javascript
 69 | export const options = {
 70 |   vus: 1000,
 71 |   duration: '1m',
 72 | };
 73 | 
 74 | export default function () {
 75 |   http.get('http://example.com/api/resource');
 76 | }
 77 | ```
 78 | 
 79 | ### Solution
 80 | 
 81 | Executors control how K6 schedules VUs and iterations. The executor that you choose depends on the goals of your test and the type of traffic you want to model. For example, the `ramping-vus` executor gradually increases the number of VUs over a specified duration, allowing for more realistic load testing for specific test types.
 82 | 
 83 | ```javascript
 84 | export const options = {
 85 |   discardResponseBodies: true,
 86 |   scenarios: {
 87 |     contacts: {
 88 |       executor: 'ramping-vus',
 89 |       startVUs: 0,
 90 |       stages: [
 91 |         { duration: '20s', target: 10 },
 92 |         { duration: '10s', target: 0 },
 93 |       ],
 94 |       gracefulRampDown: '0s',
 95 |     },
 96 |   },
 97 | };
 98 | 
 99 | export default function () {
100 |   http.get('http://example.com/api/resource');
101 |   // Injecting sleep
102 |   // Sleep time is 500ms. Total iteration time is sleep + time to finish request.
103 |   sleep(0.5);
104 | }
105 | ```
106 | 
107 | Based upon our test scenario inputs and results:
108 | 
109 | - The configuration defines 2 stages for a total test duration of 30 seconds.
110 | - Stage 1 ramps up VUs linearly from the 0 to the target of 10 over a 20 second duration.
111 | - From the 10 VUs at the end of stage 1, stage 2 then ramps down VUs linearly to the target of 0 over a 10 second duration.
112 | - Each iteration of the default function is expected to be roughly 515ms, or ~2/s.
113 | - As the number of VUs changes, the iteration rate directly correlates; each addition of a VU increases the rate by about 2 iterations/s, whereas each subtraction of a VUs reduces by about 2 iterations/s.
114 | - The example performed ~300 iterations over the course of the test.
115 | 
116 | #### Chart representation of the test execution
117 | 
118 | ![ramping-vus execution chart](../assets//images//examples/ramping-vus.png)
119 | 
120 | ## Glossary
121 | 
122 | ### **VU**
123 | 
124 | - Virtual User
125 | 
126 | ### **Think Time**
127 | 
128 | - Amount of time a script stops during test execution to
129 | replicate delays experienced by real users while using an application.
130 | 
131 | ### **Iteration**
132 | 
133 | - A single execution of the default function in a K6 script.
134 | 
135 | ### **Average Test**
136 | 
137 | - Assess how the system performs under a typical load for your system or application. Typical load might be a regular day in production or an average timeframe in your daily traffic.
138 | 
139 | ### **Stress Test**
140 | 
141 | - Help you discover how the system functions with the load at peak traffic.
142 | 
143 | ### **Spike Test**
144 | 
145 | - A spike test verifies whether the system survives and performs under sudden and massive rushes of utilization.
146 | 
147 | ### **Breakpoint Test**
148 | 
149 | - Breakpoint tests discover your system’s limits.
150 | 
151 | ### **Soak Test**
152 | 
153 | - Soak tests are a variation of the average-load test. The main difference is the test duration. In a soak test, the peak load is usually an average load, but the peak load duration extends several hours or even days.
154 | 


--------------------------------------------------------------------------------
/recipes/automated-testing.md:
--------------------------------------------------------------------------------
  1 | # Automated Testing
  2 | 
  3 | ##### Table of contents
  4 | 
  5 | [Glossary](#glossary)
  6 | 
  7 | [Testing best practices](#testing-best-practices)
  8 | 
  9 | [Types of automated tests](#types-of-automated-tests)
 10 | 
 11 | [Unit Tests](#unit-tests)
 12 | 
 13 | [Integration Tests](#integration-tests)
 14 | 
 15 | [API Tests](#api-tests)
 16 | 
 17 | [E2E Tests](#e2e-tests)
 18 | 
 19 | [Performance Tests](#performance-tests)
 20 | 
 21 | [Visual Tests](#visual-tests)
 22 | 
 23 | 
 24 | ## Glossary
 25 | **Confidence** - describes a degree to which passing tests guarantee that the app is working.  
 26 | **Determinism** - describes how easy it is to determine where the problem is based on the failing test.  
 27 | **Use Case** - a potential scenario in which a system receives external input and responds to it. It defines the interactions between a role (user or another system) and a system to achieve a goal.  
 28 | **Combinatiorial Explosion** - the fast growth in the number of combinations that need to be tested when multiple business rules are involved.  
 29 | 
 30 | ## Testing best practices
 31 | 
 32 | ### Quality over quantity
 33 | Don't focus on achieving a specific code coverage percentage.
 34 | While code coverage can help us identify uncovered parts of the codebase, it doesn't guarantee high confidence.
 35 | 
 36 | Instead, focus on identifying important paths of the application, especially from user's perspective.
 37 | User can be a developer using a shared function, a user interacting with the UI, or a client using server app's JSON API.
 38 | Write tests to cover those paths in a way that gives confidence that each path, and each separate part of the path works as expected.
 39 | 
 40 | ---
 41 | 
 42 | Flaky tests that produce inconsistent results ruin confidence in the test suite, mask real issues, and are the source of frustration. The refactoring process to address the flakiness is crucial and should be a priority.
 43 | To adequately deal with flaky tests it is important to know how to identify, fix, and prevent them:
 44 | - Common characteristics of flaky tests include inconsistency, false positives and negatives, and sensitivity to dependency, timing, ordering, and environment.
 45 | - Typical causes of the stated characteristics are concurrency, timing/ordering problems, external dependencies, non-deterministic assertions, test environment instability, and poorly written test logic.
 46 | - Detecting flaky tests can be achieved by rerunning, running tests in parallel, executing in different environments, and analyzing test results.
 47 | - To fix and prevent further occurrences of flaky tests the following steps can be taken, isolate tests, employ setup and cleanup routines, handle concurrency, configure a stable test environment, improve error handling, simplify testing logic, and proactively deal with typical causes of the flaky tests.
 48 | 
 49 | ---
 50 | 
 51 | Be careful with tests that alter database state. We want to be able to run tests
 52 | in parallel so do not write tests that depend on each other. Each test should be
 53 | independent of the test suite.
 54 | 
 55 | ---
 56 | 
 57 | Test for behavior and not implementation. Rather focus on writing tests that
 58 | follow the business logic instead of programming logic. Avoid writing parts of
 59 | the function implementation in the actual test assertion. This will lead to tight
 60 | coupling of tests with internal implementation and the tests will have to be fixed
 61 | each time the logic changes.
 62 | 
 63 | ---
 64 | 
 65 | Writing quality tests is hard and it's easy to fall into common pitfalls of testing
 66 | that the database update function actually updates the database. Start off simple
 67 | and as the application grows in complexity, it will be easier to determine what
 68 | should be tested more thoroughly. It is perfectly fine to have a small test suite
 69 | that covers the critical code and the essentials. Small suites will run faster
 70 | which means they will be run more often.
 71 | 
 72 | ## Types of Automated Tests
 73 | 
 74 | There are different approaches to testing, and depending on boundaries of the
 75 | test, we can split them into following categories:
 76 | 
 77 | - **Unit Tests**
 78 | - **Integration Tests**
 79 | - **API Tests**
 80 | - **E2E Tests**
 81 | - **Load/Performance Tests**
 82 | - **Visual Tests**
 83 | 
 84 | *Note that some people can call these tests by different names, but for Studion
 85 | internal purposes, this should be considered the naming convention.*
 86 | 
 87 | ### Unit Tests
 88 | 
 89 | These are the most isolated tests that we can write. They should take a specific
 90 | function/service/helper/module and test its functionality. Unit tests will
 91 | usually require mocked data, but since we're testing the case when specific
 92 | input produces specific output, the mocked data set should be minimal.
 93 | 
 94 | Unit testing is recommended for functions that contain a lot of logic and/or branching.
 95 | It is convenient to test a specific function at the lowest level so if the logic
 96 | changes, we can make minimal changes to the test suite and/or mocked data.
 97 | 
 98 | #### When to use
 99 | - Test a unit that implements the business logic, that's isolated from side effects such as database interaction or HTTP request processing
100 | - Test function or class method with multiple input-output permutations
101 | 
102 | #### When **not** to use
103 | - To test unit that integrates different application layers, such as persistence layer (database) or HTTP layer (see "Integration Tests") or performs disk I/O or communicates with external system
104 | 
105 | #### Best practices
106 | - Unit tests should execute fast (<50ms)
107 | - Use mocks and stubs through dependency injection (method or constructor injection)
108 | 
109 | #### Antipatterns
110 | - Mocking infrastructure parts such as database I/O - instead, revert the control by using the `AppService`, `Command` or `Query` to integrate unit implementing business logic with the infrastructure layer of the application
111 | - Monkey-patching dependencies used by the unit - instead, pass the dependencies through the constructor or method, so that you can pass the mocks or stubs in the test
112 | 
113 | 
114 | ### Integration Tests
115 | 
116 | With these tests, we test how multiple components of the system behave together.
117 | 
118 | #### Infrastructure
119 | 
120 | Running the tests on test infrastructure should be preferred to mocking, unlike in unit tests. Ideally, a full application instance would be run, to mimic real application behavior as close as possible.
121 | This usually includes running the application connected to a test database, inserting fake data into it during the test setup, and doing assertions on the current state of the database. This also means integration test code should have full access to the test infrastructure for querying. 
122 | > [!NOTE]
123 | > Regardless of whether using raw queries or the ORM, simple queries should be used to avoid introducing business logic within tests.
124 | 
125 | However, mocking can still be used when needed, for example when expecting side-effects that call third party services.
126 | 
127 | #### Entry points
128 | 
129 | Integration test entry points can vary depending on the application use cases. These include services, controllers, or the API. These are not set in stone and should be taken into account when making a decision. For example:
130 | - A use case that can be invoked through multiple different protocols can be tested separately from them, to avoid duplication. A tradeoff in this case is the need to write some basic tests for each of the protocols.
131 | - A use case that will always be invokeable through a single protocol might benefit enough from only being tested using that protocol. E.g. a HTTP API route test might eliminate the need for a lower level, controller/service level test. This would also enable testing the auth layer integration within these tests, which might not have been possible otherwise depending on the technology used.
132 | 
133 | Multiple approaches can be used within the same application depending on the requirements, to provide sufficient coverage.
134 | 
135 | #### Testing surface
136 | 
137 | **TODO**: do we want to write anything about mocking the DB data/seeds?
138 | 
139 | In these tests we should cover **at least** the following:
140 | - **authorization** - make sure only logged in users with correct role/permissions
141 | can access this endpoint
142 | - **success** - if we send correct data, the endpoint should return response that
143 | contains correct data
144 | - **failure** - if we send incorrect data, the endpoint should handle the exception
145 | and return appropriate error status
146 | - **successful change** - successful request should make the appropriate change
147 | 
148 | If the endpoint contains a lot of logic where we need to mock a lot of different
149 | inputs, it might be a good idea to cover that logic with unit tests. Unit tests
150 | will require less overhead and will provide better performance while at the same
151 | time decoupling logic testing and endpoint testing.
152 | 
153 | #### When to use
154 | - To verify the API endpoint performs authentication and authorization.
155 | - To verify user permissions for that endpoint.
156 | - To verify that invalid input is correctly handled.
157 | - To verify the basic business logic is handled correctly, both in expected success and failure cases.
158 | - To verify infrastructure related side-effects, e.g. database changes or calls to third party services.
159 | 
160 | #### When **not** to use
161 | - For extensive testing of business logic permutations beyond fundamental scenarios. Integration tests contain more overhead to write compared to unit tests and can easily lead to a combinatorial explosion. Instead, unit tests should be used for thorough coverage of these permutations.
162 | - For testing third party services. We should assume they work as expected.
163 | 
164 | #### Best practices
165 | - Test basic functionality and keep the tests simple.
166 | - Prefer test infrastructure over mocking.
167 | - If the tested endpoint makes database changes, verify that the changes were
168 | actually made.
169 | - Assert that output data is correct.
170 | 
171 | #### Antipatterns
172 | - Aiming for code coverage percentage number. An app with 100% code coverage can
173 | have bugs. Instead, focus on writing meaningful, quality tests.
174 | 
175 | ### API Tests
176 | 
177 | With these tests, we want to make sure our API contract is valid and the API
178 | returns the expected data. That means we write tests for the publically
179 | available endpoints.
180 | 
181 | > [!NOTE]
182 | > As mentioned in the Integration Tests section, API can be the entry point to the integration tests, meaning API tests are a subtype of integration tests. However, when we talk about API tests here, we are specifically referring to the public API contract tests, which don't have access to the internals of the application.
183 | 
184 | In the cases where API routes are covered extensively with integration tests, API tests might not be needed, leaving more time for QA to focus on E2E tests.
185 | However, in more complex architectures (e.g. integration tested microservices behind an API gateway), API tests can be very useful.
186 | 
187 | #### When to use
188 | - To make sure the API signature is valid.
189 | 
190 | #### When **not** to use
191 | - To test application logic.
192 | 
193 | #### Best practices
194 | - Write these tests with the tools which allow us to reuse the tests to write
195 | performance tests (K6).
196 | 
197 | #### Antipatterns
198 | 
199 | ### E2E Tests
200 | 
201 | E2E tests are executed in a browser environment using tools like Playwright,
202 | Cypress, or similar frameworks. The purpose of these tests is to make sure that
203 | interacting with the application UI produces the expected result, verifying the
204 | application’s functionality from a user’s perspective.
205 | 
206 | Usually, these tests will cover a large portion of the codebase with least
207 | amount of code. Because of that, they can be the first tests to be added to
208 | existing project that has no tests or has low test coverage.
209 | 
210 | These tests should not cover all of the use cases because they are the slowest
211 | to run. If we need to test edge cases, we should try to implement those at a
212 | lower level (integration or unit tests).
213 | 
214 | #### When to use
215 | - To validate user interactions and critical workflows in the application UI.
216 | - For testing specific user flows.
217 | - For making sure that critical application features are working as expected.
218 | - For better coverage of the most common user pathways.
219 | 
220 | #### When **not** to use
221 | - For data validation.
222 | 
223 | #### Best practices
224 | - Tests should be atomic and simple, all complicated tests should be thrown out.
225 | - Focus on the most important user workflows rather than attempting exhaustive
226 | coverage.
227 | - Each test should be able to run independently, with the environment reset to a
228 | known state before every test.
229 | - Performance is key in these tests. We want to run tests as often as possible
230 | and good performance will allow that.
231 | - Flaky tests should be immediately disabled and refactored. Flaky tests will
232 | cause the team to ignore or bypass the tests and these should be dealt with
233 | immediately.
234 | - Ensure consistent data states to avoid test failures due to variability in
235 | backend systems or environments.
236 | - Run tests in parallel and isolate them from external dependencies to improve
237 | speed and reliability.
238 | - Automate E2E tests in your CI/CD pipeline to catch regressions early in the
239 | deployment process.
240 | 
241 | #### Antipatterns
242 | - Avoid trying to cover all use cases or edge cases in E2E tests; these are
243 | better suited for unit or integration tests.
244 | 
245 | ### Performance Tests
246 | 
247 | Performance tests replicate typical user scenarios and then scale up to simulate
248 | concurrent users. They measure key performance metrics such as response time,
249 | throughput, error rate, and resource utilization. These tests help uncover
250 | bottlenecks and identify specific endpoints or processes that require 
251 | optimization.
252 | 
253 | Performance tests are supposed to be run on a production-like environment since
254 | they test the performance of code **and** infrastructure. It's essential to
255 | consider real user behavior when designing and running these tests. The best 
256 | practice is to create a clone of the production environment for testing
257 | purposes, avoiding potential disruption to actual users.
258 | 
259 | #### When to use
260 | - To stress test application's infrastructure.
261 | - To evaluate the app’s behavior and performance under increasing traffic.
262 | - To identify and address bottlenecks or resource limitations in the
263 | application.
264 | - To ensure the application can handle anticipated peak traffic or usage
265 | patterns.
266 | 
267 | #### When **not** to use
268 | - To verify functional requirements or application features.
269 | - To test a specific user scenario.
270 | 
271 | #### Best practices
272 | - Establish clear goals. Are you testing scalability, stability, or 
273 | responsiveness? Without these objectives, tests risk being unfocused, resulting
274 | in meaningless data.
275 | - Include diverse scenarios that represent different user journeys across the
276 | system, not just a single performance test/scenario.
277 | - Use a clone of the production environment to ensure the infrastructure matches
278 | real-world conditions, including hardware, network, and database configurations.
279 | - Schedule performance tests periodically or before major releases to catch
280 | regressions early.
281 | - Record and analyze test outcomes to understand trends over time, identify weak
282 | points, and track improvements.
283 | - Performance testing should not be a one-time task; it should be an ongoing
284 | process integrated into the development lifecycle.
285 | 
286 | #### Antipatterns
287 | - Running these tests locally or on an environment that doesn't match production
288 | in terms of infrastructure performance. Tests should be developed on a local
289 | instance, but the actual measurements should be performed live.
290 | - Ignoring data variability, ensure the test data mirrors real-world conditions,
291 | including varying user inputs and dataset sizes. 
292 | - Ignoring randomness in user behavior, ensure the tests mimic actual user
293 | behavior, including realistic click frequency, page navigation patterns, and
294 | input actions.
295 | 
296 | #### [Antipattern Examples](/examples/performance-testing-antipattern-examples.md)
297 | 
298 | ### Visual Tests
299 | 
300 | The type of test where test runner navigates to browser page, takes snapshot and
301 | then compares the snapshot with the reference snapshot.
302 | 
303 | Visual tests allow you to quickly cover large portions of the application,
304 | ensuring that changes in the UI are detected without writing complex test cases.
305 | The downside is that they're requiring engineers to invest time in identifying
306 | the root cause of errors.
307 | 
308 | #### When to use
309 | - When we want to make sure there are no changes in the UI.
310 | 
311 | #### When **not** to use
312 | - To test a specific feature or business logic.
313 | - To test a specific user scenario.
314 | 
315 | #### Best practices
316 | - Ensure the UI consistently renders the same output by eliminating randomness
317 | (e.g., by always using same seeds data or controlling API responses to always
318 | return same values).
319 | - Add as many pages as possible but keep the tests simple.
320 | - Consider running visual tests at the component level to isolate and detect
321 | issues earlier.
322 | - Define acceptable thresholds for minor visual differences (e.g., pixel
323 | tolerance) to reduce noise while detecting significant regressions.
324 | 
325 | #### Antipatterns
326 | - Avoid creating overly complicated visual tests that try to simulate user
327 | behavior. These are better suited for E2E testing.
328 | - Visual tests should complement, not replace other types of tests like E2E
329 | tests. Over-relying on them can leave functional gaps in coverage.
330 | - Blindly updating snapshots without investigating failures undermines the
331 | purpose of visual testing and risks missing real issues.
332 | 


--------------------------------------------------------------------------------
/recipes/circleci-build-guide.md:
--------------------------------------------------------------------------------
  1 | # CircleCI Build Guide Setup
  2 | 
  3 |   The following page provides a "getting started" example of CircleCI config.
  4 |   This config is used on a few active Studion projects. However, there are plans
  5 |   to make this guide obsolete by making Studion orb and a CLI script to generate
  6 |   config.yml file automatically. We can use this as a reference and to learn how to
  7 |   set up CircleCI. Especially as Studion orb(s) will be based off this setup.
  8 | 
  9 | ## The application
 10 | 
 11 |   This guide assumes the application has the following components:
 12 |   - SPA frontend
 13 |   - Dockerfile for building server
 14 |   - Pulumi config for infrastructure
 15 | 
 16 |   This guide also assumes we are hosting the app on AWS.
 17 | 
 18 | ## `./.circleci/config.yml`
 19 | 
 20 |   CircleCI uses `config.yml` file to define the tasks it will perform.
 21 |   The file should be placed inside `.circleci` directory in project root.
 22 |   Knowing YAML syntax is a prerequisite for writing CircleCI config.
 23 | 
 24 |   For better clarity, we will look at separate config blocks and describe what they do.
 25 | 
 26 | ### Config Init
 27 | 
 28 |   Here we define the config version and dependencies.
 29 | 
 30 |   Orbs are CircleCI packages that allow us to define build process
 31 |   in a simple and easy way. Read more about orbs here https://circleci.com/orbs/.
 32 | 
 33 |   For this app, we need `aws-cli`, `aws-ecr`, `node` and `pulumi` orbs.
 34 | 
 35 | 
 36 |   ```yaml
 37 |     version: 2.1
 38 |     orbs:
 39 |       aws-cli: circleci/aws-cli@4.1.3
 40 |       aws-ecr: circleci/aws-ecr@9.0.4
 41 |       node: circleci/node@5.2.0
 42 |       pulumi: pulumi/pulumi@2.1.0
 43 | 
 44 |     executors:
 45 |       node:
 46 |         docker:
 47 |           - image: cimg/node:16.20.2
 48 |       base:
 49 |         docker:
 50 |           - image: cimg/base:stable-20.04
 51 |   ```
 52 | 
 53 | ### AWS Credentials
 54 | 
 55 |   In case we have multiple AWS credentials, we can define them at the beginning and
 56 |   reuse them where applicable. In this example, we have Studion AWS account and client
 57 |   AWS account credentials.
 58 | 
 59 |   ```yaml
 60 |   studion-aws-credentials: &studion-aws-credentials
 61 |     access_key: STUDION_AWS_ACCESS_KEY
 62 |     secret_key: STUDION_AWS_SECRET_KEY
 63 |     region: ${STUDION_AWS_REGION}
 64 | 
 65 |   client-aws-credentials: &client-aws-credentials
 66 |     access_key: CLIENT_AWS_ACCESS_KEY
 67 |     secret_key: CLIENT_AWS_SECRET_KEY
 68 |     region: ${CLIENT_AWS_REGION}
 69 |   ```
 70 | 
 71 |   Note that we used YAML anchor here so we can reuse the credentials objects.
 72 |   Also, note that `access_key` and `secret_key` just contain the name of the env
 73 |   variable while `region` contains the actual value of the env variable.
 74 | 
 75 |   Environment variables are configured in CircleCI project settings
 76 |   within the CircleCI application.
 77 | 
 78 | 
 79 | ### Job 1: Build Frontend
 80 | 
 81 |   This step pulls the code, injects secret .npmrc file, installs npm packages and
 82 |   runs build process. Finally, the output is persisted to workspace so we can upload
 83 |   it to S3 later in the build process.
 84 | 
 85 |   ```yaml
 86 |     jobs:
 87 |       build-frontend:
 88 |         working_directory: ~/app
 89 |         executor: node
 90 |         steps:
 91 |           - checkout
 92 |           - run:
 93 |               command: echo "@fortawesome:registry=https://npm.fontawesome.com/" > ~/app/.npmrc
 94 |           - run:
 95 |               command: echo "//npm.fontawesome.com/:_authToken=${FA_TOKEN}" >> ~/app/.npmrc
 96 |           - node/install-packages:
 97 |               override-ci-command: npm ci
 98 |           - run:
 99 |               name: Build frontend
100 |               command: npm run build
101 |           - persist_to_workspace:
102 |               root: .
103 |               paths:
104 |                 - dist
105 |   ```
106 | 
107 |   In this example, we have .npmrc file that contains the auth token for Font Awesome Pro
108 |   package. This is how we can construct that file so `npm install` can install all
109 |   required packages.
110 | 
111 | 
112 | ### Job 2: Build server
113 | 
114 |   This step pulls the code and uses AWS ECR orb to build Docker image and push it to
115 |   private AWS registry.
116 | 
117 |   ```yaml
118 |     build-server:
119 |       working_directory: ~/app
120 |       executor:
121 |         name: aws-ecr/default
122 |         docker_layer_caching: true
123 |       parameters:
124 |         access_key:
125 |           type: string
126 |         secret_key:
127 |           type: string
128 |         region:
129 |           type: string
130 |         account_id:
131 |           type: string
132 |         ecr_repo:
133 |           type: string
134 |       resource_class: medium
135 |       steps:
136 |         - checkout
137 |         - run:
138 |             command: echo "@fortawesome:registry=https://npm.fontawesome.com/" > ~/app/.npmrc
139 |         - run:
140 |             command: echo "//npm.fontawesome.com/:_authToken=${FA_TOKEN}" >> ~/app/.npmrc
141 |         - aws-ecr/build_and_push_image:
142 |             auth:
143 |               - aws-cli/setup:
144 |                   aws_access_key_id: << parameters.access_key >>
145 |                   aws_secret_access_key: << parameters.secret_key >>
146 |                   region: << parameters.region >>
147 |             account_id: << parameters.account_id >>
148 |             attach_workspace: true
149 |             checkout: false
150 |             extra_build_args: "--secret id=npmrc_secret,src=.npmrc --target server"
151 |             region: << parameters.region >>
152 |             repo: << parameters.ecr_repo >>
153 |             repo_encryption_type: KMS
154 |             tag: latest,${CIRCLE_SHA1}
155 |   ```
156 | 
157 |   Note that this step accepts parameters which will be passed later when we define
158 |   the complete workflow.
159 | 
160 |   More info about AWS ECR orb can be found here:
161 |   https://circleci.com/developer/orbs/orb/circleci/aws-ecr
162 | 
163 | 
164 | ### Job 3: Deploy infrastructure
165 | 
166 |   This part calls Pulumi to set up AWS resources.
167 | 
168 |   ```yaml
169 |     deploy-aws:
170 |       working_directory: ~/app
171 |       executor: node
172 |       parameters:
173 |         access_key:
174 |           type: string
175 |         secret_key:
176 |           type: string
177 |         region:
178 |           type: string
179 |         account_id:
180 |           type: string
181 |         ecr_repo:
182 |           type: string
183 |         stack:
184 |           type: string
185 |       steps:
186 |         - checkout
187 |         - aws-cli/setup:
188 |             aws_access_key_id: << parameters.access_key >>
189 |             aws_secret_access_key: << parameters.secret_key >>
190 |             region: << parameters.region >>
191 |         - pulumi/login
192 |         - node/install-packages:
193 |             app-dir: ./infrastructure
194 |         - run:
195 |             name: Configure envs
196 |             command: |
197 |               echo 'export SERVER_IMAGE="<< parameters.account_id >>.dkr.ecr.<< parameters.region >>.amazonaws.com/<< parameters.ecr_repo >>:${CIRCLE_SHA1}"' >> "$BASH_ENV"
198 |               source "$BASH_ENV"
199 |         - pulumi/update:
200 |             stack: "<< parameters.stack >>"
201 |             working_directory: ./infrastructure
202 |             skip-preview: true
203 |         - pulumi/stack_output:
204 |             stack: "<< parameters.stack >>"
205 |             property_name: frontendBucketName
206 |             env_var: S3_SITE_BUCKET
207 |             working_directory: ./infrastructure
208 |         - pulumi/stack_output:
209 |             stack: "<< parameters.stack >>"
210 |             property_name: cloudfrontId
211 |             env_var: CF_DISTRIBUTION_ID
212 |             working_directory: ./infrastructure
213 |         - run:
214 |             name: Store pulumi output as env file
215 |             command: cp $BASH_ENV bash.env
216 |         - persist_to_workspace:
217 |             root: .
218 |             paths:
219 |               - bash.env
220 |   ```
221 | 
222 |   Note that this step assumes that Pulumi files are located in `infrastructure`
223 |   directory in project root.
224 | 
225 |   We export `SERVER_IMAGE` env variable which is used in Pulumi to create an ECS
226 |   service with that image. Notice we're missing .env files. That is because we put
227 |   all secrets in AWS SSM Parameter Store and we configured our Pulumi ECS service
228 |   to pull the secrets from there.
229 | 
230 |   Pulumi needs to be configured so it outputs at least two variables:
231 | 
232 |     1. S3 bucket name where we will upload built frontend from job 1
233 |     2. Cloudfront Distribution ID which we'll use to invalidate its cache
234 | 
235 | 
236 |   Both variables are stored in `bash.env` file and that file is persisted to workspace
237 |   because that is the easiest way of carrying those variables over to the next step.
238 | 
239 | ### Job 4: Deploy Frontend
240 | 
241 |   This is the step where we upload the frontend dist files from job 1 to S3 bucket
242 |   that was created in job 3.
243 | 
244 |   ```yaml
245 |     deploy-frontend:
246 |       working_directory: ~/app
247 |       parameters:
248 |         access_key:
249 |           type: string
250 |         secret_key:
251 |           type: string
252 |         region:
253 |           type: string
254 |       executor: base
255 |       steps:
256 |         - attach_workspace:
257 |             at: .
258 |         - aws-cli/setup:
259 |             aws_access_key_id: << parameters.access_key >>
260 |             aws_secret_access_key: << parameters.secret_key >>
261 |             region: << parameters.region >>
262 |         - run:
263 |             name: Set environment variables
264 |             command: cat bash.env >> $BASH_ENV
265 |         - run:
266 |             name: Deploy to S3
267 |             command: |
268 |               aws s3 sync dist s3://${S3_SITE_BUCKET} --no-progress --delete
269 |               aws cloudfront create-invalidation --distribution-id ${CF_DISTRIBUTION_ID} --paths "/*"
270 | 
271 |   ```
272 | 
273 | ### Workflow definition
274 | 
275 |   Workflow is used to orchestrate different jobs and configure job dependencies,
276 |   for example: we need to wait for the infrastructure deployment before we can upload
277 |   files to S3 (which is supposed to be created in that job).
278 |   In this example we can see that we run this workflow only when the branch name is
279 |   `develop`.
280 | 
281 |   ```yaml
282 |     workflows:
283 |       version: 2
284 |       build-and-deploy-dev:
285 |         when:
286 |           and:
287 |             - equal: [develop, << pipeline.git.branch >>]
288 |         jobs:
289 |           - build-frontend
290 |           - build-server:
291 |               <<: *studion-aws-credentials
292 |               account_id: ${STUDION_AWS_ACCOUNT_ID}
293 |               ecr_repo: app_server
294 |           - deploy-aws:
295 |               <<: *studion-aws-credentials
296 |               account_id: ${STUDION_AWS_ACCOUNT_ID}
297 |               ecr_repo: app_server
298 |               stack: dev
299 |               requires:
300 |                 - build-server
301 |           - deploy-frontend:
302 |               <<: *studion-aws-credentials
303 |               requires:
304 |                 - build-frontend
305 |                 - deploy-aws
306 |   ```
307 | 
308 |   Note that job params are set for each job and here we can use AWS credentials which
309 |   we defined at the beginning of the file. We can also see that some jobs can run in
310 |   parallel, for example: frontend and backend builds don't depend on each other and
311 |   that is how we can speed up the build process.
312 | 
313 | ### Workflow definition part 2
314 | 
315 |   In the previous steps we defined job parameters that allow us to
316 |   easily build different environments, for example: staging.
317 | 
318 |   Everything remains the same, we just need to change some variables and we can
319 |   easily deploy to as many environments as we want.
320 | 
321 |   ```yaml
322 |     build-and-deploy-stage:
323 |       when:
324 |         and:
325 |           - equal: [stage, << pipeline.git.branch >>]
326 |       jobs:
327 |         - build-frontend
328 |         - build-server:
329 |             <<: *client-aws-credentials
330 |             account_id: ${CLIENT_AWS_ACCOUNT_ID}
331 |             ecr_repo: app_server
332 |         - deploy-aws:
333 |             <<: *client-aws-credentials
334 |             account_id: ${CLIENT_AWS_ACCOUNT_ID}
335 |             ecr_repo: app_server
336 |             stack: stage
337 |             requires:
338 |               - build-server
339 |         - deploy-frontend:
340 |             <<: *client-aws-credentials
341 |             requires:
342 |               - build-frontend
343 |               - deploy-aws
344 |   ```
345 | 


--------------------------------------------------------------------------------
/recipes/docker-image-guide.md:
--------------------------------------------------------------------------------
  1 | # Docker image guide
  2 | 
  3 | The following handbook offers best practices for creating small and secure NodeJs
  4 | Docker images suitable for production use.
  5 | You will find it helpfull no matter what type of NodeJs application you aim to build.
  6 | 
  7 | Examples with each step explained will be used to guide you through best practices.
  8 | 
  9 | ## Simple NodeJs application
 10 | 
 11 | ### The application
 12 | 
 13 | Let's start with the simple nodejs application. Here is an overview of the files
 14 | included in the project:
 15 | 
 16 | ```
 17 | ├── index.js
 18 | ├── package.json
 19 | ├── package-lock.json
 20 | ├── Dockerfile
 21 | ├── .dockerignore
 22 | ├── .npmrc
 23 | ```
 24 | 
 25 | ```js
 26 | // index.js
 27 | const express = require("express");
 28 | const os = require("os");
 29 | 
 30 | const app = express();
 31 | 
 32 | app.use("/", (req, res) => {
 33 |   res.send(`Hello world from ${os.hostname()}.`);
 34 | });
 35 | 
 36 | app.listen(3000, () => {
 37 |   console.log("App is listening on port 3000");
 38 | });
 39 | ```
 40 | 
 41 | ### The Dockerfile
 42 | 
 43 | ```dockerfile
 44 | # https://cheatsheetseries.owasp.org/cheatsheets/NodeJS_Docker_Cheat_Sheet.html#1-use-explicit-and-deterministic-docker-base-image-tags
 45 | # https://snyk.io/blog/choosing-the-best-node-js-docker-image
 46 | FROM node:20.9-bookworm-slim@sha256:c325fe5059c504933948ae6483f3402f136b96492dff640ced5dfa1f72a51716 AS base
 47 | # https://docs.docker.com/build/cache/#combine-commands-together-wherever-possible
 48 | # https://cheatsheetseries.owasp.org/cheatsheets/NodeJS_Docker_Cheat_Sheet.html#5-properly-handle-events-to-safely-terminate-a-nodejs-docker-web-application
 49 | # https://github.com/Yelp/dumb-init
 50 | RUN apt update && apt install -y --no-install-recommends dumb-init
 51 | ENTRYPOINT ["dumb-init", "--"]
 52 | 
 53 | FROM node:20.9-bookworm@sha256:3c48678afb1ae5ca5931bd154d8c1a92a4783555331b535bbd7e0822f9ca8603 AS install
 54 | # https://www.pathname.com/fhs/pub/fhs-2.3.html#USRSRCSOURCECODE
 55 | WORKDIR /usr/src/app
 56 | # https://cheatsheetseries.owasp.org/cheatsheets/NodeJS_Docker_Cheat_Sheet.html#3-optimize-nodejs-tooling-for-production
 57 | ENV NODE_ENV production
 58 | COPY package*.json .
 59 | # https://cheatsheetseries.owasp.org/cheatsheets/NodeJS_Docker_Cheat_Sheet.html#2-install-only-production-dependencies-in-the-nodejs-docker-image
 60 | # when NODE_ENV is set to production, npm ci automatically omits dev dependencies
 61 | # https://docs.npmjs.com/cli/v10/commands/npm-ci#omit
 62 | # NOTE: if we don't have secrets, this is how we install npm packages, however if we
 63 | # do have npmrc secret, we skip this step and proceed to the next.
 64 | RUN npm ci --omit=dev
 65 | # we can mount .npmrc secret file without leaving the secrets in the final built image
 66 | # refer to docs https://docs.docker.com/build/building/secrets/
 67 | RUN --mount=type=secret,id=npmrc_secret,target=/usr/src/app/.npmrc,required npm ci --omit=dev
 68 | 
 69 | FROM base AS configure
 70 | WORKDIR /usr/src/app
 71 | COPY --chown=node:node --from=install /usr/src/app/node_modules ./node_modules
 72 | # https://docs.docker.com/build/cache/#dont-include-unnecessary-files
 73 | COPY --chown=node:node ./index.js .
 74 | 
 75 | FROM configure AS run
 76 | ENV NODE_ENV production
 77 | # https://cheatsheetseries.owasp.org/cheatsheets/NodeJS_Docker_Cheat_Sheet.html#4-dont-run-containers-as-root
 78 | USER node
 79 | CMD [ "node", "index.js" ]
 80 | ```
 81 | 
 82 | ### Important notes:
 83 | 
 84 | 1. **Always** specify `.dockerignore` files to reduce security risks and image
 85 |    footprint size. Also, by avoiding sending unwanted files to the builder,
 86 |    build speed is improved. The file should at least include
 87 |    `node_modules`, `.git` and `.env` files.
 88 | 
 89 | 2. [The order of Dockerfile instructions matters](https://docs.docker.com/build/guide/layers/).
 90 | 
 91 | 3. `FROM node:20.9-bookworm-slim@sha256:c32...16 AS base`
 92 | 
 93 | - Selecting the appropriate Docker image is crucial to achieve minimal resource
 94 |   utilization and minimize vulnerability risks.
 95 | - It is recommended to always use official docker images even though they are
 96 |   not the smallest ones. An excellent illustration of this is the "alpine" image,
 97 |   which, while having a minimal footprint, has experimental support.
 98 | - Include image sha256 hash to ensure that always the same image is downloaded.
 99 | - Use explicit and deterministic Docker base image tags to improve readability
100 |   and maintainability.
101 | - Currently, the official `bookworm-slim` image appears to be the most suitable
102 |   choice for a Node.js runner image, given its minimal size (nearly equivalent
103 |   to the Alpine version). Read more about chosing the best NodeJs Docker image
104 |   [here](https://snyk.io/blog/choosing-the-best-node-js-docker-image).
105 | 
106 | 4. `RUN apt update && apt install -y --no-install-recommends dumb-init`
107 | 
108 | - NodeJs is not designed to be a PID 1 process so we are using a process wrapper
109 |   to handle termination signals for us instead of doing it manually inside our
110 |   NodeJs application.
111 | - Notice how these two commands (apt update and apt install dumb-init) are chained.
112 |   By doing so we are saving some image footprint size because each docker image step
113 |   adds an additional layer which affects the final size. That being said, it is
114 |   recommended whenever is possible to chain RUN commands into single command.
115 | 
116 | 5. `ENTRYPOINT ["dumb-init", "--"]`
117 | 
118 | - As explained above, we are using dump-init as PID 1 process.
119 | 
120 | 6. `FROM node:20.9-bookworm@sha256:3c...603 AS install`
121 | 
122 | - For dependency installation (and later for the build phase), we use the
123 |   standard `bookworm` image instead of the `slim` version because certain
124 |   dependencies require additional tools for the compilation step.
125 | 
126 | 7. `WORKDIR /usr/src/app`
127 | 
128 | - Application code should be placed inside `/usr/src` subdirectory.
129 | 
130 | 8. `ENV NODE_ENV production`
131 | 
132 | - If you are building your image for production this ensures that all frameworks
133 |   and libraries are using the optimal settings for performance and security.
134 | 
135 | 9. `COPY package*.json .`
136 | 
137 | - It's important to notice here that we are copying `package*.json` files
138 |   separate from the rest of the codebase. By doing so, we are leveraging Docker
139 |   layers caching functionality mentioned in step 2. When source code
140 |   changes, we don't want to reinstall dependencies because they remain unchanged.
141 |   By copying source code files after dependency installation, we are only
142 |   re-executing those steps that come after that step, including that step.
143 | 
144 | 10. `RUN npm ci --omit=dev`
145 | 
146 | - devDependencies are not essential for the application to work. By installing
147 |   only production dependencies we are reducing security risks and image
148 |   footprint size and also improving build speed.
149 | 
150 | 11. `COPY --chown=node:node --from=install /usr/src/app/node_modules ./node_modules`
151 | 
152 | - From the install phase we are copying only the node_modules folder in order
153 |   to keep the final Docker image minimal.
154 | - The default Docker behavior is that copied files are owned by `root`.
155 |   By specifying `--chown=node:node` we are telling that `node_modules` files
156 |   will be owned by `node` user instead of `root`.
157 |   `node` is the least privileged user and by selecting it, we are limiting the
158 |   number of actions an attacker can do in case our application gets compromised.
159 | 
160 | 12. `COPY --chown=node:node ./index.js .`
161 | 
162 | - Copy the rest of the codebase as described in step 9. For this example, we
163 |   are copying only the `index.js` file because that is the only file we need in
164 |   order to run our application. Avoid adding unnecessary files to your builds by
165 |   explicitly stating the files or directories you intend to copy over.
166 | 
167 | 13. `USER node`
168 | 
169 | - The process should be owned by the `node` user instead of `root`.
170 | 
171 | 14. `RUN --mount=type=secret,id=npmrc_secret,target=/usr/src/app/.npmrc,required npm ci --omit=dev`
172 | 
173 | - The files mounted as secrets will be available during build, but they will not
174 |   remain in the final image. The secret can be any file, but npmrc is most common
175 |   so we use it as an example.
176 |   To be able to use the secret, we must pass it either as a param to Docker build or
177 |   define it in Docker compose.
178 | 
179 | - Docker build example:
180 |   `docker build -t ntc-lms . --secret id=npmrc_secret,src=.npmrc`
181 | 
182 | - Docker compose.yaml example:
183 |   ```yaml
184 |     services:
185 |       app:
186 |         build:
187 |           context: .
188 |           secrets:
189 |             - npmrc_secret
190 | 
191 |     ...
192 | 
193 |     secrets:
194 |       npmrc_secret:
195 |        file: .npmrc
196 |   ```
197 | 
198 | 
199 | ## Typescript NodeJs application
200 | 
201 | ### The application
202 | 
203 | ```
204 | ├── src
205 | │   ├── index.ts
206 | ├── dist
207 | │   ├── index.js
208 | ├── node_modules
209 | ├── tsconfig.json
210 | ├── package.json
211 | ├── package-lock.json
212 | ├── Dockerfile
213 | ├── .dockerignore
214 | ```
215 | 
216 | ### The Dockerfile
217 | 
218 | ```dockerfile
219 | FROM node:20.9-bookworm-slim@sha256:c325fe5059c504933948ae6483f3402f136b96492dff640ced5dfa1f72a51716 AS base
220 | RUN apt update && apt install -y --no-install-recommends dumb-init
221 | ENTRYPOINT ["dumb-init", "--"]
222 | 
223 | FROM node:20.9-bookworm@sha256:3c48678afb1ae5ca5931bd154d8c1a92a4783555331b535bbd7e0822f9ca8603 AS build
224 | WORKDIR /usr/src/app
225 | COPY package*.json .
226 | RUN npm ci
227 | COPY ./src tsconfig.json ./
228 | RUN npm run build
229 | 
230 | FROM node:20.9-bookworm@sha256:3c48678afb1ae5ca5931bd154d8c1a92a4783555331b535bbd7e0822f9ca8603 AS install
231 | WORKDIR /usr/src/app
232 | ENV NODE_ENV production
233 | COPY package*.json .
234 | RUN npm ci --omit=dev
235 | 
236 | FROM base AS configure
237 | WORKDIR /usr/src/app
238 | COPY --chown=node:node --from=build /usr/src/app/dist ./dist
239 | COPY --chown=node:node --from=install /usr/src/app/node_modules ./node_modules
240 | 
241 | FROM configure AS run
242 | ENV NODE_ENV production
243 | USER node
244 | CMD [ "node", "dist/index.js" ]
245 | ```
246 | 
247 | ### Important notes:
248 | 
249 | 1. [Use multi-stage builds](https://cheatsheetseries.owasp.org/cheatsheets/NodeJS_Docker_Cheat_Sheet.html#8-use-multi-stage-builds).
250 |    By splitting the docker image into multiple stages, we are ensuring that the
251 |    final image only contains essential files which reduces image footprint size
252 |    and security risks. In the given example, we begin by building a TypeScript
253 |    application in the build stage.
254 |    Ultimately, in the configure phase, we exclusively copy the `dist`
255 |    output from the build stage to the final Docker image.
256 |    Another benefit of using a multi-stage build is that the Docker builder will
257 |    work out dependencies between the stages and run them using the most
258 |    efficient strategy. This even allows you to run multiple builds concurrently.
259 | 
260 | ## Bonus Tips
261 | 
262 | ### Caching
263 | 
264 | Often, Docker images are built inside CI/CD pipeline. To enhance the efficiency
265 | of CI/CD and minimize computation costs, leveraging caching is crucial.
266 | 
267 | Caching depends on the platform that is being used.
268 | For detailed guidance on caching Docker image layers with CircleCI, refer to
269 | this [link](https://circleci.com/docs/docker-layer-caching).
270 | 
271 | Also, on this
272 | [link](https://courses.devopsdirective.com/docker-beginner-to-pro/lessons/06-building-container-images/02-api-node-dockerfile#use-a-cache-mount-to-speed-up-dependency-installation-%EF%B8%8F)
273 | you can find how to cache npm dependencies between builds.
274 | 
275 | ## Resources
276 | 
277 | - [NodeJs Docker Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/NodeJS_Docker_Cheat_Sheet.html#nodejs-docker-cheat-sheet)
278 | - [Choosing the best NodeJs Docker image](https://snyk.io/blog/choosing-the-best-node-js-docker-image)
279 | - [Docker guide](https://docs.docker.com/build/guide/)
280 | - [10 best practices to containerize NodeJs web applications with Docker](https://snyk.io/blog/10-best-practices-to-containerize-nodejs-web-applications-with-docker/)
281 | - [NodeJs API Dockerfile](https://courses.devopsdirective.com/docker-beginner-to-pro/lessons/06-building-container-images/02-api-node-dockerfile#nodejs-api-dockerfile)
282 | 


--------------------------------------------------------------------------------
/recipes/handling-concurrency.md:
--------------------------------------------------------------------------------
  1 | # Handling concurrency using optimistic or pessimistic locking
  2 | 
  3 | Concurrency issues arise when multiple entities, such as services, threads, or users, access or modify the same resource simultaneously, leading to race conditions.
  4 | 
  5 | ![Concurrency](../assets/images/concurrency.png)
  6 | 
  7 | Race conditions are very hard to track and debug.
  8 | Typically, we develop sequential applications where concurrency concerns are minimal.
  9 | Concurrency is a key factor in boosting performance; it is utilized to increase throughput or reduce execution times, and is likely unnecessary in the absence of specific performance requirements.
 10 | However, concurrency issues can also occur from parallel requests to the same resources.
 11 | For instance, when two users try to reserve the same inventory item at the same time, it highlights a scenario where concurrency management becomes crucial, regardless of performance considerations.
 12 | 
 13 | Addressing concurrency within a monolithic architecture differs significantly from managing it in distributed systems. In this article, we will concentrate on strategies for handling concurrency in monolithic applications, reflecting a more common scenario within our company.
 14 | 
 15 | There are multiple approaches to managing concurrency, including database isolation levels and design-centric solutions like end-to-end partitioning. In this article, we will specifically explore the concepts of optimistic and pessimistic locking.
 16 | 
 17 | We'll also explore practical example of optimistic locking. Let's dive into it!
 18 | 
 19 | ## Pesimisstic locking vs Optimistic locking
 20 | 
 21 | ![Pessimistic vs Optimistic locking](../assets/images/pessimistic-vs-optimistic-locking.png)
 22 | 
 23 | The concepts of pessimistic versus optimistic locking can be metaphorically compared to asking for permission versus apologizing afterward. In pessimistic locking, a lock is placed on the resource, requiring all consumers to ASK for permission before they can modify it. On the other hand, optimistic locking does not involve placing a lock. Instead, consumers proceed under the assumption that they are the sole users of the resource and APOLOGIZE in case of conflict.
 24 | 
 25 | In the pessimistic lock approach, actual locks are used on the database level, while in the optimistic approach, no locks are used, resulting in higher throughput.
 26 | 
 27 | Optimistic locking, generally speaking, is easier to implement and offers performance benefits but can become costly in scenarios where the chance of conflict is high.
 28 | As the probability of conflict increases, the chances of transaction abortion also rise. Rollbacks can be costly for the system as it needs to revert all current pending changes which might involve both table rows and index records.
 29 | 
 30 | In situations with a high risk of conflict, it might be better to use pessimistic locking in the first place rather than doing rollbacks and subsequent retries, which can put additional load on the system.
 31 | However, pessimistic locking can affect the performance and scalability of your application by maintaining locks on database rows or tables and may also lead to deadlocks if not managed carefully.
 32 | 
 33 | **Steps for Implementing Pessimistic Locking:**
 34 | 
 35 | 1. Retrieve and lock the resource from the database (using "SELECT FOR UPDATE")
 36 | 2. Apply the necessary changes
 37 | 3. Save the changes by committing the transaction, which also releases the locks.
 38 | 
 39 | **Steps for Implementing Optimistic Locking:**
 40 | 
 41 | ![Optimistic locking](../assets/images/optimistic-locking.png)
 42 | 
 43 | 1. Retrieve the resource from the database
 44 | 2. Apply the necessary changes
 45 | 3. Save the changes to the database; an error is thrown in case of a version mismatch, otherwise, the version is incremented
 46 | 
 47 | ## Example
 48 | 
 49 | Let's see optimistic locking in action.
 50 | We'll illustrate its application through an example where users can reserve inventory items for a specified period.
 51 | Although I'm utilizing MikroORM and PostgreSQL for this demonstration, it's worth noting that optimistic locking can be implemented with nearly any database.
 52 | 
 53 | Potentially, we could encounter a scenario where users attempt to reserve the same inventory item simultaneously for overlapping time periods, which could lead to issues.
 54 | 
 55 | I would say that the majority of the work here lies in the aggregate design.
 56 | I've structured it so that the `InventoryItem` entity acts as the aggregate root, containing `Reservations`. This design mandates that the creation of reservations must proceed through the `InventoryItem` aggregate root, which encapsulates this specific logic.
 57 | Upon retrieving an inventory item from the database, it will include all current reservations for that item, enabling us to apply business logic to determine if there is any overlapping reservation that conflicts with the one we intend to create.
 58 | If there is no conflict, we proceed with the creation. This method centralizes the reservation creation logic, thereby ensuring consistency.
 59 | 
 60 | For optimistic locking, we need a field to determine if an entity has changed since we retrieved it from the database. I used an integer `version` field that increments with each modification.
 61 | 
 62 | Here is the code for the `InventoryItem` aggregate root:
 63 | 
 64 | ```ts
 65 | // inventory-item.entity.ts
 66 | @Entity()
 67 | export class InventoryItem extends AggregateRoot {
 68 |   @Property({ version: true })
 69 |   readonly version: number = 1;
 70 | 
 71 |   @OneToMany({
 72 |     entity: () => Reservation,
 73 |     mappedBy: (it) => it.reservationItem,
 74 |     orphanRemoval: true,
 75 |     eager: true,
 76 |   })
 77 |   private _reservations = new Collection<Reservation>(this);
 78 | 
 79 |   createReservation(startDate: Date, endDate: Date, userId: number) {
 80 |     const overlapingReservation =
 81 |       this.reservations.some(/** Find overlaping reservation logic */);
 82 |     if (overlapingReservation) {
 83 |       throw new ReservationOverlapException();
 84 |     }
 85 |     const reservation = new Reservation(this, startDate, endDate, userId);
 86 |     this._reservations.add(reservation);
 87 |     this.updatedAt = new Date();
 88 |   }
 89 | 
 90 |   get reservations() {
 91 |     return this._reservations.getItems();
 92 |   }
 93 | }
 94 | ```
 95 | 
 96 | By applying `@Property({ version: true })` we instruct MikroORM to treat this field as the version field.
 97 | MikroORM will handle incrementing the version field and will throw an `OptimisticLock` exception in the case of a conflict.
 98 | 
 99 | ```ts
100 | // reservation.entity.ts
101 | @Entity()
102 | export class Reservation extends BaseEntity {
103 |   @Property()
104 |   startDate: Date;
105 | 
106 |   @Property()
107 |   endDate: Date;
108 | 
109 |   @ManyToOne({
110 |     entity: () => InventoryItem,
111 |     serializer: (it) => it.id,
112 |     serializedName: "inventoryItemId",
113 |     fieldName: "inventory_item_id",
114 |   })
115 |   inventoryItem!: InventoryItem;
116 | 
117 |   @Property()
118 |   userId: number;
119 | 
120 |   constructor(
121 |     inventoryItem: InventoryItem,
122 |     startDate: Date,
123 |     endDate: Date,
124 |     userId: number
125 |   ) {
126 |     super();
127 |     this.inventoryItem = inventoryItem;
128 |     this.startDate = startDate;
129 |     this.endDate = endDate;
130 |     this.userId = userId;
131 |   }
132 | }
133 | ```
134 | 
135 | Now, within the `createReservation` method, all we need to do is:
136 | 
137 | - Retrieve the inventory item entity from the repository
138 | - Create a reservation by invoking the inventoryItem.createReservation method
139 | - Flush the changes to the database
140 | 
141 | ```ts
142 | // create-reservation.command.ts
143 | @RetryOnError(OptimisticLockError)
144 | @CreateRequestContext()
145 | async createReservation(payload: CreateReservationPayload): Promise<void> {
146 |   const {
147 |     userId,
148 |     inventoryItemId: id,
149 |     startDate,
150 |     endDate,
151 |   } = payload;
152 |   const inventoryItem = await this.repository.findById(id);
153 |   inventoryItem.createReservation(startDate, endDate, userId);
154 |   await this.em.flush();
155 | }
156 | ```
157 | 
158 | You might have noticed that in the case of an `OptimisticLockError`, I used a custom `@RetryOnError` decorator to retry the operation. This approach is adopted because users may attempt to reserve the same inventory item for different time periods, leading to an `OptimisticLockError` for one of the requests. By retrying, we ensure that the end user is not aware of multiple concurrent requests occurring simultaneously.
159 | 
160 | In this scenario, we could also leverage database transaction isolation levels, like SERIALIZABLE, since this transaction does not span across multiple requests. However, there is often a requirement for long-running business processes that span multiple requests. In these situations, database transactions alone are insufficient for managing concurrency throughout such an extended business transaction. For these cases, optimistic locking proves to be a highly suitable solution.
161 | 
162 | You can find full working example [here](https://github.com/ikovac/teem-clone/tree/master/apps/api/src/reservation).
163 | 
164 | Also, If you are interested in implementing cross-request optimistic locking, check out the [MikroORM documentation](https://mikro-orm.io/docs/transactions#optimistic-locking)
165 | 


--------------------------------------------------------------------------------
/recipes/lti.md:
--------------------------------------------------------------------------------
 1 | # LTI - Theory
 2 | 
 3 | ## What is LTI?
 4 | 
 5 | LTI (Learning Tools Interoperability) is a standard developed by IMS Global Learning Consortium designed to enable seamless integration of educational tools and platforms. LTI standard allows learning management systems (LMS) or any other platforms to integrate remote tools and content in a standard way without logging into each tool.
 6 | 
 7 | LTI allows platforms to host numerous integrations. By using LTI, tools can be developed independently from the platform by different parties.
 8 | 
 9 | **This explanation sounds a little bit overwhelming. Can you explain it to me like I am 10?**
10 | 
11 | Sure! Imagine you go to a school that has its own website where you can find all your lessons, videos, and tests. Every student has their own account, and you can see your scores and progress on this website. This special website is called the Learning Management System or LMS.
12 | 
13 | Now, think about this: a new, fun quiz game web app has just come out, and your school wants to use it in class. Normally, you would have to go to the quiz game’s website, create a new account, and then log in to play. But wouldn’t it be easier if you could play the quiz game right from your school’s LMS, without making a new account?
14 | 
15 | That’s exactly what the LTI standard helps with! LTI allows the quiz game to be added to your school’s LMS. This way, you can play the quiz without leaving the LMS and without creating a new account. You just log in to the LMS, and you can access everything you need with one username and password.
16 | 
17 | ## Key concepts
18 | 
19 | - **Platform**. A *tool platform* or, more simply, *platform* has traditionally been a Learning Management System (LMS), but it may be any kind of platform that needs to delegate bits of functionality out to a suite of *tools*. Examples of platforms are LMS systems such as Coursera, edX, Moodle, Blackboard, and Canvas.
20 | - **Tool**. The external application or service providing functionality to the *platform* is called a *tool*.
21 | 
22 | <aside>
23 | 💡 Historically, LTI referred to platforms as "tool consumers" and tools as "tool providers." However, these terms do not align with their usage in the OAuth2 and OpenID Connect communities. Therefore, LTI 1.3 no longer uses these terms, and instead, it uses "platform" and "tool" to describe the parties involved in an LTI integration.
24 | 
25 | </aside>
26 | 
27 | ## Example: LTI from the learner's perspective
28 | 
29 | Let's explore LTI from the learner's perspective.
30 | 
31 | Firstly, let's introduce the tool. It's a very simple web app with just a few pages: a home page and some resource pages.
32 | 
33 | ![lti-home.png](../assets/images/lti/lti-home.png)
34 | 
35 | ![lti-resource1.png](../assets/images/lti/lti-resource1.png)
36 | 
37 | On the LMS side, we have lessons, and when a user clicks on the lesson reading link, it launches our LTI tool, which provides the content for Resource 1. This process is called an LTI launch.
38 | 
39 | ![lms-lessons.png](../assets/images/lti/lms-lessons.png)
40 | 
41 | ![lms-tool-launch.png](../assets/images/lti/lms-tool-launch.png)
42 | 
43 | The tool's content is embedded within an iframe, so the user doesn't even realize the content is coming from another source. This creates a seamless experience, making it feel like part of the LMS, and the user never leaves the LMS platform. Additionally, the user doesn't need to log in to the tool itself. There are other techniques for integrating an LTI tool, but using an iframe is the most common method.
44 | 
45 | **The platform acts as the OIDC provider in the LTI process, meaning the tool doesn't need to know anything about the platform’s identity provider. For example, if the platform uses Auth0 for authentication, the tool doesn't need to know anything about Auth0 because the platform issues the ID token, thereby serving as the OIDC provider from the tool's perspective.**
46 | 
47 | **LTI messages sent from the platform are *OpenID Tokens*. Messages sent from the tool are *JSON Web Tokens* (JWT) as the tool is not typically acting as OpenID Provider.**
48 | 
49 | ## LTI 1.1 vs 1.3
50 | 
51 | The most popular LTI versions are 1.1 and 1.3. Version 1.1 is deprecated, so this article will focus on the latest version, 1.3, which offers additional capabilities and improvements. Some key benefits of version 1.3 compared to 1.1 include security enhancements:
52 | 
53 | - LTI 1.1 uses OAuth 1.0, which has been deprecated for years due to insufficient protection. In practice, controlling access to the secret is very difficult.
54 | - LTI 1.3, on the other hand, uses OAuth 2.0 and JWT message signing protocols.
55 | - Additionally, LTI 1.3 adopts the OpenID Connect workflow for authentication with every launch.
56 | 
57 | You can read about all the benefits of version 1.3 [here](https://brijendrasinghrajput.medium.com/lti-1-3-benefits-over-lti-1-1-a1a37e94bc5b).
58 | 
59 | ## LTI components
60 | 
61 | ![lti-advantage.png](../assets/images/lti/lti-advantage.png)
62 | 
63 | The message for embedding content into the platform (LTI launch) is referred to as [LTI Core](https://www.imsglobal.org/spec/lti/v1p3). Additional services are deep linking, names and roles, and assignments and grades. All together are called LTI Advantage.
64 | 
65 | - [_Deep Linking_](https://www.imsglobal.org/spec/lti-dl/v2p0/): the ability to launch a tool’s configuration panel that will return a configured resource link to the platform based on what the user has set up. The next time a user accesses the link, it will be taken to the configured activity/resource instead of the configuration panel.
66 | - [_Assignment and Grading Services_](https://www.imsglobal.org/spec/lti-ags/v2p0/) (AGS): the ability of posting grades generated by the tool back to the platform’s grade book.
67 | - [_Names and Role Provisioning Services_](http://www.imsglobal.org/spec/lti-nrps/v2p0) (NRPS): the ability of the platform providing the tool with students list and user information.
68 | 
69 | ## Implementation
70 | 
71 | 1. [Creating LTI compatible Tool](https://github.com/ExtensionEngine/lti-tool-example)
72 | 2. Creating LTI compatible Platform - TBA
73 | 


--------------------------------------------------------------------------------
/recipes/ses-bounce-handling.md:
--------------------------------------------------------------------------------
  1 | # SES Bounce handling
  2 | 
  3 | ## Context
  4 | 
  5 | Amazon is very strict on rules regarding its email service SES. If you’re having too many bounces or complaints, resulting in a non-healthy sending status, you’ll receive a service block easily.
  6 | 
  7 | That's why it's important to handle bounces and complaints accordingly.
  8 | 
  9 | **What are bounces and complaints?**
 10 | 
 11 | Bounces occur when an email cannot be delivered to the recipient for various reasons. There are two types of bounces: hard bounces and soft bounces.
 12 | 
 13 | 1. Hard Bounces
 14 | 
 15 |    - Hard bounces are caused by permanent issues, such as an invalid email address, a non-existent domain, or the recipient's email server blocking the message.
 16 |    - **Impact**: Consistently sending emails to invalid or non-existent addresses can harm your sender reputation. A poor sender reputation may lead to your emails being marked as spam or rejected by email service providers, impacting your overall deliverability.
 17 | 
 18 | 2. Soft Bounces:
 19 | 
 20 |    - Soft bounces result from temporary issues, such as the recipient's mailbox being full or the email server being temporarily unavailable.
 21 |    - Unlike hard bounces, soft bounces provide an opportunity to retry delivering the email.
 22 | 
 23 | Complaints arise when recipients mark your emails as spam or unwanted. It's important to promptly process this feedback, investigate the reasons for complaints, and take corrective actions, such as removing complaining recipients from your mailing list.
 24 | 
 25 | ## Building a simple SES bounce handling example
 26 | 
 27 | What are we going to build?
 28 | 
 29 | ![Infrastructure diagram](../assets/images/infrastructure-diagram.png)
 30 | 
 31 | We are going to build two bounce handlers - the first one which will send an alert email upon detecting a bounce, and the second one which will add bounced emails to the block list.
 32 | The application can subsequently verify whether the recipient's email is listed in the block list. If it is, the system will refrain from sending an email to that address.
 33 | 
 34 | You can find the complete working code [here](https://github.com/ikovac/ses-bounce-handling).
 35 | 
 36 | We are going to use [Pulumi](https://www.pulumi.com/) to create and manage our infrastructure.
 37 | 
 38 | ### SNS Topic
 39 | 
 40 | Let's start with creating a simple Pulumi program.
 41 | We need to create a SNS Topic which is used for receiving SES notification messages.
 42 | 
 43 | ```typescript
 44 | import * as aws from "@pulumi/aws";
 45 | import * as pulumi from "@pulumi/pulumi";
 46 | 
 47 | export const snsTopic = new aws.sns.Topic("ses-notifications", {
 48 |   fifoTopic: false,
 49 |   namePrefix: "ses-notifications",
 50 | });
 51 | ```
 52 | 
 53 | Once we have created the SNS topic we have to tell SES to use that topic for sending notification messages. To do so, navigate to the AWS Console. Go to the SES service -> Verified identities -> Select your identity -> Notifications tab -> Feedback notifications -> Edit.
 54 | 
 55 | ![SES SNS topic](../assets/images/ses-sns-topic.png)
 56 | 
 57 | Select your newly created SNS topic for bounce feedback and click "Save changes".
 58 | 
 59 | ### SNS Topic Subscription
 60 | 
 61 | Now, with SES notifications routed to the SNS topic, we can proceed to create our initial handler. This handler will be responsible for sending notification emails when bounces occur.
 62 | 
 63 | ```typescript
 64 | const config = new pulumi.Config();
 65 | 
 66 | export const subscription = new aws.sns.TopicSubscription(
 67 |   "send-email-notification-handler",
 68 |   {
 69 |     topic: snsTopic.arn,
 70 |     protocol: "email-json",
 71 |     endpoint: config.require("email"),
 72 |     filterPolicyScope: "MessageBody",
 73 |     filterPolicy: JSON.stringify({ notificationType: ["Bounce"] }),
 74 |   }
 75 | );
 76 | ```
 77 | 
 78 | In the code snippet above we created `sns.TopicSubscription` resource that uses `email-json` protocol. Additionally, by defining a `filterPolicy`, we ensure that emails are sent only for messages of the type `Bounce`.
 79 | After deploying the current code, you will receive a subscription confirmation email. Please confirm it to start receiving email notifications.
 80 | 
 81 | Now we can test our handler by sending a test email.
 82 | Navigate to your SES identity page and click the "Send test email" button.
 83 | 
 84 | ![SES test email](../assets/images/test-email.png)
 85 | 
 86 | If everything goes as expected you should receive an email notification containing bounce details 🥳.
 87 | 
 88 | Although the current handler notifies relevant parties about a bounce, it doesn't take preventive measures for the future. Let's develop a block list handler to address this issue effectively.
 89 | 
 90 | ### SQS Queue
 91 | 
 92 | ```typescript
 93 | const queue = new aws.sqs.Queue("ses-notifications-queue", {
 94 |   fifoQueue: false,
 95 |   namePrefix: "ses-notifications-queue",
 96 |   sqsManagedSseEnabled: true,
 97 | });
 98 | 
 99 | const allowSNSToQueueMessages = aws.iam.getPolicyDocumentOutput({
100 |   statements: [
101 |     {
102 |       sid: "AllowSNSToQueueMessages",
103 |       effect: "Allow",
104 |       actions: ["sqs:SendMessage"],
105 |       resources: [queue.arn],
106 |       principals: [
107 |         {
108 |           type: "*",
109 |           identifiers: ["*"],
110 |         },
111 |       ],
112 |       conditions: [
113 |         {
114 |           test: "ArnEquals",
115 |           variable: "aws:SourceArn",
116 |           values: [snsTopic.arn],
117 |         },
118 |       ],
119 |     },
120 |   ],
121 | });
122 | 
123 | const allowSNSToQueueMessagesPolicy = new aws.sqs.QueuePolicy(
124 |   "allow-sns-to-queue-messages-policy",
125 |   {
126 |     queueUrl: queue.id,
127 |     policy: allowSNSToQueueMessages.apply((policy) => policy.json),
128 |   }
129 | );
130 | 
131 | const addToBlockListHandler = new aws.sns.TopicSubscription(
132 |   "add-to-block-list-handler",
133 |   {
134 |     topic: snsTopic.arn,
135 |     protocol: "sqs",
136 |     endpoint: queue.arn,
137 |     filterPolicyScope: "MessageBody",
138 |     filterPolicy: JSON.stringify({ notificationType: ["Bounce"] }),
139 |   }
140 | );
141 | ```
142 | 
143 | The code snippet above creates sqs queue and another `sns.TopicSubscription` which forwards bounce messages to our SQS queue.
144 | 
145 | While it is possible to directly attach a Lambda function to the SNS topic, it is advisable to introduce a queue in between.
146 | 
147 | The primary advantage of incorporating an SQS (Simple Queue Service) between SNS and Lambda is the ability to reprocess messages. By adding the message to a dead letter queue, we can reprocess it at a later time—something not achievable with direct SNS to Lambda integration.
148 | 
149 | Another benefit of using SQS is cost efficiency in Lambda invocations. This approach allows for more efficient scaling and reduced costs, as it enables the processing of messages in batches.
150 | 
151 | In the snippet above we also created the `AllowSNSToQueueMessages` policy which allows SNS to enqueue messages.
152 | 
153 | ⚠ Please take note that, with `notificationType: ["Bounce"] })`, we are currently filtering only Bounce messages for the purpose of this tutorial. However, in real-world scenarios, it is advisable to handle Complaint messages as well. Additionally, it is recommended to distinguish between hard and soft bounces and implement a retry mechanism in the latter case.
154 | 
155 | ### Dynamo DB
156 | 
157 | Let's proceed by creating a simple DynamoDB table with only an email column, which will be used to store bounced emails.
158 | 
159 | ```typescript
160 | const dynamoTable = new aws.dynamodb.Table("block-list-table", {
161 |   name: "blocklist",
162 |   attributes: [
163 |     {
164 |       name: "email",
165 |       type: "S",
166 |     },
167 |   ],
168 |   hashKey: "email",
169 |   readCapacity: 1,
170 |   writeCapacity: 1,
171 | });
172 | ```
173 | 
174 | ### AWS Lambda
175 | 
176 | Before creating the Lambda function, it's essential to set up the necessary execution role. This role will provide the Lambda function with the required permissions to access DynamoDB and other services such as CloudWatch.
177 | 
178 | ```typescript
179 | const assumeRolePolicy = aws.iam.getPolicyDocument({
180 |   statements: [
181 |     {
182 |       effect: "Allow",
183 |       actions: ["sts:AssumeRole"],
184 |       principals: [
185 |         {
186 |           type: "Service",
187 |           identifiers: ["lambda.amazonaws.com"],
188 |         },
189 |       ],
190 |     },
191 |   ],
192 | });
193 | 
194 | const iamForLambda = new aws.iam.Role("lambda-execution-role", {
195 |   name: "LambdaExecutionRole",
196 |   assumeRolePolicy: assumeRolePolicy.then((policy) => policy.json),
197 | });
198 | 
199 | new aws.iam.RolePolicyAttachment("execution-role-policy-attachment", {
200 |   role: iamForLambda.name,
201 |   policyArn:
202 |     "arn:aws:iam::aws:policy/service-role/AWSLambdaSQSQueueExecutionRole",
203 | });
204 | 
205 | const allowLambdaToAccessDynamoDb = aws.iam.getPolicyDocumentOutput({
206 |   statements: [
207 |     {
208 |       effect: "Allow",
209 |       actions: ["dynamodb:*"],
210 |       resources: [dynamoTable.arn],
211 |     },
212 |   ],
213 | });
214 | 
215 | const allowLambdaToAccessDynamoDbPolicy = new aws.iam.Policy(
216 |   "allow-lambda-to-access-dynamo-db-policy",
217 |   {
218 |     name: "AllowLambdaToAccessDynamoDb",
219 |     policy: allowLambdaToAccessDynamoDb.apply((policy) => policy.json),
220 |   }
221 | );
222 | 
223 | new aws.iam.RolePolicyAttachment("lambda-dynamodb-policy-attachment", {
224 |   role: iamForLambda.name,
225 |   policyArn: allowLambdaToAccessDynamoDbPolicy.arn,
226 | });
227 | ```
228 | 
229 | Now we can create our AWS Lambda handler responsible for processing queued messages.
230 | 
231 | ```typescript
232 | const codeZip = new pulumi.asset.AssetArchive({
233 |   "index.mjs": new pulumi.asset.FileAsset("./lambda.mjs"),
234 | });
235 | 
236 | export const lambda = new aws.lambda.Function("add-to-block-list-lambda", {
237 |   name: "add-to-block-list-handler",
238 |   code: codeZip,
239 |   role: iamForLambda.arn,
240 |   handler: "index.handler",
241 |   runtime: "nodejs20.x",
242 |   environment: {
243 |     variables: {
244 |       TABLE_NAME: dynamoTable.name,
245 |     },
246 |   },
247 | });
248 | ```
249 | 
250 | Lambda code is available in the `lambda.mjs` file.
251 | 
252 | ```javascript
253 | import { DynamoDBClient } from "@aws-sdk/client-dynamodb";
254 | import { DynamoDBDocumentClient, PutCommand } from "@aws-sdk/lib-dynamodb";
255 | 
256 | const client = new DynamoDBClient({});
257 | const docClient = DynamoDBDocumentClient.from(client);
258 | 
259 | export const handler = async (event) => {
260 |   const records = event.Records;
261 |   const emails = records.reduce((acc, it) => {
262 |     const body = JSON.parse(it.body);
263 |     const message = JSON.parse(body.Message);
264 |     const bouncedRecipients = message.bounce.bouncedRecipients.map(
265 |       (r) => r.emailAddress
266 |     );
267 |     return [...acc, ...bouncedRecipients];
268 |   }, []);
269 | 
270 |   const pResult = emails.map((email) => {
271 |     const command = new PutCommand({
272 |       TableName: process.env.TABLE_NAME,
273 |       Item: { email },
274 |     });
275 |     return docClient.send(command);
276 |   });
277 |   await Promise.all(pResult);
278 | };
279 | ```
280 | 
281 | Finally, we can map the SQS queue to our Lambda function:
282 | 
283 | ```typescript
284 | new aws.lambda.EventSourceMapping("sqs-lambda-mapping", {
285 |   eventSourceArn: queue.arn,
286 |   functionName: lambda.arn,
287 | });
288 | ```
289 | 
290 | By sending test email again, we can see that `bounce@simulator.amazonses.com` has been successfully added to the blocklist table 🥳🥳🥳.
291 | 
292 | You can find the complete working code [here](https://github.com/ikovac/ses-bounce-handling).
293 | 
294 | ## Final words
295 | 
296 | The purpose of this guide is not to offer a one-size-fits-all solution for every project but rather to present a simple demo handler. This demonstration aims to illustrate that handling bounces and complaints is not that hard.
297 | 
298 | Before implementing a bounces and complaints handler, it is recommended to investigate the solution that best fits your specific project requirements. For instance, instead of storing bounced emails in DynamoDB, you may opt for an HTTP `sns.TopicSubscription` to notify your system about bounces. Subsequently, you can handle these notifications within your system accordingly.
299 | 
300 | Additionally, for production use, it is highly advisable to incorporate a dead-letter queue. This ensures the ability to reprocess messages that may have failed.
301 | 


--------------------------------------------------------------------------------