├── LICENSE └── README.md /LICENSE: -------------------------------------------------------------------------------- 1 | Creative Commons Corporation (“Creative Commons”) is not a law firm and does not provide legal services or legal advice. Distribution of Creative Commons public licenses does not create a lawyer-client or other relationship. Creative Commons makes its licenses and related information available on an “as-is” basis. Creative Commons gives no warranties regarding its licenses, any material licensed under their terms and conditions, or any related information. Creative Commons disclaims all liability for damages resulting from their use to the fullest extent possible. 2 | 3 | Using Creative Commons Public Licenses 4 | 5 | Creative Commons public licenses provide a standard set of terms and conditions that creators and other rights holders may use to share original works of authorship and other material subject to copyright and certain other rights specified in the public license below. The following considerations are for informational purposes only, are not exhaustive, and do not form part of our licenses. 6 | 7 | Considerations for licensors: Our public licenses are intended for use by those authorized to give the public permission to use material in ways otherwise restricted by copyright and certain other rights. Our licenses are irrevocable. Licensors should read and understand the terms and conditions of the license they choose before applying it. Licensors should also secure all rights necessary before applying our licenses so that the public can reuse the material as expected. Licensors should clearly mark any material not subject to the license. This includes other CC-licensed material, or material used under an exception or limitation to copyright. More considerations for licensors. 8 | 9 | Considerations for the public: By using one of our public licenses, a licensor grants the public permission to use the licensed material under specified terms and conditions. If the licensor’s permission is not necessary for any reason–for example, because of any applicable exception or limitation to copyright–then that use is not regulated by the license. Our licenses grant only permissions under copyright and certain other rights that a licensor has authority to grant. Use of the licensed material may still be restricted for other reasons, including because others have copyright or other rights in the material. A licensor may make special requests, such as asking that all changes be marked or described. Although not required by our licenses, you are encouraged to respect those requests where reasonable. More considerations for the public. 10 | 11 | Creative Commons Attribution 4.0 International Public License 12 | 13 | By exercising the Licensed Rights (defined below), You accept and agree to be bound by the terms and conditions of this Creative Commons Attribution 4.0 International Public License ("Public License"). To the extent this Public License may be interpreted as a contract, You are granted the Licensed Rights in consideration of Your acceptance of these terms and conditions, and the Licensor grants You such rights in consideration of benefits the Licensor receives from making the Licensed Material available under these terms and conditions. 14 | 15 | Section 1 – Definitions. 16 | 17 | Adapted Material means material subject to Copyright and Similar Rights that is derived from or based upon the Licensed Material and in which the Licensed Material is translated, altered, arranged, transformed, or otherwise modified in a manner requiring permission under the Copyright and Similar Rights held by the Licensor. For purposes of this Public License, where the Licensed Material is a musical work, performance, or sound recording, Adapted Material is always produced where the Licensed Material is synched in timed relation with a moving image. 18 | Adapter's License means the license You apply to Your Copyright and Similar Rights in Your contributions to Adapted Material in accordance with the terms and conditions of this Public License. 19 | Copyright and Similar Rights means copyright and/or similar rights closely related to copyright including, without limitation, performance, broadcast, sound recording, and Sui Generis Database Rights, without regard to how the rights are labeled or categorized. For purposes of this Public License, the rights specified in Section 2(b)(1)-(2) are not Copyright and Similar Rights. 20 | Effective Technological Measures means those measures that, in the absence of proper authority, may not be circumvented under laws fulfilling obligations under Article 11 of the WIPO Copyright Treaty adopted on December 20, 1996, and/or similar international agreements. 21 | Exceptions and Limitations means fair use, fair dealing, and/or any other exception or limitation to Copyright and Similar Rights that applies to Your use of the Licensed Material. 22 | Licensed Material means the artistic or literary work, database, or other material to which the Licensor applied this Public License. 23 | Licensed Rights means the rights granted to You subject to the terms and conditions of this Public License, which are limited to all Copyright and Similar Rights that apply to Your use of the Licensed Material and that the Licensor has authority to license. 24 | Licensor means the individual(s) or entity(ies) granting rights under this Public License. 25 | Share means to provide material to the public by any means or process that requires permission under the Licensed Rights, such as reproduction, public display, public performance, distribution, dissemination, communication, or importation, and to make material available to the public including in ways that members of the public may access the material from a place and at a time individually chosen by them. 26 | Sui Generis Database Rights means rights other than copyright resulting from Directive 96/9/EC of the European Parliament and of the Council of 11 March 1996 on the legal protection of databases, as amended and/or succeeded, as well as other essentially equivalent rights anywhere in the world. 27 | You means the individual or entity exercising the Licensed Rights under this Public License. Your has a corresponding meaning. 28 | 29 | Section 2 – Scope. 30 | 31 | License grant. 32 | Subject to the terms and conditions of this Public License, the Licensor hereby grants You a worldwide, royalty-free, non-sublicensable, non-exclusive, irrevocable license to exercise the Licensed Rights in the Licensed Material to: 33 | reproduce and Share the Licensed Material, in whole or in part; and 34 | produce, reproduce, and Share Adapted Material. 35 | Exceptions and Limitations. For the avoidance of doubt, where Exceptions and Limitations apply to Your use, this Public License does not apply, and You do not need to comply with its terms and conditions. 36 | Term. The term of this Public License is specified in Section 6(a). 37 | Media and formats; technical modifications allowed. The Licensor authorizes You to exercise the Licensed Rights in all media and formats whether now known or hereafter created, and to make technical modifications necessary to do so. The Licensor waives and/or agrees not to assert any right or authority to forbid You from making technical modifications necessary to exercise the Licensed Rights, including technical modifications necessary to circumvent Effective Technological Measures. For purposes of this Public License, simply making modifications authorized by this Section 2(a)(4) never produces Adapted Material. 38 | Downstream recipients. 39 | Offer from the Licensor – Licensed Material. Every recipient of the Licensed Material automatically receives an offer from the Licensor to exercise the Licensed Rights under the terms and conditions of this Public License. 40 | No downstream restrictions. You may not offer or impose any additional or different terms or conditions on, or apply any Effective Technological Measures to, the Licensed Material if doing so restricts exercise of the Licensed Rights by any recipient of the Licensed Material. 41 | No endorsement. Nothing in this Public License constitutes or may be construed as permission to assert or imply that You are, or that Your use of the Licensed Material is, connected with, or sponsored, endorsed, or granted official status by, the Licensor or others designated to receive attribution as provided in Section 3(a)(1)(A)(i). 42 | 43 | Other rights. 44 | Moral rights, such as the right of integrity, are not licensed under this Public License, nor are publicity, privacy, and/or other similar personality rights; however, to the extent possible, the Licensor waives and/or agrees not to assert any such rights held by the Licensor to the limited extent necessary to allow You to exercise the Licensed Rights, but not otherwise. 45 | Patent and trademark rights are not licensed under this Public License. 46 | To the extent possible, the Licensor waives any right to collect royalties from You for the exercise of the Licensed Rights, whether directly or through a collecting society under any voluntary or waivable statutory or compulsory licensing scheme. In all other cases the Licensor expressly reserves any right to collect such royalties. 47 | 48 | Section 3 – License Conditions. 49 | 50 | Your exercise of the Licensed Rights is expressly made subject to the following conditions. 51 | 52 | Attribution. 53 | 54 | If You Share the Licensed Material (including in modified form), You must: 55 | retain the following if it is supplied by the Licensor with the Licensed Material: 56 | identification of the creator(s) of the Licensed Material and any others designated to receive attribution, in any reasonable manner requested by the Licensor (including by pseudonym if designated); 57 | a copyright notice; 58 | a notice that refers to this Public License; 59 | a notice that refers to the disclaimer of warranties; 60 | a URI or hyperlink to the Licensed Material to the extent reasonably practicable; 61 | indicate if You modified the Licensed Material and retain an indication of any previous modifications; and 62 | indicate the Licensed Material is licensed under this Public License, and include the text of, or the URI or hyperlink to, this Public License. 63 | You may satisfy the conditions in Section 3(a)(1) in any reasonable manner based on the medium, means, and context in which You Share the Licensed Material. For example, it may be reasonable to satisfy the conditions by providing a URI or hyperlink to a resource that includes the required information. 64 | If requested by the Licensor, You must remove any of the information required by Section 3(a)(1)(A) to the extent reasonably practicable. 65 | If You Share Adapted Material You produce, the Adapter's License You apply must not prevent recipients of the Adapted Material from complying with this Public License. 66 | 67 | Section 4 – Sui Generis Database Rights. 68 | 69 | Where the Licensed Rights include Sui Generis Database Rights that apply to Your use of the Licensed Material: 70 | 71 | for the avoidance of doubt, Section 2(a)(1) grants You the right to extract, reuse, reproduce, and Share all or a substantial portion of the contents of the database; 72 | if You include all or a substantial portion of the database contents in a database in which You have Sui Generis Database Rights, then the database in which You have Sui Generis Database Rights (but not its individual contents) is Adapted Material; and 73 | You must comply with the conditions in Section 3(a) if You Share all or a substantial portion of the contents of the database. 74 | 75 | For the avoidance of doubt, this Section 4 supplements and does not replace Your obligations under this Public License where the Licensed Rights include other Copyright and Similar Rights. 76 | 77 | Section 5 – Disclaimer of Warranties and Limitation of Liability. 78 | 79 | Unless otherwise separately undertaken by the Licensor, to the extent possible, the Licensor offers the Licensed Material as-is and as-available, and makes no representations or warranties of any kind concerning the Licensed Material, whether express, implied, statutory, or other. This includes, without limitation, warranties of title, merchantability, fitness for a particular purpose, non-infringement, absence of latent or other defects, accuracy, or the presence or absence of errors, whether or not known or discoverable. Where disclaimers of warranties are not allowed in full or in part, this disclaimer may not apply to You. 80 | To the extent possible, in no event will the Licensor be liable to You on any legal theory (including, without limitation, negligence) or otherwise for any direct, special, indirect, incidental, consequential, punitive, exemplary, or other losses, costs, expenses, or damages arising out of this Public License or use of the Licensed Material, even if the Licensor has been advised of the possibility of such losses, costs, expenses, or damages. Where a limitation of liability is not allowed in full or in part, this limitation may not apply to You. 81 | 82 | The disclaimer of warranties and limitation of liability provided above shall be interpreted in a manner that, to the extent possible, most closely approximates an absolute disclaimer and waiver of all liability. 83 | 84 | Section 6 – Term and Termination. 85 | 86 | This Public License applies for the term of the Copyright and Similar Rights licensed here. However, if You fail to comply with this Public License, then Your rights under this Public License terminate automatically. 87 | 88 | Where Your right to use the Licensed Material has terminated under Section 6(a), it reinstates: 89 | automatically as of the date the violation is cured, provided it is cured within 30 days of Your discovery of the violation; or 90 | upon express reinstatement by the Licensor. 91 | For the avoidance of doubt, this Section 6(b) does not affect any right the Licensor may have to seek remedies for Your violations of this Public License. 92 | For the avoidance of doubt, the Licensor may also offer the Licensed Material under separate terms or conditions or stop distributing the Licensed Material at any time; however, doing so will not terminate this Public License. 93 | Sections 1, 5, 6, 7, and 8 survive termination of this Public License. 94 | 95 | Section 7 – Other Terms and Conditions. 96 | 97 | The Licensor shall not be bound by any additional or different terms or conditions communicated by You unless expressly agreed. 98 | Any arrangements, understandings, or agreements regarding the Licensed Material not stated herein are separate from and independent of the terms and conditions of this Public License. 99 | 100 | Section 8 – Interpretation. 101 | 102 | For the avoidance of doubt, this Public License does not, and shall not be interpreted to, reduce, limit, restrict, or impose conditions on any use of the Licensed Material that could lawfully be made without permission under this Public License. 103 | To the extent possible, if any provision of this Public License is deemed unenforceable, it shall be automatically reformed to the minimum extent necessary to make it enforceable. If the provision cannot be reformed, it shall be severed from this Public License without affecting the enforceability of the remaining terms and conditions. 104 | No term or condition of this Public License will be waived and no failure to comply consented to unless expressly agreed to by the Licensor. 105 | Nothing in this Public License constitutes or may be interpreted as a limitation upon, or waiver of, any privileges and immunities that apply to the Licensor or You, including from the legal processes of any jurisdiction or authority. 106 | 107 | Creative Commons is not a party to its public licenses. Notwithstanding, Creative Commons may elect to apply one of its public licenses to material it publishes and in those instances will be considered the “Licensor.” Except for the limited purpose of indicating that material is shared under a Creative Commons public license or as otherwise permitted by the Creative Commons policies published at creativecommons.org/policies, Creative Commons does not authorize the use of the trademark “Creative Commons” or any other trademark or logo of Creative Commons without its prior written consent including, without limitation, in connection with any unauthorized modifications to any of its public licenses or any other arrangements, understandings, or agreements concerning use of licensed material. For the avoidance of doubt, this paragraph does not form part of the public licenses. 108 | 109 | Creative Commons may be contacted at creativecommons.org. -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | Backend development best practices 2 | ================================== 3 | 4 | 5 | 6 | **Table of Contents** *generated with [DocToc](https://github.com/thlorenz/doctoc)* 7 | 8 | - [Translations of this document](#translations-of-this-document) 9 | - [N Commandments](#n-commandments) 10 | - [General points on guidelines](#general-points-on-guidelines) 11 | - [Development environment setup in README.md](#development-environment-setup-in-readmemd) 12 | - [Data persistence](#data-persistence) 13 | - [General considerations](#general-considerations) 14 | - [SaaS, cloud-hosted or self-hosted?](#saas-cloud-hosted-or-self-hosted) 15 | - [Persistence solutions](#persistence-solutions) 16 | - [RDBMS](#rdbms) 17 | - [NoSQL](#nosql) 18 | - [Document storage](#document-storage) 19 | - [Key-value store](#key-value-store) 20 | - [Graph database](#graph-database) 21 | - [Environments](#environments) 22 | - [Local development environment](#local-development-environment) 23 | - [Continuous integration environment](#continuous-integration-environment) 24 | - [Testing environment](#testing-environment) 25 | - [Staging environment](#staging-environment) 26 | - [Production environment](#production-environment) 27 | - [Bill of Materials](#bill-of-materials) 28 | - [Security](#security) 29 | - [Docker](#docker) 30 | - [Credentials](#credentials) 31 | - [Secrets](#secrets) 32 | - [Login Throttling](#login-throttling) 33 | - [User Password Storage](#user-password-storage) 34 | - [Audit Log](#audit-log) 35 | - [Suspicious Action Throttling and/or blocking](#suspicious-action-throttling-andor-blocking) 36 | - [Anonymized Data](#anonymized-data) 37 | - [Temporary file storage](#temporary-file-storage) 38 | - [Dedicated vs Shared server environment](#dedicated-vs-shared-server-environment) 39 | - [Application monitoring](#application-monitoring) 40 | - [Status page](#status-page) 41 | - [Status page format](#status-page-format) 42 | - [Plain format](#plain-format) 43 | - [JSON format](#json-format) 44 | - [HTTP status codes](#http-status-codes) 45 | - [Load balancer health checks](#load-balancer-health-checks) 46 | - [Access control](#access-control) 47 | - [Checklists](#checklists) 48 | - [Responsibility checklist](#responsibility-checklist) 49 | - [Release checklist](#release-checklist) 50 | - [General questions to consider](#general-questions-to-consider) 51 | - [Generally proven useful tools](#generally-proven-useful-tools) 52 | - [License](#license) 53 | 54 | 55 | 56 | # Translations of this document 57 | 58 | These are community-provided translations of this document. If you have comments regarding a particular translation, please approach the translation's maintainer. 59 | 60 | - [Turkish](https://github.com/umutphp/backend-best-practices) translation by [umutphp](https://github.com/umutphp) 61 | 62 | # N Commandments 63 | 64 | 1. README.md in the root of the repo is the docs 65 | 2. Single command run 66 | 3. Single command deploy 67 | 4. Repeatable and re-creatable builds 68 | 5. Build artifacts bundle a ["Bill of Materials"](#bill-of-materials) 69 | 6. Use [UTC as the timezone](http://yellerapp.com/posts/2015-01-12-the-worst-server-setup-you-can-make.html) all around 70 | 71 | # General points on guidelines 72 | 73 | We do not want to limit ourselves to certain tech stacks or frameworks. Different problems require different solutions, and hence these guidelines are valid for various backend architectures. 74 | 75 | # Development environment setup in README.md 76 | 77 | Document all the parts of the development/server environment. Strive to use the same setup and versions on all environments, starting from developer laptops, and ending with the actual production environment. This includes the database, application server, proxy server (nginx, Apache, ...), SDK version(s), gems/libraries/modules. 78 | 79 | Automate the setup process as much as possible. For example, [Docker Compose](https://docs.docker.com/compose/) could be used both in production and development to set up a complete environment, where [Dockerfiles](https://docs.docker.com/articles/dockerfile_best-practices/) fetch all parts of the software, and contain the necessary scripting to setup the environment and all the parts of it. Consider using archived copies of the installers, in case upstream packages later become unavailable. A minimum precaution is to keep a SHA-1 checksums of the packages, and to make sure that the checksum matches when the packages are installed. 80 | 81 | Consider storing any relevant parts of the development environment and its dependencies in some persistent storage. If the environment can be built using Docker, one possible way to do this is to use [docker export](http://docs.docker.com/reference/commandline/cli/#export). 82 | 83 | # Data persistence 84 | 85 | ## General considerations 86 | 87 | Independent of the persistence solution your project uses, there are general considerations that you should follow: 88 | 89 | * Have backups that are verified to work 90 | * Have scripts or other tooling for copying persistent data from one env to another, e.g. from prod to staging in order to debug something 91 | * Have plans in place for rolling out updates to the persistence solution (e.g. database server security updates) 92 | * Have plans in place for scaling up the persistence solution 93 | * Have plans or tooling for managing schema changes 94 | * Have monitoring in place to verify health of the persistence solution 95 | 96 | ## SaaS, cloud-hosted or self-hosted? 97 | 98 | An important choice regarding any solution is where to run it. 99 | 100 | * SaaS -- fast to get started, easy to scale up, some infrastructure work required to allow access from everywhere etc. 101 | * Self-hosted in the cloud -- allows tuning database more than SaaS and probably cheaper at scale in terms of hosting, but more labor-intensive 102 | * Self-hosted on own hardware -- able to tweak everything and manage physical security, but most expensive and labor intensive 103 | 104 | ## Persistence solutions 105 | 106 | This section aims to provide some guidance for selecting the type of persistence solution. The choice always needs to be tailored to the problem and none of these is a silver bullet, however. 107 | 108 | ### RDBMS 109 | 110 | Pick a relational database system such as PostgreSQL when data and transaction integrity is a major concern or when lots of data analysis is required. The [ACID compliance](https://en.wikipedia.org/wiki/ACID), aggregation and transformation functions of the RDBMS will help. 111 | 112 | ### NoSQL 113 | 114 | Pick a NoSQL database when you expect to scale horizontally and when you don't require ACID. Pick a system that fits your model. 115 | 116 | #### Document storage 117 | 118 | Stores documents that can be easily addressed and searched for by content or by inclusion in a collection. This is made possible because the database understands the storage format. Use for just that: storing large numbers of structured documents. Notable examples: 119 | 120 | * CouchDB 121 | * ElasticSearch 122 | 123 | > Note that since 9.4, PostgreSQL can also be used to store JSON natively. 124 | 125 | #### Key-value store 126 | 127 | Stores values, or sometimes groups of key-value pairs, accessible by key. Considers the values to be simply blobs, so does not provide the query capabilities of document stores. Scalable to immense sizes. Notable examples: 128 | 129 | * Cassandra 130 | * Redis 131 | 132 | #### Graph database 133 | 134 | General graph databases store nodes and edges of a graph, providing index-free lookups of the neighbors of any node. For applications where graph-like queries like shortest path or diameter are crucial. Specialized graph databases also exist for storing e.g. [RDF triples](https://en.wikipedia.org/wiki/Resource_Description_Framework). 135 | 136 | # Environments 137 | 138 | This section describes the environments you should have, at a minimum. It might sound like a lot, [but there is a purpose for each one](http://futurice.com/blog/five-environments-you-cannot-develop-without). 139 | 140 | - [Local development](#local-development-environment) 141 | - [Continuous integration](#continuous-integration-environment) 142 | - [Testing](#testing-environment) 143 | - [Staging](#staging-environment) 144 | - [Production](#production-environment) 145 | 146 | ## Local development environment 147 | 148 | This is your local development environment. You probably should not have a shared external development environment. Instead, you should work to make it possible to run the entire system locally, by stubbing or mocking third-party services as needed. 149 | 150 | ## Continuous integration environment 151 | 152 | CI is (among other things) for making sure that your software builds and automated tests pass after every change. 153 | 154 | ## Testing environment 155 | 156 | This is a shared environment where code is deployed to as often as possible, preferably every time code is committed to the mainline branch. It can be broken from time to time, especially in the active development phase. It is an important canary environment and is as similar to production as possible. Any external integrations are set up to use staging-level versions of other services. 157 | 158 | ## Staging environment 159 | 160 | Staging is set up exactly like production. No changes to the production environment happen before having been rehearsed here first. Any mysterious production issues can be debugged here. 161 | 162 | ## Production environment 163 | 164 | The big iron. Logged, monitored, cleaned up periodically, squared away and secured. 165 | 166 | # Bill of Materials 167 | 168 | This document must be included in every build artifact and shall contain the following: 169 | 170 | 1. What version(s) of an SDK and critical tools were used to produce it 171 | 1. Which dependencies have been included 172 | 1. A globally unique revision number of the build (i.e. a git SHA-1 hash) 173 | 1. The environment and variables used when building the package 174 | 1. A list of failed tests or checks 175 | 176 | 177 | # Security 178 | 179 | Be aware of possible security threats and problems. You should at least be familiar with the [OWASP Top 10 vulnerabilities](https://www.owasp.org/index.php/Category:OWASP_Top_Ten_Project), and you should of monitor vulnerabilities in any third party software you use. 180 | 181 | Good generic security guidelines would be: 182 | 183 | ## Docker 184 | 185 | **Using Docker will not make your service more secure.** Generally, you should consider at least following things if using Docker: 186 | 187 | - Don't run any untrusted binaries inside Docker containers 188 | - Create unprivileged users inside Docker containers and run binaries using unprivileged user instead of root whenever possible 189 | - Periodically rebuild and redeploy your containers with updated libraries and dependencies 190 | - Periodically update (or rebuild) your Docker hosts with latest security updates 191 | - Multiple containers running on same host will by default have some level of access to other containers and the host itself. Properly secure all hosts, and run containers with a minimum set of capabilities, for example preventing network access if they don't need it. 192 | 193 | ## Credentials 194 | 195 | Never send credentials unencrypted over public network. Always use encryption (such as HTTPS, SSL, etc.). 196 | 197 | ## Secrets 198 | 199 | Never store secrets (passwords, keys, etc.) in the sources in version control! It is very easy to forget they are there and the project source tends to end up in many places (developer machines, development test servers, etc) which unnecessarily increases the risk of an important secret being compromised. Also, version control has the nasty feature of overwriting file permissions, so even if you secure your config file permissions, the next time you check out the source, the permissions would be overwritten to the default public-readable. 200 | 201 | Probably the easiest way to handle secrets is to put them in a separate file on the servers that need them, and to be ignored by version control. You can keep e.g. a `.sample` file in the version control, with fake values to illustrate what should go there in the real file. In some cases, it is not easy to include a separate configuration file from the main configuration. If this happens, consider using environment variables, or writing the config file from a version-controlled template on deployment. 202 | 203 | ## Login Throttling 204 | 205 | Place limits on the amount of login attempts allowed per client per unit of time. Lock a user account for specific time after a given number of failed attempts (e.g. lock for 5 minutes after 20 failed login attempts). 206 | The aim of these measures is make online brute-force attacks against usernames/passwords infeasible. 207 | 208 | ## User Password Storage 209 | 210 | > Never EVER store passwords in plaintext! 211 | 212 | Never store passwords in reversible encrypted form, unless absolutely required by the application / system. Here is a good article about what and what not to do: https://crackstation.net/hashing-security.htm 213 | 214 | If you do need to be able to obtain plaintext passwords from the database, here are some suggestions to follow. 215 | 216 | If passwords won't be converted back to plaintext often (e.g. special procedure is required), keep decryption keys away from the application that accesses the database regularly. 217 | 218 | If passwords still need to be regularly decrypted, separate the decryption functionality from the main application as much as possible—e.g. a separate server accepts requests to decrypt a password, but enforces a higher level of control, like throttling, authorization, etc. 219 | 220 | Whenever possible (it should be in a great majority of cases), store passwords using a good one-way hash with a good random salt. And, no, SHA-1 is not a good choice for a hashing function in this context. Hash functions that are designed with passwords in mind are deliberately slower, which makes offline brute-force attacks more time consuming, hence less feasible. See this post for more details: http://security.stackexchange.com/questions/211/how-to-securely-hash-passwords/31846#31846 221 | 222 | ## Audit Log 223 | 224 | For applications handling sensitive data, especially where certain users are allowed a relatively wide access or control, it's good to maintain some kind of audit logging—storing a sequence of actions / events that took place in the system, together with the event/source originator (user, automation job, etc). This can be, e.g: 225 | 226 | 2012-09-13 03:00:05 Job "daily_job" performed action "delete old items". 227 | 2012-09-13 12:47:23 User "admin_user" performed action "delete item 123". 228 | 2012-09-13 12:48:12 User "admin_user" performed action "change password of user foobar". 229 | 2012-09-13 13:02:11 User "sneaky_user" performed action "view confidential page 567". 230 | ... 231 | 232 | The log may be a simple text file or stored in a database. At least these three items are good to have: an exact timestamp, the action/event originator (who did this), and the actual action/event (what was done). The exact actions to be logged depend on what is important for the application itself, of course. 233 | 234 | The audit log may be a part of the normal application log, but the emphasis here is on logging who did what and not only that a certain action was performed. If possible, the audit log should be made tamper-proof, e.g. only be accessible by a dedicated logging process or user and not directly by the application. 235 | 236 | ## Suspicious Action Throttling and/or blocking 237 | 238 | This can be seen as a generalization of the Login Throttling, this time introducing similar mechanics for arbitrary actions that are deemed "suspicious" within the context of the application. For example, an ERP system which allows normal users access to a substantial amount of information, but expects users to be concerned only with a small subset of that information, may limit attempts to access larger than expected datasets too quickly. E.g. prevent users from downloading list of all customers, if users are supposed to work on one or two customers at a time. Note that this is different from limiting access completely—users are still allowed to retrieve information about any customer, just not all of them at once. Depending on the system, throttling might not be enough—e.g. when one invokes an action on all resources with a single request. Then blocking might be required. Note the difference between making 1000 requests in 10 seconds to retrieve full customer information, one customer at a time, and making a single request to retrieve that information at once. 239 | 240 | What is suspicious here depends strongly on the expected use of the application. E.g. in one system, deleting 10000 records might be completely legitimate action, but not so in an another one. 241 | 242 | ## Anonymized Data 243 | 244 | Whenever large datasets are exported to third parties, data should be anonymized as much as possible, given the intended use of the data. For example, if a third party service will provide general statistical analysis on a customer database, it probably does not need to know the names, addresses or other personal information for individual customers. Even a generic customer ID number might be too revealing, depending on the data set. Take a look at this article: http://arstechnica.com/tech-policy/2009/09/your-secrets-live-online-in-databases-of-ruin/. 245 | 246 | Avoid logging personally identifiable information, for example user’s name. 247 | 248 | If your logs contain sensitive information, make sure you know how logs are protected and where they are located also in the case of cloud hosted log management systems. 249 | 250 | If you must log sensitive information try hashing before logging so you can identify the same entity between different parts of the processing. 251 | 252 | ## Temporary file storage 253 | 254 | Make sure you are aware where your application is storing temporary files. If you are using publicly accessible directories (which are most probably the default) like `/tmp` and `/var/tmp`, make sure you create your files with mode 600, so that they are readable only by the user your application is running as. Alternatively, have a protected directory for storing temporary files (directory accessible only by the application user). 255 | 256 | ## Dedicated vs Shared server environment 257 | 258 | The security threats can be quite different depending on whether the application is going to run in a shared or a dedicated environment. Shared here means that there are other (not necessarily 3rd party) applications running on the same server. In that case, having appropriate file permissions becomes critical, otherwise application source code, data files, temporary files, logs, etc might end up accessible by unintended users. Then a security breach in a 3rd party application might result in your application being compromised. 259 | 260 | You can never be sure what kind of an environment your application will run for its entire life time—it may start on a dedicated server, but as time goes, 3rd party applications might be added to the same system. That is why it is best to plan from the very first moment that your application runs in a shared environment, and take all precautions. Here's a non-exhaustive list of the files/directories you need to think about: 261 | 262 | * application source code 263 | * data directories 264 | * temporary storage directories (often by default the system wide /tmp might be used - see above) 265 | * configuration files 266 | * version control directories - .git, .hg, .svn, etc. 267 | * startup scripts (may contain initialization variables, secrets, etc) 268 | * log files 269 | * crash dumps 270 | * private keys (SSL, SSH, etc) 271 | * etc. 272 | 273 | Sometimes, some files need to be accessible by different users (e.g. static content served by apache). In that case, take care to allow only access to what is really needed. 274 | 275 | Keep in mind that on a UNIX/Linux filesystem, write access to a directory is permission-wise very powerful—it allows you to delete files in that directory and recreate them (which results in a modified file). /tmp and /var/tmp are by default safe from this effect, because of the sticky bit that should be set on those. 276 | 277 | Additionally, as mentioned in the secrets section, file permissions might not be preserved in version control, so even if you set them once, the next checkout/update/whatever may override them. A good idea is then to have a Makefile, a script, a version control hook or something similar that would set the correct permissions when updating the sources. 278 | 279 | # Application monitoring 280 | 281 | Monitoring the full status of a service requires that both OS-level and application-specific monitoring checks are performed. OS-level checks include, for example, CPU, disk or memory usage, running processes, open ports, etc. Application specific checks are, however, the most important from the point of view of the running service. These can be anything from "does this URL respond and return the HTTP status 200", to checking database connectivity, data consistency, and so on. 282 | 283 | This section describes a way to implement the application-specific checks, which would make it easier to monitor the overall application health and give full control to the application developers to determine what checks are meaningful in the context of the concrete application. 284 | 285 | In essence, the idea is to have a single endpoint (an application URL) that can give a good status overview of the entire application. This is implemented inside the application and requires work from the project team, but on the other hand, the project team is the one who can really define what is an OK state of the application and what is considered an ERROR state. 286 | 287 | The application could implement any number of "subsystem" checks. For example, 288 | 289 | * connection to the database is up 290 | * data is in an consistent state (e.g. a list of items in a certain database table is meaningful) 291 | * 3rd party services that the application integrates to are reachable 292 | * ElasticSearch indexes are in a consistent state 293 | * anything else that makes sense for the application 294 | 295 | A combined overview status should be provided by the application, aggregating the information from the various subsystem checks. The idea is that an external monitoring system can track only this combined overview, so that the external monitoring does not need to be reconfigured when a new application check is added or modified. Moreover, the developers are the ones that can decide about what the overall status is based on regarding subsystem checks (i.e. which ones are critical, while ones are not, etc). 296 | 297 | ## Status page 298 | 299 | All status checks SHOULD be accessible under `/status` URLs as follows: 300 | 301 | * `/status` - the overall status page (mandatory) 302 | * `/status/subsystem1` - a status check for speciffic subsystem (optional) 303 | * ... 304 | 305 | The main `/status` page should at a minimum give an overall status of the system, as described in the next section. This means that the main `/status` page should execute ALL subsystem checks and report the aggregated overall system status. It is up to the developers to decide how the overall system status is determined based on the subsystems. For example an `ERROR` state of some non-critical subsystem may only generate an overall `WARNING` status. 306 | 307 | For performance reasons, some subsystem checks may be excluded from this overall `/status` page - for example, when the check causes higher resource usage, takes longer time to complete, etc. Overall, the main status page should be light enough so that it can be polled relatively often (every 1-3 minutes) and not cause too much load on the system. Subsystem checks that are excluded from the overall status check should have their own URLs, as shown above. Naturally, monitoring those would require modifications in the monitoring system configuration. To overcome this, a different approach can be taken: the application could perform the heavy subsystem checks in a background process at a rate that is acceptable and store the status internally. This would allow the main status page to reflect also these heavy checks (e.g. it would retrieve the last performed check status). This approach should be used, unless its implementation is too difficult. 308 | 309 | ## Status page format 310 | 311 | We propose two alternative formats for the status pages - `plain` and `JSON`. 312 | 313 | ### Plain format 314 | 315 | The plain format has one status per line in the form `key: value`. The key is a subsystem/check name and the value is the status value. The status value can be one of: 316 | 317 | * `OK` 318 | * `WARN Message` 319 | * `ERROR Message` 320 | 321 | where `Message` can be some meaningful text, that can help quickly identify the problem. The message is single line, without specified length restriction, but use common sense - e.g. probably should not be longer than 200 characters. 322 | 323 | The main status page MUST have a key called `status` that shows the overall aggregated application status. The individual subsystem check status lines are optional. Subsystem status keys should have a suffix `_status`. Here are some examples: 324 | 325 | When everything is ok: 326 | 327 | ``` 328 | status: OK 329 | database_status: OK 330 | elastic_search_status: OK 331 | ``` 332 | 333 | When some check is failing: 334 | 335 | ``` 336 | status: ERROR Database is not accessible 337 | database_status: ERROR Connection failed 338 | elastic_search_status: OK 339 | ``` 340 | 341 | Multiple failures at the same time: 342 | 343 | ``` 344 | status: ERROR failed subsystems: database, elasticsearch. For details see https://myapp.example.com/status 345 | database_status: ERROR Connection failed 346 | elastic_search_status: WARN Too few entries in index A. 347 | ``` 348 | 349 | In addition to the status lines, a status page can have non-status keys. For example, those can be showing some metrics (that may or may not be monitored). The additional keys must be prefixed with the subsystem name. 350 | 351 | ``` 352 | status: OK 353 | database_status: OK 354 | database_customers: 378 355 | database_items: 8934748 356 | elastic_search_status: OK 357 | elastic_search_shards: 20 358 | ``` 359 | 360 | The overall status may naturally be based on some of the metrics: 361 | 362 | ``` 363 | status: WARN Too few items in database 364 | database_status: WARN Too few customers in database 365 | database_customers: 378 366 | database_items: 1 367 | elastic_search_status: OK 368 | elastic_search_shards: 20 369 | ``` 370 | 371 | Subsystem checks that have their own URL (`/status/subsystemX`) should follow a similar format, having a mandatory key `status` and a number of optional additional keys. Example for e.g. `/status/database`: 372 | 373 | ``` 374 | status: OK 375 | connection_pool: 30 376 | latency: 2 377 | ``` 378 | 379 | ### JSON format 380 | 381 | The JSON format of the status pages can be often preferable, for example when the tooling or integration to other systems is easier to achieve via a common data format. 382 | 383 | The status values follow the same format as described above - `OK`, `WARN Message` and `ERROR Message`. 384 | 385 | The equivalent to the status key form the plain format is a `status` key in the root JSON object. Subsystems should use nested objects also having a mandatory `status` key. Here are some examples: 386 | 387 | All is fine: 388 | 389 | ```json 390 | { 391 | "status": "OK", 392 | "database": { 393 | "status": "OK" 394 | }, 395 | "elastic_search": { 396 | "status": "OK" 397 | } 398 | } 399 | ``` 400 | 401 | Status page with additional metrics: 402 | 403 | ```json 404 | { 405 | "status": "OK", 406 | "uptime": 18234, 407 | "database": { 408 | "status": "OK", 409 | "connection_pool": 30 410 | }, 411 | "elastic_search": { 412 | "status": "OK", 413 | "multinode": false 414 | } 415 | } 416 | ``` 417 | 418 | Something failing: 419 | 420 | ```json 421 | { 422 | "status": "ERROR Database is not accessible. See https://myapp.example.com/status for details.", 423 | "database": { 424 | "status": "ERROR Connection failed", 425 | "connection_timeout": 30 426 | }, 427 | "elastic_search": { 428 | "status": "OK" 429 | } 430 | } 431 | ``` 432 | 433 | ## HTTP status codes 434 | 435 | Whenever the overall application status is OK, the HTTP status code in the status page response MUST be set to 200 (OK). Otherwise a 5XX error code SHOULD be set. For example, code 500 (Internal Server Error) could be used. Optionally, non-critical WARN status may still respond with 200. 436 | 437 | ## Load balancer health checks 438 | 439 | Often the application is running behind a load balaner. Load balancers typically can monitor application servers by polling a given URL. The health check is used so that the load balancer can stop routing traffic to the failing application servers. 440 | 441 | The overall `/status` page is a good candidate for the load balancer health check URL. However, a separate dedicated status page for a load balancer health check provides an important benefit. Such a page can be fine-tuned for when the application is considered to be healthy from the load balancer's perspective. For example, an error in a subsystem may still be considered a critical error for the overall application status, but does not necessarily need to cause the application server to be removed from the load balancer pool. A good example is a 3rd party integration status check. The load balancer health check page should only return non-200 status code when the application instance must be considered non-operational. 442 | 443 | The load balancer health check page should be placed at a `/status/health` URL. Depending on your load balancer, the format of that page may deviate from the overall status format described here. Some load balancers may even observe only the returned HTTP status code. 444 | 445 | ## Access control 446 | 447 | The status pages may need proper authorization in place, especially in case they expose debugging information in status messages or application metrics. HTTP basic authentication or IP-based restrictions are usually good enough candidates to consider. 448 | 449 | # Checklists 450 | 451 | To avoid forgetting the most important things, here are some handy checklists for your current or upcoming projects. 452 | 453 | ## Responsibility checklist 454 | 455 | In bigger projects, especially when multiple parties are involved, it is crucial to keep track of all different aspects and its responsibilities. The following table illustrates how a go-live checklist for releasing a website could look like: 456 | 457 | | Aspect | Task | Responsible person / party | Deadline | Status | 458 | |--- |--- |--- |--- |--- | 459 | | Frontend | Website wireframes | e.g. Company B / Person X | e.g. 17.6. | e.g. in progress | 460 | | Frontend | Website design | e.g. Company A / Person Z | e.g. 23.7. | e.g. waiting | 461 | | Frontend | Website templates | | | | 462 | | Frontend | Content creation and population | | | | 463 | | Backend | Setup CMS | | | | 464 | | Backend | Setup staging environment | | | | 465 | | Backend | Setup production environment | | | | 466 | | Backend | Migrate hosting services to client accounts | | | | 467 | | Backend | DNS configuration | | | | 468 | | Backend | Setup website analytics | | | | 469 | | Backend | Integrate marketing automation | | | | 470 | | Backend | Web font license | | | | 471 | | Dates | Website/Product go-live time | | | | 472 | | Dates | Publish the website | | | | 473 | 474 | ## Release checklist 475 | 476 | When you are ready to release, remember to check off everything on your release checklist! The resulting peace of mind, repeatability and dependability is a great boon. 477 | 478 | You *do* have one, right? If you don't, here is a good generic starting point for you: 479 | 480 | * [ ] Deploying works the same no matter which environment you are deploying to 481 | * [ ] All environments have well defined names, and they are referred to using those names 482 | * [ ] All environments have the same underlying software stack 483 | * [ ] All environment configuration is version controlled (web server config, CI build scripts etc.) 484 | * [ ] The product has been tested from the networks from where it will be used (e.g. public Internet, customer LAN) 485 | * [ ] The product has been tested with all of the targeted devices 486 | * [ ] There is a simple way to find out what code is running in any given environment 487 | * [ ] A versioning scheme has been defined 488 | * [ ] Any version of the product should be easily mappable to a state of the code base 489 | * [ ] Rolling back a deployment is possible 490 | * [ ] Backups are running 491 | * [ ] Restoring from a backup has been tested 492 | * [ ] No secrets are stored in version control 493 | * [ ] Logging is turned on 494 | * [ ] There is a well defined process for accessing and searching through logs 495 | * [ ] Logging includes exceptions and stack traces where appropriate 496 | * [ ] Errors can be mapped to stack traces 497 | * [ ] Release notes have been written 498 | * [ ] Server environments are up-to-date 499 | * [ ] A plan for updating the server environments exists 500 | * [ ] The product has been load tested 501 | * [ ] A method exists for replicating the state of one environment in another (e.g. copy prod to QA to reproduce an error) 502 | * [ ] All repeating release processes have been automated 503 | 504 | # General questions to consider 505 | 506 | * What is the expected/required life-span of the project? 507 | * Is the project one-off, or will there be continuous development? 508 | * What is the release cycle for a version of the service? 509 | * What environments (dev, test, staging, prod, ...) are going to be set up? 510 | * How will downtime of the production service impact the value of the service? 511 | * How mature is the technology? Is major changes that break backward compatibility to be expected? 512 | 513 | # Generally proven useful tools 514 | 515 | * [HTTPie](https://github.com/jakubroztocil/httpie) is a great tool for testing APIs on the command line. It's simple to pass in custom headers and cookies, and it even has session support. 516 | * [jq](http://stedolan.github.io/jq/) is a CLI JSON processor. Massage JSON data coming in from cURL (or of course HTTPie!) at will. Another great tool for API testing or exploration. 517 | 518 | # License 519 | 520 | [Futurice Oy](http://www.futurice.com) 521 | Creative Commons Attribution 4.0 International (CC BY 4.0) 522 | --------------------------------------------------------------------------------