├── .gitignore
├── 01-getting-started.rst
├── 02-dns.rst
├── 03-users-and-directories.rst
├── 04-web-server.rst
├── 05-static-files.rst
├── 06-gunicorn.rst
├── 07-settings.rst
├── 08-postgresql.rst
├── 09-recovery1.rst
├── 10-recovery2.rst
├── CHANGES.txt
├── Makefile
├── README.rst
├── _static
├── cover.png
├── epub.css
├── how-static-files-work-apache.png
├── how-static-files-work-nginx.png
├── output_of_ls.png
├── putty-config.png
└── top.png
├── _templates
└── epub-cover.html
├── conf.py
├── cover.pdf
├── django_deployment_vars.example
├── fixlatex
├── index.rst
├── index.rst-latex
├── main_toctree_latex.rst
├── meta.rst
├── meta_epub.rst
├── requirements.txt
└── testscript
/.gitignore:
--------------------------------------------------------------------------------
1 | _build
2 | django_deployment_vars
3 |
--------------------------------------------------------------------------------
/02-dns.rst:
--------------------------------------------------------------------------------
1 | DNS
2 | ===
3 |
4 | Introduction to the DNS
5 | -----------------------
6 |
7 | In this book, you will find that I like to show you the code first, even
8 | if you don't understand it clearly, and then explain to you how things
9 | work. Unfortunately, I cannot do that with DNS. You need to understand
10 | it first and then write the code. **The big problem with DNS is that if
11 | you screw things up, even if you fix or revert things, it may be days
12 | before the system works again**. So you need to read carefully.
13 |
14 | When you open your browser and type http://djangodeployment.com/, the
15 | first thing your browser does is find the IP address of the machine
16 | djangodeployment.com. For this, it asks a component of the operating
17 | system called the "resolver": "What is the IP address of
18 | djangodeployment.com?" After some time (usually from a few ms to a few
19 | seconds), the resolver replies: "It's 71.19.145.109". The browser then
20 | proceeds to open a TCP connection on port 80 of that address and use
21 | HTTP to request the required information (in our case the home page of
22 | djangodeployment.com).
23 |
24 | .. note:: What about IPv6?
25 |
26 | If your computer has an IPv6 connection to the Internet, your browser
27 | will actually first ask the resolver for the IPv6 address of server. For
28 | djangodeployment.com, the resolver will eventually reply "It's
29 | 2605:2700:0:3::4713:916d". The browser will then attempt to connect to
30 | that IPv6 address. If there is any kind of error, such as the resolver
31 | being unable to find an IPv6 address (many web servers aren't yet
32 | configured to use one), or the IPv6 address not responding (network
33 | errors are still more frequent with IPv6 than IPv4), the browser will
34 | fall back to using the IPv4 address, as I explained above.
35 |
36 | The only thing the resolver does is ask another machine to do the actual
37 | resolving; that other machine is called a name server. Most likely you
38 | are using a name server provided by your Internet Service Provider. I
39 | will be calling that name server "your name server", although it's not
40 | exactly yours; but it's the one you are using.
41 |
42 | .. tip:: Which is my name server?
43 |
44 | On Unix-like machines (including Mac OS X), the name server used is
45 | stored in file ``/etc/resolv.conf``; the file is usually setup
46 | during DHCP, but on systems with a static IP address it is often
47 | edited manually. On Windows, you can determine the name server by
48 | typing the command 'ipconfig /all', where it shows as "DNS Servers";
49 | it is setup during DHCP, but on systems with a static IP address it
50 | is often edited manually in the network properties. Your system may
51 | be configured to use more than one name server, in which case it
52 | chooses one and uses another if the first one does not respond.
53 |
54 | You might find out that the name server is your aDSL router. Actually
55 | your aDSL router is merely a so-called "forwarding" name server,
56 | which only transfers the query to another name server, which is the
57 | one that does the real magic. You can find which one it is by logging
58 | in your router's web interface and browsing through its settings. It
59 | is setup during the establishment of the aDSL connection.
60 |
61 | When I say "your name server" I don't mean the forwarding name
62 | server, but the one that does the real job.
63 |
64 | In order to find out the address that corresponds to a name, your name
65 | server makes a series of questions to other name servers on the
66 | Internet:
67 |
68 | 1. First, your name server picks up one of thirteen so-called "root name
69 | servers". The IP addresses of these thirteen name servers are
70 | well-known (the official list is at
71 | http://www.internic.net/domain/named.root) and generally do not
72 | change, and your name server is preprogrammed to use them. Your name
73 | server tells the chosen root name server something like this: "Hello,
74 | I'd like to know the IP address of djangodeployment.com please."
75 |
76 | 2. The root name server replies: "Hi. I don't know the address of
77 | djangodeployment.com; you should ask one of these name servers,
78 | which are responsible for all domain names ending in '.com'" (and it
79 | supplies a number of IP addresses (actually thirteen).
80 |
81 | 3. Your name server picks up one of the .com name servers and asks it:
82 | "Hello, I'd like to know the IP address of djangodeployment.com
83 | please."
84 |
85 | 4. The .com name server replies: "Hi. I don't know the address of
86 | djangodeployment.com; you should ask one of these name servers,
87 | which are responsible for djangodeployment.com" (and it supplies a
88 | number of IP addresses, which at the time of this writing are
89 | three).
90 |
91 | 5. Your name server picks up one of the three name servers and asks it:
92 | "Hello, I'd like to know the IP address of djangodeployment.com
93 | please."
94 |
95 | 6. The djangodeployment.com name server replies: "Sure,
96 | djangodeployment.com is 71.19.145.109".
97 |
98 | After your name server gets this information, it replies to the
99 | resolver, which in turn replies to your browser.
100 |
101 | In this example, there were only six steps, but they could be more; for
102 | example, if you try to resolve cs.man.ac.uk, first the root servers will
103 | be asked, these will direct to the .uk name servers, which will direct
104 | to the .ac.uk name servers, and so on, for a total of 10 steps (this is
105 | not always the case; when resolving itia.civil.ntua.gr, the .gr servers
106 | refer you to the .ntua.gr servers, and these in turn refer you directly
107 | to the itia.civil.ntua.gr servers, for a total of 8 steps).
108 |
109 | All this discussion between servers takes time and network traffic, so
110 | it only happens the first time you ask to connect to the web page. The
111 | results of the DNS query are heavily cached in order to make it faster
112 | for the next times. Typically web browsers cache such results for about
113 | half an hour, or until browser restart. Most important, however, your
114 | name server caches results for much longer. In fact, the response (6)
115 | above is not exactly what I wrote; instead, it is "Sure,
116 | djangodeployment is 71.19.145.109, and you can cache this information
117 | for up to 8 hours". Equally important, the response (4) is "I don't know
118 | the address of djangodeployment.com; you should ask one of these three
119 | name servers, which are responsible for djangodeployment.com, and you
120 | can cache this information (i.e. the list of name servers that are
121 | responsible for djangodeployment.com) for up to two days". Caching times
122 | are configurable to various degrees and are usually from 5 minutes to 48
123 | hours, but caching for a whole week is not uncommon. Rarely does your
124 | name server need to go through the complete list of steps; most often it
125 | will have cached the name servers for the top level domain, and
126 | sometimes it will also have cached some lower stuff.
127 |
128 | So here is the big problem with DNS: it's not hard to get it right (it's
129 | easier than writing a Django program), but if you make the slightest
130 | error you might be stuck with the wrong information for up to two days
131 | (or even a week). If you make an error when configuring your domain
132 | name, and a customer attempts to access your site, the error may be
133 | cached by the customer's name server for up to two days, and you can do
134 | nothing about it except fix the error and wait. There is no way to send
135 | a signal to all the name servers of the world and tell them "hey, please
136 | invalidate the cache for djangodeployment.com". Different customers or
137 | visitors of your site will experience different amounts of downtime,
138 | depending on when exactly their local name server will decide to expire
139 | its cache.
140 |
141 | Registering a domain name
142 | -------------------------
143 |
144 | You register a domain name with a registrar. Registrars are companies
145 | that provide the service of registering a domain name for you. These
146 | companies are authorized by ICANN, the organization ultimately
147 | responsible for domain names. So, before registering a domain name, you
148 | first need to select a registrar, and there are many. I'm using
149 | BookMyName.com, a French registrar which I selected more or less at
150 | random. Its web site is unpolished but it works. Another French
151 | registrar, particularly popular in the free software community, is
152 | Gandi, but it's a bit more expensive than others. The most popular
153 | registrar worldwide is GoDaddy, but it supported SOPA, and for me that's
154 | a deal breaker. Another interesting option is Namecheap; I think its
155 | software is nice and its prices are reasonable. If you don't know what
156 | to do, choose that one. There are also dozens of other options, and it's
157 | fine to choose another one. Note that I'm not affiliated with any
158 | registrar (and certainly none of the four I've mentioned).
159 |
160 | For practice, you can go and register a cheap test domain; Namecheap,
161 | for example, sells some domains for $0.88 per year. Go get one now so
162 | that you can start messing around with it. Below I use ".com" as an
163 | example, but if your domain is different ($0.88 domains certainly aren't
164 | .com) it doesn't matter, exactly the same rules apply.
165 |
166 | When you register a .com domain name at the registrar's web site, two
167 | things happen:
168 |
169 | 1. The registrar configures some name servers to be the name servers
170 | for the domain. For example, when I registered djangodeployment.com
171 | at the web site of bookmyname.com, bookmyname.com configured three
172 | name servers (nsa.bookmyname.com, nsb.bookmyname.com, and
173 | nsc.bookmyname.com) as the djangodeployment.com name servers. These
174 | are the three servers that are involved in steps 5 and 6 of the
175 | resolving procedure that I presented in the previous section. I am
176 | going to call them the **domain's name servers**.
177 |
178 | 2. The registrar notifies the .com name servers that domain
179 | djangodeployment.com is registered, and that the site name servers
180 | are the three mentioned above. I am going to call the .com name
181 | servers the **upstream name servers**. If your domain is
182 | mydomain.co.uk, the upstream name servers are those responsible for
183 | .co.uk.
184 |
185 |
186 | .. _adding_dns_records:
187 |
188 | Adding records to your domain
189 | -----------------------------
190 |
191 | The DNS database consists of records. Each record maps a name to a
192 | value. For example, a record says that the name djangodeployment.com
193 | corresponds to the value 71.19.145.109. Your registrar provides a web
194 | interface with which you can add, remove and edit records (in Namecheap
195 | you need to go to the Dashboard, Domain list, Manage (the domain),
196 | Advanced DNS). Go to your registrar's interface and, for the test domain
197 | you created, create the following records (remember that
198 | $SERVER_IPv4_ADDRESS and $SERVER_IPv6_ADDRESS are placeholders and you
199 | need to replace them with something else; also omit the "AAAA" records
200 | if your server doesn't have an IPv6 address):
201 |
202 | ==== ==== ===== ====================
203 | Name Type TTL Value
204 | ==== ==== ===== ====================
205 | @ A 300 $SERVER_IPv4_ADDRESS
206 | @ AAAA 300 $SERVER_IPv6_ADDRESS
207 | www A 300 $SERVER_IPv4_ADDRESS
208 | www AAAA 300 $SERVER_IPv6_ADDRESS
209 | ==== ==== ===== ====================
210 |
211 | Each record has a type. There are many different types of records, but
212 | the ones you need to be aware of here are A, AAAA, and CNAME. "A" defines
213 | an IPv4 address, whereas "AAAA" defines an IPv6 address. We will deal
214 | with CNAME a bit later.
215 |
216 | When you see "@" as a name, I mean a literal "@" symbol. This is
217 | shorthand for writing the domain itself. If your domain is mydomain.com,
218 | then whether you enter "mydomain.com." (with a trailing dot) or "@" in
219 | the field for the name is exactly the same thing. Some registrars might
220 | be allowing only the shorthand "@", but often it is allowed to write
221 | "mydomain.com.". Use the "@", which is more common. The first of these
222 | four records means that the domain itself resolves to
223 | $SERVER_IPv4_ADDRESS. Likewise for the second record.
224 |
225 | If your domain is mydomain.com, the next two records define the IP
226 | addresses for www.mydomain.com. In the field for the name, you can
227 | either write "www.mydomain.com." (with a trailing dot), or "www",
228 | without a trailing dot. Use the latter, which is more common. In the
229 | rest of the text, I will be using $DOMAIN and www.$DOMAIN instead of
230 | mydomain.com and www.mydomain.com, and you should understand that you
231 | need to replace "$DOMAIN" with your actual domain.
232 |
233 | These four records are normally all you need to set. In theory you can
234 | set www.$DOMAIN to point to a different server than $DOMAIN, but this is
235 | uncommon. You can also define ftp.$DOMAIN and whateverelse.$DOMAIN, but
236 | this is often not needed.
237 |
238 | The TTL, meaning "time to live", is the maximum allowed caching time.
239 | When a name server asks the domain's name server for the IPv4 address of
240 | $DOMAIN, the domain's name server will reply "$DOMAIN is 71.19.145.109,
241 | and you can cache this information for 300 seconds". Don't make it less
242 | than 300; it will increase the number of queries your visitors will
243 | make, thus making responses a bit slower; and some name servers will
244 | ignore the TTL if it's less than 300 and use 300 anyway. A common
245 | tactic is to use a large value (say 28800), and when for some reason you
246 | need to switch to another server, you reduce that to 300, wait at least
247 | 8 hours (28800 seconds), then bring the server down, change the DNS to
248 | point to the new server, then start the new server. If planned correctly
249 | and executed without problems, the switch will result in a downtime of
250 | no more than 300 seconds. After this is finished, you change the TTL to
251 | 28800 again.
252 |
253 | You can usually leave the TTL field empty. In that case, a default
254 | TTL applies. The default TTL for the zone ("zone" is more or less the
255 | same as a domain) is normally configurable, but this may depend on the
256 | web interface of the registrar.
257 |
258 | CNAME records are a kind of alias. For example, one of the domains I'm
259 | managing is openmeteo.org, and its database is like this:
260 |
261 | ======= ===== ===== ====================================
262 | Name Type TTL Value
263 | ======= ===== ===== ====================================
264 | @ A 300 83.212.168.232
265 | @ AAAA 300 2001:648:2ffc:1014:a800:ff:feb1:6047
266 | www CNAME 300 ilissos.openmeteo.org.
267 | ilissos A 300 83.212.168.232
268 | ilissos AAAA 300 2001:648:2ffc:1014:a800:ff:feb1:6047
269 | ======= ===== ===== ====================================
270 |
271 | The machine that hosts the web service for openmeteo.org is called
272 | ilissos.openmeteo.org. When the name server is queried for
273 | www.openmeteo.org, it replies: "Hi, www.openmeteo.org is an alias; the
274 | canonical name is ilissos.openmeteo.org." So then it has to be queried
275 | again for ilissos.openmeteo.org. (However, you cannot use CNAME for the
276 | domain itself, only for other hosts within the domain.) On the right
277 | hand side of CNAMEs, you should always specify the fully qualified
278 | domain name **and end it with a dot**, such as "ilissos.openmeteo.org.",
279 | as in the example above.
280 |
281 | I used to use CNAMEs a lot, but now I avoid them, because they make
282 | first-time visits a little slower. Assume you want to visit
283 | "http://www.openmeteo.org/synoptic/irma". Then these things happen:
284 |
285 | 1. www.openmeteo.org is resolved, and it turns out to be an alias of
286 | ilissos.openmeteo.org.
287 |
288 | 2. ilissos.openmeteo.org is resolved to an IP address.
289 |
290 | 3. The request http://www.openmeteo.org/synoptic/irma is sent to the IP
291 | address. The web server redirects it to
292 | http://openmeteo.org/synoptic/irma, without the www.
293 |
294 | 4. The request http://openmeteo.org/synoptic/irma is sent to the IP
295 | address, and it is redirected to
296 | http://openmeteo.org/synoptic/irma/, because I'm using
297 | ``APPEND_SLASH = True`` in Django's settings.
298 |
299 | 5. The request http://openmeteo.org/synoptic/irma/ is sent to the IP
300 | address, and this time a proper response is returned.
301 |
302 | All these steps take a small amount of time which may add up to one
303 | second or more. This is only for the first request of first time
304 | visitors, but today people have little patience, and it's a good idea
305 | for the visitor's browser to start drawing something on the screen
306 | within at most one second, otherwise you will be losing a non-negligible
307 | number of visitors. Besides, a high quality web site should not have
308 | unnecessary delays. So lately I've stopped using CNAMEs, and I've
309 | stopped redirecting between URLs with and without the leading www.
310 |
311 | Changing the domain's name servers
312 | ----------------------------------
313 |
314 | As I said, when you register the domain, the registrar configures its
315 | own name servers to act as the domain's name servers, and also tells
316 | the upstream name servers the ip addresses and/or names of the domain's
317 | name servers. While this is normally sufficient, there are cases when
318 | you will want to use other name servers instead of the registrar's name
319 | servers. For example, DigitalOcean offers name servers and a web
320 | interface to configure them, and if DigitalOcean's web interface is
321 | easier, or if it integrates well with droplets making configuration
322 | faster, you might want to use that. In such a case, you can go to the
323 | registrar's web interface and specify different name servers. The
324 | registrar will tell the upstream name servers which are your new name
325 | servers. It can't setup the new name servers themselves, you have to do
326 | that yourself (e.g. via the DigitalOcean's web interface if you are
327 | using DigitalOcean's name servers).
328 |
329 | In this case, you must be aware that while, as we saw in the previous
330 | section, you can configure the TTL for the DNS records of your domain,
331 | **you cannot configure the TTL of the upstream name servers**. The
332 | upstream name servers, when queried about your domain, respond with
333 | something like "the name servers for the requested domain are such and
334 | such, and you can cache this information for 2 days". This TTL,
335 | typically 2 days, is not configurable by you, so you have to live with
336 | it. So changing name servers is a bit risky, because if you do anything
337 | wrong, different users will experience different downtimes that can last
338 | for up to 2 days.
339 |
340 | Finally, some information about the NS record, which means "name
341 | server". I haven't told you, but the DNS database (the zone file, as it
342 | is called) for djangodeployment.com also contains these records:
343 |
344 | ==== ==== ===== ===================
345 | Name Type TTL Value
346 | ==== ==== ===== ===================
347 | @ NS 28800 nsa.bookmyname.com.
348 | @ NS 28800 nsb.bookmyname.com.
349 | @ NS 28800 nsc.bookmyname.com.
350 | ==== ==== ===== ===================
351 |
352 | (As you can see, there can be many records with the same type and name,
353 | and this is true of A and AAAA records as well—one name may map to many
354 | IP addresses, but we will not delve into that here.)
355 |
356 | I have never really understood the reason for the existence of these
357 | records **in the domain's zone file**. The upstream name servers
358 | obviously need to know that, but what's the use of querying a domain's
359 | name server about which are the domain's name servers? Obviously I
360 | already know them. However, `there is a reason`_, and these records
361 | need to be present both in the domain's name servers and upstream.
362 |
363 | .. _there is a reason: http://serverfault.com/questions/588244/what-is-the-role-of-ns-records-at-the-apex-of-a-dns-domain
364 |
365 | In any case, these NS records are virtually always configured
366 | automatically by the registrar or by the web interface of the name
367 | server provider, so usually you don't need to know more about it. What
368 | you need to know, however, is that DNS is a complicated system that
369 | easily fills in several books by itself. It will work well if you are
370 | gentle with it. If you want to do something more advanced and you don't
371 | really know what you are doing, ask for help from an expert if you can't
372 | afford the downtime.
373 |
374 | .. _editing_the_hosts_file:
375 |
376 | Editing the hosts file
377 | ----------------------
378 |
379 | As I told you earlier, when your browser needs to know the IP address
380 | that corresponds to a name, it asks your operating system's resolver,
381 | and the resolver asks the name server. It is possible to bypass the
382 | asking of the name server and tell the resolver what answers to give.
383 | This is done by modifying the ``hosts`` file, which in Unixes is
384 | ``/etc/hosts``, and in Windows is
385 | ``C:\Windows\System32\drivers\etc\hosts``. Edit the file and add these
386 | lines at the end::
387 |
388 | 1.2.3.4 mysite.com
389 | 1.2.3.4 www.mysite.com
390 |
391 | Save the file, restart your browser (because, remember, it may be
392 | caching names), and then visit mysite.com. It will probably fail to
393 | connect (because 1.2.3.4 does not exist), but the thing is that
394 | mysite.com has resolved to 1.2.3.4. The resolver found it in the
395 | ``hosts`` file, so it did not ask the DNS server.
396 |
397 | I often edit the ``hosts`` file, for experimenting with a temporary
398 | server without needing to change the DNS. Sometimes I want to redirect a
399 | domain to another machine, for development or testing, and I want to do
400 | this only for myself, without affecting the users of the domain. In such
401 | cases the ``hosts`` file comes in handy, and the changes made work
402 | immediately, without needing to wait for DNS caches to expire.
403 |
404 | The only thing that you must take care of is to remember to revert the
405 | ``hosts`` file to its original contents; if you forget to do so, it
406 | might cause you great headaches later (imagine wondering why the web
407 | site you are deploying is different than what it should be, and
408 | discovering, after hours of searching, that it was because of a
409 | forgotten entry in ``hosts``). What I usually do is leave the editor
410 | open and not close it until after I have reverted the file. When I don't
411 | do that thing, at least I make certain that the domain I'm playing with
412 | is ``example.com`` or anyway something very unlikely to ever be actually
413 | used by me.
414 |
415 | Visiting your Django project through the domain
416 | -----------------------------------------------
417 |
418 | In the previous chapter you ran Django on a server and it was reachable
419 | through http://$SERVER_IPv4_ADDRESS/. Now you should have setup your
420 | DNS and have $DOMAIN point to $SERVER_IPv4_ADDRESS. In your Django
421 | settings, change ``ALLOWED_HOSTS`` to this::
422 |
423 | ALLOWED_HOSTS = ['$DOMAIN', 'www.$DOMAIN']
424 |
425 | Then run the Django development server as in the previous chapter:
426 |
427 | .. code-block:: bash
428 |
429 | ./manage.py runserver 0.0.0.0:80
430 |
431 | Now you should be able to reach your Django project via http://$DOMAIN/.
432 | So we fixed the first step; we managed to reach Django through a domain
433 | instead of an IP address. Next, we will run Django as an unprivileged
434 | user, and put its files in appropriate directories.
435 |
436 | Chapter summary
437 | ---------------
438 |
439 | * Register your domain at a registrar.
440 | * Use the registrar's web interface to specify A and AAAA records for
441 | the domain and for www.
442 | * Be careful when you play with TTLs and when changing the domain's name
443 | servers.
444 | * If you do anything advanced with the DNS and you don't really know
445 | what you're doing and you can't afford the downtime, ask for expert
446 | help.
447 | * Set ``ALLOWED_HOSTS = ['$DOMAIN', 'www.$DOMAIN']``.
448 | * Optionally use your local ``hosts`` file for experimentation.
449 |
--------------------------------------------------------------------------------
/03-users-and-directories.rst:
--------------------------------------------------------------------------------
1 | .. _users_and_directories:
2 |
3 | Users and directories
4 | =====================
5 |
6 | Right now your Django project is at ``/root``, or maybe at
7 | ``/home/joe``. The first thing we are going to fix is put your Django
8 | project in a proper place.
9 |
10 | I will be using ``$DJANGO_PROJECT`` as the name of your Django
11 | project.
12 |
13 | .. _creating_user:
14 |
15 | Creating a user and group
16 | -------------------------
17 |
18 | It's a good idea to not run Django as root. We will create a user
19 | specifically for that, and we will give the user the same name as the
20 | Django project, i.e. ``$DJANGO_PROJECT``. However, in principle it can
21 | be different, and I will be using ``$DJANGO_USER`` to denote the user
22 | name, so that you can distinguish when I'm talking about the user and
23 | when about the project.
24 |
25 | Execute this command:
26 |
27 | .. code-block:: bash
28 |
29 | adduser --system --home=/var/opt/$DJANGO_PROJECT \
30 | --no-create-home --disabled-password --group \
31 | --shell=/bin/bash $DJANGO_USER
32 |
33 | Here is why we use these parameters:
34 |
35 | ``--system``
36 | This tells ``adduser`` to create a system user, as opposed to
37 | creating a normal user. System users are intended to run programs,
38 | whereas normal users are people. Because of this parameter,
39 | ``adduser`` will assign a user id less than 1000, which is only a
40 | convention for knowing that this is a system user. Otherwise there
41 | isn't much difference.
42 |
43 | ``--home=/var/opt/$DJANGO_PROJECT``
44 | This specifies the home directory for the user. For system users, it
45 | doesn't really matter which directory we will choose, but by
46 | convention we choose the one which holds the program's data. We will
47 | talk about the ``/var/opt/$DJANGO_PROJECT`` directory later.
48 |
49 | ``--no-create-home``
50 | We tell ``adduser`` to not create the home directory. We could allow
51 | it to create it, but we will create it ourselves later on, for
52 | instructive purposes.
53 |
54 | ``--disabled-password``
55 | The password will be, well, disabled. This means that you won't be
56 | able to become this user by using a password. However, the root user
57 | can always become another user (e.g. with ``su``) without using a
58 | password, so we don't need one.
59 |
60 | ``--group``
61 | This tells ``adduser`` to not only add a new user, but to also add a
62 | new group, having the same name as the user, and make the new user a
63 | member of the new group. We will see further below why this is
64 | useful. I will be using ``$DJANGO_GROUP`` to denote the new group.
65 | In principle it could be different than ``$DJANGO_USER`` (but then
66 | the procedure of creating the user and the group would be slightly
67 | different), but the most important thing is that I want it to be
68 | perfectly clear when we are talking about the user and when we are
69 | talking about the group.
70 |
71 | ``--shell=/bin/bash``
72 | By default, ``adduser`` uses ``/bin/false`` as the shell for system
73 | users, which practically means they are disabled; ``/bin/false``
74 | can't run any commands. We want the user to have the most common
75 | shell used in GNU/Linux systems, ``/bin/bash``.
76 |
77 | .. _the_program_files:
78 |
79 | The program files
80 | -----------------
81 |
82 | Your Django project should be structured either like this::
83 |
84 | $DJANGO_PROJECT/
85 | |-- manage.py
86 | |-- requirements.txt
87 | |-- your_django_app/
88 | `-- $DJANGO_PROJECT/
89 |
90 | or like this::
91 |
92 | $REPOSITORY_ROOT/
93 | |-- requirements.txt
94 | `-- $DJANGO_PROJECT/
95 | |-- manage.py
96 | |-- your_django_app/
97 | `-- $DJANGO_PROJECT/
98 |
99 | I prefer the former, but some people prefer the extra repository root
100 | directory.
101 |
102 | We are going to place your project inside ``/opt``. This is a standard
103 | directory for program files that are not part of the operating system.
104 | (The ones that are installed by the operating system go to ``/usr``.)
105 | So, clone or otherwise copy your Django project in
106 | ``/opt/$DJANGO_PROJECT`` or in ``/opt/$REPOSITORY_ROOT``. Do
107 | this **as the root user**. Create the virtualenv for your project **as
108 | the root user** as well:
109 |
110 | .. code-block:: bash
111 |
112 | virtualenv --system-site-packages --python=/usr/bin/python3 \
113 | /opt/$DJANGO_PROJECT/venv
114 | /opt/$DJANGO_PROJECT/venv/bin/pip install \
115 | -r /opt/$DJANGO_PROJECT/requirements.txt
116 |
117 | While it might seem strange that we are creating these as the root user
118 | instead of as ``$DJANGO_USER``, it is standard practice
119 | for program files to belong to the root user. If you check, you will see
120 | that ``/bin/ls`` belongs to the root user, though you may be running it
121 | as joe. In fact, it would be an error for it to belong to joe, because
122 | then joe would be able to modify it. So for security purposes it's
123 | better for program files to belong to root.
124 |
125 | This poses a problem: when ``$DJANGO_USER`` attempts to execute your
126 | Django application, it will not have permission to write
127 | the compiled Python files in the ``/opt/$DJANGO_PROJECT`` directory,
128 | because this is owned by root. So we need to pre-compile
129 | these files as root:
130 |
131 | .. code-block:: bash
132 |
133 | /opt/$DJANGO_PROJECT/venv/bin/python -m compileall \
134 | -x /opt/$DJANGO_PROJECT/venv/ /opt/$DJANGO_PROJECT
135 |
136 | The option ``-x /opt/$DJANGO_PROJECT/venv/`` tells compileall to exclude
137 | directory ``/opt/$DJANGO_PROJECT/venv`` from compilation. This is
138 | because the virtualenv takes care of its own compilation and we should
139 | not interfere.
140 |
141 | The data directory
142 | ------------------
143 |
144 | As I already hinted, our data directory is going to be
145 | ``/var/opt/$DJANGO_PROJECT``. It is standard policy for programs
146 | installed in ``/opt`` to put their data in ``/var/opt``. Most notably,
147 | we will store media files in there (in a later chapter). We will also
148 | store the SQLite file there. Usually in production we use a
149 | different RDBMS, but we will deal with this in a later chapter as well.
150 | So, let's now prepare the data directory:
151 |
152 | .. code-block:: bash
153 |
154 | mkdir -p /var/opt/$DJANGO_PROJECT
155 | chown $DJANGO_USER /var/opt/$DJANGO_PROJECT
156 |
157 | Besides creating the directory, we also changed its owner to
158 | ``$DJANGO_USER``. This is necessary because Django will be needing to
159 | write data in that directory, and it will be running as that user, so it
160 | needs permission to do so.
161 |
162 | .. _the_log_directory:
163 |
164 | The log directory
165 | -----------------
166 |
167 | Later we will setup our Django project to write to log files in
168 | ``/var/log/$DJANGO_PROJECT``. Let's prepare the directory.
169 |
170 | .. code-block:: bash
171 |
172 | mkdir -p /var/log/$DJANGO_PROJECT
173 | chown $DJANGO_USER /var/log/$DJANGO_PROJECT
174 |
175 | The production settings
176 | -----------------------
177 |
178 | Debian puts configuration files in ``/etc``. More specifically, the
179 | configuration for programs that are installed in ``/opt`` is supposed to
180 | go to ``/etc/opt``, which is what we will do:
181 |
182 | .. code-block:: bash
183 |
184 | mkdir /etc/opt/$DJANGO_PROJECT
185 |
186 | For the time being this directory is going to have only ``settings.py``;
187 | later it will have a bit more. Your
188 | ``/etc/opt/$DJANGO_PROJECT/settings.py`` file should be like this:
189 |
190 | .. code-block:: Python
191 |
192 | from DJANGO_PROJECT.settings import *
193 |
194 | DEBUG = True
195 | ALLOWED_HOSTS = ['$DOMAIN', 'www.$DOMAIN']
196 | DATABASES = {
197 | 'default': {
198 | 'ENGINE': 'django.db.backends.sqlite3',
199 | 'NAME': '/var/opt/$DJANGO_PROJECT/$DJANGO_PROJECT.db',
200 | }
201 | }
202 |
203 | .. note::
204 |
205 | The above is not valid Python until you replace ``$DJANGO_PROJECT``
206 | with the name of your django project and ``$DOMAIN`` with your
207 | domain. In all examples until now you might have been able to copy
208 | and paste the code from the book and use shell variables for
209 | ``$DJANGO_PROJECT``, ``$DJANGO_USER``, ``$DJANGO_GROUP``, and so on.
210 | This is, indeed, the reason I chose this notation. However, in some
211 | places, like in this Python, you have to actually replace it
212 | yourself. (Occasionally I use DJANGO_PROJECT without the leading
213 | dollar sign, in order to get the syntax highlighter to work.)
214 |
215 | .. note::
216 |
217 | These settings might give you the error "The SECRET_KEY setting must
218 | not be empty", or "Unknown command: 'collectstatic'", or some other
219 | error that indicates a problem with the settings. If this happens,
220 | a likely explanation is that this line at the top of your production
221 | settings isn't working correctly::
222 |
223 | from DJANGO_PROJECT.settings import *
224 |
225 | It may be that, in your Django project, ``settings`` is a directory
226 | that has no ``__init__.py`` file or an empty ``__init__.py`` file.
227 | Maybe you have to change the line to this::
228 |
229 | from DJANGO_PROJECT.settings.base import *
230 |
231 | Check what your project's settings file actually is, and import from
232 | that one.
233 |
234 | Let's now **secure the production settings**. We don't want other users
235 | of the system to be able to read the file, because it contains sensitive
236 | information. Maybe not yet, but after a few chapters it is going to have
237 | the secret key, the password to the database, the password for the email
238 | server, etc. At this point, you are wondering: what other users? I am
239 | the only person using this server, and I have created no users. Indeed,
240 | now that it's so easy and cheap to get small servers and assign a single
241 | job to them, this detail is not as important as it used to be. However,
242 | it is still a good idea to harden things a little bit. Maybe a year
243 | later you will create a normal user account on that server as an
244 | unrelated convenience for a colleague.
245 |
246 | If your Django project has a vulnerability, an attacker might be able to
247 | give commands to the system as the user as which the project runs (i.e.
248 | as ``$DJANGO_USER``). Likewise, in the future you might install some
249 | other web application, and that other web application might have a
250 | vulnerability and could be attacked, and the attacker might be able to
251 | give commands as the user running that application. In that case, if we
252 | have secured our ``settings.py``, the attacker won't be able to read it.
253 | Eventually servers get compromised, and we try to set up the system in
254 | such a way as to minimize the damage, and we can minimize it if we
255 | contain it, and we can contain it if the compromising of an application
256 | does not result in the compromising of other applications. This is why
257 | we want to run each application in its own user and its own group.
258 |
259 | Here is how to make the contents of ``/etc/opt/$DJANGO_PROJECT``
260 | unreadable by other users:
261 |
262 | .. code-block:: bash
263 |
264 | chgrp $DJANGO_GROUP /etc/opt/$DJANGO_PROJECT
265 | chmod u=rwx,g=rx,o= /etc/opt/$DJANGO_PROJECT
266 |
267 | What this does is make the directory unreadable by users other than
268 | ``root`` and ``$DJANGO_USER``. The directory is owned by ``root``, and
269 | the first command above changes the group of the directory to
270 | ``$DJANGO_GROUP``. The second command changes the permissions of the
271 | directory so that:
272 |
273 | **u=rwx**
274 | The owner has permission to read (rx) and write (w) the directory
275 | (the ``u`` in ``u=rwx`` stands for "user", but actually it means the
276 | "user who owns the directory"). The owner is ``root``. Reading a
277 | directory is denoted with ``rx`` rather than simply ``r``, where the
278 | ``x`` stands for "search"; but giving a directory only one of the
279 | ``r`` and ``x`` permissions is an edge case that I've seen only once
280 | in my life. For practical purposes, when you want a directory to be
281 | readable, you must specify both ``r`` and ``x``. (This applies only
282 | to directories; for files, the ``x`` is the permission to execute the
283 | file as a program.)
284 | **g=rx**
285 | The group has permission to read the directory. More precisely, users
286 | who belong in that group have permission to read the directory. The
287 | directory's group is ``$DJANGO_GROUP``. The only user in that group
288 | is ``$DJANGO_USER``, so this adjustment applies only to that user.
289 | **o=**
290 | Other users have no permission, they can't read or write to the
291 | directory.
292 |
293 | You might have expected that it would have been easier to tell the
294 | system "I want ``root`` to be able to read and write, and
295 | ``$DJANGO_USER`` to be able to only read". Instead, we did something
296 | much more complicated: we made ``$DJANGO_USER`` belong to a
297 | ``$DJANGO_GROUP``, and we made the directory readable by that group,
298 | thus indirectly readable by the user. The reason we did it this way is
299 | an accident of history. In Unix there has traditionally been no way to
300 | say "I want ``root`` to be able to read and write, and ``$DJANGO_USER``
301 | to be able to only read". In many modern Unixes, including Linux, it is
302 | possible using Access Control Lists, but this is a feature added later,
303 | it does not work the same in all Unixes, and its syntax is harder to
304 | use. The way we use here works the same in FreeBSD, HP-UX, and all other
305 | Unixes, and it is common practice everywhere.
306 |
307 | Finally, we need to **compile** the settings file. Your settings file
308 | and the ``/etc/opt/$DJANGO_PROJECT`` directory is owned by root, and, as
309 | with the files in ``/opt``, Django won't be able to write the
310 | compiled version, so we pre-compile it as root:
311 |
312 | .. code-block:: bash
313 |
314 | /opt/$DJANGO_PROJECT/venv/bin/python -m compileall \
315 | /etc/opt/$DJANGO_PROJECT
316 |
317 | Compiled files are the reason we changed the permissions of the
318 | directory and not the permissions of ``settings.py``. When Python writes
319 | the compiled files (which also contain the sensitive information), it
320 | does not give them the permissions we want, which means we'd need to be
321 | chgrping and chmoding each time we compile. By removing read permissions
322 | from the directory, we make sure that none of the files in the directory
323 | is readable; in Unix, in order to read file
324 | ``/etc/opt/$DJANGO_PROJECT/settings.py``, you must have permission to
325 | read ``/`` (the root directory), ``/etc``, ``/etc/opt``,
326 | ``/etc/opt/$DJANGO_PROJECT``, and
327 | ``/etc/opt/$DJANGO_PROJECT/settings.py``.
328 |
329 | You can check the permissions of a directory with the ``-d`` option of
330 | ``ls``, like this:
331 |
332 | .. code-block:: bash
333 |
334 | ls -lhd /
335 | ls -lhd /etc
336 | ls -lhd /etc/opt
337 | ls -lhd /etc/opt/$DJANGO_PROJECT
338 |
339 | (In the above commands, if you don't use the ``-d`` option it will show
340 | the contents of the directory instead of the directory itself.)
341 |
342 | .. hint:: Unix permissions
343 |
344 | When you list a file or directory with the ``-l`` option of ``ls``,
345 | it will show you something like ``-rwxr-xr-x`` at the beginning of
346 | the line. The first character is the file type: ``-`` for a file and
347 | ``d`` for a directory (there are also some more types, but we won't
348 | bother with them). The next nine characters are the permissions:
349 | three for the user, three for the group, three for others.
350 | ``rwxr-xr-x`` means "the user has permission to read, write and
351 | search/execute, the group has permission to read and search/execute
352 | but not write, and so do others".
353 |
354 | ``rwxr-xr-x`` can also be denoted as 755. If you substitute 0 in
355 | place of a hyphen and 1 in place of r, w and x, you get 111 101 101.
356 | In octal, this is 755. Instead of
357 |
358 | .. code-block:: bash
359 |
360 | chmod u=rwx,g=rx,o= /etc/opt/$DJANGO_PROJECT
361 |
362 | you can type
363 |
364 | .. code-block:: bash
365 |
366 | chmod 750 /etc/opt/$DJANGO_PROJECT
367 |
368 | which means exactly the same thing. People use this latter version
369 | much more than the other one, because it is so much easier to type,
370 | and because converting permissions into octal becomes second nature
371 | with a little practice.
372 |
373 | Managing production vs. development settings
374 | --------------------------------------------
375 |
376 | How to manage production vs. development settings seems to be an eternal
377 | question. Many people recommend, instead of a single ``settings.py``
378 | file, a ``settings`` directory containing ``__init__.py`` and
379 | ``base.py``. ``base.py`` is the base settings, those that are the same
380 | whether in production or development or testing. The directory often
381 | contains ``local.py`` (alternatively named ``dev.py``), with common
382 | development settings, which might or might not be in the repository.
383 | There's often also ``test.py``, settings that are used when testing.
384 | Both ``local.py`` and ``test.py`` start with this line::
385 |
386 | from .base import *
387 |
388 | Then they go on to override the base settings or add more settings.
389 | When the project is set up like this, ``manage.py`` is usually modified
390 | so that, by default, it uses ``$DJANGO_PROJECT.settings.local`` instead
391 | of simply ``$DJANGO_PROJECT.settings``. For more information on this
392 | technique, see Section 5.2, "Using Multiple Settings Files", in the book
393 | Two Scoops of Django; there's also a `stackoverflow answer`_ about it.
394 |
395 | .. _stackoverflow answer: http://stackoverflow.com/questions/1626326/how-to-manage-local-vs-production-settings-in-django/15325966#15325966
396 |
397 | Now, people who use this scheme sometimes also have ``production.py`` in
398 | the settings directory of the repository. Call me a perfectionist (with
399 | deadlines), but the production settings are the administrator's job, not
400 | the developer's, and your django project's repository is made by the
401 | developers. You might claim that you are both the developer and the
402 | administrator, since it's you who are developing the project and
403 | maintaining the deployment, but in this case you are assuming two roles,
404 | wearing a different hat each time. Production settings don't belong in
405 | the project repository any more than the nginx or PostgreSQL
406 | configuration does.
407 |
408 | The proper place to store such settings is another repository—the
409 | deployment repository. It can be as simple as holding only the
410 | production ``settings.py`` (along with ``README`` and ``.gitignore``),
411 | or as complicated as containing all your nginx, PostgreSQL, etc.,
412 | configuration for several servers, along with the "recipe" for how to
413 | set them up, written with a configuration management system such as
414 | Ansible.
415 |
416 | If you choose, however, to keep your production settings in your Django
417 | project repository, then your ``/etc/opt/$DJANGO_PROJECT/settings.py``
418 | file shall eventually be a single line::
419 |
420 | from $DJANGO_PROJECT.settings.production import *
421 |
422 | However, I don't want you to do this now. We aren't yet going to use our
423 | real production settings, because we are going step by step. Instead,
424 | create the ``/etc/opt/$DJANGO_PROJECT/settings.py`` file as I explained
425 | in the previous section.
426 |
427 | Running the Django server
428 | -------------------------
429 |
430 | .. warning::
431 |
432 | We are running Django with ``runserver`` here, which is inappropriate
433 | for production. We are doing it only temporarily, so that you
434 | understand several concepts. We will run Django correctly in the
435 | chapter about :ref:`gunicorn`.
436 |
437 | .. code-block:: bash
438 |
439 | su $DJANGO_USER
440 | source /opt/$DJANGO_PROJECT/venv/bin/activate
441 | export PYTHONPATH=/etc/opt/$DJANGO_PROJECT:/opt/$DJANGO_PROJECT
442 | export DJANGO_SETTINGS_MODULE=settings
443 | python /opt/$DJANGO_PROJECT/manage.py migrate
444 | python /opt/$DJANGO_PROJECT/manage.py runserver 0.0.0.0:8000
445 |
446 | You could also do that in an exceptionally long command (provided you
447 | have already done the ``migrate`` part), like this:
448 |
449 | .. code-block:: bash
450 |
451 | PYTHONPATH=/etc/opt/$DJANGO_PROJECT:/opt/$DJANGO_PROJECT \
452 | DJANGO_SETTINGS_MODULE=settings \
453 | su $DJANGO_USER -c \
454 | "/opt/$DJANGO_PROJECT/venv/bin/python \
455 | /opt/$DJANGO_PROJECT/manage.py runserver 0.0.0.0:8000"
456 |
457 | .. hint:: su
458 |
459 | You have probably heard of ``sudo``, which is a very useful program
460 | on Unix client machines (desktops and laptops). On the server,
461 | ``sudo`` is less common and we use ``su`` instead.
462 |
463 | ``su``, like ``sudo``, changes the user that executes a program. If
464 | you are user joe and you execute ``su -c ls``, then ``ls`` is run as
465 | root. ``su`` will ask for the root password in order to proceed.
466 |
467 | ``su alice -c ls`` means "execute ``ls`` as user alice". ``su alice``
468 | means "start a shell as user alice"; you can then type commands as
469 | user alice, and you can enter ``exit`` to "get out" of ``su``, that
470 | is, to exit the shell than runs as alice. If you are a normal user
471 | ``su`` will ask you for alice's password. If you are root, it will
472 | become alice without questions. This should make clear how the ``su``
473 | command works when you run the Django server as explained above.
474 |
475 | ``sudo`` works very differently from ``su``. Instead of asking the
476 | password of the user you want to become, it asks for your password,
477 | and has a configuration file that describes which user is allowed to
478 | become what user and with what constraints. It is much more
479 | versatile. ``su`` does only what I described and nothing more. ``su``
480 | is guaranteed to exist in all Unix systems, whereas ``sudo`` is an
481 | add-on that must be installed. By default it is usually installed on
482 | client machines, but not on servers. ``su`` is much more commonly
483 | used on servers and shell scripts than ``sudo``.
484 |
485 | Do you understand that very clearly? If not, here are some tips:
486 |
487 | * Make sure you have a grip on virtualenv_ and `environment
488 | variables`_.
489 | * Python reads the ``PYTHONPATH`` environment variable and adds
490 | the specified directories to the Python path.
491 | * Django reads the ``DJANGO_SETTINGS_MODULE`` environment variable.
492 | Because we have set it to "settings", Django will attempt to import
493 | ``settings`` instead of the default (the default is
494 | ``$DJANGO_PROJECT.settings``, or maybe
495 | ``$DJANGO_PROJECT.settings.local``).
496 | * When Django attempts to import ``settings``, Python looks in its
497 | path. Because ``/etc/opt/$DJANGO_PROJECT`` is listed first in
498 | ``PYTHONPATH``, Python will first look there for ``settings.py``, and
499 | it will find it there.
500 | * Likewise, when at some point Django attempts to import
501 | ``your_django_app``, Python will look in
502 | ``/etc/opt/$DJANGO_PROJECT``; it won't find it there, so then it will
503 | look in ``/opt/$DJANGO_PROJECT``, since this is next in
504 | ``PYTHONPATH``, and it will find it there.
505 | * If, before running ``manage.py [whatever]``, we had changed directory
506 | to ``/opt/$DJANGO_PROJECT``, we wouldn't need to specify
507 | that directory in ``PYTHONPATH``, because Python always adds the
508 | current directory to its path. This is why, in development, you just
509 | tell it ``python manage.py [whatever]`` and it finds your project.
510 | We prefer, however, to set the ``PYTHONPATH`` and not change
511 | directory; this way our setup will be clearer and more robust.
512 |
513 | .. _virtualenv: http://djangodeployment.com/2016/11/01/virtualenv-demystified/
514 | .. _environment variables: http://djangodeployment.com/2016/11/07/what-is-the-difference-between-a-shell-variable-and-an-environment-variable/
515 |
516 | Instead of using ``DJANGO_SETTINGS_MODULE``, you can also use the
517 | ``--settings`` parameter of ``manage.py``:
518 |
519 | .. code-block:: bash
520 |
521 | PYTHONPATH=/etc/opt/$DJANGO_PROJECT:/opt/$DJANGO_PROJECT \
522 | su $DJANGO_USER -c \
523 | "/opt/$DJANGO_PROJECT/venv/bin/python \
524 | /opt/$DJANGO_PROJECT/manage.py
525 | runserver --settings=settings 0.0.0.0:8000"
526 |
527 | (``manage.py`` also supports a ``--pythonpath`` parameter which could be
528 | used instead of ``PYTHONPATH``, however it seems that ``--settings``
529 | doesn't work correctly together with ``--pythonpath``, at least not in
530 | Django 1.8.)
531 |
532 | If you fire up your browser and visit http://$DOMAIN:8000/, you should
533 | see your Django project in action.
534 |
535 | Chapter summary
536 | ---------------
537 |
538 | * Create a system user and group with the same name as your Django
539 | project.
540 | * Put your Django project in ``/opt``, with all files owned by root.
541 | * Put your virtualenv in ``/opt/$DJANGO_PROJECT/venv``, with all files
542 | owned by root.
543 | * Put your data files in a subdirectory of ``/var/opt`` with the same
544 | name as your Django project, owned by the system user you created. If
545 | you are using SQLite, the database file will go in there.
546 | * Put your settings file in a subdirectory of ``/etc/opt`` with the
547 | same name as your Django project, whose user is root, whose group is
548 | the system group you created, that is readable by the group and
549 | writeable by root, and whose contents belong to root.
550 | * Precompile the files in ``/opt/$DJANGO_PROJECT`` and
551 | ``/etc/opt/$DJANGO_PROJECT``.
552 | * Run ``manage.py`` as the system user you created, after setting the
553 | environment variables
554 | ``PYTHONPATH=/etc/opt/$DJANGO_PROJECT:/opt/$DJANGO_PROJECT`` and
555 | ``DJANGO_SETTINGS_MODULE=settings``.
556 |
--------------------------------------------------------------------------------
/04-web-server.rst:
--------------------------------------------------------------------------------
1 | The web server
2 | ==============
3 |
4 | This chapter is divided in two parts: nginx and Apache. Depending on
5 | which of the two you choose, you only need to read that part.
6 |
7 | Both nginx and Apache are excellent choices for a web server. Most
8 | people deploying Django nowadays seem to be using nginx, so, if you
9 | aren't interested in learning more about what you should choose, pick up
10 | nginx. Apache is also widely used, and it is preferable in some cases.
11 | If you have any reason to prefer it, go ahead and use it.
12 |
13 | If you don't know what to do, choose nginx. If you want to know
14 | more about the pros and cons of each one, I have written `a blog post
15 | about it`_.
16 |
17 | .. _a blog post about it: http://djangodeployment.com/2016/11/15/why-nginx-is-faster-than-apache-and-why-you-neednt-necessarily-care/
18 |
19 | Installing nginx
20 | ----------------
21 |
22 | Install nginx like this::
23 |
24 | apt install nginx-light
25 |
26 | .. note::
27 |
28 | Instead of ``nginx-light``, you can use packages ``nginx-full`` or
29 | ``nginx-extras``, which have more modules available. However,
30 | ``nginx-light`` is enough in most cases.
31 |
32 | After you install, go to your web browser and visit http://$DOMAIN/. You
33 | should see nginx's welcome page.
34 |
35 | Configuring nginx to serve the domain
36 | -------------------------------------
37 |
38 | Create file ``/etc/nginx/sites-available/$DOMAIN`` with the
39 | following contents:
40 |
41 | .. code-block:: nginx
42 |
43 | server {
44 | listen 80;
45 | listen [::]:80;
46 | server_name $DOMAIN www.$DOMAIN;
47 | root /var/www/$DOMAIN;
48 | }
49 |
50 | .. note::
51 |
52 | Again, this is not a valid nginx configuration file until you replace
53 | ``$DOMAIN`` with your actual domain name.
54 |
55 | Create a symbolic link in ``sites-enabled``:
56 |
57 | .. code-block:: bash
58 |
59 | cd /etc/nginx/sites-enabled
60 | ln -s ../sites-available/$DOMAIN .
61 |
62 | .. _symboliclinks:
63 |
64 | .. hint:: Symbolic links
65 |
66 | A symbolic link looks like a file, but in fact it is a pointer to
67 | another file. The command
68 |
69 | .. code-block:: bash
70 |
71 | ln -s ../sites-available/$DOMAIN .
72 |
73 | means "create a symbolic link that points to file
74 | ``../sites-available/$DOMAIN`` and put the link in the current
75 | directory (``.``). Two dots denote the parent directory, so when the
76 | current directory is ``/etc/nginx/sites-enabled``, ``..`` means the
77 | parent, ``/etc/nginx``, whereas ``../sites-available`` means "one up,
78 | then down into ``sites-available``. A single dot designates the
79 | current directory.
80 |
81 | The command above is exactly equivalent as this:
82 |
83 | .. code-block:: bash
84 |
85 | ln -s ../sites-available/$DOMAIN $DOMAIN
86 |
87 | which means "create a symbolic link that points to file
88 | ``../sites-available/$DOMAIN`` and give it the name $DOMAIN. If the
89 | last argument of ``ln -s`` is a directory (for example, ``.``), then
90 | it creates the symbolic link in there and gives it the same name as
91 | the actual file.
92 |
93 | You can treat the symbolic link as if it was a file; you can edit it
94 | with an editor, you can open it with a Python program using
95 | ``open()``, and in these cases the actual file (the one being pointed
96 | to by the symbolic link) is opened instead.
97 |
98 | While the order of arguments in the ``ln`` command may seem strange
99 | at first, it is consistent with the order of arguments in the ``cp``
100 | command which merely copies files. Just as ``cp source destination``
101 | copies file ``source`` to file ``destination``, similarly ``ln -s``
102 | is like making a copy of the file, but instead of an actual copy, it
103 | creates a symbolic link.
104 |
105 | If you list files with ``ls -l``, it is clearly indicated
106 | which file the symbolic link points to. The permissions of the link,
107 | ``rwxrwxrwx``, may seem insecure, but they are actually irrelevant;
108 | it is the permissions of the actual file that count.
109 |
110 | Except for symbolic links there are also hard links, which are
111 | created without the ``-s`` option, but are different and rarely used.
112 | It is unlikely that you will ever create a hard link, so get used to
113 | always type ``ln -s``, that is, with the ``-s`` option.
114 |
115 | Tell nginx to re-read its configuration:
116 |
117 | .. code-block:: bash
118 |
119 | service nginx reload
120 |
121 | Finally, create directory ``/var/www/$DOMAIN``, and inside that
122 | directory create a file ``index.html`` with the following contents:
123 |
124 | .. code-block:: html
125 |
126 |
This is the web site for $DOMAIN.
127 |
128 | Fire up your browser and visit http://$DOMAIN/, and you should
129 | see the page you created.
130 |
131 | The fact that we named the nginx configuration file (in
132 | ``/etc/nginx/sites-available``) ``$DOMAIN`` is irrelevant; any name
133 | would have worked the same, but it's a convention to name it with the
134 | domain name. In fact, strictly speaking, we needn't even have created a
135 | separate file. The only configuration file nginx needs is
136 | ``/etc/nginx/nginx.conf``. If you open that file, you will see that it
137 | contains, among others, the following line::
138 |
139 | include /etc/nginx/sites-enabled/*;
140 |
141 | So what it does is read all files in that directory and process them as
142 | if their contents had been inserted in that point of
143 | ``/etc/nginx/nginx.conf``.
144 |
145 | As we noticed, if you visit http://$DOMAIN/, you see the page you
146 | created. If, however, you visit http://$SERVER_IPv4_ADDRESS/, you should
147 | see nginx's welcome page. If the host name (the part between "http://"
148 | and the next slash) is $DOMAIN or www.$DOMAIN then nginx uses the
149 | configuration we specified above, because of the ``server_name``
150 | configuration directive which contains these two names. If we use
151 | another domain name, or the server's ip address, there is no matching
152 | ``server { ... }`` block in the nginx configuration, so nginx uses its
153 | default configuration. That default configuration is in
154 | ``/etc/nginx/sites-enabled/default``. What makes it the default is the
155 | ``default_server`` parameter in these two lines:
156 |
157 | .. code-block:: nginx
158 |
159 | listen 80 default_server;
160 | listen [::]:80 default_server;
161 |
162 | If someone arrives at my server through the wrong domain name, I don't
163 | want them to see a page that says "Welcome to nginx", so I change the
164 | default configuration to the following, which merely responds with "Not
165 | found":
166 |
167 | .. code-block:: nginx
168 |
169 | server {
170 | listen 80 default_server;
171 | listen [::]:80 default_server;
172 | return 404;
173 | }
174 |
175 | Configuring nginx for django
176 | ----------------------------
177 |
178 | Change ``/etc/nginx/sites-available/$DOMAIN`` to the following
179 | (which only differs from the one we just created in that it has the
180 | ``location`` block):
181 |
182 | .. code-block:: nginx
183 |
184 | server {
185 | listen 80;
186 | listen [::]:80;
187 | server_name $DOMAIN www.$DOMAIN;
188 | root /var/www/$DOMAIN;
189 | location / {
190 | proxy_pass http://localhost:8000;
191 | }
192 | }
193 |
194 | Tell nginx to reload its configuration::
195 |
196 | service nginx reload
197 |
198 | Finally, start your Django server as we saw in the previous chapter;
199 | however, it doesn't need to listen on 0.0.0.0:8000, a mere 8000 is
200 | enough:
201 |
202 | .. code-block:: bash
203 |
204 | PYTHONPATH=/etc/opt/$DJANGO_PROJECT:/opt/$DJANGO_PROJECT \
205 | su $DJANGO_USER -c \
206 | "/opt/$DJANGO_PROJECT/venv/bin/python \
207 | /opt/$DJANGO_PROJECT/manage.py \
208 | runserver --settings=settings 8000"
209 |
210 | Now go to http://$DOMAIN/ and you should see your Django
211 | project in action.
212 |
213 | .. warning::
214 |
215 | We are running Django with ``runserver`` here, which is inappropriate
216 | for production. We are doing it only temporarily, so that you
217 | understand the concepts. We will run Django correctly in the chapter
218 | about :ref:`gunicorn`.
219 |
220 | Nginx receives your HTTP request. Because of the ``proxy_pass``
221 | directive, it decides to just pass on this request to another server,
222 | which in our case is localhost:8000.
223 |
224 | Now this may work for now, but we will add some more configuration which
225 | we will be necessary later. The ``location`` block actually becomes:
226 |
227 | .. code-block:: nginx
228 |
229 | location / {
230 | proxy_pass http://localhost:8000;
231 | proxy_set_header Host $http_host;
232 | proxy_redirect off;
233 | proxy_set_header X-Forwarded-For $remote_addr;
234 | proxy_set_header X-Forwarded-Proto $scheme;
235 | client_max_body_size 20m;
236 | }
237 |
238 | Here is what these configuration directives do:
239 |
240 | **proxy_set_header Host $http_host**
241 | By default, the header of the request nginx makes to the backend
242 | includes ``Host: localhost``. We need to pass the real ``Host`` to
243 | Django (i.e. the one received by nginx), otherwise Django cannot
244 | check if it's in ``ALLOWED_HOSTS``.
245 | **proxy_redirect off**
246 | This tells nginx that, if the backend returns an HTTP redirect, it
247 | should leave it as is. (By default, nginx assumes the backend is
248 | stupid and tries to be smart; if the backend returns an HTTP redirect
249 | that says "redirect to http://localhost:8000/somewhere", nginx
250 | replaces it with something similar to
251 | http://yourowndomain.com/somewhere". We prefer to configure Django
252 | properly instead.)
253 | **proxy_set_header X-Forwarded-For $remote_addr**
254 | To Django, the request is coming from nginx, and therefore the
255 | network connection appears to be from localhost, i.e. from address
256 | 127.0.0.1 (or ::1 in IPv6). Some Django apps need to know the actual
257 | IP address of the machine that runs the web browser; they might need
258 | that for access control, or to use the GeoIP database to deliver
259 | different content to different geographical areas. So we have nginx
260 | pass the actual IP address of the visitor in the ``X-Forwarded-For``
261 | header. Your Django project might not make use of this information,
262 | but it might do so in the future, and it's better to set the correct
263 | nginx configuration from now. When the time comes to use this
264 | information, you will need to configure your Django app properly; one
265 | way is to use django-ipware_.
266 |
267 | .. _django-ipware: https://github.com/un33k/django-ipware
268 |
269 | **proxy_set_header X-Forwarded-Proto $scheme**
270 | Another thing that Django does not know is whether the request has
271 | been made through HTTPS or plain HTTP; nginx knows that, but the
272 | request it subsequently makes to the Django backend is always plain
273 | HTTP. We tell nginx to pass this information with the
274 | ``X-Forwarded-Proto`` HTTP header, so that related Django
275 | functionality such as ``request.is_secure()`` works properly. You
276 | will also need to set ``SECURE_PROXY_SSL_HEADER =
277 | ('HTTP_X_FORWARDED_PROTO', 'https')`` in your ``settings.py``.
278 | **client_max_body_size 20m**
279 | This tells nginx to accept HTTP POST requests of up to 20 MB in
280 | length; if a request is larger nginx ignores it and returns a 413.
281 | Whether you really need that setting or not depends on whether you
282 | accept file uploads. If not, nginx's default, 1 MB, is probably
283 | enough, and it is better for protection against a denial-of-service
284 | attack that could attempt to make several large POST requests
285 | simultaneously.
286 |
287 | This concludes the part of the chapter about nginx. If you chose nginx
288 | as your web server, you probably want to skip the next sections and go
289 | to the Chapter summary.
290 |
291 | Installing Apache
292 | -----------------
293 |
294 | Install Apache like this::
295 |
296 | apt install apache2
297 |
298 | After you install, go to your web browser and visit
299 | http://$DOMAIN/. You should see Apache's welcome page.
300 |
301 | Configuring Apache to serve the domain
302 | --------------------------------------
303 |
304 | Create file ``/etc/apache2/sites-available/$DOMAIN.conf`` with
305 | the following contents:
306 |
307 | .. code-block:: apache
308 |
309 |
310 | ServerName $DOMAIN
311 | ServerAlias www.$DOMAIN
312 | DocumentRoot /var/www/$DOMAIN
313 |
314 |
315 | .. note::
316 |
317 | Again, this is not a valid Apache configuration file until you replace
318 | ``$DOMAIN`` with your actual domain name, such as "example.com".
319 |
320 | Create a symbolic link in ``sites-enabled``:
321 |
322 | .. code-block:: bash
323 |
324 | cd /etc/apache2/sites-enabled
325 | ln -s ../sites-available/$DOMAIN.conf .
326 |
327 | .. hint:: Symbolic links
328 |
329 | If you don't know what symbolic links are, I have described them in
330 | :ref:`the equivalent section for nginx`.
331 |
332 | .. hint:: Use a2ensite
333 |
334 | Debian-based systems have two convenient scripts, ``a2ensite``,
335 | meaning "Apache 2 enable site", and its counterpart, ``a2dissite``,
336 | for disabling a site. The first one merely creates the symbolic link
337 | as above, the second one removes it. So the manual creation of the
338 | symbolic link above is purely educational, and it's usually better to
339 | save some typing by just entering this instead:
340 |
341 | .. code-block:: bash
342 |
343 | a2ensite $DOMAIN
344 |
345 | Tell Apache to re-read its configuration:
346 |
347 | .. code-block:: bash
348 |
349 | service apache2 reload
350 |
351 | Finally, create directory ``/var/www/$DOMAIN``, and inside
352 | that directory create a file ``index.html`` with the following
353 | contents:
354 |
355 | .. code-block:: html
356 |
357 |
This is the web site for $DOMAIN.
358 |
359 | Fire up your browser and visit http://$DOMAIN/, and you should
360 | see the page you created.
361 |
362 | The fact that we named the Apache configuration file (in
363 | ``/etc/apache2/sites-available``) ``yourowndomain.com`` is irrelevant;
364 | any name would have worked the same, but it's a convention to name it
365 | with the domain name. In fact, strictly speaking, we needn't even have
366 | created a separate file. The only configuration file Apache needs is
367 | ``/etc/apache2/apache2.conf``. If you open that file, you will see that
368 | it contains, among others, the following line::
369 |
370 | IncludeOptional sites-enabled/*.conf
371 |
372 | So what it does is read all ``.conf`` files in that directory and
373 | process them as if their contents had been inserted in that point of
374 | ``/etc/apache2/apache2.conf``.
375 |
376 | As we noticed, if you visit http://$DOMAIN/, you see the page
377 | you created. If, however, you visit http://$SERVER_IP_ADDRESS/, you
378 | should see Apache's welcome page. If the host name (the part between
379 | "http://" and the next slash) is $DOMAIN or
380 | www.$DOMAIN, then Apache uses the configuration we specified
381 | above, because of the ``ServerName`` and ``ServerAlias`` configuration
382 | directives which contain these two names. If we use another
383 | domain name, or the server's ip address, there is no matching
384 | ``VirtualHost`` block in the Apache configuration, so apache uses its
385 | default configuration. That default configuration is in
386 | ``/etc/apache2/sites-enabled/000-default.conf``. What makes it the
387 | default is that it is listed first; the ``IncludeOptional`` in
388 | ``/etc/apache2/apache2.conf`` reads files in alphabetical order, and
389 | ``000-default.conf`` has the ``000`` prefix to ensure it is first.
390 |
391 | If someone arrives at my server through the wrong domain name, I don't
392 | want them to see a page that says "It works!", so I change the default
393 | configuration to the following, which merely responds with "Not found":
394 |
395 | .. code-block:: apache
396 |
397 |
398 | DocumentRoot /var/www/html
399 | Redirect 404 /
400 |
401 |
402 |
403 | Configuring Apache for django
404 | -----------------------------
405 |
406 | Change ``/etc/apache2/sites-available/$DOMAIN.conf`` to the
407 | following (which only differs from the one we just created in that it
408 | has the ``ProxyPass`` directive):
409 |
410 | .. code-block:: apache
411 |
412 |
413 | ServerName $DOMAIN
414 | ServerAlias www.$DOMAIN
415 | DocumentRoot /var/www/$DOMAIN
416 | ProxyPass / http://localhost:8000/
417 |
418 |
419 | In order for this to work, we actually first need to enable Apache
420 | modules ``proxy`` and ``proxy_http``, and we will take the opportunity
421 | to also enable ``headers``, because we will need it soon after:
422 |
423 | .. code-block:: bash
424 |
425 | a2enmod proxy proxy_http headers
426 |
427 | (Similarly to ``a2ensite`` and ``a2dissite``, ``a2enmod`` and
428 | ``a2dismod`` are merely convenient ways to create and delete symbolic
429 | links that point from ``/etc/apache2/mods-enabled`` to
430 | ``/etc/apache2/mods-available``.)
431 |
432 | Tell Apache to reload its configuration::
433 |
434 | service apache2 reload
435 |
436 | Finally, start your Django server as we saw in the previous chapter;
437 | however, it doesn't need to listen on 0.0.0.0:8000, a mere 8000 is
438 | enough:
439 |
440 | .. code-block:: bash
441 |
442 | PYTHONPATH=/etc/opt/$DJANGO_PROJECT:/opt/$DJANGO_PROJECT \
443 | su $DJANGO_USER -c \
444 | "/opt/$DJANGO_PROJECT/venv/bin/python \
445 | /opt/$DJANGO_PROJECT/manage.py \
446 | runserver --settings=settings 8000"
447 |
448 | Now go to http://$DOMAIN/ and you should see your Django project in
449 | action.
450 |
451 | .. warning::
452 |
453 | We are running Django with ``runserver`` here, which is inappropriate
454 | for production. We are doing it only temporarily, so that you
455 | understand the concepts. We will run Django correctly in the chapter
456 | about :ref:`gunicorn`.
457 |
458 | Apache receives your HTTP request. Because of the ``ProxyPass``
459 | directive, it decides to just pass on this request to another server,
460 | which in our case is localhost:8000.
461 |
462 | Now this may work for now, but we will add some more configuration which
463 | we will be necessary later:
464 |
465 | .. code-block:: apache
466 |
467 |
468 | ServerName $DOMAIN
469 | ServerAlias www.$DOMAIN
470 | DocumentRoot /var/www/$DOMAIN
471 | ProxyPass / http://localhost:8000/
472 | ProxyPreserveHost On
473 | RequestHeader set X-Forwarded-Proto "http"
474 |
475 |
476 | Here is what these configuration directives do:
477 |
478 | **ProxyPreserveHost On**
479 | By default, the header of the request Apache makes to the backend
480 | includes ``Host: localhost`` We need to pass the real ``Host`` to
481 | Django (i.e. the one received by Apache), otherwise Django cannot
482 | check if it's in ``ALLOWED_HOSTS``.
483 | **RequestHeader set X-Forwarded-Proto "http"**
484 | Another thing that Django does not know is whether the request has
485 | been made through HTTPS or plain HTTP; Apache knows that, but the
486 | request it subsequently makes to the Django backend is always plain
487 | HTTP. We tell Apache to pass this information with the
488 | ``X-Forwarded-Proto`` HTTP header, so that related Django
489 | functionality such as ``request.is_secure()`` works properly. You
490 | will also need to set ``SECURE_PROXY_SSL_HEADER =
491 | ('HTTP_X_FORWARDED_PROTO', 'https')`` in your ``settings.py``.
492 |
493 | This does not yet play a role because we have configured Apache
494 | to only serve plain HTTP. If we wanted it to also serve HTTPS, we
495 | would add a ```` block, which would contain mostly
496 | the same stuff as the ```` we have already defined.
497 | One of the differences is that ``X-Forwarded-Proto`` will be set to
498 | `"https"`.
499 |
500 | Chapter summary
501 | ---------------
502 |
503 | * Install your web server.
504 | * Name the web server's configuration file with the domain name of your
505 | site.
506 | * Put the configuration file in ``sites-available`` and symlink it from
507 | ``sites-enabled`` (don't forget to reload the web server).
508 | * Use the ``proxy_pass`` (nginx) or ``ProxyPass`` (Apache) directive to
509 | pass the HTTP request to Django.
510 | * Configure the web server to pass HTTP request headers ``Host``,
511 | ``X-Forwarded-For``, and ``X-Forwarded-Proto`` (Apache by default
512 | passes ``X-Forwarded-For``, so there is no configuration needed for
513 | that one).
514 | * For nginx, also configure ``proxy_redirect`` and
515 | ``client_max_body_size``.
516 |
--------------------------------------------------------------------------------
/05-static-files.rst:
--------------------------------------------------------------------------------
1 | Static and media files
2 | ======================
3 |
4 | Let's quickly make static files work. You might not understand perfectly
5 | what we're doing, but it will become very clear afterwards.
6 |
7 | .. _setting_up_django:
8 |
9 | Setting up Django
10 | -----------------
11 |
12 | **First**, add these statements to
13 | ``/etc/opt/$DJANGO_PROJECT/settings.py``::
14 |
15 | STATIC_ROOT = '/var/cache/$DJANGO_PROJECT/static/'
16 | STATIC_URL = '/static/'
17 |
18 | Remember that after each change to your settings you should, in theory,
19 | recompile:
20 |
21 | .. code-block:: bash
22 |
23 | /opt/$DJANGO_PROJECT/venv/bin/python -m compileall \
24 | /etc/opt/$DJANGO_PROJECT
25 |
26 | It's not really a big deal if you forget to recompile, but we will deal
27 | with that later.
28 |
29 | **Second**, create directory ``/var/cache/$DJANGO_PROJECT/static/``:
30 |
31 | .. code-block:: bash
32 |
33 | mkdir -p /var/cache/$DJANGO_PROJECT/static
34 |
35 | The ``-p`` parameter tells ``mkdir`` to create not only the directory,
36 | but, if needed, its parents as well.
37 |
38 | **Third**, run ``collectstatic``:
39 |
40 | .. code-block:: bash
41 |
42 | PYTHONPATH=/etc/opt/$DJANGO_PROJECT:/opt/$DJANGO_PROJECT \
43 | /opt/$DJANGO_PROJECT/venv/bin/python \
44 | /opt/$DJANGO_PROJECT/manage.py collectstatic \
45 | --settings=settings
46 |
47 | This will copy all static files to the directory we specified in
48 | `STATIC_ROOT`. Don't worry if you don't understand it clearly, we will
49 | explain it in a minute.
50 |
51 | Setting up nginx
52 | ----------------
53 |
54 | Change ``/etc/nginx/sites-available/$DOMAIN`` to the following,
55 | which only differs from the previous version in that the new ``location
56 | /static {}`` block has been added at the end:
57 |
58 | .. code-block:: nginx
59 |
60 | server {
61 | listen 80;
62 | listen [::]:80;
63 | server_name $DOMAIN www.$DOMAIN;
64 | root /var/www/$DOMAIN;
65 | location / {
66 | proxy_pass http://localhost:8000;
67 | proxy_set_header Host $http_host;
68 | proxy_redirect off;
69 | proxy_set_header X-Forwarded-For $remote_addr;
70 | proxy_set_header X-Forwarded-Proto $scheme;
71 | client_max_body_size 20m;
72 | }
73 | location /static/ {
74 | alias /var/cache/$DJANGO_PROJECT/static/;
75 | }
76 | }
77 |
78 | Don't forget to execute ``service nginx reload`` after that.
79 |
80 | Now let's try to see if it works. **Stop the Django development server**
81 | if it is running on the server. Open your browser and visit
82 | http://$DOMAIN/. nginx should give you a 502. This is expected, since
83 | the backend is not working.
84 |
85 | But now try to visit http://$DOMAIN/static/admin/img/icon_searchbox.png.
86 | If you have ``django.contrib.admin`` in ``INSTALLED_APPS``, it should
87 | get a search icon (if you don't use ``django.contrib.admin``, pick up
88 | another static file that you expect to see, or browse the directory
89 | ``/var/cache/$DJANGO_PROJECT/static``).
90 |
91 | :numref:`how_static_files_work_nginx` explains how this works.
92 |
93 | .. _how_static_files_work_nginx:
94 |
95 | .. figure:: _static/how-static-files-work-nginx.png
96 |
97 | How Django static files work in production (nginx version)
98 |
99 | The only thing that remains to clear up is what exactly these
100 | ``location`` blocks mean. ``location /static/`` means that the
101 | configuration inside the block shall apply only if the path of the URL
102 | begins with ``/static/``. Likewise, ``location /`` applies if the path
103 | of the URL begins with a slash. However, all paths begin with a slash,
104 | so if the path begins with ``/static/`` both ``location`` blocks match
105 | the URL. Nginx only uses one ``location`` block. The rules with which
106 | nginx chooses the ``location`` block that shall apply are complicated
107 | and are described in the `documentation for location`_, but in this
108 | particular case, nginx chooses the longest matching prefix; so if the
109 | path begins with ``/static/``, nginx will choose ``location /static/``.
110 |
111 | .. _documentation for location: http://nginx.org/en/docs/http/ngx_http_core_module.html#location
112 |
113 |
114 | Setting up Apache
115 | -----------------
116 |
117 | Change ``/etc/apache2/sites-available/$DOMAIN.conf`` to the following:
118 |
119 | .. code-block:: apache
120 |
121 |
122 | ServerName $DOMAIN
123 | ServerAlias www.$DOMAIN
124 | DocumentRoot /var/www/$DOMAIN
125 | ProxyPass /static/ !
126 | ProxyPass / http://localhost:8000/
127 | ProxyPreserveHost On
128 | RequestHeader set X-Forwarded-Proto "http"
129 | Alias /static/ /var/cache/$DJANGO_PROJECT/static/
130 |
131 | Require all granted
132 |
133 |
134 |
135 | Don't forget to execute ``service apache2 reload`` after that.
136 |
137 | Now let's try to see if it works. **Stop the Django development server**
138 | if it is running on the server. Open your browser and visit
139 | http://$DOMAIN/. Apache should give you a 503. This is expected, since
140 | the backend is not working.
141 |
142 | But now try to visit http://$DOMAIN/static/admin/img/icon_searchbox.png.
143 | If you have ``django.contrib.admin`` in ``INSTALLED_APPS``, it should
144 | get a search icon (if you don't use ``django.contrib.admin``, pick up
145 | another static file that you expect to see, or browse the directory
146 | ``/var/cache/$DJANGO_PROJECT/static``).
147 |
148 | :numref:`how_static_files_work_apache` explains how this works.
149 |
150 | .. _how_static_files_work_apache:
151 |
152 | .. figure:: _static/how-static-files-work-apache.png
153 |
154 | How Django static files work in production (Apache version)
155 |
156 | Now let's examine how the configuration above produces these results.
157 | The directive ``ProxyPass / http://localhost:8000/`` tells Apache that,
158 | if the URL path begins with ``/``, then it should pass the request to
159 | the backend. All URL paths begin with ``/``, so the directive always
160 | matches. But there is also the directive ``ProxyPass /static/ !``, which
161 | will match paths starting with ``/static/``. When there are many
162 | matching ``ProxyPass`` directives, the first one wins; so for path
163 | ``/static/admin/img/icon_searchbox.png``, ``ProxyPass /static/ !`` wins.
164 | The exclamation mark means "no proxy passing", so the directive means
165 | "when a URL path begins with ``/static/``, do not pass it to the
166 | backend". Since it is not going to be passed to the backend, Apache
167 | would normally combine it with the ``DocumentRoot`` and would thus try
168 | to return the file
169 | ``/var/www/$DOMAIN/static/admin/img/icon_searchbox.png``, but the
170 | ``Alias`` directive tells it to get
171 | ``/var/cache/$DJANGO_PROJECT/static/admin/img/icon_searchbox.png``
172 | instead. By default, Apache will refuse to access files in directories
173 | other than ``DocumentRoot``, and will return 403, "Forbidden", in
174 | requests to access them; so we add the directive ``Require all granted``
175 | for the static files directory, which means "everyone has permission to
176 | read the files".
177 |
178 | Media files
179 | -----------
180 |
181 | Media files are similar to static files, so let's go through them
182 | quickly. We will store them in ``/var/opt/$DJANGO_PROJECT/media``.
183 |
184 | .. code-block:: bash
185 |
186 | mkdir /var/opt/$DJANGO_PROJECT/media
187 | chown $DJANGO_USER /var/opt/$DJANGO_PROJECT/media
188 |
189 | One of the differences with static files is that we changed the
190 | ownership of ``/var/opt/$DJANGO_PROJECT/media`` to $DJANGO_USER. The
191 | reason is that Django needs to be writing there each time the user
192 | uploads a file or requests to delete a file.
193 |
194 | Add the following to ``/etc/opt/$DJANGO_PROJECT/settings.py``::
195 |
196 | MEDIA_ROOT = '/var/opt/$DJANGO_PROJECT/media/'
197 | MEDIA_URL = '/media/'
198 |
199 | For nginx, add the following to ``/etc/nginx/sites-available/$DOMAIN``:
200 |
201 | .. code-block:: nginx
202 |
203 | location /media/ {
204 | alias /var/opt/$DJANGO_PROJECT/media/;
205 | }
206 |
207 | For Apache, add the following before ``ProxyPass /``:
208 |
209 | .. code-block:: apache
210 |
211 | ProxyPass /media/ !
212 |
213 | and the following at the end of the ``VirtualHost`` block:
214 |
215 | .. code-block:: apache
216 |
217 | Alias /media/ /var/opt/$DJANGO_PROJECT/media/
218 |
219 | Require all granted
220 |
221 |
222 | Recompile your settings, reload the web server, and it's ready.
223 |
224 | File locations
225 | --------------
226 |
227 | Your static and media files are now served properly by the web server
228 | instead of the Django development server, and I hope you understand
229 | clearly what we've done. Let's take a break and discuss the file
230 | locations that I've chosen:
231 |
232 | ============== =================================
233 | Program files /opt/$DJANGO_PROJECT
234 | Virtualenv /opt/$DJANGO_PROJECT/venv
235 | Media files /var/opt/$DJANGO_PROJECT/media
236 | Static files /var/cache/$DJANGO_PROJECT/static
237 | Configuration /etc/opt/$DJANGO_PROJECT
238 | ============== =================================
239 |
240 | There are a couple more that we haven't seen yet, but the above more or
241 | less tell the whole story.
242 |
243 | Many people prefer a much simpler setup instead. They put everything
244 | related to their project in a single directory, which is that of their
245 | repository root, like this:
246 |
247 | ============== ====================================
248 | Program files /srv/$DJANGO_PROJECT
249 | Virtualenv /srv/$DJANGO_PROJECT/venv
250 | Media files /srv/$DJANGO_PROJECT/media
251 | Static files /srv/$DJANGO_PROJECT/static
252 | Configuration /srv/$DJANGO_PROJECT/$DJANGO_PROJECT
253 | ============== ====================================
254 |
255 | Although this setup seems simpler, I have preferred the other one for
256 | several reasons. The first one is purely educational. When you get too
257 | used to the simple setup, you might configure always the same
258 | ``STATIC_ROOT``, without really understanding what it does. The clean
259 | separation of directories should also have helped you get a grip on
260 | ``PYTHONPATH`` and ``DJANGO_SETTINGS_MODULE``.
261 |
262 | Separating in many directories is also cleaner and applies to many
263 | different situations. If a Django application is packaged as a ``.deb``
264 | package, or as a pip-installable package, the tweak required with the
265 | split directories scheme is minimal.
266 |
267 | Finally, separating the directories makes it easier to backup only what
268 | is needed. My backup solution (which we will see in the chapters about
269 | recovery) may exclude ``/opt`` and ``/var/cache`` from the backup.
270 | Since the static files can be regenerated, there is no need to back them
271 | up.
272 |
273 |
274 | Chapter summary
275 | ---------------
276 |
277 | * Set ``STATIC_ROOT`` to ``/var/cache/$DJANGO_PROJECT/static/``.
278 | * Set ``STATIC_URL`` to ``/static/``.
279 | * Set ``MEDIA_ROOT`` to ``/var/opt/$DJANGO_PROJECT/media/``.
280 | * Set ``MEDIA_URL`` to ``/media/``.
281 | * Run ``collectstatic``.
282 | * In nginx, set ``location /static/ { alias
283 | /var/cache/$DJANGO_PROJECT/static/; }``; likewise for media files.
284 | * In Apache, add ``ProxyPass /static/ !`` before ``ProxyPass /``, and
285 | add
286 |
287 | .. code-block:: apache
288 |
289 | Alias /static/ /var/cache/$DJANGO_PROJECT/static/
290 |
291 | Require all granted
292 |
293 |
294 | Likewise for media files.
295 |
--------------------------------------------------------------------------------
/06-gunicorn.rst:
--------------------------------------------------------------------------------
1 | .. _gunicorn:
2 |
3 | Gunicorn
4 | ========
5 |
6 | Why Gunicorn?
7 | -------------
8 |
9 | We now need to replace the Django development server with a Python
10 | application server. I will explain later why we need this. For now we
11 | need to select which Python application server to use. There are three
12 | popular servers: mod_wsgi, uWSGI, and Gunicorn.
13 |
14 | mod_wsgi is for Apache only, and I prefer to use a method that can be
15 | used with either Apache or nginx. This will make it easier to change the
16 | web server, should such a need arise. I also find Gunicorn easier to
17 | setup and maintain.
18 |
19 | I used uWSGI for a couple of years and was overwhelmed by its features.
20 | Many of them duplicate features that already exist in Apache or nginx or
21 | other parts of the stack, and thus they are rarely, if ever, needed. Its
22 | documentation is a bit chaotic. The developers themselves admit it: "We
23 | try to make our best to have good documentation but it is a hard work.
24 | Sorry for that." I recall hitting problems week after week and spending
25 | hours to solve them each time.
26 |
27 | Gunicorn, on the other hand, does exactly what you want and no more. It
28 | is simple and works fine. So I recommend it unless in your particular
29 | case there is a compelling reason to use one of the others, and so far I
30 | haven't met any such compelling reason.
31 |
32 | Installing and running Gunicorn
33 | -------------------------------
34 |
35 | We will install Gunicorn with ``pip`` rather than with ``apt``, because
36 | the packaged Gunicorn (both in Debian 8 and Ubuntu 16.04) supports only
37 | Python 2.
38 |
39 | .. code-block:: bash
40 |
41 | /opt/$DJANGO_PROJECT/venv/bin/pip install gunicorn
42 |
43 | Now run Django with Gunicorn:
44 |
45 | .. code-block:: bash
46 |
47 | su $DJANGO_USER
48 | source /opt/$DJANGO_PROJECT/venv/bin/activate
49 | export PYTHONPATH=/etc/opt/$DJANGO_PROJECT:/opt/$DJANGO_PROJECT
50 | export DJANGO_SETTINGS_MODULE=settings
51 | gunicorn $DJANGO_PROJECT.wsgi:application
52 |
53 | You can also write it as one long command, like this:
54 |
55 | .. code-block:: bash
56 |
57 | PYTHONPATH=/etc/opt/$DJANGO_PROJECT:/opt/$DJANGO_PROJECT \
58 | DJANGO_SETTINGS_MODULE=settings \
59 | su $DJANGO_USER -c "/opt/$DJANGO_PROJECT/venv/bin/gunicorn \
60 | $DJANGO_PROJECT.wsgi:application"
61 |
62 | Either of the two versions above will start Gunicorn, which will be
63 | listening at port 8000, like the Django development server did. Visit
64 | http://$DOMAIN/, and you should see your Django project in action.
65 |
66 | What actually happens here is that ``gunicorn``, a Python program, does
67 | something like ``from $DJANGO_PROJECT.wsgi import application``. It uses
68 | ``$DJANGO_PROJECT.wsgi`` and ``application`` because we told it so in
69 | the command line. Open the file
70 | ``/opt/$DJANGO_PROJECT/$DJANGO_PROJECT/wsgi.py`` to see that
71 | ``application`` is defined there. In fact, ``application`` is a Python
72 | callable. Now each time Gunicorn receives an HTTP request, it calls
73 | ``application()`` in a standardized way that is specified by the WSGI
74 | specification. The fact that the interface of this function is
75 | standardized is what permits you to choose between many different Python
76 | application servers such as Gunicorn, uWSGI, or mod_wsgi, and why each
77 | of these can interact with many Python application frameworks like
78 | Django or Flask.
79 |
80 | The reason we aren't using the Django development server is that it is
81 | meant for, well, development. It has some neat features for development,
82 | such as that it serves static files, and that it automatically restarts
83 | itself whenever the project files change. It is, however, totally
84 | inadequate for production; for example, it might leave files or
85 | connections open, and it does not support processing many requests at
86 | the same time, which you really want. Gunicorn, on the other hand, does
87 | the multi-processing part correctly, leaving to Django only the things
88 | that Django can do well.
89 |
90 | Gunicorn is actually a web server, like Apache and nginx. However, it
91 | does only one thing and does it well: it runs Python WSGI-compliant
92 | applications. It cannot serve static files and there's many other
93 | features Apache and nginx have that Gunicorn does not. This is why we
94 | put Apache or nginx in front of Gunicorn and proxy-pass requests to it.
95 | The accurate name for Gunicorn, uWSGI, and mod_wsgi would be
96 | "specialized web servers that run Python WSGI-compliant applications",
97 | but this is too long, which is why I've been using the vaguer "Python
98 | application servers" instead.
99 |
100 | Gunicorn has many parameters that can configure its behaviour. Most of
101 | them work fine with their default values. Still, we need to modify a
102 | few. Let's run it again, but this time with a few parameters:
103 |
104 | .. code-block:: bash
105 |
106 | su $DJANGO_USER
107 | source /opt/$DJANGO_PROJECT/venv/bin/activate
108 | export PYTHONPATH=/etc/opt/$DJANGO_PROJECT:/opt/$DJANGO_PROJECT
109 | export DJANGO_SETTINGS_MODULE=settings
110 | gunicorn --workers=4 \
111 | --log-file=/var/log/$DJANGO_PROJECT/gunicorn.log \
112 | --bind=127.0.0.1:8000 --bind=[::1]:8000 \
113 | $DJANGO_PROJECT.wsgi:application
114 |
115 | Here is what these parameters mean:
116 |
117 | ``--workers=4``
118 | Gunicorn starts a number of processes called "workers", and each
119 | process, each worker that is, serves one request at a time. To serve
120 | five concurrent requests, five workers are needed; if there are more
121 | concurrent requests than workers, they will be queued. You probably
122 | need two to five workers per processor core. Four workers are a good
123 | starting point for a single-core machine. The reason you don't want
124 | to increase this too much is that your Django project's RAM
125 | consumption is approximately proportional to the number of workers,
126 | as each worker is effectively a distinct instance of the Django
127 | project. If you are short on RAM, you might want to consider
128 | decreasing the number of workers. If you get many concurrent
129 | requests and your CPU is underused (usually meaning your Django
130 | projects do a lot of disk/database access) and you can spare the RAM,
131 | you can increase the number of workers.
132 |
133 | .. tip:: Check your CPU and RAM usage
134 |
135 | If your server gets busy, the Linux ``top`` command will show you
136 | useful information about the amount of free RAM, the RAM consumed
137 | by your Django project (and other system processes), and the CPU
138 | usage for various processes. You can read more about it in
139 | :ref:`top_memory` and :ref:`top_cpu`.
140 |
141 | ``--log-file=/var/log/$DJANGO_PROJECT/gunicorn.log``
142 | I believe this is self-explanatory.
143 |
144 | ``--bind=127.0.0.1:8000``
145 | This tells Gunicorn to listen on port 8000 of the local network
146 | interface. This is the default, but we specify it here for two
147 | reasons:
148 |
149 | 1. It's such an important setting that you need to see it to know
150 | what you've done. Besides, you could be running many applications
151 | on the same server, and one could be listening on 8000, another
152 | on 8001, and so on. So, for uniformity, always specify this.
153 | 2. We specify ``--bind`` twice (see below), to also listen on IPv6.
154 | The second time would override the default anyway.
155 |
156 | ``--bind=[::1]:8000``
157 | This tells Gunicorn to also listen on port 8000 of the local IPv6
158 | network interface. This must be specified if IPv6 is enabled on the
159 | virtual server. It is not specified, things may or may not work, and
160 | the system may be a bit slower even if things work.
161 |
162 | The reason is that the front-end web server, Apache or nginx, has
163 | been told to forward the requests to http://localhost:8000/. It will
164 | ask the the resolver what "localhost" means. If the system is
165 | IPv6-enabled, the resolver will reply with two results, ``::1``,
166 | which is the IPv6 address for the localhost, and ``127.0.0.1``. The
167 | web server might then decide to try the IPv6 version first. If
168 | Gunicorn has not been configured to listen to that address, then
169 | nothing will be listening at port 8000 of ::1, so the connection will
170 | be refused. The web server will then probably try the IPv4 version,
171 | which will work, but it will have made a useless attempt first.
172 |
173 | I could make some experiments to determine exactly what happens in
174 | such cases, and not speak with "maybe" and "probably", but it doesn't
175 | matter. If your server has IPv6, you must set it up correctly and use
176 | this option. If not, you should not use this option.
177 |
178 | Configuring systemd
179 | -------------------
180 |
181 | The only thing that remains is to make Gunicorn start automatically. For
182 | this, we will configure it as a service in systemd.
183 |
184 | .. note:: Older systems don't have systemd
185 |
186 | systemd is relatively a novelty. It exists only in Debian 8 and
187 | later, and Ubuntu 15.04 and later. In older systems you need to
188 | start Gunicorn in another way. I recommend supervisor_, which you can
189 | install with ``apt install supervisor``.
190 |
191 | .. _supervisor: http://supervisord.org/
192 |
193 | The first program the kernel starts after it boots is systemd. For this
194 | reason, the process id of systemd is 1. Enter the command ``ps 1`` and
195 | you will probably see that the process with id 1 is ``/sbin/init``, but
196 | if you look at it with ``ls -lh /sbin/init``, you will see it's a
197 | symbolic link to systemd.
198 |
199 | After systemd starts, it has many tasks, one of which is to start and
200 | manage the system services. We will tell it that Gunicorn is one of
201 | these services by creating file
202 | ``/etc/systemd/system/$DJANGO_PROJECT.service``, with the following
203 | contents:
204 |
205 | .. code-block:: ini
206 |
207 | [Unit]
208 | Description=$DJANGO_PROJECT
209 |
210 | [Service]
211 | User=$DJANGO_USER
212 | Group=$DJANGO_GROUP
213 | Environment="PYTHONPATH=/etc/opt/$DJANGO_PROJECT:/opt/$DJANGO_PROJECT"
214 | Environment="DJANGO_SETTINGS_MODULE=settings"
215 | ExecStart=/opt/$DJANGO_PROJECT/venv/bin/gunicorn \
216 | --workers=4 \
217 | --log-file=/var/log/$DJANGO_PROJECT/gunicorn.log \
218 | --bind=127.0.0.1:8000 --bind=[::1]:8000 \
219 | $DJANGO_PROJECT.wsgi:application
220 |
221 | [Install]
222 | WantedBy=multi-user.target
223 |
224 | After creating that file, if you enter ``service $DJANGO_PROJECT
225 | start``, it will start Gunicorn. However, it will not start it
226 | automatically at boot until we tell it ``systemctl enable
227 | $DJANGO_PROJECT``.
228 |
229 | The ``[Service]`` section of the configuration file should be
230 | self-explanatory, so I will only explain the other two sections. Systemd
231 | doesn't only manage services; it also manages devices, sockets, swap
232 | space, and other stuff. All these are called units; "unit" is, so to
233 | speak, the superclass. The ``[Unit]`` section contains configuration
234 | that is common to all unit types. The only option we need to specify
235 | there is ``Description``, which is free text. Its purpose is only to
236 | show in the UI of management tools. Although $DJANGO_PROJECT will work
237 | as a description, it's better to use something more verbose. As the
238 | systemd documentation says,
239 |
240 | "Apache2 Web Server" is a good example. Bad examples are
241 | "high-performance light-weight HTTP server" (too generic) or
242 | "Apache2" (too specific and meaningless for people who do not know
243 | Apache).
244 |
245 | The ``[Install]`` section tells systemd what to do when the service is
246 | enabled. The ``WantedBy`` option specifies dependencies. If, for
247 | example, we wanted to start Gunicorn before nginx, we would specify
248 | ``WantedBy=nginx.service``. This is too strict a dependency, so we just
249 | specify ``WantedBy=multi-user.target``. A target is a unit type that
250 | represents a state of the system. The multi-user target is a state all
251 | GNU/Linux systems reach in normal operations. Desktop systems go beyond
252 | that to the "graphical" target, which "wants" a multi-user system and
253 | adds a graphical login screen to it; but we want Gunicorn to start
254 | regardless whether we have a graphical login screen (we probably don't,
255 | as it is a waste of resources on a server).
256 |
257 | As I already said, you tell systemd to automatically start the service
258 | at boot (and automatically stop it at system shutdown) in this way:
259 |
260 | .. code-block:: bash
261 |
262 | systemctl enable $DJANGO_PROJECT
263 |
264 | Do you remember that in nginx and Apache you enable a site just by
265 | creating a symbolic link to ``sites-available`` from ``sites-enabled``?
266 | Likewise, ``systemctl enable`` does nothing but create a symbolic link.
267 | The dependencies we have specified in the ``[Install]`` section of the
268 | configuration file determine where the symbolic link will be created
269 | (sometimes more than one symbolic links are created). After you enable
270 | the service, try to restart the server, and check that your Django
271 | project has started automatically.
272 |
273 | As you may have guessed, you can disable the service like this:
274 |
275 | .. code-block:: bash
276 |
277 | systemctl disable $DJANGO_PROJECT
278 |
279 | This does not make use of the information in the ``[Install]`` section;
280 | it just removes all symbolic links.
281 |
282 | More about systemd
283 | ------------------
284 |
285 | While I don't want to bother you with history, if you don't read this
286 | section you will eventually get confused by the many ways you can manage
287 | a service. For example, if you want to tell nginx to reload its
288 | configuration, you can do it with either of these commands:
289 |
290 | .. code-block:: bash
291 |
292 | systemctl reload nginx
293 | service nginx reload
294 | /etc/init.d/nginx reload
295 |
296 | Before systemd, the first program that was started by the kernel was
297 | ``init``. This was much less smart than systemd and did not know what a
298 | "service" is. All ``init`` could do was execute programs or scripts. So
299 | if we wanted to start a service we would write a script that started the
300 | service and put it in ``/etc/init.d``, and enable it by linking it from
301 | ``/etc/rc2.d``. When ``init`` brought the system to "runlevel 2", the
302 | equivalent of systemd's multi-user target, it would execute the scripts
303 | in ``/etc/rc2.d``. Actually it wasn't ``init`` itself that did that, but
304 | other programs that ``init`` was configured to run, but this doesn't
305 | matter. What matters is that the way you would start, stop, or restart
306 | nginx, or tell it to reload its configuration, or check its running
307 | status, was this:
308 |
309 | .. code-block:: bash
310 |
311 | /etc/init.d/nginx start
312 | /etc/init.d/nginx stop
313 | /etc/init.d/nginx restart
314 | /etc/init.d/nginx reload
315 | /etc/init.d/nginx status
316 |
317 | The problem with these commands was that they might not always work
318 | correctly, mostly because of environment variables that might have been
319 | set, so the ``service`` script was introduced around 2005, which, as its
320 | documentation says, runs an init script "in as predictable an
321 | environment as possible, removing most environment variables and with
322 | the current working directory set to /." So a better alternative for the
323 | above commands was
324 |
325 | .. code-block:: bash
326 |
327 | service nginx start
328 | service nginx stop
329 | service nginx restart
330 | service nginx reload
331 | service nginx status
332 |
333 | The new way of doing these with systemd is the following:
334 |
335 | .. code-block:: bash
336 |
337 | systemctl start nginx
338 | systemctl stop nginx
339 | systemctl restart nginx
340 | systemctl reload nginx
341 | systemctl status nginx
342 |
343 | Both ``systemctl`` and ``service`` will work the same with your Gunicorn
344 | service, because ``service`` is a backwards compatible way to run
345 | ``systemctl``. You can't manage your service with an ``/etc/init.d``
346 | script, because we haven't created any such script (and it would have
347 | been very tedious to do so, which is why we preferred to use supervisor
348 | before we had systemd). For nginx and Apache, all three ways are
349 | available, because most services packaged with the operating system are
350 | still managed with init scripts, and systemd has a backwards compatible
351 | way of dealing with such scripts. In future versions of Debian and
352 | Ubuntu, it is likely that the init scripts will be replaced with systemd
353 | configuration files like the one we wrote for Gunicorn, so the
354 | ``/etc/init.d`` way will cease to exist.
355 |
356 | Of the remaining two newer ways, I don't know which is better.
357 | ``service`` has the benefit that it exists in non-Linux Unix systems,
358 | such as FreeBSD, so if you use both GNU/Linux and FreeBSD you can use
359 | the same command in both. The ``systemctl`` version may be more
360 | consistent with other systemd commands, like the ones for enabling and
361 | disabling services. Use whichever you like.
362 |
363 | .. _top_memory:
364 |
365 | The top command: memory management
366 | ----------------------------------
367 |
368 | If your server gets busy and you wonder whether its RAM and CPU are
369 | enough, the Linux ``top`` command is a useful tool. Execute it simply by
370 | entering ``top``. You can exit ``top`` by pressing ``q`` on the
371 | keyboard.
372 |
373 | When you execute ``top`` you will see an image similar to :numref:`top`.
374 |
375 | .. _top:
376 |
377 | .. figure:: _static/top.png
378 |
379 | The ``top`` command
380 |
381 | Let's examine **available RAM** first, which in :numref:`top` is
382 | indicated in the red box. The output of ``top`` is designed so that it
383 | fits in an 80-character wide terminal. For the RAM, the five values
384 | (total, used, free, buffers, and cached) can't fit on the line that is
385 | labeled "KiB Mem", so the last one has been moved to the line below,
386 | that is, the "cached Mem" indication belongs in "KiB Mem" and not in
387 | "KiB Swap".
388 |
389 | The "total" amount of RAM is simply the total amount of RAM; it is as
390 | much as you asked your virtual server to have. The "used" plus the
391 | "free" equals the total. Linux does heavy caching, which I explain
392 | below, so the "used" should be close to the total, and the "free" should
393 | be close to zero.
394 |
395 | Since RAM is much faster than the disk, Linux caches information from
396 | the disk in RAM. It does so in a variety of ways:
397 |
398 | * If you open a file, read it, close it, then you open it
399 | again and read it again, the second time it will be much faster; this
400 | is because Linux has cached the contents of the file in RAM.
401 | * Whenever you write a file, you are likely to read it again, so Linux
402 | caches it.
403 | * In order to speed up disk writing, Linux doesn't actually write to the
404 | disk when your program says ``f.write(data)``, not even when you close
405 | the file, not even when your program ends. It keeps the data in the
406 | cache and writes it later, attempting to optimize disk head movement.
407 | This is why some data may be lost when the system is powered off
408 | instead of properly shut down.
409 |
410 | The part of RAM that is used for Linux's disk cache is what ``top``
411 | shows as "buffers" and "cached". Buffers is also a kind of cache, so it
412 | is the sum of "buffers" and "cache" that matters (the difference between
413 | "buffers" and "cached" doesn't really matter unless you are a kernel
414 | developer). "Buffers" is usually negligible, so it's enough to only
415 | look at "cache".
416 |
417 | Linux doesn't want your RAM sitting down doing nothing, so if there is
418 | RAM available, it will use it for caching. Give it more RAM and it will
419 | cache more. If your server has a substantial amount of RAM labeled
420 | "free", it may mean that you have so much RAM that Linux can't fill it
421 | in even with its disk cache. This probably means the machine is larger
422 | than it needs to be, so it's a waste of resources. If, on the other
423 | hand, the cache is very small, this may mean that the system is short on
424 | RAM. On a healthy system, the cache should be 20–50% of RAM.
425 |
426 | Since we are talking about RAM, let's also examine the **amount of RAM
427 | used by processes**. By default ``top`` sorts processes by CPU usage,
428 | but you can type ``M`` (Shift + ``m``) to sort by memory usage (you can
429 | go back to sort by CPU usage by typing ``P``). The RAM used by each
430 | process is indicated by the "RES" column in KiB and the "%MEM" column in
431 | percentage.
432 |
433 | There are two related columns; "VIRT", for virtual memory, and "SHR",
434 | for shared memory. First of all, you need to forget the Microsoft
435 | terminology. Windows calls "virtual memory" what everyone else calls
436 | "swap space"; and what everyone else calls "virtual memory" is a very
437 | different thing from swap space. In order to better understand what
438 | virtual memory is, let's see it with this C program (it doesn't matter
439 | if you don't speak C):
440 |
441 | .. code-block:: c
442 |
443 | #include
444 | #include
445 | #include
446 | #include
447 |
448 | int main() {
449 | char c;
450 | void *p;
451 |
452 | /* Allocate 2 GB of memory */
453 | p = malloc(2L * 1024 * 1024 * 1024);
454 | if (!p) {
455 | fprintf(stderr, "Can't allocate memory: %s\n",
456 | strerror(errno));
457 | exit(1);
458 | }
459 |
460 | /* Do nothing until the user presses Enter */
461 | fputs("Press Enter to continue...", stderr);
462 | while((c = fgetc(stdin)) != EOF && c != '\n')
463 | ;
464 |
465 | /* Free memory and exit */
466 | free(p);
467 | exit(0);
468 | }
469 |
470 | When I run this program on my laptop, and while it is waiting for me to
471 | press Enter, this is what ``top`` shows about it::
472 |
473 | . PID ... VIRT RES SHR S %CPU %MEM ... COMMAND
474 | 13687 ... 2101236 688 612 S 0.0 0.0 ... virtdemo
475 |
476 | It indicates 2 GB VIRT, but actually uses less than 1 MB of RAM, while
477 | swap usage is still at zero. Overall, running the program has had a
478 | negligible effect on the system. The reason is that the ``malloc``
479 | function has only allocated virtual memory; "virtual" as in "not real".
480 | The operating system has provided 2 GB of virtual address space to the
481 | program, but the program has not used any of that. If the program had
482 | used some of this virtual memory (i.e. if it had written to it), the
483 | operating system would have automatically allocated some RAM and would
484 | have mapped the used virtual address space to the real address space in
485 | the RAM.
486 |
487 | So virtual memory is neither swap nor swap plus RAM; it's virtual. The
488 | operating system maps only the used part of the process's virtual memory
489 | space to something real; usually RAM, sometimes swap. Many programs
490 | allocate much more virtual memory than they actually use. For this
491 | reason, the VIRT column of ``top`` is not really useful. The RES
492 | column, that stands for "resident", indicates the part of RAM actually
493 | used.
494 |
495 | The SHR column indicates how much memory the program potentially shares
496 | with other processes. Usually all of that memory is included in the RES
497 | column. For example, in :numref:`top`, there are four ``apache2``
498 | processes which I show again here::
499 |
500 | . PID ... VIRT RES SHR S %CPU %MEM ... COMMAND
501 | 23268 ... 458772 37752 26820 S 0.2 3.7 ... apache2
502 | 16481 ... 461176 55132 41840 S 0.1 5.4 ... apache2
503 | 23237 ... 455604 14884 9032 S 0.1 1.5 ... apache2
504 | 23374 ... 459716 38876 27296 S 0.1 3.8 ... apache2
505 |
506 | It is unlikely that the total amount of RAM used by these four processes
507 | is the sum of the RES column (about 140 MB); it is more likely that
508 | something like 9 MB is shared among all of them, which would bring the
509 | total to about 110 MB. Maybe even less. They might also be sharing
510 | something (such as system libraries) with non-apache processes. It is
511 | not really possible to know how much of the memory marked as shared is
512 | actually being shared, and by how many processes, but it is something
513 | you need to take into account in order to explain why the total memory
514 | usage on your system is less than the sum of the resident memory for all
515 | processes.
516 |
517 | Let's now talk about **swap**. Swap is disk space used for temporarily
518 | writing (swapping) RAM. Linux uses it in two cases. The first one is if
519 | a program has actually used some RAM but has left it unused for a long
520 | time. If a process has written something to RAM but has not read it back
521 | for several hours, it means the RAM is being wasted. Linux doesn't like
522 | that, so it may save that part of RAM to the disk (to the swap space),
523 | which will free up the RAM for something more useful (such as caching).
524 | This is the case in :numref:`top`. The system is far from low on memory,
525 | and yet it has used a considerable amount of swap space. The only
526 | explanation is that some processes have had unused data in RAM for too
527 | long. When one of these processes eventually attempts to use swapped
528 | memory, the operating system will move it from the swap space back to
529 | the RAM (if there's not enough free RAM, it will swap something else or
530 | discard some of its cache).
531 |
532 | The second case in which Linux will use swap is if it's low on memory.
533 | This is a bad thing to happen and will greatly slow down the system,
534 | sometimes to a grinding halt. You can understand that this is the case
535 | from the fact that swap usage will be considerable while at the same
536 | time the free and cached RAM will be very low. Sometimes you will be
537 | unable to even run ``top`` when this happens.
538 |
539 | Whereas in Windows the swap space (confusingly called "virtual memory")
540 | is a file, on Linux it is usually a disk partition. You can find out
541 | where swap is stored on your system by examining the contents of file
542 | ``/proc/swaps``, for example by executing ``cat /proc/swaps``. (The
543 | "files" inside the ``/proc`` directory aren't real; they are created by
544 | the kernel and they do not exist on the disk. ``cat`` prints the
545 | contents of files, similar to ``less``, but does not paginate.)
546 |
547 | .. _top_cpu:
548 |
549 | The top command: CPU usage
550 | --------------------------
551 |
552 | The third line of ``top`` has eight numbers which add up to 100%. They
553 | are user, system, nice, idle, waiting, hardware interrupts, software
554 | interrupts, and steal, and indicate where the CPU spent its time in the
555 | last three seconds:
556 |
557 | * **us** (user) and **sy** (system) indicate how much of its time the
558 | processor was running programs in user mode and in kernel mode. Most
559 | code runs in user mode; but when a process asks the Linux kernel to do
560 | something (allocate memory, access the disk, network, or other device,
561 | start another process, etc.), the kernel switches to kernel mode, which
562 | means it has some priviliges that user mode doesn't have. (For example,
563 | kernel mode has access to all RAM and can modify the mapping between
564 | the processes' virtual memory and RAM/swap; whereas user mode simply
565 | has access to the virtual address space and doesn't know what happens
566 | behind the scenes.)
567 | * **ni** (nice) indicates how much of its time the processor was running
568 | with a positive "niceness" value. If many processes need the CPU at
569 | the same time, a "nice" process has lower priority. The "niceness" is
570 | a number up to 19. A process with a "niceness" of 19 will practically
571 | only run when the CPU would otherwise be idle. For example, the GNOME
572 | desktop environment's Desktop Search finds stuff in your files, and it
573 | does so very fast because it uses indexes. These indexes are updated
574 | in the background by the "tracker" process, which runs with a
575 | "niceness" of 19 in order to not make the rest of the system slower.
576 | Processes may also run with a negative niceness (up to -20), which
577 | means they have higher priority. In the list of processes, the NI
578 | column indicates the "niceness". Most processes have the default zero
579 | niceness, and it is unlikely you will ever need to know more about all
580 | that.
581 | * **id** (idle) and **wa** (waiting) indicate how much time the CPU was
582 | sitting down doing nothing. "Waiting" is a special case of idle; it
583 | means that while the CPU was idle there was at least one process
584 | waiting for disk I/O. A high value of "waiting" indicates heavy disk
585 | usage.
586 | * The meaning of time spent in **hi** (hardware interrupts) and **si**
587 | (software interrupts) is very technical. If this is non-negligible, it
588 | indicates heavy I/O (such as disk or network).
589 | * **st** (steal) is for virtual machines. When nonzero, it indicates
590 | that for that amount of time the virtual machine needed to run
591 | something on the (virtual) CPU, but it had to wait because the real
592 | CPU was unavailable, either because it was doing something else (e.g.
593 | servicing another virtual machine on the same host) or because of
594 | reaching the CPU usage quota.
595 |
596 | If the machine has more than one CPUs or cores, the "%Cpu(s)" line of
597 | ``top`` shows data collectively for all CPUs; but you can press ``1`` to
598 | toggle between that and showing information for each individual CPU.
599 |
600 | In the processes list, the %CPU column indicates the amount of time the
601 | CPU was working for that process, either in user mode or in kernel mode
602 | (when kernel code is running, most of the time it is in order to service
603 | a process, so this time is accounted for in the process). The %CPU
604 | column can add up to more than 100% if you have more than one cores; for
605 | four cores it can add up to 400% and so on.
606 |
607 | Finally, let's discuss about the CPU load. When your system is doing
608 | nothing, the CPU load is zero. If there is one process using the CPU,
609 | the load is one. If there is one process using the CPU and another
610 | process that wants to run and is queued for the CPU to become available,
611 | the load is two. The three numbers in the orange box in :numref:`top`
612 | are the load average in the last one, five, and 15 minutes. The load
613 | average should generally be less than the number of CPU cores, and
614 | preferably under 0.7 times the number of cores. It's OK if it spikes
615 | sometimes, so the load average for the last minute can occasionally go
616 | over the number of cores, but the 5- or 15-minute average should stay
617 | low. For more information about the load average, there's an excellent
618 | blog post by Andre Lewis, `Understanding Linux CPU Load - when should
619 | you be worried?`_
620 |
621 | .. _Understanding Linux CPU Load - when should you be worried?: http://blog.scoutapp.com/articles/2009/07/31/understanding-load-averages
622 |
623 |
624 | Chapter summary
625 | ---------------
626 |
627 | * Install ``gunicorn`` in your virtualenv.
628 | * Create file ``/etc/systemd/system/$DJANGO_PROJECT.service`` with
629 | these contents:
630 |
631 | .. code-block:: ini
632 |
633 | [Unit]
634 | Description=$DJANGO_PROJECT
635 |
636 | [Service]
637 | User=$DJANGO_USER
638 | Group=$DJANGO_GROUP
639 | Environment="PYTHONPATH=/etc/opt/$DJANGO_PROJECT:/opt/$DJANGO_PROJECT"
640 | Environment="DJANGO_SETTINGS_MODULE=settings"
641 | ExecStart=/opt/$DJANGO_PROJECT/venv/bin/gunicorn \
642 | --workers=4 \
643 | --log-file=/var/log/$DJANGO_PROJECT/gunicorn.log \
644 | --bind=127.0.0.1:8000 --bind=[::1]:8000 \
645 | $DJANGO_PROJECT.wsgi:application
646 |
647 | [Install]
648 | WantedBy=multi-user.target
649 |
650 | * Enable the service with ``systemctl enable $DJANGO_PROJECT``, and
651 | start/stop/restart it or get its status with ``systemctl $COMMAND
652 | $DJANGO_PROJECT``, where $COMMAND is start, stop, restart or status.
653 |
--------------------------------------------------------------------------------
/08-postgresql.rst:
--------------------------------------------------------------------------------
1 | PostgreSQL
2 | ==========
3 |
4 | Why PostgreSQL?
5 | ---------------
6 |
7 | So far we have been using SQLite. Can we continue to do so? The answer,
8 | as always, is "it depends". Most probably you can't.
9 |
10 | I'm using SQLite in production in one application I've made for an
11 | eshop hosted by BigCommerce. It gets the orders from the BigCommerce API
12 | and formats them on a PDF for printing on labels. It has no models, and
13 | all the data is stored in BigCommerce. The only significant data stored
14 | in SQLite is the users' names and passwords used for login, by
15 | ``django.contrib.auth``. It's hardly three users. Recreating them would
16 | be easier than maintaining a PostgreSQL installation. So SQLite it is.
17 |
18 | What if your database is small and you don't have many users, but you
19 | store mission-critical data in the database? That's a hard one. The
20 | thing is, no-one really knows if SQLite is appropriate, because no-one
21 | is using it for mission-critical data. Thunderbird doesn't use it for
22 | storing emails, but for storing indexes, which can be recreated.
23 | Likewise for Firefox. The `SQLite people claim`_ it's appropriate for
24 | mission-critical applications, but industry experience on that is
25 | practically nonexistent. I've never seen corruption in SQLite. I've seen
26 | corruption in PostgreSQL, but we are comparing apples to oranges. I have
27 | a gut feeling (but no hard data) that I can trust SQLite more than
28 | MySQL.
29 |
30 | If I ever choose to use SQLite for mission-critical data, I will make
31 | sure I not just backup the database file, but also backup a plain text
32 | dump of the database. I trust plain text dumps more than database files
33 | in case there is silent corruption that can go unnoticed for some time.
34 |
35 | One problem with SQLite is that you may choose to go with it now that
36 | your database is small and your users are few, but you can't really be
37 | certain what it will be like in three or five years. If for some reason
38 | the database has grown or the users have increased, SQLite might be
39 | unable to handle it. Migrating to PostgreSQL at that stage could be a
40 | nightmare. So the safe option is to use PostgreSQL straight from the
41 | beginning.
42 |
43 | As for MySQL, I never understood why it has become so popular when
44 | there's PostgreSQL around. My only explanation is it was marketed
45 | better. PostgreSQL is more powerful, it is easier, and it has better
46 | documentation. If you have a reason to use MySQL, it's probably that you
47 | already know it, or that people around you know it (e.g. it is
48 | company policy). In that case, hopefully you don't need any help from
49 | me. Otherwise, choose PostgreSQL and read the rest of this chapter.
50 |
51 | .. _SQLite people claim: https://www.sqlite.org/testing.html
52 |
53 | Getting started with PostgreSQL
54 | -------------------------------
55 |
56 | You may have noticed that I prefer to tell you to do things first and
57 | then explain them. Same thing again. We will quickly install PostgreSQL
58 | and configure Django to use it. You won't be understanding clearly what
59 | you are doing. After we finish it, you have some long sections to read.
60 | You *must* read them, however. **The way to avoid doing the reading is
61 | to forget about PostgreSQL and continue using SQLite.** It is risky to
62 | put your customer's data on a system that you don't understand and that
63 | you've set up just by blindly following instructions.
64 |
65 | .. code-block:: bash
66 |
67 | apt install postgresql
68 |
69 | This will install PostgreSQL and create a cluster; I will explain later
70 | what this means.
71 |
72 | .. warning:: Make sure the locale is right
73 |
74 | When PostgreSQL installs, it uses the encoding specified by the
75 | default system locale (found in ``/etc/default/locale``). If this is
76 | not UTF-8, the databases will be using an encoding other than UTF-8.
77 | You really don't want that. If you aren't certain, you can check,
78 | using the procedure I explained in
79 | :ref:`setting_up_the_system_locale`, that the default system locale
80 | is appropriate. You can also check that PostgreSQL was installed with
81 | the correct locale with this command:
82 |
83 | .. code-block:: bash
84 |
85 | su postgres -c 'psql -l'
86 |
87 | This will list your databases and some information about them,
88 | including their locale. Immediately after installation, there should
89 | be three databases (I explain them later on).
90 |
91 | If you make an error and install PostgreSQL while the locale is
92 | wrong, the easiest way to fix the problem is to drop and recreate the
93 | cluster. I explain later what "cluster" means, but what you need to
94 | know is that the following procedure will permanently and irrevocably
95 | delete all your databases. **Be careful not to type the commands in
96 | the wrong window** (you could delete the databases of the wrong
97 | server). Fix your locale as described in
98 | :ref:`setting_up_the_system_locale`, then execute the following
99 | commands:
100 |
101 | .. code-block:: bash
102 |
103 | service postgresql stop
104 | pg_dropcluster 9.5 main
105 | pg_createcluster 9.5 main
106 | service postgresql start
107 |
108 | If you have a database with useful data, obviously you can't do this.
109 | Fixing the problem is more advanced and isn't covered by this
110 | chapter; there is a `question at Stackoverflow`_ that treats it, but
111 | better finish this chapter first to get a grip on the basics.
112 |
113 | .. _question at Stackoverflow: http://stackoverflow.com/questions/5090858/how-do-you-change-the-character-encoding-of-a-postgres-database
114 |
115 | Let's now try to connect to PostgreSQL with a client program:
116 |
117 | .. code-block:: bash
118 |
119 | su postgres -c 'psql template1'
120 |
121 | This connects you with the "template1" database and gives you a prompt
122 | ending in ``#``. You can give it some commands like ``\l`` to list the
123 | databases (there are three just after installation). Let's create a
124 | user and a database. I will use placeholders $DJANGO_DB_USER,
125 | $DJANGO_DB_PASSWORD, and $DJANGO_DATABASE. We normally use the same as
126 | $DJANGO_PROJECT for both $DJANGO_DB_USER and $DJANGO_DATABASE, and I
127 | have the habit of using the SECRET_KEY as the database password, but in
128 | principle all these can be different; so I will be using these different
129 | placeholders here to signal to you that they denote something different.
130 |
131 | .. We use "text" instead of "sql" in the following code block because
132 | the highlighter is confused by the placeholders and shows warnings.
133 |
134 | .. code-block:: text
135 |
136 | CREATE USER $DJANGO_DB_USER PASSWORD '$DJANGO_DB_PASSWORD';
137 | CREATE DATABASE $DJANGO_DATABASE OWNER $DJANGO_DB_USER;
138 |
139 | The command to exit ``psql`` is ``\q``.
140 |
141 | Next, we need to install ``psycopg2``:
142 |
143 | .. code-block:: bash
144 |
145 | apt install python-psycopg2 python3-psycopg2
146 |
147 | This will work only if you have created your virtualenv with the
148 | ``--system-site-packages`` option, which is what I told you to do many
149 | pages ago. Otherwise, you need to ``pip install psycopg2`` inside the
150 | virtualenv. Most people do it in the second way. However, attempting to
151 | install ``psycopg2`` with ``pip`` will require compilation, and
152 | compilation can be tricky, and different ``psycopg2`` versions might
153 | behave differently, and in my experience the easiest and safest way is
154 | to install the version of ``psycopg2`` that is packaged with the
155 | operating system. If your site-wide Python installation is clean
156 | (meaning you have used ``pip`` only in virtualenvs),
157 | ``--system-site-packages`` works great.
158 |
159 | Finally, change your ``DATABASES`` setting to this:
160 |
161 | .. code-block:: python
162 |
163 | DATABASES = {
164 | 'default': {
165 | 'ENGINE': 'django.contrib.gis.db.backends.postgis',
166 | 'NAME': '$DJANGO_DATABASE',
167 | 'USER': '$DJANGO_DB_USER',
168 | 'PASSWORD': '$DJANGO_DB_PASSWORD',
169 | 'HOST': 'localhost',
170 | 'PORT': 5432,
171 | }
172 | }
173 |
174 | From now on, Django should be using PostgreSQL (you may need to restart
175 | Gunicorn). You should be able to setup your database with this:
176 |
177 | .. code-block:: bash
178 |
179 | PYTHONPATH=/etc/opt/$DJANGO_PROJECT:/opt/$DJANGO_PROJECT \
180 | DJANGO_SETTINGS_MODULE=settings \
181 | su $DJANGO_USER -c \
182 | "/opt/$DJANGO_PROJECT/venv/bin/python \
183 | /opt/$DJANGO_PROJECT/manage.py migrate"
184 |
185 |
186 | PostgreSQL connections
187 | ----------------------
188 |
189 | A short while ago we run this innocent looking command:
190 |
191 | .. code-block:: bash
192 |
193 | su postgres -c 'psql template1'
194 |
195 | Now let's explain what this does. Brace yourself, as it will take
196 | several sections. Better go make some tea, relax, and come back.
197 |
198 | A web server listens on TCP port 80 and a client, usually a browser,
199 | connects to that port and asks for some information. The server and the
200 | client communicate in a language, in this case the Hypertext Transfer
201 | Protocol or HTTP. In very much the same way, the PostgreSQL server is
202 | listening on a communication port and a client connects to that port.
203 | The client and the server communicate in the PostgreSQL Frontend/Backend
204 | Protocol.
205 |
206 | In the case of the ``psql template1`` command, ``psql``, the PostgreSQL
207 | interactive terminal, is the client. It connects to the server, and gets
208 | commands from you. If you tell it ``\l``, it asks the server for the
209 | list of databases. If you give it an SQL command, it sends it to the
210 | server and gets the response from the server.
211 |
212 | When you connect to a web server with your browser, you always provide
213 | the server address in the form of a URL. But here we only provided a
214 | database name. We could have told it the server as follows (but it's not
215 | going to work without a fight, because the user authentication kicks in,
216 | which I explain in the next section):
217 |
218 | .. code-block:: bash
219 |
220 | psql --host=localhost --port=5432 template1
221 |
222 | You might think ``localhost`` and 5432 is the default, but it isn't. The
223 | default is Unix domain socket ``/var/run/postgresql/.s.PGSQL.5432``.
224 | Let's see what this means.
225 |
226 | If you think about it, TCP is nothing more than a way for different
227 | processes to communicate. One process, the browser, opens a
228 | communication channel to another process, the web server. Unix domain
229 | sockets are an alternative interprocess communication system that has
230 | some advantages but only works on the same machine. Two processes on the
231 | same machine that want to communicate can do so via a socket; one
232 | process, the server, will create the socket, and another, the client,
233 | will connect to the socket. One of the philosophies of Unix is that
234 | everything looks like a file, so Unix domain sockets look like files,
235 | but they don't occupy any space on your disk. The client opens what
236 | looks like a file, and sends and receives data from it.
237 |
238 | When the PostgreSQL server starts, it creates socket
239 | ``/var/run/postgresql/.s.PGSQL.5432``. The "5432" is nothing of meaning
240 | to the system; if the socket had been named
241 | ``/var/run/postgresql/hello.world``, it would have worked exactly the
242 | same. The PostgreSQL developers chose to include the "5432" in the name
243 | of the socket as a convenience, in order to signify that this socket
244 | leads to the same PostgreSQL server as the one listening on TCP port
245 | 5432. This is useful in the rare case where many PostgreSQL instances
246 | (called "clusters", which I explain later) are running on the same
247 | machine.
248 |
249 | .. hint:: Hidden files
250 |
251 | In Unix, when a file begins with a dot, it's "hidden". This means
252 | that ``ls`` doesn't normally show it, and that when you use wildcards
253 | such as ``*`` to denote all files, the shell will not include it.
254 | Otherwise it's not different from non-hidden files.
255 |
256 | To list the contents of a directory including hidden files, use the
257 | ``-a`` option:
258 |
259 | .. code-block:: bash
260 |
261 | ls -a /var/run/postgresql
262 |
263 | This will include ``.`` and ``..``, which denote the directory itself
264 | and the parent directory (``/var/run/postgresql/.`` is the same as
265 | ``/var/run/postgresql``; ``/var/run/postgresql/..`` is the same as
266 | ``/var/run``). You can use ``-A`` instead of ``-a`` to include all
267 | hidden files except ``.`` and ``..``.
268 |
269 | PostgreSQL roles and authentication
270 | -----------------------------------
271 |
272 | After a client such as ``psql`` connects to the TCP port or to the Unix
273 | domain socket of the PostgreSQL server, it must authenticate before
274 | doing anything else. It must login, so to speak, as a user. Like many
275 | other relational database management systems (RDBMS's), PostgreSQL keeps
276 | its own list of users and has a sophisticated permissions system with
277 | which different users have different permissions on different databases
278 | and tables. This is useful in desktop applications. In the Greek tax
279 | office, for example, employees run a program on their computer, and the
280 | program asks them for their username and password, with which they login
281 | to the tax office RDBMS, which is Oracle, and Oracle decides what this
282 | user can or cannot access.
283 |
284 | Web applications changed that. Instead of PostgreSQL managing the users
285 | and their permissions, we have a single PostgreSQL user,
286 | $DJANGO_DB_USER, as which Django connects to PostgreSQL, and this user
287 | has full permissions on the $DJANGO_DB database. The actual users and
288 | their permissions are managed by ``django.contrib.auth``. What a user
289 | can or cannot do is decided by Django, not by PostgreSQL. This is a pity
290 | because ``django.contrib.auth`` (or the equivalent in other web
291 | frameworks) largely duplicates functionality that already exists in the
292 | RDBMS, and because having the RDBMS check the permissions is more robust
293 | and more secure. I believe that the reason web frameworks were developed
294 | this way is independence from any specific RDBMS, but I don't really
295 | know. Whatever the reason, we will live with that, but I am telling you
296 | the story so that you can understand why we need to create a PostgreSQL
297 | user for Django to connect to PostgreSQL as.
298 |
299 | Just as in Unix the user "root" is the superuser, meaning it has full
300 | permissions, and likewise the "administrator" in Windows, in PostgreSQL
301 | the superuser is "postgres". I am talking about the database user, not
302 | the operating system user. There is also an operating system "postgres"
303 | user, but here I don't mean the user that is stored in ``/etc/passwd``
304 | and which you can give as an argument to ``su``; I mean a PostgreSQL
305 | user. The fact that there exists an operating system user that happens
306 | to have the same username is irrelevant.
307 |
308 | Let's go back to our innocent looking command:
309 |
310 | .. code-block:: bash
311 |
312 | su postgres -c 'psql template1'
313 |
314 | As I explained, since we don't specify the database server, ``psql`` by
315 | default connects to the Unix domain socket
316 | ``/var/run/postgresql/.s.PGSQL.5432``. The first thing it must do after
317 | connecting is authenticating. We could have specified a user to
318 | authenticate as with the ``--username`` option. Since we did not,
319 | ``psql`` uses the default. The default is what the ``PGUSER``
320 | environment variable says, and if this is absent, it is the username of
321 | the current operating system user. In our case, the operating system
322 | user is ``postgres``, because we executed ``su postgres``; so ``psql``
323 | attempts to authenticate as the PostgreSQL user ``postgres``.
324 |
325 | To make sure you understand this clearly, try to run ``psql template1``
326 | as root:
327 |
328 | .. code-block:: bash
329 |
330 | psql template1
331 |
332 | What does it tell you? Can you understand why? If not, please re-read
333 | the previous paragraph. Note that after you have just installed
334 | PostgreSQL, it has only one user, ``postgres``.
335 |
336 | So, ``psql`` connected to ``/var/run/postgresql/.s.PGSQL.5432`` and
337 | asked to authenticate as ``postgres``. At this point, you might have
338 | expected the server to request a password, which it didn't. The reason
339 | is that PostgreSQL supports many different authentication methods, and
340 | password authentication is only one of them. In that case, it used
341 | another method, "peer authentication". By default, PostgreSQL is
342 | configured to use peer authentication when the connection is local (that
343 | is, through the Unix domain socket) and password authentication when the
344 | connection is through TCP. So try this instead to see that it will ask
345 | for a password:
346 |
347 | .. code-block:: bash
348 |
349 | su postgres -c 'psql --host=localhost template1'
350 |
351 | You don't know the ``postgres`` password, so just provide an empty
352 | password and see that it refuses the connection. I don't know the
353 | password either. I believe that Debian/Ubuntu sets no password (i.e.
354 | invalid password) at installation time. You can set a valid password
355 | with ``ALTER USER postgres PASSWORD 'topsecret'``, but don't do that.
356 | There is no reason for the ``postgres`` user to connect to the database
357 | with password authentication, it could be a security risk, and you
358 | certainly don't want to add yet another password to your password
359 | manager.
360 |
361 | Let's go back to what we were saying. ``psql`` connected to the socket
362 | and asked to authenticate as ``postgres``. The server decided to use
363 | peer authentication, because the connection is local. In peer
364 | authentication, the server asks the operating system: "who is the user
365 | who connected to the socket?" The operating system replied: "postgres".
366 | The server checks that the operating system user name is the same as the
367 | PostgreSQL user name which the client has requested to authenticate as.
368 | If it is, the server allows. So the Unix ``postgres`` user can always
369 | connect locally (through the socket) as the PostgreSQL ``postgres``
370 | user, and the Unix ``joe`` user can always connect locally as the
371 | PostgreSQL ``joe`` user.
372 |
373 | So, in fact, if $DJANGO_USER and $DJANGO_DB_USER are the same (and they
374 | are if so far you have followed everything I said), you could use these
375 | Django settings:
376 |
377 | .. code-block:: python
378 |
379 | DATABASES = {
380 | 'default': {
381 | 'ENGINE': 'django.db.backends.postgresql_psycopg2',
382 | 'NAME': '$DJANGO_DATABASE',
383 | 'USER': '$DJANGO_DB_USER',
384 | }
385 | }
386 |
387 | In this case, Django will connect to PostgreSQL using the Unix domain
388 | socket, and PostgreSQL will authenticate it with peer authentication.
389 | This is quite cool, because you don't need to manage yet another
390 | password. However, I don't recommend it. First, most of your colleagues
391 | will have trouble understanding that setup, and you can't expect
392 | everyone to sit down and read everything and understand everything in
393 | detail. Second, next month you may decide to put Django and PostgreSQL
394 | on different machines, and using password authentication you make your
395 | Django settings ready for that change. It's also better, both for
396 | automation and your sanity, to have similar Django settings on all your
397 | deployments, and not to make some of them different just because it
398 | happens that PostgreSQL and Django run on the same machine there.
399 |
400 | Remember that when we created the $DJANGO_DATABASE database, we made
401 | $DJANGO_DB_USER its owner?
402 |
403 | .. We use "text" instead of "sql" in the following code block because
404 | the highlighter is confused by the placeholders and shows warnings.
405 |
406 | .. code-block:: text
407 |
408 | CREATE DATABASE $DJANGO_DATABASE OWNER $DJANGO_DB_USER;
409 |
410 | The owner of a database has full permission to do anything in that
411 | database: create and drop tables; update, insert and delete any rows
412 | from any tables; grant other users permission to do these things; and
413 | drop the entire database. This is by far the easiest and recommended way
414 | to give $DJANGO_DB_USER the required permissions.
415 |
416 | Before I move to the next section, two more things you need to know.
417 | PostgreSQL authentication is configurable. The configuration is at
418 | ``/etc/postgresql/9.x/main/pg_hba.conf``. Avoid touching it, as it is a
419 | bit complicated. The default (peer authentication for Unix domain socket
420 | connections, password authentication for TCP connections) works fine for
421 | most cases. The only problem you are likely to face is that the default
422 | configuration does not allow connection from other machines, only from
423 | localhost. So if you ever put PostgreSQL on a different machine from
424 | Django, you will need to modify the configuration.
425 |
426 | Finally, PostgreSQL used to have users and groups, but the PostgreSQL
427 | developers found out that these two types of entity had so much in
428 | common that they joined them into a single type that is called "role". A
429 | role can be a member of another role, just as a user could belong to a
430 | group. This is why you will see "role joe does not exist" in error
431 | messages, and why ``CREATE USER`` and ``CREATE ROLE`` are exactly the
432 | same thing.
433 |
434 | PostgreSQL databases and clusters
435 | ---------------------------------
436 |
437 | Several pages ago, we gave this command:
438 |
439 | .. code-block:: bash
440 |
441 | su postgres -c 'psql template1'
442 |
443 | I have explained where it connected and how it authenticated, and to
444 | finish this up I only need to explain why we told it to connect to the
445 | "template1" database.
446 |
447 | The thing is, there was actually no theoretical need to connect to a
448 | database. The only two commands we gave it were these:
449 |
450 | .. We use "text" instead of "sql" in the following code block because
451 | the highlighter is confused by the placeholders and shows warnings.
452 |
453 | .. code-block:: text
454 |
455 | CREATE USER $DJANGO_DB_USER PASSWORD '$DJANGO_DB_PASSWORD';
456 | CREATE DATABASE $DJANGO_DATABASE OWNER $DJANGO_DB_USER;
457 |
458 | I also told you, for experiment, to also provide the ``\l`` command,
459 | which lists the databases.
460 |
461 | All three commands are independent of database and would work exactly
462 | the same regardless of which database we are connected to. However,
463 | whenever a client connects to PostgreSQL, it *must* connect to a
464 | database. There is no way to tell the server "hello, I'm user postgres,
465 | authenticate me, but I don't want to connect to any specific database
466 | because I only want to do work that is independent of any specific
467 | database". Since you must connect to a database, you can choose any of
468 | the three that are always known to exist: ``postgres``, ``template0``,
469 | and ``template1``. It is a long held custom to connect to ``template1``
470 | in such cases (although ``postgres`` is a bit better, but more on that
471 | below).
472 |
473 | The official PostgreSQL documentation explains ``template0`` and
474 | ``template1`` so perfectly that I will simply copy it here:
475 |
476 | CREATE DATABASE actually works by copying an existing database. By
477 | default, it copies the standard system database named ``template1``.
478 | Thus that database is the "template" from which new databases are
479 | made. If you add objects to ``template1``, these objects will be
480 | copied into subsequently created user databases. This behavior
481 | allows site-local modifications to the standard set of objects in
482 | databases. For example, if you install the procedural language
483 | PL/Perl in ``template1``, it will automatically be available in user
484 | databases without any extra action being taken when those databases
485 | are created.
486 |
487 | There is a second standard system database named ``template0``. This
488 | database contains the same data as the initial contents of
489 | ``template1``, that is, only the standard objects predefined by your
490 | version of PostgreSQL. ``template0`` should never be changed after
491 | the database cluster has been initialized. By instructing CREATE
492 | DATABASE to copy ``template0`` instead of ``template1``, you can
493 | create a "virgin" user database that contains none of the site-local
494 | additions in ``template1``. This is particularly handy when
495 | restoring a ``pg_dump`` dump: the dump script should be restored in
496 | a virgin database to ensure that one recreates the correct contents
497 | of the dumped database, without conflicting with objects that might
498 | have been added to ``template1`` later on.
499 |
500 | There's more about that in `Section 22.3`_ of the documentation. In
501 | practice, I never touch ``template1`` either. I like to have PostGIS in
502 | the template, but what I do is create another template,
503 | ``template_postgis``, for the purpose.
504 |
505 | .. _section 22.3: https://www.postgresql.org/docs/9.6/static/manage-ag-templatedbs.html
506 |
507 | Before explaining what the ``postgres`` database is for, we need to look
508 | at an alternative way of creating users and databases. Instead of using
509 | ``psql`` and executing ``CREATE USER`` and ``CREATE DATABASE``, you can
510 | run these commands:
511 |
512 | .. code-block:: bash
513 |
514 | su postgres -c "createuser --pwprompt $DJANGO_DB_USER"
515 | su postgres -c "createdb --owner=$DJANGO_DB_USER $DJANGO_DATABASE"
516 |
517 | Like ``psql``, ``createuser`` and ``createdb`` are PostgreSQL clients;
518 | they do nothing more than connect to the PostgreSQL server, construct
519 | ``CREATE USER`` and ``CREATE DATABASE`` commands from the arguments you
520 | have given, and send these commands to the server. As I've explained,
521 | whenever a client connects to PostgreSQL, it *must* connect to a
522 | database. What ``createuser`` and ``createdb`` (and other PostgreSQL
523 | utility programs) do is connect to the ``postgres`` database. So
524 | ``postgres`` is actually an empty, dummy database used when a client
525 | needs to connect to the PostgreSQL server without caring about the
526 | database.
527 |
528 | I hinted above that it is better to use ``psql postgres`` than ``psql
529 | template1`` (though most people use the latter). The reason is that
530 | sometimes you may accidentally create tables while being connected to
531 | the wrong database. It has happened to me more than once to screw up my
532 | ``template1`` database. You don't want to accidentally modify your
533 | ``template1`` database, but it's not a big deal if you modify your
534 | ``postgres`` database. So use that one instead when you want to connect
535 | with ``psql``. The only reason I so far told you to use the suboptimal
536 | ``psql template1`` is that I thought you would be confused by the many
537 | instances of "postgres" (there's an operating system user, a PostgreSQL
538 | user, and a database named thus).
539 |
540 | Now let's finally explain what a cluster is. Let's see it with an
541 | example. Remember that nginx reads ``/etc/nginx/nginx.conf`` and listens
542 | on port 80? Well, it's entirely possible to start another instance of
543 | nginx on the same server, that reads ``/home/antonis/nginx.conf`` and
544 | listens to another port. That other instance will have different lock
545 | files, different log files, different configuration files, and can have
546 | different directory roots, so it can be totally independent. It's very
547 | rarely needed, but it can be done (I've done it once to debug a
548 | production server of a problem I couldn't reproduce in development).
549 | Likewise, you can start a second instance of PostgreSQL, that uses
550 | different configuration files and a different data file directory, and
551 | listens on a different port (and different Unix domain socket). Since it
552 | is totally independent of the other instance, it also has its own users
553 | and its own databases, and is served by different server processes.
554 | These server processes could even be run by different operating system
555 | users (but in practice we use the same user, ``postgres``, for all of
556 | them). Each such instance of PostgreSQL is called a cluster. By far most
557 | PostgreSQL installations have a single cluster called "main", so you
558 | needn't worry further about it; just be aware that this is why the
559 | configuration files are in ``/etc/postgresql/9.x/main``, why the data
560 | files are in ``/var/lib/postgresql/9.x/main``, and why the log files are
561 | named ``/var/log/postgresql/postgresql-9.x-main.log``. If you ever
562 | create a second cluster on the same machine, you will be doing something
563 | advanced, like setting up certain kinds of replication. If you are doing
564 | such an advanced thing now, you are probably reading the wrong book.
565 |
566 | Further reading
567 | ---------------
568 |
569 | You may have noticed that I close most chapters with a summary, which,
570 | among other things, repeats most of the code and configuration snippets
571 | of the chapter. In this chapter I have no summary to write, because I
572 | have already written it; it's Section `Getting started with
573 | PostgreSQL`_. In the rest of the chapter I merely explained it.
574 |
575 | I explain in the next chapter, but it is so important that I must repeat
576 | it here, that **you should not backup your PostgreSQL database by
577 | copying its data files from /var/lib/postgresql**. If you do such a
578 | thing, you risk being unable to restore it when you need it. Read the
579 | next chapter for more information.
580 |
581 | I hope I wrote enough to get you started. You should be able to use it
582 | in production now, and learn a little bit more and more as you go on.
583 | Its great documentation is the natural place to continue. If you ever do
584 | anything advanced, Gregory Smith's PostgreSQL High Performance is a nice
585 | book.
586 |
--------------------------------------------------------------------------------
/10-recovery2.rst:
--------------------------------------------------------------------------------
1 | Recovery part 2
2 | ===============
3 |
4 | Restoring a file or directory
5 | -----------------------------
6 |
7 | You made some changes to ``/etc/opt/$DJANGO_PROJECT/settings.py``,
8 | changed your mind, and you want it back? No problem:
9 |
10 | .. code-block:: bash
11 |
12 | duply main fetch etc/opt/$DJANGO_PROJECT/settings.py \
13 | /tmp/restored_settings.py
14 |
15 | This will fetch the most recent version of the file from backup and will
16 | put it in ``/tmp/restored_settings.py``. Note that when you specify the
17 | source file there is no leading slash.
18 |
19 | You can also fetch previous versions of the file:
20 |
21 | .. code-block:: bash
22 |
23 | # Fetch it as it was 4 days ago
24 | duply main fetch etc/opt/$DJANGO_PROJECT/settings.py \
25 | /tmp/restored_settings.py 4D
26 |
27 | # Fetch it as it was on 4 January 2017
28 | duply main fetch etc/opt/$DJANGO_PROJECT/settings.py \
29 | /tmp/restored_settings.py 2017-01-04
30 |
31 | Here is how to restore all the backup into ``/tmp/restored_files``:
32 |
33 | .. code-block:: bash
34 |
35 | duply main restore /tmp/restored_files
36 |
37 | As before, you can append age specifiers such as ``4D`` or
38 | ``2017-01-04`` to the command. Note that restoring a large backup can
39 | incur charges by your backup storage provider.
40 |
41 | You should probably never restore files directly to their original
42 | location. Instead, restore into ``/tmp`` or ``/var/tmp`` and move
43 | or copy them.
44 |
45 | .. _restoring_sqlite:
46 |
47 | Restoring SQLite
48 | ----------------
49 |
50 | Restoring SQLite is very simple. Assuming the dump file is in
51 | ``/tmp/restored_files/var/backups/sqlite-$DJANGO_PROJECT.dump``, you
52 | should be able to recreate your database file thus:
53 |
54 | .. code-block:: bash
55 |
56 | sqlite3 /tmp/$DJANGO_PROJECT.db \
57 | /dev/null
85 |
86 | ``psql`` shows a lot of output, which we don't need. We redirect the
87 | output to ``/dev/null``, which in Unix-like systems is a black hole; it
88 | is a device file that merely discards everything written to it. We
89 | discard only the standard output, not the standard error, because we
90 | want to see error messages. If everything goes well, it should show only
91 | one error message:
92 |
93 | ERROR: role "postgres" already exists
94 |
95 | The file written to by ``pg_dumpall`` contains SQL commands that can be
96 | used to recreate all databases. In the beginning of the file there are
97 | commands that first create the users. One of these users is
98 | ``postgres``, but this already exists in your new cluster, therefore the
99 | error message. (The dump file also includes commands to create the
100 | databases, but ``pg_dumpall`` is smart enough to not include database
101 | creation commands for template0, template1, and postgres.)
102 |
103 | .. hint:: Playing with redirections
104 |
105 | You might want to redirect the standard error as well as the standard
106 | output. You can do it like this:
107 |
108 | .. code-block:: bash
109 |
110 | su postgres -c 'psql -f postgresql.dump postgres' \
111 | >/tmp/psql.out 2>/tmp/psql.err
112 |
113 | This actually means "redirect file descriptor 1 to /tmp/psql.out and
114 | file descriptor 2 to /tmp/psql.err". Instead of ``>file`` you can
115 | write ``1>file``, but 1 is the default and custom has it to omit it
116 | almost always. File descriptor 1 is always standard output, and 2 is
117 | always standard error. There are several use cases for redirecting
118 | the standard error, and one of them is if you want to keep a record
119 | of the error messages so that you can examine them later.
120 |
121 | One problem is that ``psql`` actually throws error messages
122 | interspersed with standard output messages, and if you separate
123 | output from error you might not know at which stage the error
124 | occurred. If you want to log the error messages in the same file and
125 | in the correct position in relation to the output messages, you can
126 | do this:
127 |
128 | .. code-block:: bash
129 |
130 | su postgres -c 'psql -f postgresql.dump postgres' \
131 | >/tmp/psql.out 2>&1
132 |
133 | The ``2 > &1`` means "redirect the standard error to the same place
134 | where you're putting the standard output".
135 |
136 | However, this will not always work as you expect because the standard
137 | output is buffered whereas the standard error is unbuffered; so
138 | sometimes error messages can appear in the file **before** output
139 | that was supposed to be printed before the error.
140 |
141 | If something goes wrong and you want to start over, here is how, but
142 | **be careful not to type these in the wrong window** (you could delete a
143 | production cluster in another server):
144 |
145 | .. code-block:: bash
146 |
147 | service postgresql stop
148 | pg_dropcluster 9.5 main
149 | pg_createcluster 9.5 main
150 | service postgresql start
151 |
152 | The second command will remove the "main" cluster of PostgreSQL version
153 | 9.5 (replace that with your actual PostgreSQL version). The third
154 | command will initialize a brand new cluster.
155 |
156 | .. _restoring_an_entire_system:
157 |
158 | Restoring an entire system
159 | --------------------------
160 |
161 | A few sections ago we saw how to restore all backed up files in a
162 | temporary directory such as ``/tmp/restored_files``. If your server (the
163 | "backed up server") has exploded, you might be tempted to setup a new
164 | server (the "restored server") and then just restore all the backup
165 | directly in the root directory instead of a temporary directory. This
166 | won't work correctly, however. For example, if you restore all of
167 | ``/var/lib``, you will overwrite ``/var/lib/apt`` and ``/var/lib/dpkg``,
168 | where the system keeps track of what packages it has installed, so it
169 | will think it has installed all the packages that had been installed in
170 | the backed up server, and the system will essentially be broken. Or if
171 | you restore ``/etc/network`` you might overwrite the restored system's
172 | network configuration with the network configuration of the backed up
173 | server. So you can't do this; you need restore the backup in
174 | ``/tmp/restored_files`` and then selectively move or copy stuff from
175 | there to its normal place.
176 |
177 | Below I present a complete recovery plan that you can use whenever your
178 | system needs recovery. It should be applicable in its entirety only when
179 | you need a complete recovery; however, if you need a partial recovery
180 | you can still follow it and omit some parts as you go. **I assume the
181 | backed up system only had Django apps deployed in the way I have
182 | described in the rest of this book.** If you have something else
183 | installed, or if you have deployed in a different way (e.g. in different
184 | directories), you **must** modify the plan with one of your own.
185 |
186 | You must also make sure that you have access to the recovery plan even
187 | if the server goes down; that is, don't store the recovery plan on a
188 | server that is among those that may need to be recovered.
189 |
190 | .. hint:: The rm command
191 |
192 | In various places in the following recovery plan, I tell you to use
193 | the ``rm`` command, which is the Unix command that removes files.
194 | With the ``-r`` option it recursively removes directories, and ``-f``
195 | means "ask no questions". The following will delete the nginx
196 | configuration, asking no questions:
197 |
198 | .. code-block:: bash
199 |
200 | rm -rf /etc/nginx
201 |
202 | ``rm`` accepts many arguments, so ``rm -rf /etc/nginx /etc/apache2``
203 | will delete both directories. Accidentally inserting a space, as in
204 | ``rm -rf / etc/nginx``, will delete mostly all your system.
205 |
206 | AAA.
207 |
208 | 1. Notify management, or the customer, or whoever is affected and needs
209 | to be informed.
210 |
211 | 2. Take notes. In particular, mark on this recovery plan anything that
212 | needs improvement.
213 |
214 | 3. Create a new server and add your ssh key.
215 |
216 | 4. Change the DNS so that $DOMAIN, www.$DOMAIN, and any other needed
217 | name points to the IP address of the new server (see
218 | :ref:`adding_dns_records`).
219 |
220 | 5. Create a user and group for your Django project (see
221 | :ref:`creating_user`).
222 |
223 | 6. Install packages:
224 |
225 | .. code-block:: bash
226 |
227 | apt install python python3 \
228 | python-virtualenv python3-virtualenv \
229 | postgresql python-psycopg2 python3-psycopg2 \
230 | sqlite3 dma nginx-light duply
231 |
232 | (Ignore questions on how to setup dma, we will restore its
233 | configuration from the backup later.)
234 |
235 | If you use Apache, install ``apache2`` instead of ``nginx-light``.
236 | The actual list of packages you need might be different (but you
237 | can also find this out while restoring).
238 |
239 | 7. Check duplicity version with ``duplicity --version``; if earlier
240 | than 0.7.6 and your backups are in Backblaze B2, install a more
241 | recent version of duplicity as explained in
242 | :ref:`Installing duplicity in Debian
243 | `.
244 |
245 | 8. Create the duply configuration directory and file as explained in
246 | :ref:`setting_up_duplicity_and_duply` (you don't need to create any
247 | files beside ``conf``, you don't need ``exclude`` or ``pre``).
248 |
249 | 9. Restore the backup in ``/var/tmp/restored_files``:
250 |
251 | .. code-block:: bash
252 |
253 | duply main restore /var/tmp/restored_files
254 |
255 | 10. Restore the ``/opt``, ``/var/opt`` and ``/etc/opt`` directories:
256 |
257 | .. code-block:: bash
258 |
259 | cd /var/tmp/restored_files
260 | cp -a var/opt/* /var/opt/
261 | cp -a etc/opt/* /etc/opt/
262 | cp -a opt/* /opt/
263 |
264 | (If you have excluded ``/opt`` from backup, clone/copy your Django
265 | project in ``/opt`` and create the virtualenv as described in
266 | :ref:`the_program_files`.)
267 |
268 | 11. Create the log directory as explained in :ref:`the_log_directory`.
269 |
270 | 12. Restore your nginx configuration:
271 |
272 | .. code-block:: bash
273 |
274 | service nginx stop
275 | rm -r /etc/nginx
276 | cp -a /var/tmp/restored_files/etc/nginx /etc
277 | service nginx start
278 |
279 | If you use Apache, restore your Apache configuration instead:
280 |
281 | .. code-block:: bash
282 |
283 | service apache2 stop
284 | rm -r /etc/apache2
285 | cp -a /var/tmp/restored_files/etc/apache2 /etc/
286 | service apache2 start
287 |
288 | 13. Create your static files directory and run ``collectstatic`` as
289 | explained in :ref:`Static and media files `.
290 |
291 | 14. Restore the systemd service file for your Django project and enable
292 | the service:
293 |
294 | .. code-block:: bash
295 |
296 | cd /var/tmp/restored_files
297 | cp etc/systemd/system/$DJANGO_PROJECT.service \
298 | /etc/systemd/system/
299 | systemctl enable $DJANGO_PROJECT
300 |
301 | 15. Restore the configuration for the DragonFly Mail Agent:
302 |
303 | .. code-block:: bash
304 |
305 | rm -r /etc/dma
306 | cp -a /var/tmp/restored_files/etc/dma /etc/
307 |
308 | 16. Create the cache directory as described in :ref:`caching`.
309 |
310 | 17. Restore the databases as explained in :ref:`restoring_sqlite` and
311 | :ref:`restoring_postgresql`.
312 |
313 | 18. Restore the duply configuration:
314 |
315 | .. code-block:: bash
316 |
317 | rm -r /etc/duply
318 | cp -a /var/tmp/restored/files/etc/duply /etc/
319 |
320 | 19. Restore the ``duply`` cron job:
321 |
322 | .. code-block:: bash
323 |
324 | cp /var/tmp/restored/etc/cron.daily/duply /etc/cron.daily/
325 |
326 | (You may want to list ``/var/tmp/restored/etc/cron.daily`` and
327 | ``/etc/cron.daily`` to see if there is any other cronjob that needs
328 | restoring.)
329 |
330 | 20. Start the Django project and verify it works:
331 |
332 | .. code-block:: bash
333 |
334 | service $DJANGO_PROJECT start
335 |
336 | 21. Restart the system and verify it works:
337 |
338 | .. code-block:: bash
339 |
340 | shutdown -r now
341 |
342 | The system might work perfectly without restart; the reason we restart
343 | it is to verify that if the server restarts, all services will startup
344 | properly.
345 |
346 | After you've finished, update your recovery plan with the notes you
347 | took.
348 |
349 | Recovery testing
350 | ----------------
351 |
352 | In the previous chapter I said several times that you must test your
353 | recovery. Your recovery testing plan depends on the extent to which
354 | downtime is an issue.
355 |
356 | If downtime is not an issue, that is, you can find a date and time in
357 | which the system is not being used, the simplest way to test the
358 | recovery is to shutdown the server, pretend it has been entirely
359 | deleted, and follow the recovery plan in the previous section to bring
360 | the system up on a new server. Keep the old server off for a week or a
361 | month or until you feel confident it really has no useful information,
362 | then delete it.
363 |
364 | If you can't have much downtime, maybe there are times when the system
365 | is not being written to. Many web apps are like this; you want them to
366 | always be readable by the visitors, but maybe they are not being updated
367 | off hours. In that case, notify management or the customer about what
368 | you are going to do, pick up an appropriate time, and test the recovery
369 | with the following procedure:
370 |
371 | 1. In the DNS, verify that the TTL of $DOMAIN, www.$DOMAIN, and any
372 | other necessary record is no more than 300 seconds or 5 minutes (see
373 | :ref:`adding_dns_records`).
374 |
375 | 2. Follow the recovery plan of the previous section to bring up the
376 | system on a new server, **but omit the step about changing the
377 | DNS**. (Hint: you can :ref:`edit your own hosts file
378 | ` while checking if the new system works.)
379 |
380 | 3. After the system works and you've fixed all problems, change the DNS
381 | so that $DOMAIN, www.$DOMAIN, and any other needed name points to
382 | the IP address of the new server (see :ref:`adding_dns_records`).
383 |
384 | 4. Wait for five minutes, then shut down the old server.
385 |
386 | You could have zero downtime by only following the first two steps
387 | instead of all four, and after you are satisfied discard the *new*
388 | server instead of the old one. However, you can't really be certain you
389 | haven't left something out if you don't use the new server
390 | operationally. So while following half the testing plan can be a good
391 | idea as a preliminary test in order to get an idea of how much time will
392 | be needed by the actual test, staying there and not doing the actual
393 | test is a bad idea.
394 |
395 | If you think you can't afford any downtime at all, you are doing
396 | something wrong. You *will* have downtime when you accidentally delete a
397 | database, when there is a hardware or network error, and in many other
398 | cases. Pretending you won't is a bad idea. If you really can't afford
399 | downtime, you should setup high availability (which is a lot of work and
400 | can fill in several books by itself). If you don't, it means that the
401 | business *can* afford a little downtime once in a while, so having a
402 | little scheduled downtime once a year shouldn't be a big deal.
403 |
404 | In fact, I think that, in theory at least, recovery should be tested
405 | during business hours, possibly without notifying the business in
406 | advance (except to get permission to do it, but not to arrange a
407 | specific time). Recovery isn't merely a system administrator's issue,
408 | and an additional recovery plan for management might need to be
409 | created, that describes how the business will handle the situation (what
410 | to tell the customers, what the employees should do, and so on).
411 | Recovery with downtime during business hours can be a good exercise for
412 | the whole business, not just for the administrator.
413 |
414 | Copying offline
415 | ---------------
416 |
417 | Briefly, here is how to copy the server's data to your local machine:
418 |
419 | .. code-block:: bash
420 |
421 | awk '{ print $2 }' /etc/duply/main/exclude >/tmp/exclude
422 | tar czf - --exclude-from=/tmp/exclude / | \
423 | split --bytes=200M - \
424 | /tmp/`hostname`-`date --iso-8601`.tar.gz.
425 |
426 | This will need some explanation, of course, but it will create one or more
427 | files with filenames similar to the following:
428 |
429 | | ``/tmp/myserver-2017-01-22.tar.gz.aa``
430 | | ``/tmp/myserver-2017-01-22.tar.gz.ab``
431 | | ``/tmp/myserver-2017-01-22.tar.gz.ac``
432 |
433 | We will talk about downloading them later on. Now let's examine what we
434 | did. We will check the last command (i.e. the ``tar`` and ``split``)
435 | first.
436 |
437 | We've seen the ``tar`` command earlier, in :ref:`Installing duplicity in
438 | Debian `. The "c" in "czf" means we will
439 | create an archive; the "z" means the archive will be compressed; the "f"
440 | followed by a file name specifies the name of the archive; "f" followed
441 | by a hyphen means the archive will be created in the standard output.
442 | The last argument to the ``tar`` command specifies which directory
443 | should be put in the archive; in our case it's a mere slash, which means
444 | the root directory. The ``--exclude-from=/tmp/exclude`` option means
445 | that files and directories specified in the ``/tmp/exclude`` file should
446 | not be included in the archive.
447 |
448 | This would create an archive with all the files we need, but it might be
449 | too large. If your external disk is formatted in FAT32, it might not be
450 | able to hold files larger than 2 GB. So we take the data thrown at the
451 | standard output and we split it in manageable chunks of 200 MB each.
452 | This is what the ``split`` command does. The hyphen in ``split`` means
453 | "split the standard input". The last argument to ``split`` is the file
454 | prefix; the files ``split`` creates are named ``PREFIXaa``,
455 | ``PREFIXab``, and so on.
456 |
457 | The backticks in the specified prefix are a neat shell trick: the shell
458 | executes the command within the backticks, takes the command's standard
459 | output, and inserts it in the command line. So the shell will first
460 | execute ``hostname`` and ``date --iso-8601``, it will then create the
461 | command line for ``split`` that contains among other things the output
462 | of these commands, and then it will execute ``split`` giving it the
463 | calculated command line. We have chosen a prefix that ends in
464 | ``.tar.gz``, because that is what compressed tar files end in. If you
465 | concatenate these files into a single file ending in ``.tar.gz``, that
466 | will be the compressed tar file. We will see how to concatenate them two
467 | sections ahead.
468 |
469 | Finally, let's explain the first command, which creates
470 | ``/tmp/exclude``. We want to exclude the same directories as those
471 | specified in ``/etc/duply/main/exclude``. However, the syntax used by
472 | duplicity is different from the syntax used by ``tar``. Duplicity needs
473 | the pathnames to be preceded by a minus sign and a space, whereas
474 | ``tar`` just wants them listed. So the first command merely strips the
475 | minus sign. ``awk`` is actually a whole programming language, but you
476 | don't need to learn it (I don't know it either). The ``{ print $2 }``
477 | means "print the second item of each line". While ``awk`` is the
478 | canonical way of doing this in Unix-like systems, you could do it with
479 | Python if you prefer, but it's much harder:
480 |
481 | .. code-block:: bash
482 |
483 | python -c "import sys;\
484 | print('\n'.join([x.split()[1] for x in sys.stdin]))" \
485 | /tmp/exclude
486 |
487 | Now let's **download the archive**. That's easy using ``scp`` (on
488 | Unix-like systems) or ``pscp`` (on Windows). Assuming the external disk
489 | is plugged in and available as $EXTERNAL_DISK (i.e. something like
490 | ``/media/user/DISK`` on GNU/Linux, and something like ``E:\`` on
491 | Windows), you can put it directly in there like this:
492 |
493 | .. code-block:: bash
494 |
495 | scp root@$SERVER_IP_ADDRESS:/tmp/*.tar.gz.* $EXTERNAL_DISK
496 |
497 | In Windows, use ``pscp`` instead of ``scp``. You can also use graphical
498 | tools, however command-line tools can often be more convenient.
499 |
500 | In Unix-like systems, a better command is ``rsync``:
501 |
502 | .. code-block:: bash
503 |
504 | rsync root@$SERVER_IP_ADDRESS:/tmp/*.tar.gz.* $EXTERNAL_DISK
505 |
506 | If for some reason the transfer is interrupted and you restart it,
507 | ``rsync`` will only transfer the parts of the files that have not yet
508 | been transferred. ``rsync`` must be installed both on the server and
509 | locally for this to work. You may have success with Windows rsync
510 | programs such as DeltaCopy.
511 |
512 | One problem with the above scheme is that we temporarily store the split
513 | tar file on the server, and the server might not have enough disk space
514 | for that. In that case, if you run a Unix-like system locally, this
515 | might work:
516 |
517 | .. code-block:: bash
518 |
519 | ssh root@$SERVER_IP_ADDRESS \
520 | "awk '{ print \$2 }' /etc/duply/main/exclude
521 | >/tmp/exclude; \
522 | tar czf - --exclude-from=/tmp/exclude /" | \
523 | split --bytes=200M - \
524 | $EXTERNAL_DISK/$SERVER_NAME-`date --iso-8601`.tar.gz.
525 |
526 | The ``ssh`` command will login to the remote server and execute the
527 | commands ``awk`` and ``tar``, and it will capture their standard output
528 | (i.e. ``tar``'s standard output, because ``awk``'s is redirected) and it
529 | will throw it in its own standard output.
530 |
531 | The trickiest part of this ``ssh`` command is that, in the ``awk``, we
532 | have escaped the dollar sign with a backslash. ``awk`` is a programming
533 | language, and ``{ print $2 }`` is an ``awk`` program. ``awk`` must
534 | literally receive the string ``{ print $2 }`` as its program. When we
535 | give a local shell the command ``awk '{ print $2 }'``, the shell leaves
536 | the ``{ print $2 }`` as it is, because it is enclosed in single quotes.
537 | If, instead, we used double quotes, we would use ``awk "{ print \$2
538 | }"``, otherwise, if we simply used ``$2``, the shell would try to expand
539 | it to whatever ``$2`` means (see :ref:`Bash syntax `).
540 | Now the string given to ``ssh`` is a double-quoted string. The *local*
541 | shell gets that string and performs expansions and runs ``ssh`` after it
542 | has done these expansions; and ``ssh`` gets the resulting string,
543 | executes a shell remotely, and gives it that string. You can understand
544 | the rest of the story with a bit of thinking.
545 |
546 | If you aren't running a Unix-like system locally, something else you can
547 | do is use another Debian/Ubuntu server that you have on the network and
548 | does have the disk space. You can also temporarily create one at Digital
549 | Ocean just for the job. After running the above command to create the
550 | backup and store it in the temporary server, you can then copy it to
551 | your local machine and external disk.
552 |
553 | You may have noticed we did not backup the databases. I assume that your
554 | normal backup script does this every day, and it stores the saved
555 | databases in ``/var/backups``. You need to be careful, however, to not
556 | run the ``tar`` command at the same time cron and duply run
557 | ``/etc/duply/main/pre``, otherwise you might be copying them at exactly
558 | the time they are being overwritten.
559 |
560 | Storing and rotating external disks
561 | -----------------------------------
562 |
563 | In the previous chapter I told you you need two external disks. Store
564 | one of them at your office and the other elsewhere—at your home, at your
565 | boss's home, at a bank vault, at a backup storage company, or at your
566 | customer's office or home (however don't give your customer a disk that
567 | also contains data of other customers of yours). Whatever place you
568 | chose, I will be calling it "off site". So you will be keeping one disk
569 | off site and one on site. Whenever you want to perform an offline backup
570 | (say once per month), connect the disk you have on site, delete all the
571 | files it contains, and perform the procedure described in the previous
572 | section to backup your servers on it. After that, physically label it
573 | with the date (overwriting or removing the previous label), and move it
574 | off site. Bring the other disk on site and let it sit there until the
575 | next offline backup.
576 |
577 | Why do we use two disks instead of just one? Well, it's quite
578 | conceivable that your online data (and online backup) will be severely
579 | damaged, and you can perform an offline backup, wiping out the previous
580 | one, before realizing the server's severely damanged. In that case, your
581 | offline disk will contain damaged data. Or the attacker might wait for
582 | you to plug in the backup disk, and then wipe it out and proceed to wipe
583 | out the online backup and your servers.
584 |
585 | You might object that there is a weakness to this plan because the two
586 | disks are at the same location, off site, when you take there the
587 | recently used disk and exchange it with the older one. I wouldn't worry
588 | too much about this. Offline backups are extra backups anyway, and you
589 | hope to never need to use them. While it's possible that someone can get
590 | access to all your passwords and delete all your online servers and
591 | backups, the probability of this happening at the same time as the
592 | physical destruction of your two offline disks at the limited time they
593 | are both off site is so low that you should probably worry more about
594 | your plane crashing.
595 |
596 | With this scheme, you might lose up to one month of data. Normally this
597 | is too much, but maybe for the extreme case we are talking about it's
598 | OK. Only you can judge that. If you think it's unacceptable, you might
599 | perform offline backups more often. If you do them more often than once
600 | every two weeks, it would be better to use more external disks.
601 |
602 | Recovering from offline backups
603 | -------------------------------
604 |
605 | You will probably never need to recover from offline backups, so we
606 | won't go into much detail. If a disaster happens and you need to restore
607 | from offline, the most important thing you need to care about is the
608 | safety of your external disk. Make **absolutely certain** you will only
609 | plug it on a safe computer, one that is certainly not compromised by any
610 | attacker. Do this very slowly and think about every step. After plugging
611 | the external disk in, copy its files to the computer's disk, then unplug
612 | the external disk immediately and keep it safe.
613 |
614 | Recovery is the same as what's described in
615 | :ref:`restoring_an_entire_system`, except for the steps that use duply
616 | and duplicity to restore the backup in ``/var/tmp/restored_files``.
617 | Instead, copy the ``.tar.gz.XX`` files to the server's ``/var/tmp``
618 | directory; use ``scp`` or ``pscp`` or ``rsync`` for that (``rsync`` is
619 | the best if you have it). When you have them all, join them in one
620 | piece with the concatenation command, ``cat``, then untar them:
621 |
622 | .. code-block:: bash
623 |
624 | cd /tmp
625 | cat *.tar.gz.* >backup.tar.gz
626 | mkdir restored_files
627 | cd restored_files
628 | tar xf ../backup.tar.gz
629 |
630 | If you are low on disk space, you might join the concatenation command
631 | with the tar command, like this:
632 |
633 | .. code-block:: bash
634 |
635 | cd /tmp
636 | mkdir restored_files
637 | cd restored_files
638 | cat ../*.tar.gz.* | tar xf -
639 |
640 | Scheduling manual operations
641 | ----------------------------
642 |
643 | In the previous chapter, I described stuff that you will eventually
644 | set up in such a way that it runs alone. Your servers will be backing up
645 | themselves without your knowing anything about it. In contrast, all the
646 | procedures I described in this chapter are to be manually executed by a
647 | human:
648 |
649 | * Restoring part of a system or the whole system
650 | * Recovery testing
651 | * Copying offline
652 | * Recovering from offline backups
653 |
654 | Some of these procedures will be triggered by an event, such as losing
655 | data. Recovery testing, however, and copying offline, will not be
656 | triggered; *you* must take care that they occur. This can be as simple
657 | as adding a few recurring entries to your calendar, or as hard as
658 | inventing foolproof procedures to be added to the company's operations
659 | manual. Whatever you do, you must make sure it works. **If you don't
660 | test recovery, it is almost certain it will take too long when you need
661 | it, and it is quite likely you will be unable to recover at all.**
662 |
663 | Chapter summary
664 | ---------------
665 |
666 | * Use the provided recovery plan or devise your own.
667 | * Make sure you will have access to the recovery plan (and all required
668 | information such as logins and passwords) even if your server stops
669 | existing.
670 | * Test your recovery plan once a year or so.
671 | * Backup online as well as to offline disks and store them safely.
672 | * Don't backup to offline disks at the same time as the system is
673 | performing its online backup.
674 | * Create an offline backup schedule and a recovery testing schedule and
675 | make sure they are being followed.
676 |
--------------------------------------------------------------------------------
/CHANGES.txt:
--------------------------------------------------------------------------------
1 | 2.1 (2018-08-13)
2 | ----------------
3 |
4 | - License has been changed to CC-by-nc-sa
5 | - The book used to have some prerequisite knowledge and, if the reader
6 | didn't have it, it recommended to take the free "Linux servers 101"
7 | course. This is no longer the case. Instead, the content of the course
8 | has been incorporated in the first chapter of the book.
9 |
10 | 2.0 (2017-05-05)
11 | ----------------
12 |
13 | - New cover.
14 | - Chapters and sections are now numbered in epub.
15 | - Figures are now numbered.
16 | - Improved static files figures.
17 | - Added two sections explaining the "top" command, and while at it,
18 | explained a great deal about memory and process management.
19 | - Added section "Clearing sessions" to the settings chapter.
20 | - The scheme with settings/base.py and other files that import from each
21 | other is no longer advocated. Instead, the default Django layout is
22 | assumed.
23 | - Gevent is no longer proposed.
24 | - Pointed out the risk of a nightmarish migration later if using SQLite
25 | now.
26 | - Fixed a few typos, trivial errors, and made some language
27 | improvements.
28 |
29 | 1.0 (2017-01-26)
30 | ----------------
31 |
32 | - Added two chapters: Recovery part 1 and 2.
33 | - Improved the formatting of admonitions both in PDF and EPUB.
34 | - Fixed some typos and formatting errors.
35 | - Removed references to chapter numbers, as in EPUB they are unnumbered.
36 | - In PostgreSQL, replaced the drop-and-recreate-the-cluster procedure with an
37 | easier and saner one.
38 |
39 | 0.2 (2017-01-19)
40 | ----------------
41 |
42 | - Fixed error on PDF where instead of http://$DOMAIN/ it would show something
43 | like http://\protect\P1\dollarsignDOMAIN.
44 | - Added page with meta information.
45 | - Fixed some typos.
46 |
47 | 0.1 (2016-12-22)
48 | ----------------
49 |
50 | - Initial release.
51 |
--------------------------------------------------------------------------------
/Makefile:
--------------------------------------------------------------------------------
1 | # Makefile for Sphinx documentation
2 | #
3 |
4 | # Custom variables
5 | BASEFILENAME = DeployingDjangoonasingleDebianorUbuntuserver
6 |
7 | # You can set these variables from the command line.
8 | SPHINXOPTS =
9 | SPHINXBUILD = if ! [ `pip freeze | grep --color=none Sphinx` = `cat requirements.txt` ]; then echo 'Wrong sphinx version'; exit 1; fi; sphinx-build
10 | PAPER =
11 | BUILDDIR = _build
12 |
13 | # User-friendly check for sphinx-build
14 | ifeq ($(shell which $(SPHINXBUILD) >/dev/null 2>&1; echo $$?), 1)
15 | $(error The '$(SPHINXBUILD)' command was not found. Make sure you have Sphinx installed, then set the SPHINXBUILD environment variable to point to the full path of the '$(SPHINXBUILD)' executable. Alternatively you can add the directory with the executable to your PATH. If you don't have Sphinx installed, grab it from http://sphinx-doc.org/)
16 | endif
17 |
18 | # Internal variables.
19 | PAPEROPT_a4 = -D latex_paper_size=a4
20 | PAPEROPT_letter = -D latex_paper_size=letter
21 | ALLSPHINXOPTS = -d $(BUILDDIR)/doctrees $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) .
22 | # the i18n builder cannot share the environment and doctrees with the others
23 | I18NSPHINXOPTS = $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) .
24 |
25 | .PHONY: help clean html dirhtml singlehtml pickle json htmlhelp qthelp devhelp epub latex latexpdf text man changes linkcheck doctest gettext
26 |
27 | help:
28 | @echo "Please use \`make ' where is one of"
29 | @echo " html to make standalone HTML files"
30 | @echo " dirhtml to make HTML files named index.html in directories"
31 | @echo " singlehtml to make a single large HTML file"
32 | @echo " pickle to make pickle files"
33 | @echo " json to make JSON files"
34 | @echo " htmlhelp to make HTML files and a HTML help project"
35 | @echo " qthelp to make HTML files and a qthelp project"
36 | @echo " devhelp to make HTML files and a Devhelp project"
37 | @echo " epub to make an epub"
38 | @echo " latex to make LaTeX files, you can set PAPER=a4 or PAPER=letter"
39 | @echo " latexpdf to make LaTeX files and run them through pdflatex"
40 | @echo " latexpdfja to make LaTeX files and run them through platex/dvipdfmx"
41 | @echo " text to make text files"
42 | @echo " man to make manual pages"
43 | @echo " texinfo to make Texinfo files"
44 | @echo " info to make Texinfo files and run them through makeinfo"
45 | @echo " gettext to make PO message catalogs"
46 | @echo " changes to make an overview of all changed/added/deprecated items"
47 | @echo " xml to make Docutils-native XML files"
48 | @echo " pseudoxml to make pseudoxml-XML files for display purposes"
49 | @echo " linkcheck to check all external links for integrity"
50 | @echo " doctest to run all doctests embedded in the documentation (if enabled)"
51 |
52 | clean:
53 | rm -rf $(BUILDDIR)/*
54 |
55 | html:
56 | $(SPHINXBUILD) -b html $(ALLSPHINXOPTS) $(BUILDDIR)/html
57 | @echo
58 | @echo "Build finished. The HTML pages are in $(BUILDDIR)/html."
59 |
60 | dirhtml:
61 | $(SPHINXBUILD) -b dirhtml $(ALLSPHINXOPTS) $(BUILDDIR)/dirhtml
62 | @echo
63 | @echo "Build finished. The HTML pages are in $(BUILDDIR)/dirhtml."
64 |
65 | singlehtml:
66 | $(SPHINXBUILD) -b singlehtml $(ALLSPHINXOPTS) $(BUILDDIR)/singlehtml
67 | @echo
68 | @echo "Build finished. The HTML page is in $(BUILDDIR)/singlehtml."
69 |
70 | pickle:
71 | $(SPHINXBUILD) -b pickle $(ALLSPHINXOPTS) $(BUILDDIR)/pickle
72 | @echo
73 | @echo "Build finished; now you can process the pickle files."
74 |
75 | json:
76 | $(SPHINXBUILD) -b json $(ALLSPHINXOPTS) $(BUILDDIR)/json
77 | @echo
78 | @echo "Build finished; now you can process the JSON files."
79 |
80 | htmlhelp:
81 | $(SPHINXBUILD) -b htmlhelp $(ALLSPHINXOPTS) $(BUILDDIR)/htmlhelp
82 | @echo
83 | @echo "Build finished; now you can run HTML Help Workshop with the" \
84 | ".hhp project file in $(BUILDDIR)/htmlhelp."
85 |
86 | qthelp:
87 | $(SPHINXBUILD) -b qthelp $(ALLSPHINXOPTS) $(BUILDDIR)/qthelp
88 | @echo
89 | @echo "Build finished; now you can run "qcollectiongenerator" with the" \
90 | ".qhcp project file in $(BUILDDIR)/qthelp, like this:"
91 | @echo "# qcollectiongenerator $(BUILDDIR)/qthelp/DeployingDjangoonasingleDebianorUbuntuserver.qhcp"
92 | @echo "To view the help file:"
93 | @echo "# assistant -collectionFile $(BUILDDIR)/qthelp/DeployingDjangoonasingleDebianorUbuntuserver.qhc"
94 |
95 | devhelp:
96 | $(SPHINXBUILD) -b devhelp $(ALLSPHINXOPTS) $(BUILDDIR)/devhelp
97 | @echo
98 | @echo "Build finished."
99 | @echo "To view the help file:"
100 | @echo "# mkdir -p $$HOME/.local/share/devhelp/DeployingDjangoonasingleDebianorUbuntuserver"
101 | @echo "# ln -s $(BUILDDIR)/devhelp $$HOME/.local/share/devhelp/DeployingDjangoonasingleDebianorUbuntuserver"
102 | @echo "# devhelp"
103 |
104 | epub:
105 | $(SPHINXBUILD) -b epub $(ALLSPHINXOPTS) $(BUILDDIR)/epub
106 | @echo
107 | @echo "Build finished. The epub file is in $(BUILDDIR)/epub."
108 |
109 | latex:
110 | mv index.rst index.rst.bak
111 | cp index.rst-latex index.rst
112 | $(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex
113 | mv index.rst.bak index.rst
114 | ./fixlatex <$(BUILDDIR)/latex/${BASEFILENAME}.tex >/tmp/${BASEFILENAME}.tex
115 | mv /tmp/${BASEFILENAME}.tex $(BUILDDIR)/latex/${BASEFILENAME}.tex
116 | @echo
117 | @echo "Build finished; the LaTeX files are in $(BUILDDIR)/latex."
118 | @echo "Run \`make' in that directory to run these through (pdf)latex" \
119 | "(use \`make latexpdf' here to do that automatically)."
120 |
121 | latexpdf:
122 | mv index.rst index.rst.bak
123 | cp index.rst-latex index.rst
124 | $(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex
125 | mv index.rst.bak index.rst
126 | ./fixlatex <$(BUILDDIR)/latex/${BASEFILENAME}.tex >/tmp/${BASEFILENAME}.tex
127 | mv /tmp/${BASEFILENAME}.tex $(BUILDDIR)/latex/${BASEFILENAME}.tex
128 | @echo "Running LaTeX files through pdflatex..."
129 | $(MAKE) -C $(BUILDDIR)/latex all-pdf
130 | @echo "pdflatex finished; the PDF files are in $(BUILDDIR)/latex."
131 |
132 | latexpdfja:
133 | $(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex
134 | @echo "Running LaTeX files through platex and dvipdfmx..."
135 | $(MAKE) -C $(BUILDDIR)/latex all-pdf-ja
136 | @echo "pdflatex finished; the PDF files are in $(BUILDDIR)/latex."
137 |
138 | text:
139 | $(SPHINXBUILD) -b text $(ALLSPHINXOPTS) $(BUILDDIR)/text
140 | @echo
141 | @echo "Build finished. The text files are in $(BUILDDIR)/text."
142 |
143 | man:
144 | $(SPHINXBUILD) -b man $(ALLSPHINXOPTS) $(BUILDDIR)/man
145 | @echo
146 | @echo "Build finished. The manual pages are in $(BUILDDIR)/man."
147 |
148 | texinfo:
149 | $(SPHINXBUILD) -b texinfo $(ALLSPHINXOPTS) $(BUILDDIR)/texinfo
150 | @echo
151 | @echo "Build finished. The Texinfo files are in $(BUILDDIR)/texinfo."
152 | @echo "Run \`make' in that directory to run these through makeinfo" \
153 | "(use \`make info' here to do that automatically)."
154 |
155 | info:
156 | $(SPHINXBUILD) -b texinfo $(ALLSPHINXOPTS) $(BUILDDIR)/texinfo
157 | @echo "Running Texinfo files through makeinfo..."
158 | make -C $(BUILDDIR)/texinfo info
159 | @echo "makeinfo finished; the Info files are in $(BUILDDIR)/texinfo."
160 |
161 | gettext:
162 | $(SPHINXBUILD) -b gettext $(I18NSPHINXOPTS) $(BUILDDIR)/locale
163 | @echo
164 | @echo "Build finished. The message catalogs are in $(BUILDDIR)/locale."
165 |
166 | changes:
167 | $(SPHINXBUILD) -b changes $(ALLSPHINXOPTS) $(BUILDDIR)/changes
168 | @echo
169 | @echo "The overview file is in $(BUILDDIR)/changes."
170 |
171 | linkcheck:
172 | $(SPHINXBUILD) -b linkcheck $(ALLSPHINXOPTS) $(BUILDDIR)/linkcheck
173 | @echo
174 | @echo "Link check complete; look for any errors in the above output " \
175 | "or in $(BUILDDIR)/linkcheck/output.txt."
176 |
177 | doctest:
178 | $(SPHINXBUILD) -b doctest $(ALLSPHINXOPTS) $(BUILDDIR)/doctest
179 | @echo "Testing of doctests in the sources finished, look at the " \
180 | "results in $(BUILDDIR)/doctest/output.txt."
181 |
182 | xml:
183 | $(SPHINXBUILD) -b xml $(ALLSPHINXOPTS) $(BUILDDIR)/xml
184 | @echo
185 | @echo "Build finished. The XML files are in $(BUILDDIR)/xml."
186 |
187 | pseudoxml:
188 | $(SPHINXBUILD) -b pseudoxml $(ALLSPHINXOPTS) $(BUILDDIR)/pseudoxml
189 | @echo
190 | @echo "Build finished. The pseudo-XML files are in $(BUILDDIR)/pseudoxml."
191 |
--------------------------------------------------------------------------------
/README.rst:
--------------------------------------------------------------------------------
1 | =========================
2 | Book on Django Deployment
3 | =========================
4 |
5 | Get the book
6 | ============
7 |
8 | `Read online at readthedocs`_ or get it in epub or pdf at the "releases" page.
9 |
10 | .. _read online at readthedocs: https://djangodeployment.readthedocs.io/
11 |
12 | Compiling the source
13 | ====================
14 |
15 | ::
16 |
17 | apt install texlive-latex-extra
18 | mkvirtualenv ddbook
19 | pip install -r requirements.txt
20 | make latexpdf
21 | make epub
22 |
23 | After the above, the PDF should be in ``_build/latex`` and the epub in
24 | ``_build/epub``.
25 |
26 | Contributing
27 | ============
28 |
29 | If you want something to be fixed or added, please add an issue.
30 |
31 | If you fix or add something, please add a pull request. When fixing/adding
32 | configuration and code snippets, please use (and fix) ``testscript`` to verify
33 | that things work.
34 |
35 | Copyright and license
36 | =====================
37 |
38 | Please see file ``meta.rst``.
39 |
--------------------------------------------------------------------------------
/_static/cover.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/djangodeployment/django-deployment-book/9843ac13772b29fe1e0e88f410191e421bc168f3/_static/cover.png
--------------------------------------------------------------------------------
/_static/epub.css:
--------------------------------------------------------------------------------
1 | /*
2 | * epub.css_t
3 | * ~~~~~~~~~~
4 | *
5 | * Sphinx stylesheet -- epub theme.
6 | *
7 | * :copyright: Copyright 2007-2016 by the Sphinx team, see AUTHORS.
8 | * :license: BSD, see LICENSE for details.
9 | *
10 | */
11 |
12 | /* -- main layout ----------------------------------------------------------- */
13 |
14 | div.clearer {
15 | clear: both;
16 | }
17 |
18 | a:link, a:visited {
19 | color: #3333ff;
20 | text-decoration: underline;
21 | }
22 |
23 | img {
24 | border: 0;
25 | max-width: 100%;
26 | }
27 |
28 | /* -- relbar ---------------------------------------------------------------- */
29 |
30 | div.related {
31 | width: 100%;
32 | font-family: sans-serif;
33 | font-size: 90%;
34 | }
35 |
36 | div.related h3 {
37 | display: none;
38 | }
39 |
40 | div.related ul {
41 | margin: 0;
42 | padding: 0 0 0 10px;
43 | list-style: none;
44 | }
45 |
46 | div.related li {
47 | display: inline;
48 | }
49 |
50 | div.related li.right {
51 | float: right;
52 | margin-right: 5px;
53 | }
54 |
55 | /* -- sidebar --------------------------------------------------------------- */
56 |
57 | div.sphinxsidebarwrapper {
58 | padding: 10px 5px 0 10px;
59 | }
60 |
61 | div.sphinxsidebar {
62 | float: left;
63 | width: 230px;
64 | margin-left: -100%;
65 | font-size: 90%;
66 | }
67 |
68 | div.sphinxsidebar ul {
69 | list-style: none;
70 | }
71 |
72 | div.sphinxsidebar ul ul,
73 | div.sphinxsidebar ul.want-points {
74 | margin-left: 20px;
75 | list-style: square;
76 | }
77 |
78 | div.sphinxsidebar ul ul {
79 | margin-top: 0;
80 | margin-bottom: 0;
81 | }
82 |
83 | div.sphinxsidebar form {
84 | margin-top: 10px;
85 | }
86 |
87 | div.sphinxsidebar input {
88 | border: 1px solid #98dbcc;
89 | font-family: sans-serif;
90 | font-size: 100%;
91 | }
92 |
93 | img {
94 | border: 0;
95 | max-width: 100%;
96 | }
97 |
98 | /* -- search page ----------------------------------------------------------- */
99 |
100 | ul.search {
101 | margin: 10px 0 0 20px;
102 | padding: 0;
103 | }
104 |
105 | ul.search li {
106 | padding: 5px 0 5px 20px;
107 | background-image: url(file.png);
108 | background-repeat: no-repeat;
109 | background-position: 0 7px;
110 | }
111 |
112 | ul.search li a {
113 | font-weight: bold;
114 | }
115 |
116 | ul.search li div.context {
117 | color: #888;
118 | margin: 2px 0 0 30px;
119 | text-align: left;
120 | }
121 |
122 | ul.keywordmatches li.goodmatch a {
123 | font-weight: bold;
124 | }
125 |
126 | /* -- index page ------------------------------------------------------------ */
127 |
128 | table.contentstable {
129 | width: 90%;
130 | }
131 |
132 | table.contentstable p.biglink {
133 | line-height: 150%;
134 | }
135 |
136 | a.biglink {
137 | font-size: 130%;
138 | }
139 |
140 | span.linkdescr {
141 | font-style: italic;
142 | padding-top: 5px;
143 | font-size: 90%;
144 | }
145 |
146 | /* -- general index --------------------------------------------------------- */
147 |
148 | table.indextable td {
149 | text-align: left;
150 | vertical-align: top;
151 | }
152 |
153 | table.indextable dl, table.indextable dd {
154 | margin-top: 0;
155 | margin-bottom: 0;
156 | }
157 |
158 | table.indextable tr.pcap {
159 | height: 10px;
160 | }
161 |
162 | table.indextable tr.cap {
163 | margin-top: 10px;
164 | background-color: #f2f2f2;
165 | }
166 |
167 | img.toggler {
168 | margin-right: 3px;
169 | margin-top: 3px;
170 | cursor: pointer;
171 | }
172 |
173 | /* -- general body styles --------------------------------------------------- */
174 |
175 | a.headerlink {
176 | visibility: hidden;
177 | }
178 |
179 | div.body p.caption {
180 | text-align: inherit;
181 | }
182 |
183 | div.body td {
184 | text-align: left;
185 | }
186 |
187 | .field-list ul {
188 | padding-left: 100%;
189 | }
190 |
191 | .first {
192 | margin-top: 0 !important;
193 | }
194 |
195 | p.rubric {
196 | margin-top: 30px;
197 | font-weight: bold;
198 | }
199 |
200 | .align-left {
201 | text-align: left;
202 | }
203 |
204 | .align-center {
205 | text-align: center;
206 | }
207 |
208 | .align-right {
209 | text-align: right;
210 | }
211 |
212 | /* -- sidebars -------------------------------------------------------------- */
213 |
214 | div.sidebar {
215 | margin: 0 0 0.5em 1em;
216 | border: 1px solid #ddb;
217 | padding: 7px 7px 0 7px;
218 | background-color: #ffe;
219 | width: 40%;
220 | float: right;
221 | }
222 |
223 | p.sidebar-title {
224 | font-weight: bold;
225 | }
226 |
227 | /* -- topics ---------------------------------------------------------------- */
228 |
229 | div.topic {
230 | border: 1px solid #ccc;
231 | padding: 7px 7px 0 7px;
232 | margin: 10px 0 10px 0;
233 | }
234 |
235 | p.topic-title {
236 | font-size: 110%;
237 | font-weight: bold;
238 | margin-top: 10px;
239 | }
240 |
241 | /* -- admonitions ----------------------------------------------------------- */
242 |
243 | div.admonition {
244 | background-color: #d9edf7; /* Added by A.X. */
245 | border: solid #bce8f1 1px; /* Added by A.X. */
246 | margin-top: 10px;
247 | margin-bottom: 10px;
248 | padding: 7px;
249 | }
250 |
251 | div.admonition dt {
252 | font-weight: bold;
253 | }
254 |
255 | div.admonition dl {
256 | margin-bottom: 0;
257 | }
258 |
259 | p.admonition-title {
260 | margin: 0px 10px 5px 0px;
261 | font-weight: bold;
262 | }
263 |
264 | div.body p.centered {
265 | text-align: center;
266 | margin-top: 25px;
267 | }
268 |
269 | /* -- tables ---------------------------------------------------------------- */
270 |
271 | table.docutils {
272 | border: 0;
273 | border-collapse: collapse;
274 | }
275 |
276 | table caption span.caption-number {
277 | font-style: italic;
278 | }
279 |
280 | table caption span.caption-text {
281 | }
282 |
283 | table.docutils td, table.docutils th {
284 | padding: 1px 8px 1px 5px;
285 | border-top: 0;
286 | border-left: 0;
287 | border-right: 0;
288 | border-bottom: 1px solid #aaa;
289 | }
290 |
291 | table.field-list td, table.field-list th {
292 | border: 0 !important;
293 | }
294 |
295 | table.footnote td, table.footnote th {
296 | border: 0 !important;
297 | }
298 |
299 | th {
300 | text-align: left;
301 | padding-right: 5px;
302 | }
303 |
304 | table.citation {
305 | border-left: solid 1px gray;
306 | margin-left: 1px;
307 | }
308 |
309 | table.citation td {
310 | border-bottom: none;
311 | }
312 |
313 | /* -- figures --------------------------------------------------------------- */
314 |
315 | div.figure p.caption span.caption-number {
316 | font-style: italic;
317 | }
318 |
319 | div.figure p.caption span.caption-text {
320 | }
321 |
322 | /* -- other body styles ----------------------------------------------------- */
323 |
324 | ol.arabic {
325 | list-style: decimal;
326 | }
327 |
328 | ol.loweralpha {
329 | list-style: lower-alpha;
330 | }
331 |
332 | ol.upperalpha {
333 | list-style: upper-alpha;
334 | }
335 |
336 | ol.lowerroman {
337 | list-style: lower-roman;
338 | }
339 |
340 | ol.upperroman {
341 | list-style: upper-roman;
342 | }
343 |
344 | dl {
345 | margin-bottom: 15px;
346 | }
347 |
348 | dd p {
349 | margin-top: 0px;
350 | }
351 |
352 | dd ul, dd table {
353 | margin-bottom: 10px;
354 | }
355 |
356 | dd {
357 | margin-top: 3px;
358 | margin-bottom: 10px;
359 | margin-left: 30px;
360 | }
361 |
362 | dt:target, .highlighted {
363 | background-color: #ddd;
364 | }
365 |
366 | dl.glossary dt {
367 | font-weight: bold;
368 | font-size: 110%;
369 | }
370 |
371 | .field-list ul {
372 | margin: 0;
373 | padding-left: 1em;
374 | }
375 |
376 | .field-list p {
377 | margin: 0;
378 | }
379 |
380 | .optional {
381 | font-size: 130%;
382 | }
383 |
384 | .sig-paren {
385 | font-size: larger;
386 | }
387 |
388 | .versionmodified {
389 | font-style: italic;
390 | }
391 |
392 | .system-message {
393 | background-color: #fda;
394 | padding: 5px;
395 | border: 3px solid red;
396 | }
397 |
398 | .footnote:target {
399 | background-color: #dddddd;
400 | }
401 |
402 | .line-block {
403 | display: block;
404 | margin-top: 1em;
405 | margin-bottom: 1em;
406 | }
407 |
408 | .line-block .line-block {
409 | margin-top: 0;
410 | margin-bottom: 0;
411 | margin-left: 1.5em;
412 | }
413 |
414 | .guilabel, .menuselection {
415 | font-style: italic;
416 | }
417 |
418 | .accelerator {
419 | text-decoration: underline;
420 | }
421 |
422 | .classifier {
423 | font-style: oblique;
424 | }
425 |
426 | abbr, acronym {
427 | border-bottom: dotted 1px;
428 | cursor: help;
429 | }
430 |
431 | /* -- code displays --------------------------------------------------------- */
432 |
433 | pre {
434 | font-family: monospace;
435 | overflow: auto;
436 | overflow-y: hidden;
437 | }
438 |
439 | td.linenos pre {
440 | padding: 5px 0px;
441 | border: 0;
442 | background-color: transparent;
443 | color: #aaa;
444 | }
445 |
446 | table.highlighttable {
447 | margin-left: 0.5em;
448 | }
449 |
450 | table.highlighttable td {
451 | padding: 0 0.5em 0 0.5em;
452 | }
453 |
454 | code {
455 | font-family: monospace;
456 | }
457 |
458 | div.code-block-caption span.caption-number {
459 | padding: 0.1em 0.3em;
460 | font-style: italic;
461 | }
462 |
463 | div.code-block-caption span.caption-text {
464 | }
465 |
466 | div.literal-block-wrapper {
467 | padding: 1em 1em 0;
468 | }
469 |
470 | div.literal-block-wrapper div.highlight {
471 | margin: 0;
472 | }
473 |
474 | code.descname {
475 | background-color: transparent;
476 | font-weight: bold;
477 | font-size: 1.2em;
478 | }
479 |
480 | code.descclassname {
481 | background-color: transparent;
482 | }
483 |
484 | code.xref, a code {
485 | background-color: transparent;
486 | font-weight: bold;
487 | }
488 |
489 | h1 code, h2 code, h3 code, h4 code, h5 code, h6 code {
490 | background-color: transparent;
491 | }
492 |
493 | /* -- math display ---------------------------------------------------------- */
494 |
495 | img.math {
496 | vertical-align: middle;
497 | }
498 |
499 | div.body div.math p {
500 | text-align: center;
501 | }
502 |
503 | span.eqno {
504 | float: right;
505 | }
506 |
507 | /* -- special divs --------------------------------------------------------- */
508 |
509 | div.quotebar {
510 | background-color: #e3eff1;
511 | max-width: 250px;
512 | float: right;
513 | font-family: sans-serif;
514 | padding: 7px 7px;
515 | border: 1px solid #ccc;
516 | }
517 | div.footer {
518 | background-color: #e3eff1;
519 | padding: 3px 8px 3px 0;
520 | clear: both;
521 | font-family: sans-serif;
522 | font-size: 80%;
523 | text-align: right;
524 | }
525 |
526 | div.footer a {
527 | text-decoration: underline;
528 | }
529 |
530 | /* -- link-target ----------------------------------------------------------- */
531 |
532 | .link-target {
533 | font-size: 80%;
534 | }
535 |
536 | table .link-target {
537 | /* Do not show links in tables, there is not enough space */
538 | display: none;
539 | }
540 |
541 | /* -- font-face ------------------------------------------------------------- */
542 |
543 | /*
544 | @font-face {
545 | font-family: "LiberationNarrow";
546 | font-style: normal;
547 | font-weight: normal;
548 | src: url("res:///Data/fonts/LiberationNarrow-Regular.otf")
549 | format("opentype");
550 | }
551 | @font-face {
552 | font-family: "LiberationNarrow";
553 | font-style: oblique, italic;
554 | font-weight: normal;
555 | src: url("res:///Data/fonts/LiberationNarrow-Italic.otf")
556 | format("opentype");
557 | }
558 | @font-face {
559 | font-family: "LiberationNarrow";
560 | font-style: normal;
561 | font-weight: bold;
562 | src: url("res:///Data/fonts/LiberationNarrow-Bold.otf")
563 | format("opentype");
564 | }
565 | @font-face {
566 | font-family: "LiberationNarrow";
567 | font-style: oblique, italic;
568 | font-weight: bold;
569 | src: url("res:///Data/fonts/LiberationNarrow-BoldItalic.otf")
570 | format("opentype");
571 | }
572 | */
573 |
--------------------------------------------------------------------------------
/_static/how-static-files-work-apache.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/djangodeployment/django-deployment-book/9843ac13772b29fe1e0e88f410191e421bc168f3/_static/how-static-files-work-apache.png
--------------------------------------------------------------------------------
/_static/how-static-files-work-nginx.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/djangodeployment/django-deployment-book/9843ac13772b29fe1e0e88f410191e421bc168f3/_static/how-static-files-work-nginx.png
--------------------------------------------------------------------------------
/_static/output_of_ls.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/djangodeployment/django-deployment-book/9843ac13772b29fe1e0e88f410191e421bc168f3/_static/output_of_ls.png
--------------------------------------------------------------------------------
/_static/putty-config.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/djangodeployment/django-deployment-book/9843ac13772b29fe1e0e88f410191e421bc168f3/_static/putty-config.png
--------------------------------------------------------------------------------
/_static/top.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/djangodeployment/django-deployment-book/9843ac13772b29fe1e0e88f410191e421bc168f3/_static/top.png
--------------------------------------------------------------------------------
/_templates/epub-cover.html:
--------------------------------------------------------------------------------
1 | {#
2 | epub/epub-cover.html
3 | ~~~~~~~~~~~~~~~~~~~~
4 |
5 | Sample template for the html cover page.
6 |
7 | :copyright: Copyright 2007-2017 by the Sphinx team, see AUTHORS.
8 | :license: BSD, see LICENSE for details.
9 | #}
10 | {%- extends "layout.html" %}
11 | {%- block header %}{% endblock %}
12 | {%- block rootrellink %}{% endblock %}
13 | {%- block relbaritems %}{% endblock %}
14 | {%- block sidebarlogo %}{% endblock %}
15 | {%- block linktags %}{% endblock %}
16 | {%- block relbar1 %}{% endblock %}
17 | {%- block sidebar1 %}{% endblock %}
18 | {%- block sidebar2 %}{% endblock %}
19 | {%- block footer %}{% endblock %}
20 |
21 | {% block content %}
22 |