├── .gitattributes
├── README.md
├── README_CN.md
└── media
└── logo.png
/.gitattributes:
--------------------------------------------------------------------------------
1 | * text=auto
2 | README.md linguist-language=Python
3 | *.md linguist-documentation=false
4 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 |
2 |

3 |
4 |
5 | ### Translations
6 | > Feel free to create an issue if you have translated this guide or want to request a translation.
7 | - :ukraine: [Ukrainian](https://codeguida.com/post/2020) by Codeguida
8 | - 🇨🇳 [Chinese](https://github.com/hzlmn/diy-async-web-framework/blob/master/README_CN.md) thanks [@Amaindex](https://github.com/Amaindex)
9 |
10 |
11 | ### Introduction
12 |
13 | Asynchronous programming became much popular in the last few years in the Python community. Libraries like `aiohttp` show incredible growth in usage. They handle a large amount of concurrent connections while still maintain good code readability and simplicity. Not a long time ago, Django [committed](https://docs.djangoproject.com/en/dev/releases/3.0/#asgi-support) on adding async support in a next major version. So future of asynchronous python is pretty bright as you may realise. However, for a large number of developers, who came from a standard blocking model, the working mechanism of these tools may seem confusing. In this short guide, I tried to go behind the scene and clarify the process, by re-building a little `aiohttp` clone from scratch. We will start just with a basic sample from official documentation and progressively add all necessary functionality that we all like. So let's start.
14 |
15 | I assume that you already have a basic understanding of [asyncio](https://docs.python.org/3/library/asyncio.html) to follow this guide, but if you need a refresher here are few articles that may help
16 |
17 | - [Intro to asyncio](https://www.blog.pythonlibrary.org/2016/07/26/python-3-an-intro-to-asyncio)
18 | - [Understanding asynchronous programming in Python](https://dbader.org/blog/understanding-asynchronous-programming-in-python)
19 |
20 | For impatients, final source code available at [`hzlmn/sketch`](https://github.com/hzlmn/sketch)
21 |
22 | ## Related projects
23 | - [500 Lines or Less](https://github.com/aosabook/500lines)
24 |
25 |
26 | ## Table of contents :book:
27 |
28 | * [Asyncio low-level APIs, Transports & Protocols](#asyncio-low-level-apis-transports--protocols)
29 |
30 | * [Making server protocol](#making-server-protocol)
31 |
32 | * [Request/Response objects](#requestresponse-objects)
33 |
34 | * [Application & UrlDispatcher](#application--urldispatcher)
35 |
36 | * [Going further](#going-further)
37 | * [Route params](#route-params)
38 |
39 | * [Middlewares](#middlewares)
40 |
41 | * [App lifecycle hooks](#app-lifecycle-hooks)
42 |
43 | * [Better exceptions](#better-exceptions)
44 |
45 | * [Graceful shutdown](#graceful-shutdown)
46 |
47 | * [Sample application](#sample-application)
48 |
49 | * [Conclusion](#conclusion)
50 |
51 | ## Asyncio low-level APIs, Transports & Protocols
52 | Asyncio has come a long journey to become what it looks like now. Back in those days, it was created as a lower-level tool called a "tulip", and writing higher-level applications was not as enjoyable as it is today.
53 |
54 | Right now for most usecases `asyncio` is pretty high-level API, but it also provides set of low-level helpers for library authors to manage event loops, and implement networking/ipc protocols.
55 |
56 | Out of the box it only supports `TCP`, `UDP`, `SSL` and subprocesses. Libraries implement their own higher level (HTTP, FTP, etc.) based on base transports and available APIs.
57 |
58 | All communications done over chaining `Transport` and `Protocols`. In simple words `Transport` describes how we can exchange data and `Protocol` is responsible for choosing which data specifically.
59 |
60 | `Asyncio` has a pretty great official docs so you can read more about it [here](https://docs.python.org/3.8/library/asyncio-protocol.html#asyncio-transport)
61 |
62 | To get a first grasp let's write simple `TCP` server that will echo messages.
63 |
64 |
65 | `server.py`
66 |
67 | ```python
68 | import asyncio
69 |
70 | class Server(asyncio.Protocol):
71 | def connection_made(self, transport):
72 | self._transport = transport
73 |
74 | def data_received(self, data):
75 | message = data.decode()
76 |
77 | self._transport.write(data)
78 |
79 | self._transport.close()
80 |
81 | loop = asyncio.get_event_loop()
82 |
83 | coro = loop.create_server(Server, '127.0.0.1', 8080)
84 | server = loop.run_until_complete(coro)
85 |
86 | try:
87 | loop.run_forever()
88 | except KeyboardInterrupt:
89 | pass
90 |
91 | server.close()
92 | loop.run_until_complete(server.wait_closed())
93 | loop.close()
94 | ```
95 |
96 | ```shell
97 | $ curl http://127.0.0.1:8080
98 | GET / HTTP/1.1
99 | Host: 127.0.0.1:8080
100 | User-Agent: curl/7.54.0
101 | Accept: */*
102 | ```
103 |
104 | As you can see from the example above, the code is pretty simple, but as you may realize it is not scalable for writing a high-level application yet.
105 |
106 | As `HTTP` works over `TCP` transport we already can send `HTTP` requests to our server, however, we receive them in raw format and working with it will be annoying as you may guess. So next step we need to add better `HTTP` handling mechanism.
107 |
108 |
109 | ## Making server protocol
110 |
111 | Let's add request parsing so we can extract some useful info like headers, body, path and work with them instead of raw text. Parsing is a complex topic and it is certainly out of the scope of this guide, thats why we will use [httptools](https://github.com/MagicStack/httptools) by MagicStack for this as it rapidly fast, standard compatible and pretty flexible.
112 |
113 | `aiohttp`, on the other hand, has own hand-written Python based parser as well as binding to Node's [`http-parser`](https://github.com/nodejs/http-parser/tree/77310eeb839c4251c07184a5db8885a572a08352).
114 |
115 | Lets write our parsing class, that will be used as mixin for our main `Server` class.
116 |
117 | `http_parser.py`
118 | ```python
119 | class HttpParserMixin:
120 | def on_body(self, data):
121 | self._body = data
122 |
123 | def on_url(self, url):
124 | self._url = url
125 |
126 | def on_message_complete(self):
127 | print(f"Received request to {self._url.decode(self._encoding)}")
128 |
129 | def on_header(self, header, value):
130 | header = header.decode(self._encoding)
131 | self._headers[header] = value.decode(self._encoding)
132 | ```
133 | Now when we have working `HttpParserMixin`, lets modify a bit our `Server` and apply mixin.
134 |
135 | `server.py`
136 | ```python
137 | import asyncio
138 |
139 | from httptools import HttpRequestParser
140 |
141 | from .http_parser import HttpParserMixin
142 |
143 | class Server(asyncio.Protocol, HttpParserMixin):
144 | def __init__(self, loop):
145 | self._loop = loop
146 | self._encoding = "utf-8"
147 | self._url = None
148 | self._headers = {}
149 | self._body = None
150 | self._transport = None
151 | self._request_parser = HttpRequestParser(self)
152 |
153 | def connection_made(self, transport):
154 | self._transport = transport
155 |
156 | def connection_lost(self, *args):
157 | self._transport = None
158 |
159 | def data_received(self, data):
160 | # Pass data to our parser
161 | self._request_parser.feed_data(data)
162 | ```
163 |
164 |
165 | So far, we have our server that can understand incoming `HTTP` requests and obtain some important information from it. Now let's try to add simple runner to it.
166 |
167 | `server.py`
168 | ```python
169 | if __name__ == "__main__":
170 | loop = asyncio.get_event_loop()
171 | serv = Server(loop)
172 | server = loop.run_until_complete(loop.create_server(lambda: serv, port=8080))
173 |
174 | try:
175 | print("Started server on ::8080")
176 | loop.run_until_complete(server.serve_forever())
177 | except KeyboardInterrupt:
178 | server.close()
179 | loop.run_until_complete(server.wait_closed())
180 | loop.stop()
181 |
182 | ```
183 | ```sh
184 | > python server.py
185 | Started server on ::8080
186 | ```
187 | ```sh
188 | > curl http://127.0.0.1:8080/hello
189 | ```
190 |
191 | ## Request/Response objects
192 | At this moment we have working server that can parse `HTTP` calls, but for our apps we need better abstractions to work with.
193 |
194 | Let's create base `Request` class that will group together all incoming `HTTP` request information. We will use `yarl` library for dealing with urls, make sure you installed it with pip.
195 |
196 | `request.py`
197 | ```python
198 | import json
199 |
200 | from yarl import URL
201 |
202 | class Request:
203 | _encoding = "utf_8"
204 |
205 | def __init__(self, method, url, headers, version=None, body=None, app=None):
206 | self._version = version
207 | self._method = method.decode(self._encoding)
208 | self._url = URL(url.decode(self._encoding))
209 | self._headers = headers
210 | self._body = body
211 |
212 | @property
213 | def method(self):
214 | return self._method
215 |
216 | @property
217 | def url(self):
218 | return self._url
219 |
220 | @property
221 | def headers(self):
222 | return self._headers
223 |
224 | def text(self):
225 | if self._body is not None:
226 | return self._body.decode(self._encoding)
227 |
228 | def json(self):
229 | text = self.text()
230 | if text is not None:
231 | return json.loads(text)
232 |
233 | def __repr__(self):
234 | return f""
235 | ```
236 |
237 | As a next step, we also need a structure, that will helps us to describe outgoing `HTTP` response in programmer-friendly manner and convert it to raw `HTTP`, which can be processed by `asyncio.Transport`.
238 |
239 | `response.py`
240 | ```python
241 | import http.server
242 |
243 | web_responses = http.server.BaseHTTPRequestHandler.responses
244 |
245 | class Response:
246 | _encoding = "utf-8"
247 |
248 | def __init__(
249 | self,
250 | body=None,
251 | status=200,
252 | content_type="text/plain",
253 | headers=None,
254 | version="1.1",
255 | ):
256 | self._version = version
257 | self._status = status
258 | self._body = body
259 | self._content_type = content_type
260 | if headers is None:
261 | headers = {}
262 | self._headers = headers
263 |
264 | @property
265 | def body(self):
266 | return self._body
267 |
268 | @property
269 | def status(self):
270 | return self._status
271 |
272 | @property
273 | def content_type(self):
274 | return self._content_type
275 |
276 | @property
277 | def headers(self):
278 | return self._headers
279 |
280 | def add_body(self, data):
281 | self._body = data
282 |
283 | def add_header(self, key, value):
284 | self._headers[key] = value
285 |
286 | def __str__(self):
287 | """We will use this in our handlers, it is actually generation of raw HTTP response,
288 | that will be passed to our TCP transport
289 | """
290 | status_msg, _ = web_responses.get(self._status)
291 |
292 | messages = [
293 | f"HTTP/{self._version} {self._status} {status_msg}",
294 | f"Content-Type: {self._content_type}",
295 | f"Content-Length: {len(self._body)}",
296 | ]
297 |
298 | if self.headers:
299 | for header, value in self.headers.items():
300 | messages.append(f"{header}: {value}")
301 |
302 | if self._body is not None:
303 | messages.append("\r\n" + self._body)
304 |
305 | return "\r\n".join(messages)
306 |
307 | def __repr__(self):
308 | return f""
309 | ```
310 |
311 | As you can see code is pretty straight forward we incapsulate all our data and provide proper getters. Also we have few helpers for reading body `text` and `json`, that will be used later. We also need to update our `Server` to actually construct `Request` object from message.
312 |
313 | It should be created, when a whole request processed, so we need to add it to `on_message_complete` event handler in our parser mixin.
314 |
315 | `http_parser.py`
316 | ```python
317 | class HttpParserMixin:
318 | ...
319 |
320 | def on_message_complete(self):
321 | self._request = self._request_class(
322 | version=self._request_parser.get_http_version(),
323 | method=self._request_parser.get_method(),
324 | url=self._url,
325 | headers=self._headers,
326 | body=self._body,
327 | )
328 |
329 | ...
330 | ```
331 |
332 | Server also need a little modification to create `Response` object and
333 | pass encoded value to `asyncio.Transport`.
334 |
335 | `server.py`
336 | ```python
337 | from .response import Response
338 | ...
339 |
340 | class Server(asyncio.Protocol, HttpParserMixin):
341 | ...
342 |
343 | def __init__(self, loop):
344 | ...
345 | self._request = None
346 | self._request_class = Request
347 |
348 | ...
349 |
350 | def data_received(self, data):
351 | self._request_parser.feed_data(data)
352 |
353 | resp = Response(body=f"Received request on {self._request.url}")
354 | self._transport.write(str(resp).encode(self._encoding))
355 |
356 | self._transport.close()
357 | ```
358 |
359 | Now running our `server.py` we will be able to see `Received request on /path` in response to curl call `http://localhost:8080/path`.
360 |
361 | ## Application & UrlDispatcher
362 |
363 | At this stage we already have simple working server that can process HTTP requests and Request/Response objects for dealing with request cycles. However, our hand-crafted toolkit still miss few important concepts. First of all right now we have only one main request handler, in large applications we have lots of them for different routes so we certainly need mechanism for registring multiple route handlers.
364 |
365 | So, let's try to build simplest possible `UrlDispatcher`, just object with internal dict, that store as a key method and path tuple and actual handler as a value. We also need a handler for situation where user try to reach unrecognized route.
366 |
367 | `router.py`
368 | ```python
369 | from .response import Response
370 |
371 | class UrlDispatcher:
372 | def __init__(self):
373 | self._routes = {}
374 |
375 | async def _not_found(self, request):
376 | return Response(f"Not found {request.url} on this server", status=404)
377 |
378 | def add_route(self, method, path, handler):
379 | self._routes[(method, path)] = handler
380 |
381 | def resolve(self, request):
382 | key = (request.method, request.url.path)
383 | if key not in self._routes:
384 | return self._not_found
385 | return self._routes[key]
386 | ```
387 |
388 | Sure thing, we miss lots of stuff like parameterized routes but we will add them later on. For now let keep it simple as it is.
389 |
390 |
391 |
392 | Next things we need an `Application` container, that will actually combine together all app related information, because dealing with underlaying `Server` will be annoying for us.
393 |
394 | ```python
395 | import asyncio
396 |
397 | from .router import UrlDispatcher
398 | from .server import Server
399 | from .response import Response
400 |
401 | class Application:
402 | def __init__(self, loop=None):
403 | if loop is None:
404 | loop = asyncio.get_event_loop()
405 |
406 | self._loop = loop
407 | self._router = UrlDispatcher()
408 |
409 | @property
410 | def loop(self):
411 | return self._loop
412 |
413 | @property
414 | def router(self):
415 | return self._router
416 |
417 | def _make_server(self):
418 | return Server(loop=self._loop, handler=self._handler, app=self)
419 |
420 | async def _handler(self, request, response_writer):
421 | """Process incoming request"""
422 | handler = self._router.resolve(request)
423 | resp = await handler(request)
424 |
425 | if not isinstance(resp, Response):
426 | raise RuntimeError(f"expect Response instance but got {type(resp)}")
427 |
428 | response_writer(resp)
429 |
430 | ```
431 |
432 | We need to modify our `Server` a bit and add `response_writer` method, that will be responsible for passing data to transport. Also initializer should be changed to add `handler` and `app` properties that will be used to call corresponding handlers.
433 |
434 | `server.py`
435 | ```python
436 |
437 | class Server(asyncio.Protocol, HttpParserMixin):
438 | ...
439 |
440 | def __init__(self, loop, handler, app):
441 | self._loop = loop
442 | self._url = None
443 | self._headers = {}
444 | self._body = None
445 | self._transport = None
446 | self._request_parser = HttpRequestParser(self)
447 | self._request = None
448 | self._request_class = Request
449 | self._request_handler = handler
450 | self._request_handler_task = None
451 |
452 | def response_writer(self, response):
453 | self._transport.write(str(response).encode(self._encoding))
454 | self._transport.close()
455 |
456 | ...
457 |
458 | ```
459 |
460 | `http_parser.py`
461 | ```python
462 | class HttpParserMixin:
463 | def on_body(self, data):
464 | self._body = data
465 |
466 | def on_url(self, url):
467 | self._url = url
468 |
469 | def on_message_complete(self):
470 | self._request = self._request_class(
471 | version=self._request_parser.get_http_version(),
472 | method=self._request_parser.get_method(),
473 | url=self._url,
474 | headers=self._headers,
475 | body=self._body,
476 | )
477 |
478 | self._request_handler_task = self._loop.create_task(
479 | self._request_handler(self._request, self.response_writer)
480 | )
481 |
482 | def on_header(self, header, value):
483 | header = header.decode(self._encoding)
484 | self._headers[header] = value.decode(self._encoding)
485 | ```
486 |
487 | Finally, when we have basic functionality ready, can register new routes and handlers, let's add simple helper for actually running our app instance (similar to `web.run_app` in `aiohttp`).
488 |
489 | `application.py`
490 | ```python
491 | def run_app(app, host="127.0.0.1", port=8080, loop=None):
492 | if loop is None:
493 | loop = asyncio.get_event_loop()
494 |
495 | serv = app._make_server()
496 | server = loop.run_until_complete(
497 | loop.create_server(lambda: serv, host=host, port=port)
498 | )
499 |
500 | try:
501 | print(f"Started server on {host}:{port}")
502 | loop.run_until_complete(server.serve_forever())
503 | except KeyboardInterrupt:
504 | server.close()
505 | loop.run_until_complete(server.wait_closed())
506 | loop.stop()
507 | ```
508 |
509 | And now, time to make simple app with our fresh toolkit.
510 |
511 | `app.py`
512 | ```python
513 | import asyncio
514 |
515 | from .response import Response
516 | from .application import Application, run_app
517 |
518 | app = Application()
519 |
520 | async def handler(request):
521 | return Response(f"Hello at {request.url}")
522 |
523 | app.router.add_route("GET", "/", handler)
524 |
525 | if __name__ == "__main__":
526 | run_app(app)
527 |
528 | ```
529 | If you will run it and then make `GET` request to `/`, you will be able to see `Hello at /` and `404` response for all other routes. Hooray, we done it however there are still big room for improvements.
530 |
531 | ```shell
532 | $ curl 127.0.0.1:8080/
533 | Hello at /
534 |
535 | $ curl 127.0.0.1:8080/invalid
536 | Not found /invalid on this server
537 |
538 | ```
539 |
540 | ## Going further
541 |
542 | So far, we have all basic functionality up & running, but we still need to change certain things in our "framework". First of all, as we discussed earlier our router are missing parametrized-routes, it is "must have" feature of all modern libraries. Next we need to add support for middlewares, it is also very common and powerfull concept. Great thing about `aiohttp` that i am pretty much in love with is application lifecycle hooks (eg. `on_startup`, `on_shutdown`, `on_cleanup`) so we certainly should try to implement it as well.
543 |
544 | ## Route params
545 | Currently, our `UrlDispatcher` is pretty lean and it works with registered url pathes as a string. First thing, that we need is actually add support for patterns like `/user/{username}` to our `resolve` method. Also we need `_format_pattern` helper that will be responsible for generating actual regular expression from parametrized string. Also as you may noted we have another helper `_method_not_allowed` and methods for simpler definition of `GET`, `POST`, etc. routes.
546 |
547 |
548 | `router.py`
549 | ```python
550 | import re
551 |
552 | from functools import partialmethod
553 |
554 | from .response import Response
555 |
556 | class UrlDispatcher:
557 | _param_regex = r"{(?P\w+)}"
558 |
559 | def __init__(self):
560 | self._routes = {}
561 |
562 | async def _not_found(self, request):
563 | return Response(f"Could not find {request.url.raw_path}")
564 |
565 | async def _method_not_allowed(self, request):
566 | return Response(f"{request.method} not allowed for {request.url.raw_path}")
567 |
568 | def resolve(self, request):
569 | for (method, pattern), handler in self._routes.items():
570 | match = re.match(pattern, request.url.raw_path)
571 |
572 | if match is None:
573 | return None, self._not_found
574 |
575 | if method != request.method:
576 | return None, self._method_not_allowed
577 |
578 | return match.groupdict(), handler
579 |
580 | def _format_pattern(self, path):
581 | if not re.search(self._param_regex, path):
582 | return path
583 |
584 | regex = r""
585 | last_pos = 0
586 |
587 | for match in re.finditer(self._param_regex, path):
588 | regex += path[last_pos: match.start()]
589 | param = match.group("param")
590 | regex += r"(?P<%s>\w+)" % param
591 | last_pos = match.end()
592 |
593 | return regex
594 |
595 | def add_route(self, method, path, handler):
596 | pattern = self._format_pattern(path)
597 | self._routes[(method, pattern)] = handler
598 |
599 | add_get = partialmethod(add_route, "GET")
600 |
601 | add_post = partialmethod(add_route, "POST")
602 |
603 | add_put = partialmethod(add_route, "PUT")
604 |
605 | add_head = partialmethod(add_route, "HEAD")
606 |
607 | add_options = partialmethod(add_route, "OPTIONS")
608 | ```
609 |
610 | We also need to modify our application container, right now `resolve` method of `UrlDispatcher` returns `match_info` and `handler`. So inside `Application._handler` change the following lines.
611 |
612 | `application.py`
613 | ```python
614 | class Application:
615 | ...
616 | async def _handler(self, request, response_writer):
617 | """Process incoming request"""
618 | match_info, handler = self._router.resolve(request)
619 |
620 | request.match_info = match_info
621 |
622 | ...
623 |
624 | ```
625 | ## Middlewares
626 |
627 | For those, who aren't familiar with a this concept, in simple words `middleware` is just a coroutine, that can modify incoming request object or change response of a handler. It will be fired before each request to the server. Implementation is pretty trivial for our needs. First of all we need to add list of registered middlewares inside our `Application` object and change a little bit `Application._handler` to run through them. Each middleware should work with result of previous one in chain.
628 |
629 | `application.py`
630 | ```python
631 | from functools import partial
632 | ...
633 |
634 | class Application:
635 | def __init__(self, loop=None, middlewares=None):
636 | ...
637 | if middlewares is None:
638 | self._middlewares = []
639 |
640 | ...
641 |
642 | async def _handler(self, request, response_writer):
643 | """Process incoming request"""
644 | match_info, handler = self._router.resolve(request)
645 |
646 | request.match_info = match_info
647 |
648 | if self._middlewares:
649 | for md in self._middlewares:
650 | handler = partial(md, handler=handler)
651 |
652 | resp = await handler(request)
653 |
654 | ...
655 | ```
656 |
657 | Now lets try to add request logging middleware to our simple application.
658 |
659 | `app.py`
660 | ```python
661 | import asyncio
662 |
663 | from .response import Response
664 | from .application import Application, run_app
665 |
666 | async def log_middleware(request, handler):
667 | print(f"Received request to {request.url.raw_path}")
668 | return await handler(request)
669 |
670 | app = Application(middlewares=[log_middleware])
671 |
672 | async def handler(request):
673 | return Response(f"Hello at {request.url}")
674 |
675 | app.router.add_route("GET", "/", handler)
676 |
677 | if __name__ == "__main__":
678 | run_app(app)
679 |
680 | ```
681 | If we try to run it, we should see `Received request to /` message in response to incoming request.
682 |
683 |
684 | ## App lifecycle hooks
685 |
686 | Next step let's add support for running certain coroutines in response to events like starting server and stopping it. It is pretty neat feature of `aiohttp`. There are many signals like `on_startup`, `on_shutdown`, `on_response_prepared` to name a few, but for our need let's keep it simple as possible and just implement `startup` & `shutdown` helpers.
687 |
688 | Inside `Application` we need to add list of actual handlers for each event with proper encapsulation and provide getters. Then actual `startup` and `shutdown` coroutines and add corresponding calls to `run_app` helper.
689 |
690 | `application.py`
691 | ```python
692 | class Application:
693 | def __init__(self, loop=None, middlewares=None):
694 | ...
695 | self._on_startup = []
696 | self._on_shutdown = []
697 |
698 | ...
699 |
700 | @property
701 | def on_startup(self):
702 | return self._on_startup
703 |
704 | @property
705 | def on_shutdown(self):
706 | return self._on_shutdown
707 |
708 | async def startup(self):
709 | coros = [func(self) for func in self._on_startup]
710 | await asyncio.gather(*coros, loop=self._loop)
711 |
712 | async def shutdown(self):
713 | coros = [func(self) for func in self._on_shutdown]
714 | await asyncio.gather(*coros, loop=self._loop)
715 |
716 | ...
717 |
718 | def run_app(app, host="127.0.0.1", port=8080, loop=None):
719 | if loop is None:
720 | loop = asyncio.get_event_loop()
721 |
722 | serv = app._make_server()
723 |
724 | loop.run_until_complete(app.startup())
725 |
726 | server = loop.run_until_complete(
727 | loop.create_server(lambda: serv, host=host, port=port)
728 | )
729 |
730 | try:
731 | print(f"Started server on {host}:{port}")
732 | loop.run_until_complete(server.serve_forever())
733 | except KeyboardInterrupt:
734 | loop.run_until_complete(app.shutdown())
735 | server.close()
736 | loop.run_until_complete(server.wait_closed())
737 | loop.stop()
738 | ```
739 |
740 | ## Better exceptions
741 |
742 | At this steps we have most of the core features added, however, we still have lack of exceptions handling. Great feature about `aiohttp` is that it allows you to work with web exceptions as a native python exceptions.
743 | It's done with implementing both `Exception` and `Response` classes and it's really flexible mechanism we would like to have as well.
744 |
745 | So, first thing lets create our base `HTTPException` class and few helpers based on it that we might need like `HTTPNotFound` for unrecognized pathes `HTTPBadRequest` for user side issues and `HTTPFound` for redirecting.
746 |
747 | ```python
748 | from .response import Response
749 |
750 | class HTTPException(Response, Exception):
751 | status_code = None
752 |
753 | def __init__(self, reason=None, content_type=None):
754 | self._reason = reason
755 | self._content_type = content_type
756 |
757 | Response.__init__(
758 | self,
759 | body=self._reason,
760 | status=self.status_code,
761 | content_type=self._content_type or "text/plain",
762 | )
763 |
764 | Exception.__init__(self, self._reason)
765 |
766 |
767 | class HTTPNotFound(HTTPException):
768 | status_code = 404
769 |
770 |
771 | class HTTPBadRequest(HTTPException):
772 | status_code = 400
773 |
774 |
775 | class HTTPFound(HTTPException):
776 | status_code = 302
777 |
778 | def __init__(self, location, reason=None, content_type=None):
779 | super().__init__(reason=reason, content_type=content_type)
780 | self.add_header("Location", location)
781 | ```
782 |
783 | Then we need modify a bit our `Application._handler` to actually catch web exceptions.
784 |
785 | `application.py`
786 | ```python
787 | class Application:
788 | ...
789 | async def _handler(self, request, response_writer):
790 | """Process incoming request"""
791 | try:
792 | match_info, handler = self._router.resolve(request)
793 |
794 | request.match_info = match_info
795 |
796 | if self._middlewares:
797 | for md in self._middlewares:
798 | handler = partial(md, handler=handler)
799 |
800 | resp = await handler(request)
801 | except HTTPException as exc:
802 | resp = exc
803 |
804 | ...
805 | ```
806 |
807 | Also now we can drop `_not_found` & `_method_not_allowed` helpers from our `UrlDispatcher` and instead just raise proper exceptions.
808 |
809 | `router.py`
810 | ```python
811 | class UrlDispatcher:
812 | ...
813 | def resolve(self, request):
814 | for (method, pattern), handler in self._routes.items():
815 | match = re.match(pattern, request.url.raw_path)
816 |
817 | if match is None:
818 | raise HTTPNotFound(reason=f"Could not find {request.url.raw_path}")
819 |
820 | if method != request.method:
821 | raise HTTPBadRequest(reason=f"{request.method} not allowed for {request.url.raw_path}")
822 |
823 | return match.groupdict(), handler
824 |
825 | ...
826 | ```
827 |
828 | Another thing that might be a good addition is a standard formatted response for internal server errors, because we don't want to break actual app in some inconsistent situations. Let's add just a simple html template as well as tiny helper for formatting exceptions.
829 |
830 | `helpers.py`
831 | ```python
832 | import traceback
833 |
834 | from .response import Response
835 |
836 | server_exception_templ = """
837 |
838 |
500 Internal server error
839 |
Server got itself in trouble : {exc}
840 | {traceback}
841 |
842 | """
843 |
844 |
845 | def format_exception(exc):
846 | resp = Response(status=500, content_type="text/html")
847 | trace = traceback.format_exc().replace("\n", "")
848 | msg = server_exception_templ.format(exc=str(exc), traceback=trace)
849 | resp.add_body(msg)
850 | return resp
851 | ```
852 |
853 | As simple as it is, and now just catch all `Exception` inside our `Application._handler` and generate actual html response with our helper.
854 |
855 | `application.py`
856 | ```python
857 | class Application:
858 | ...
859 | async def _handler(self, request, response_writer):
860 | """Process incoming request"""
861 | try:
862 | match_info, handler = self._router.resolve(request)
863 |
864 | request.match_info = match_info
865 |
866 | if self._middlewares:
867 | for md in self._middlewares:
868 | handler = partial(md, handler=handler)
869 |
870 | resp = await handler(request)
871 | except HTTPException as exc:
872 | resp = exc
873 | except Exception as exc:
874 | resp = format_exception(exc)
875 | ...
876 | ```
877 |
878 | ## Graceful shutdown
879 | As a final touch, we need to add signal processing for the proper process of shutting down our application. So, let's change `run_app` to the following lines.
880 |
881 | `application.py`
882 | ```python
883 | ...
884 |
885 | def run_app(app, host="127.0.0.1", port=8080, loop=None):
886 | if loop is None:
887 | loop = asyncio.get_event_loop()
888 |
889 | serv = app._make_server()
890 |
891 | loop.run_until_complete(app.startup())
892 |
893 | server = loop.run_until_complete(
894 | loop.create_server(lambda: serv, host=host, port=port)
895 | )
896 |
897 | loop.add_signal_handler(
898 | signal.SIGTERM, lambda: asyncio.ensure_future(app.shutdown())
899 | )
900 |
901 | ...
902 | ```
903 |
904 | ## Sample application
905 | Now when we have our toolkit ready, let's try to complete our previous sample application lifecycle hooks and exceptions that we just added.
906 |
907 | `app.py`
908 | ```python
909 | from .application import Application, run_app
910 |
911 | async def on_startup(app):
912 | # you may query here actual db, but for an example let's just use simple set.
913 | app.db = {"john_doe",}
914 |
915 | async def log_middleware(request, handler):
916 | print(f"Received request to {request.url.raw_path}")
917 | return await handler(request)
918 |
919 | async def handler(request):
920 | username = request.match_info["username"]
921 | if username not in request.app.db:
922 | raise HTTPNotFound(reason=f"No such user with as {username} :(")
923 |
924 | return Response(f"Welcome, {username}!")
925 |
926 | app = Application(middlewares=[log_middleware])
927 |
928 | app.on_startup.append(on_startup)
929 |
930 | app.router.add_get("/{username}", handler)
931 |
932 | if __name__ == "__main__":
933 | run_app(app)
934 | ```
935 |
936 | If we done all properly you will see log messages on each request, welcome message in response to registered user and `HTTPNotFound` for unregistered users and unrecognized path.
937 |
938 | ## Conclusion
939 |
940 | Summing it up, in ~500 lines we hand-crafted pretty simple yet powerfull micro framework inspired by `aiohttp` & `sanic`. Of course, it is not production ready software as it still miss lot's of usefull & important features like more robust server, better HTTP support to fully correlate with specification, web sockets to name a few. However, i belive that through this process we developed better understanding how such tools built. As a famous physicist Richard Feynman said “What I cannot create, I do not understand”. So I hope you enjoyed this guide, see ya! :wave:
941 |
942 |
943 |
--------------------------------------------------------------------------------
/README_CN.md:
--------------------------------------------------------------------------------
1 |
2 |

3 |
4 |
5 | ### 简介
6 |
7 | 近几年来,异步编程在Python社区中变得越来越受欢迎。诸如`aiohttp`之类的异步库,在使用量上呈现出惊人的增长态势,因为它们能够并发处理大量链接,并在此基础上保持代码的可读性与简洁程度。而就在不久前,Django也[承诺](https://docs.djangoproject.com/en/dev/releases/3.0/#asgi-support)将在下个大版本中增加对异步的支持。种种迹象都表明,Python的异步编程拥有非常不错的前景。然而,对于很大一部分习惯于使用标准阻塞模型的开发人员来说,这些异步工具的工作机制显得十分令人困惑。因此,我将在这份简短的指南中从零构建一个简化版的`aiohttp`,并通过这种方式深入幕后,理清Python异步编程的工作过程。我们将从官方文档中的一个基本示例出发,并逐步增加我们所感兴趣的必要功能。让我们立刻开始吧!
8 |
9 | 在这篇指南中,我将假设你已经对[asyncio](https://docs.python.org/3/library/asyncio.html)有了最基本的了解。如果你需要回顾一些相关知识的话,这几篇文章或许能帮到你:
10 |
11 | - [Intro to asyncio](https://www.blog.pythonlibrary.org/2016/07/26/python-3-an-intro-to-asyncio)
12 | - [Understanding asynchronous programming in Python](https://dbader.org/blog/understanding-asynchronous-programming-in-python)
13 |
14 | 当然,如果你已经等不及了,可以直接在这里找最终的源码:[`hzlmn/sketch`](https://github.com/hzlmn/sketch)
15 |
16 | ## 相关项目
17 |
18 | - [500 Lines or Less](https://github.com/aosabook/500lines)
19 |
20 | ## 目录 :book:
21 |
22 | * [Asyncio库的低层级API:Transports与Protocols](#asyncio库的低层级apitransports与protocols)
23 | * [在服务器程序上实现协议](#在服务器程序上实现协议)
24 | * [Request/Response对象](#requestresponse对象)
25 | * [Application与UrlDispatcher](#application与urldispatcher)
26 | * [更进一步](#更进一步)
27 | * [路由参数](#路由参数)
28 | * [中间件](#中间件)
29 | * [App的生命周期钩子](#app的生命周期钩子)
30 | * [完善异常处理](#完善异常处理)
31 | * [优雅地退出](#优雅地退出)
32 | * [应用程序示例](#应用程序示例)
33 | * [总结](#总结)
34 |
35 | ## Asyncio库的低层级API:Transports与Protocols
36 |
37 | `Asyncio`库经过了漫长的演变才成为现在这个样子。曾经的asyncio是作为一个名为“tulip”的底层工具而被创造出来的。那个时候,开发高层级的应用程序可不像今天这么愉快。
38 |
39 | 在如今大多数情况下,`asyncio`都被作为一种高层级的API来使用,不过该库也提供了一些低层级的助手来供那些库的设计者管理事件循环,以及实现网络或进程间的通信协议。
40 |
41 | `Asyncio`库仅为`TCP`, `UDP`, `SSL` 以及子进程提供了开箱即用的支持。而其他异步库则基于`asyncio`库所提供基础传输与编程接口实现了它们所需的更高层级的协议,如`HTTP`、`FTP`等等。
42 |
43 | 所有通信都是通过链接Transports和Protocols来完成的。简单地说,Transports描述了我们该如何传送数据,而Protocols负责决定传送哪些数据。
44 |
45 | 关于Transports与Protocols,`asyncio`库提供了一份非常棒的官方文档,你可以在[这里](https://docs.python.org/3.8/library/asyncio-protocol.html#asyncio-transport)访问它,并进行更深入的了解。
46 |
47 | 作为项目的第一步,让我们先来编写一个简单的`TCP`回显服务器。
48 |
49 | `server.py`
50 |
51 | ```python
52 | import asyncio
53 |
54 | class Server(asyncio.Protocol):
55 | def connection_made(self, transport):
56 | self._transport = transport
57 |
58 | def data_received(self, data):
59 | message = data.decode()
60 |
61 | self._transport.write(data)
62 |
63 | self._transport.close()
64 |
65 | loop = asyncio.get_event_loop()
66 |
67 | coro = loop.create_server(Server, '127.0.0.1', 8080)
68 | server = loop.run_until_complete(coro)
69 |
70 | try:
71 | loop.run_forever()
72 | except KeyboardInterrupt:
73 | pass
74 |
75 | server.close()
76 | loop.run_until_complete(server.wait_closed())
77 | loop.close()
78 | ```
79 |
80 | ```shell
81 | $ curl http://127.0.0.1:8080
82 | GET / HTTP/1.1
83 | Host: 127.0.0.1:8080
84 | User-Agent: curl/7.54.0
85 | Accept: */*
86 | ```
87 |
88 | 从以上示例可以看出,构建异步服务器程序的代码非常简单。不过如果你想构建一个更高层级的应用程序,仅凭这些还不太够。
89 |
90 | 由于 `HTTP` 协议工作在 `TCP` 协议之上,我们现在已经可以向我们的服务器程序发送 `HTTP` 请求了。然而,接收并使用未经格式化处理的 `HTTP`报文显然是非常困难的。所以我们下一步的工作就是去增加一种更好的 `HTTP` 处理机制。
91 |
92 |
93 | ## 在服务器程序上实现协议
94 |
95 | 让我们为服务器程序增加一个解析 `HTTP`请求的功能,这样我们就可以提取并使用请求头、请求正文以及请求路径等信息。如何解析 `HTTP`请求是一个非常复杂的话题,这远远超出了本指南所研究的范围,因此我们将直接使用[httptools](https://github.com/MagicStack/httptools)来解析请求。[httptools](https://github.com/MagicStack/httptools)是一个效率高,兼容性好,并且相当灵活的`HTTP`解析器。
96 |
97 | 此外,`aiohttp`项目也实现了一个基于Python的`HTTP`解析器,并且这个解析器已经被集成到了Node的 [`http-parser`](https://github.com/nodejs/http-parser/tree/77310eeb839c4251c07184a5db8885a572a08352)中。
98 |
99 | 接下来,我们需要实现一个用来与服务器类组合的解析器类。
100 |
101 | `http_parser.py`
102 |
103 | ```python
104 | class HttpParserMixin:
105 | def on_body(self, data):
106 | self._body = data
107 |
108 | def on_url(self, url):
109 | self._url = url
110 |
111 | def on_message_complete(self):
112 | print(f"Received request to {self._url.decode(self._encoding)}")
113 |
114 | def on_header(self, header, value):
115 | header = header.decode(self._encoding)
116 | self._headers[header] = value.decode(self._encoding)
117 | ```
118 |
119 | 实现解析器类 `HttpParserMixin`后,将它与我们的 `Server` 类组合到一起。
120 |
121 | `server.py`
122 |
123 | ```python
124 | import asyncio
125 |
126 | from httptools import HttpRequestParser
127 |
128 | from .http_parser import HttpParserMixin
129 |
130 | class Server(asyncio.Protocol, HttpParserMixin):
131 | def __init__(self, loop):
132 | self._loop = loop
133 | self._encoding = "utf-8"
134 | self._url = None
135 | self._headers = {}
136 | self._body = None
137 | self._transport = None
138 | self._request_parser = HttpRequestParser(self)
139 |
140 | def connection_made(self, transport):
141 | self._transport = transport
142 |
143 | def connection_lost(self, *args):
144 | self._transport = None
145 |
146 | def data_received(self, data):
147 | # Pass data to our parser
148 | self._request_parser.feed_data(data)
149 | ```
150 |
151 | 现在,我们终于拥有了一个能够解析传入的 `HTTP` 请求,并从中提取重要信息的服务器。让我们把它运行起来。
152 |
153 | `server.py`
154 |
155 | ```python
156 | if __name__ == "__main__":
157 | loop = asyncio.get_event_loop()
158 | serv = Server(loop)
159 | server = loop.run_until_complete(loop.create_server(lambda: serv, port=8080))
160 |
161 | try:
162 | print("Started server on ::8080")
163 | loop.run_until_complete(server.serve_forever())
164 | except KeyboardInterrupt:
165 | server.close()
166 | loop.run_until_complete(server.wait_closed())
167 | loop.stop()
168 |
169 | ```
170 |
171 | ```sh
172 | > python server.py
173 | Started server on ::8080
174 | ```
175 |
176 | ```sh
177 | > curl http://127.0.0.1:8080/hello
178 | ```
179 |
180 | ## Request/Response对象
181 |
182 | 目前,我们已经拥有了一个可以解析 `HTTP`请求的服务器程序。但为了构建应用程序,我们还需要在某些方面做进一步的抽象。
183 |
184 | 现在让我们来创建一个用于将所有 `HTTP` 请求信息组合到一起的 `Request` 类。请确保已经安装了 `yarl` 库,我们将使用它来处理url。
185 |
186 | `request.py`
187 |
188 | ```python
189 | import json
190 |
191 | from yarl import URL
192 |
193 | class Request:
194 | _encoding = "utf_8"
195 |
196 | def __init__(self, method, url, headers, version=None, body=None, app=None):
197 | self._version = version
198 | self._method = method.decode(self._encoding)
199 | self._url = URL(url.decode(self._encoding))
200 | self._headers = headers
201 | self._body = body
202 |
203 | @property
204 | def method(self):
205 | return self._method
206 |
207 | @property
208 | def url(self):
209 | return self._url
210 |
211 | @property
212 | def headers(self):
213 | return self._headers
214 |
215 | def text(self):
216 | if self._body is not None:
217 | return self._body.decode(self._encoding)
218 |
219 | def json(self):
220 | text = self.text()
221 | if text is not None:
222 | return json.loads(text)
223 |
224 | def __repr__(self):
225 | return f""
226 | ```
227 |
228 | 下一步,我们还需要这样一个结构:它能帮助我们以程序员友好的方式描述 `HTTP` 响应,并将其转化为原始的 `HTTP` 报文。这种转化后的报文可以通过 `asyncio.Transport`处理。
229 |
230 | `response.py`
231 |
232 | ```python
233 | import http.server
234 |
235 | web_responses = http.server.BaseHTTPRequestHandler.responses
236 |
237 | class Response:
238 | _encoding = "utf-8"
239 |
240 | def __init__(
241 | self,
242 | body=None,
243 | status=200,
244 | content_type="text/plain",
245 | headers=None,
246 | version="1.1",
247 | ):
248 | self._version = version
249 | self._status = status
250 | self._body = body
251 | self._content_type = content_type
252 | if headers is None:
253 | headers = {}
254 | self._headers = headers
255 |
256 | @property
257 | def body(self):
258 | return self._body
259 |
260 | @property
261 | def status(self):
262 | return self._status
263 |
264 | @property
265 | def content_type(self):
266 | return self._content_type
267 |
268 | @property
269 | def headers(self):
270 | return self._headers
271 |
272 | def add_body(self, data):
273 | self._body = data
274 |
275 | def add_header(self, key, value):
276 | self._headers[key] = value
277 |
278 | def __str__(self):
279 | """We will use this in our handlers, it is actually generation of raw HTTP response,
280 | that will be passed to our TCP transport
281 | """
282 | status_msg, _ = web_responses.get(self._status)
283 |
284 | messages = [
285 | f"HTTP/{self._version} {self._status} {status_msg}",
286 | f"Content-Type: {self._content_type}",
287 | f"Content-Length: {len(self._body)}",
288 | ]
289 |
290 | if self.headers:
291 | for header, value in self.headers.items():
292 | messages.append(f"{header}: {value}")
293 |
294 | if self._body is not None:
295 | messages.append("\r\n" + self._body)
296 |
297 | return "\r\n".join(messages)
298 |
299 | def __repr__(self):
300 | return f""
301 | ```
302 |
303 | 如上所示,代码非常简单。我们封装了所有的数据,并为其属性定义了对应的getter方法。我们还定义了一些之后要用到的助手方法,用于处理 `text` 以及 `json`格式的报文体。接下来的任务就是更新一下服务器程序 ,使之能够通过接收到的消息来创建 `Request` 对象。
304 |
305 | `Request` 对象应当在解析完整个请求后创建,因此我们把创建工作添加到解析器类的 `on_message_complete`事件的处理方法中。
306 |
307 | `http_parser.py`
308 |
309 | ```python
310 | class HttpParserMixin:
311 | ...
312 |
313 | def on_message_complete(self):
314 | self._request = self._request_class(
315 | version=self._request_parser.get_http_version(),
316 | method=self._request_parser.get_method(),
317 | url=self._url,
318 | headers=self._headers,
319 | body=self._body,
320 | )
321 |
322 | ...
323 | ```
324 |
325 | `Server`类也需要改造一下,使之能够创建 `Response` 对象,并将编码后的消息传递给`asyncio.Transport`。
326 |
327 | `server.py`
328 |
329 | ```python
330 | from .response import Response
331 | ...
332 |
333 | class Server(asyncio.Protocol, HttpParserMixin):
334 | ...
335 |
336 | def __init__(self, loop):
337 | ...
338 | self._request = None
339 | self._request_class = Request
340 |
341 | ...
342 |
343 | def data_received(self, data):
344 | self._request_parser.feed_data(data)
345 |
346 | resp = Response(body=f"Received request on {self._request.url}")
347 | self._transport.write(str(resp).encode(self._encoding))
348 |
349 | self._transport.close()
350 | ```
351 |
352 | 现在再去运行 `server.py`,我们就可以使用curl去请求`http://localhost:8080/path`,并在响应中看到 `Received request on /path`了 。
353 |
354 | ## Application与UrlDispatcher
355 |
356 | 现阶段,我们已经拥有了能够解析`HTTP`请求的服务器,以及能够处理请求周期的Request/Response对象。然而,我们这个手写的工具包中还缺少一些重要的概念。首先,我们现在只有一个主请求处理器,而在大型的应用程序中,我们需要很多请求处理器来处理不同的路由。因此我们还需要一种机制来为不同路由分别注册处理程序。
357 |
358 | 现在让我们用内置的字典来实现一个尽可能简单的 `UrlDispatcher`。该字典的键是一个由请求方法与请求路径组成的二元组,而值是一个处理程序。此外我们还需要一个单独的处理程序去处理那些无法识别路由的请求。
359 |
360 | `router.py`
361 |
362 | ```python
363 | from .response import Response
364 |
365 | class UrlDispatcher:
366 | def __init__(self):
367 | self._routes = {}
368 |
369 | async def _not_found(self, request):
370 | return Response(f"Not found {request.url} on this server", status=404)
371 |
372 | def add_route(self, method, path, handler):
373 | self._routes[(method, path)] = handler
374 |
375 | def resolve(self, request):
376 | key = (request.method, request.url.path)
377 | if key not in self._routes:
378 | return self._not_found
379 | return self._routes[key]
380 | ```
381 |
382 | 当然,我们还缺少很多别的东西,比如参数化的路由等等。我们会在之后增加它们,现在还是让程序尽可能保持简单吧。
383 |
384 | 直接与底层的 `Server` 进行交互是非常麻烦的,所以,接下来我们需要一个`Applicatio `容器,用来组合所有与应用相关的信息。
385 |
386 | ```python
387 | import asyncio
388 |
389 | from .router import UrlDispatcher
390 | from .server import Server
391 | from .response import Response
392 |
393 | class Application:
394 | def __init__(self, loop=None):
395 | if loop is None:
396 | loop = asyncio.get_event_loop()
397 |
398 | self._loop = loop
399 | self._router = UrlDispatcher()
400 |
401 | @property
402 | def loop(self):
403 | return self._loop
404 |
405 | @property
406 | def router(self):
407 | return self._router
408 |
409 | def _make_server(self):
410 | return Server(loop=self._loop, handler=self._handler, app=self)
411 |
412 | async def _handler(self, request, response_writer):
413 | """Process incoming request"""
414 | handler = self._router.resolve(request)
415 | resp = await handler(request)
416 |
417 | if not isinstance(resp, Response):
418 | raise RuntimeError(f"expect Response instance but got {type(resp)}")
419 |
420 | response_writer(resp)
421 |
422 | ```
423 |
424 | 我们需要对 `Server` 稍加修改,并增加一个 `response_writer` 方法来将数据传送给transport。同时,我们需要在 `Server` 的构造函数中增加 `handler` 属性和 `app`属性。这些属性将被用来调用相应的处理程序。
425 |
426 | `server.py`
427 |
428 | ```python
429 | class Server(asyncio.Protocol, HttpParserMixin):
430 | ...
431 |
432 | def __init__(self, loop, handler, app):
433 | self._loop = loop
434 | self._url = None
435 | self._headers = {}
436 | self._body = None
437 | self._transport = None
438 | self._request_parser = HttpRequestParser(self)
439 | self._request = None
440 | self._request_class = Request
441 | self._request_handler = handler
442 | self._request_handler_task = None
443 |
444 | def response_writer(self, response):
445 | self._transport.write(str(response).encode(self._encoding))
446 | self._transport.close()
447 |
448 | ...
449 |
450 | ```
451 |
452 | `http_parser.py`
453 |
454 | ```python
455 | class HttpParserMixin:
456 | def on_body(self, data):
457 | self._body = data
458 |
459 | def on_url(self, url):
460 | self._url = url
461 |
462 | def on_message_complete(self):
463 | self._request = self._request_class(
464 | version=self._request_parser.get_http_version(),
465 | method=self._request_parser.get_method(),
466 | url=self._url,
467 | headers=self._headers,
468 | body=self._body,
469 | )
470 |
471 | self._request_handler_task = self._loop.create_task(
472 | self._request_handler(self._request, self.response_writer)
473 | )
474 |
475 | def on_header(self, header, value):
476 | header = header.decode(self._encoding)
477 | self._headers[header] = value.decode(self._encoding)
478 | ```
479 |
480 |
481 |
482 | 终于,我们完成了基本功能的开发,并且可以注册新的路由和处理程序了。接下来,我们要写一个简单的助手方法来运行我们的应用实例(就像 `aiohttp`中的 `web.run_app`)。
483 |
484 | `application.py`
485 |
486 | ```python
487 | def run_app(app, host="127.0.0.1", port=8080, loop=None):
488 | if loop is None:
489 | loop = asyncio.get_event_loop()
490 |
491 | serv = app._make_server()
492 | server = loop.run_until_complete(
493 | loop.create_server(lambda: serv, host=host, port=port)
494 | )
495 |
496 | try:
497 | print(f"Started server on {host}:{port}")
498 | loop.run_until_complete(server.serve_forever())
499 | except KeyboardInterrupt:
500 | server.close()
501 | loop.run_until_complete(server.wait_closed())
502 | loop.stop()
503 | ```
504 |
505 | 现在,是时候用我们新开发的工具包来创建简单的应用程序了。
506 |
507 | `app.py`
508 |
509 | ```python
510 | import asyncio
511 |
512 | from .response import Response
513 | from .application import Application, run_app
514 |
515 | app = Application()
516 |
517 | async def handler(request):
518 | return Response(f"Hello at {request.url}")
519 |
520 | app.router.add_route("GET", "/", handler)
521 |
522 | if __name__ == "__main__":
523 | run_app(app)
524 |
525 | ```
526 |
527 | 如果你已经运行了程序,并向 `/`发送了一个 `GET` 请求,就可以看到 `Hello at /`响应。同时,如果你访问其他路由,则会收到一个 `404`响应。
528 |
529 | ```shell
530 | $ curl 127.0.0.1:8080/
531 | Hello at /
532 |
533 | $ curl 127.0.0.1:8080/invalid
534 | Not found /invalid on this server
535 |
536 | ```
537 |
538 | 不错,我们终于完成了!但不得不说,这个项目还有很多需要改进的地方。
539 |
540 | ## 更进一步
541 |
542 | 到目前为止,我们已经开发并运行了所有的基本功能,但我们的“框架”中的某些东西还有待改进。首先,正如之前提到过的,我们的路由程序缺少参数化路由的功能,这是所有现代的框架都必须具有的特性。然后我们需要添加对中间件的支持,这也是十分常见,并且非常强大的概念。此外,在`aiohttp`的炫酷特性中,应用的生命周期钩子深得我喜爱(如`on_startup`, `on_shutdown`, `on_cleanup`),所以我们也应当尝试着去实现它。
543 |
544 |
545 |
546 | ## 路由参数
547 |
548 | 目前我们的 `UrlDispatcher`非常精简,它把被注册的url路径当作字符串来处理。我们首先要做的是在`resolve`方法中添加对` /user/{username} `等模式的支持。同时,我们还需要一个`_format_pattern` 助手方法,该方法可以从参数化字符串生成实际的正则表达式。也许你已经注意到了,我们还定义了`_method_not_allowed` 助手方法,以及另外几个用来处理 `GET`, `POST`等简单路由的方法。
549 |
550 | `router.py`
551 |
552 | ```python
553 | import re
554 |
555 | from functools import partialmethod
556 |
557 | from .response import Response
558 |
559 | class UrlDispatcher:
560 | _param_regex = r"{(?P\w+)}"
561 |
562 | def __init__(self):
563 | self._routes = {}
564 |
565 | async def _not_found(self, request):
566 | return Response(f"Could not find {request.url.raw_path}")
567 |
568 | async def _method_not_allowed(self, request):
569 | return Response(f"{request.method} not allowed for {request.url.raw_path}")
570 |
571 | def resolve(self, request):
572 | for (method, pattern), handler in self._routes.items():
573 | match = re.match(pattern, request.url.raw_path)
574 |
575 | if match is None:
576 | return None, self._not_found
577 |
578 | if method != request.method:
579 | return None, self._method_not_allowed
580 |
581 | return match.groupdict(), handler
582 |
583 | def _format_pattern(self, path):
584 | if not re.search(self._param_regex, path):
585 | return path
586 |
587 | regex = r""
588 | last_pos = 0
589 |
590 | for match in re.finditer(self._param_regex, path):
591 | regex += path[last_pos: match.start()]
592 | param = match.group("param")
593 | regex += r"(?P<%s>\w+)" % param
594 | last_pos = match.end()
595 |
596 | return regex
597 |
598 | def add_route(self, method, path, handler):
599 | pattern = self._format_pattern(path)
600 | self._routes[(method, pattern)] = handler
601 |
602 | add_get = partialmethod(add_route, "GET")
603 |
604 | add_post = partialmethod(add_route, "POST")
605 |
606 | add_put = partialmethod(add_route, "PUT")
607 |
608 | add_head = partialmethod(add_route, "HEAD")
609 |
610 | add_options = partialmethod(add_route, "OPTIONS")
611 | ```
612 |
613 | 我们还需要改造一下`Applicatio `容器,使`UrlDispatcher `的`resolve `方法能够返回`match_info `以及对应的`handler `。修改 `Application._handler` 中的以下几行。
614 |
615 | `application.py`
616 |
617 | ```python
618 | class Application:
619 | ...
620 | async def _handler(self, request, response_writer):
621 | """Process incoming request"""
622 | match_info, handler = self._router.resolve(request)
623 |
624 | request.match_info = match_info
625 |
626 | ...
627 |
628 | ```
629 |
630 | ## 中间件
631 |
632 | 可能有些读者会对中间件这个概念感到陌生。简单来说,中间件是一个协程,且该协程会在请求到达服务器之前启动,并修改传入处理程序的 `Request`对象,或修改处理程序生成的 `Response`对象。我们的需求实现起来非常简单。首先,我们要在`Application` 对象中添加一个用于注册中间件的列表,并修改 `Application._handler` 来运行这些中间件。注意,每个中间件的运行都要基于前一个中间件的工作结果,而不是基于最初的处理程序的工作结果。
633 |
634 | `application.py`
635 |
636 | ```python
637 | from functools import partial
638 | ...
639 |
640 | class Application:
641 | def __init__(self, loop=None, middlewares=None):
642 | ...
643 | if middlewares is None:
644 | self._middlewares = []
645 |
646 | ...
647 |
648 | async def _handler(self, request, response_writer):
649 | """Process incoming request"""
650 | match_info, handler = self._router.resolve(request)
651 |
652 | request.match_info = match_info
653 |
654 | if self._middlewares:
655 | for md in self._middlewares:
656 | handler = partial(md, handler=handler)
657 |
658 | resp = await handler(request)
659 |
660 | ...
661 | ```
662 |
663 | 然后,为我们的应用程序添加一个请求日志中间件。
664 |
665 | `app.py`
666 |
667 | ```python
668 | import asyncio
669 |
670 | from .response import Response
671 | from .application import Application, run_app
672 |
673 | async def log_middleware(request, handler):
674 | print(f"Received request to {request.url.raw_path}")
675 | return await handler(request)
676 |
677 | app = Application(middlewares=[log_middleware])
678 |
679 | async def handler(request):
680 | return Response(f"Hello at {request.url}")
681 |
682 | app.router.add_route("GET", "/", handler)
683 |
684 | if __name__ == "__main__":
685 | run_app(app)
686 |
687 | ```
688 |
689 | 现在再运行这个程序,我们就可以看到每个请求所对应的 `Received request to /` 消息了。
690 |
691 | ## App的生命周期钩子
692 |
693 | 下一步我们需要添加一些功能,使得应用程序可以在服务启动、服务停止等事件发生时执行对应的协程。这也是 `aiohttp`所拥有的一项非常灵巧的特性。可以处理的信号非常多,例如 `on_startup`、 `on_shutdown`、`on_response_prepared` 等等。但是我们想让程序尽可能保持简洁,因此只要实现`startup` 和 `shutdown`即可。
694 |
695 | 我们要先在 `Application` 内部为每个事件设置一个列表,用来添加各自的处理程序,并将其封装为属性,提供对应的getter。然后我们要编写实际的 `startup` 和 `shutdown` 协程,并在 `run_app`增加相应的调用。
696 |
697 | `application.py`
698 |
699 | ```python
700 | class Application:
701 | def __init__(self, loop=None, middlewares=None):
702 | ...
703 | self._on_startup = []
704 | self._on_shutdown = []
705 |
706 | ...
707 |
708 | @property
709 | def on_startup(self):
710 | return self._on_startup
711 |
712 | @property
713 | def on_shutdown(self):
714 | return self._on_shutdown
715 |
716 | async def startup(self):
717 | coros = [func(self) for func in self._on_startup]
718 | await asyncio.gather(*coros, loop=self._loop)
719 |
720 | async def shutdown(self):
721 | coros = [func(self) for func in self._on_shutdown]
722 | await asyncio.gather(*coros, loop=self._loop)
723 |
724 | ...
725 |
726 | def run_app(app, host="127.0.0.1", port=8080, loop=None):
727 | if loop is None:
728 | loop = asyncio.get_event_loop()
729 |
730 | serv = app._make_server()
731 |
732 | loop.run_until_complete(app.startup())
733 |
734 | server = loop.run_until_complete(
735 | loop.create_server(lambda: serv, host=host, port=port)
736 | )
737 |
738 | try:
739 | print(f"Started server on {host}:{port}")
740 | loop.run_until_complete(server.serve_forever())
741 | except KeyboardInterrupt:
742 | loop.run_until_complete(app.shutdown())
743 | server.close()
744 | loop.run_until_complete(server.wait_closed())
745 | loop.stop()
746 | ```
747 |
748 | ## 完善异常处理
749 |
750 | 至此,我们已经开发好了大部分核心特性,但是我们还缺少异常处理机制。 `Aiohttp`允许开发人员以处理原生Python异常的方式去处理web异常, 这也是其强大的特性之一。它实现上结合了 `Exception` 类以及 `Response` 类,非常的灵活,因此我们也来实现类似的机制。
751 |
752 | 首先,我们要创建 `HTTPException` 基类,并基于该类来实现一些我们可能会需要的助手类:`HTTPNotFound` 用于路径无法识别的情况、`HTTPBadRequest` 用于用户侧的问题、`HTTPFound` 用于重定向。
753 |
754 | ```python
755 | from .response import Response
756 |
757 | class HTTPException(Response, Exception):
758 | status_code = None
759 |
760 | def __init__(self, reason=None, content_type=None):
761 | self._reason = reason
762 | self._content_type = content_type
763 |
764 | Response.__init__(
765 | self,
766 | body=self._reason,
767 | status=self.status_code,
768 | content_type=self._content_type or "text/plain",
769 | )
770 |
771 | Exception.__init__(self, self._reason)
772 |
773 |
774 | class HTTPNotFound(HTTPException):
775 | status_code = 404
776 |
777 |
778 | class HTTPBadRequest(HTTPException):
779 | status_code = 400
780 |
781 |
782 | class HTTPFound(HTTPException):
783 | status_code = 302
784 |
785 | def __init__(self, location, reason=None, content_type=None):
786 | super().__init__(reason=reason, content_type=content_type)
787 | self.add_header("Location", location)
788 | ```
789 |
790 | 然后,我们需要修改一下 `Application._handler` 来实际捕获web异常。
791 |
792 | `application.py`
793 |
794 | ```python
795 | class Application:
796 | ...
797 | async def _handler(self, request, response_writer):
798 | """Process incoming request"""
799 | try:
800 | match_info, handler = self._router.resolve(request)
801 |
802 | request.match_info = match_info
803 |
804 | if self._middlewares:
805 | for md in self._middlewares:
806 | handler = partial(md, handler=handler)
807 |
808 | resp = await handler(request)
809 | except HTTPException as exc:
810 | resp = exc
811 |
812 | ...
813 | ```
814 |
815 | 现在我们可以删除 `UrlDispatcher` 中的`_not_found` 和`_method_not_allowed` 助手方法了。取而代之的是抛出对应的异常。
816 |
817 | `router.py`
818 |
819 | ```python
820 | class UrlDispatcher:
821 | ...
822 | def resolve(self, request):
823 | for (method, pattern), handler in self._routes.items():
824 | match = re.match(pattern, request.url.raw_path)
825 |
826 | if match is None:
827 | raise HTTPNotFound(reason=f"Could not find {request.url.raw_path}")
828 |
829 | if method != request.method:
830 | raise HTTPBadRequest(reason=f"{request.method} not allowed for {request.url.raw_path}")
831 |
832 | return match.groupdict(), handler
833 |
834 | ...
835 | ```
836 |
837 | 在出现某些反常的情况时,我们并不想去的破坏应用程序的运行,因此我们最好为服务器内部错误添加一个标准格式的响应。让我们编写一个简单的html模板,以及用于格式化异常的助手方法。
838 |
839 | `helpers.py`
840 |
841 | ```python
842 | import traceback
843 |
844 | from .response import Response
845 |
846 | server_exception_templ = """
847 |
848 |
500 Internal server error
849 |
Server got itself in trouble : {exc}
850 | {traceback}
851 |
852 | """
853 |
854 |
855 | def format_exception(exc):
856 | resp = Response(status=500, content_type="text/html")
857 | trace = traceback.format_exc().replace("\n", "")
858 | msg = server_exception_templ.format(exc=str(exc), traceback=trace)
859 | resp.add_body(msg)
860 | return resp
861 | ```
862 |
863 | 这非常简单,我们现在捕获了 `Application._handler` 中生成的所有 `Exception` ,并使用我们的助手方法生成实际的html响应。
864 |
865 | `application.py`
866 |
867 | ```python
868 | class Application:
869 | ...
870 | async def _handler(self, request, response_writer):
871 | """Process incoming request"""
872 | try:
873 | match_info, handler = self._router.resolve(request)
874 |
875 | request.match_info = match_info
876 |
877 | if self._middlewares:
878 | for md in self._middlewares:
879 | handler = partial(md, handler=handler)
880 |
881 | resp = await handler(request)
882 | except HTTPException as exc:
883 | resp = exc
884 | except Exception as exc:
885 | resp = format_exception(exc)
886 | ...
887 | ```
888 |
889 | ## 优雅地退出
890 |
891 | 最后,我们要为用于正确关闭应用程序的过程设置信号处理机制。让我们把 `run_app`修改成下面这个样子:
892 |
893 | `application.py`
894 |
895 | ```python
896 | ...
897 |
898 | def run_app(app, host="127.0.0.1", port=8080, loop=None):
899 | if loop is None:
900 | loop = asyncio.get_event_loop()
901 |
902 | serv = app._make_server()
903 |
904 | loop.run_until_complete(app.startup())
905 |
906 | server = loop.run_until_complete(
907 | loop.create_server(lambda: serv, host=host, port=port)
908 | )
909 |
910 | loop.add_signal_handler(
911 | signal.SIGTERM, lambda: asyncio.ensure_future(app.shutdown())
912 | )
913 |
914 | ...
915 | ```
916 |
917 | ## 应用程序示例
918 |
919 | 我们的工具包已经准备就绪了。现在,让我们为之前的应用示例添加生命周期钩子和异常处理。
920 |
921 | `app.py`
922 |
923 | ```python
924 | from .application import Application, run_app
925 |
926 | async def on_startup(app):
927 | # you may query here actual db, but for an example let's just use simple set.
928 | app.db = {"john_doe",}
929 |
930 | async def log_middleware(request, handler):
931 | print(f"Received request to {request.url.raw_path}")
932 | return await handler(request)
933 |
934 | async def handler(request):
935 | username = request.match_info["username"]
936 | if username not in request.app.db:
937 | raise HTTPNotFound(reason=f"No such user with as {username} :(")
938 |
939 | return Response(f"Welcome, {username}!")
940 |
941 | app = Application(middlewares=[log_middleware])
942 |
943 | app.on_startup.append(on_startup)
944 |
945 | app.router.add_get("/{username}", handler)
946 |
947 | if __name__ == "__main__":
948 | run_app(app)
949 | ```
950 |
951 | 如果我们正确完成了所有操作,现在就可以看到每个请求的日志消息了。同时,应用程序会响应欢迎信息给已注册用户的请求,响应 `HTTPNotFound` 给那些未注册用户或无法识别路由的请求。
952 |
953 | ## 总结
954 |
955 | 受 `aiohttp` 和 `sanic`的启发,我们用了500行代码手写了一个非常简单,而又功能强大的微型框架。诚然,它还不能用于生产环境,因为它还缺少很多实用且重要的特性,如更健壮的服务器,对http规范的完整支持,以及web套接字等等。但是,我相信在这个过程中,我们更好地理解了这些工具是如何被构建的。正如著名物理学家理查德·费曼所说:“如果我不能创造某个事物,那就说明我对它的理解还不够”。希望你能够喜欢这个指南,再见:wave:。
956 |
957 |
--------------------------------------------------------------------------------
/media/logo.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/hzlmn/diy-async-web-framework/10fc8178e6c5f2e48ce18f36e39a1e38ea50956b/media/logo.png
--------------------------------------------------------------------------------