├── .gitignore ├── .testr.conf ├── README.rst ├── Requirements.rst ├── experiments.rst ├── pep-draft.rst ├── setup.cfg ├── setup.py └── wsging └── __init__.py /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | 5 | # C extensions 6 | *.so 7 | 8 | # Distribution / packaging 9 | .Python 10 | env/ 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | lib/ 17 | lib64/ 18 | parts/ 19 | sdist/ 20 | var/ 21 | *.egg-info/ 22 | .installed.cfg 23 | *.egg 24 | 25 | # PyInstaller 26 | # Usually these files are written by a python script from a template 27 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 28 | *.manifest 29 | *.spec 30 | 31 | # Installer logs 32 | pip-log.txt 33 | pip-delete-this-directory.txt 34 | 35 | # Unit test / coverage reports 36 | htmlcov/ 37 | .tox/ 38 | .coverage 39 | .cache 40 | nosetests.xml 41 | coverage.xml 42 | 43 | # Translations 44 | *.mo 45 | *.pot 46 | 47 | # Django stuff: 48 | *.log 49 | 50 | # Sphinx documentation 51 | docs/_build/ 52 | 53 | # PyBuilder 54 | target/ 55 | 56 | # Editors 57 | .*.swp 58 | *~ 59 | 60 | AUTHORS 61 | ChangeLog 62 | .testrepository 63 | -------------------------------------------------------------------------------- /.testr.conf: -------------------------------------------------------------------------------- 1 | [DEFAULT] 2 | test_command=${PYTHON:-python} -m subunit.run discover $LISTOPT $IDOPTION 3 | test_id_option=--load-list $IDFILE 4 | test_list_option=--list 5 | -------------------------------------------------------------------------------- /README.rst: -------------------------------------------------------------------------------- 1 | This repository is for drafting a revision to the WSGI spec. 2 | 3 | Who 4 | === 5 | 6 | Right now we're figuring out who has the time to contribute to this 7 | on the web-sig list, the We below refers to whomever ends up being 8 | involved :) 9 | 10 | We want to create a clean common API for applications and middleware 11 | written in a post HTTP/2 world - where single servers may accept up to 12 | all three of HTTP/1.x, HTTP/2 and Websocket connections, and 13 | applications and middleware want to be able to take advantage of 14 | HTTP/2 and websockets when available, but also degrade gracefully. We 15 | also want to ensure that there is a graceful incremental path to 16 | adoption of the new API, including Python 2.7 support, and shims to 17 | enable existing WSGI apps/middleware/servers to respectively be 18 | contained, contain-or-be-contained and contain, things written to this 19 | new API. We want a clean, fast and approachable API, and we want to 20 | ensure that its no less friendly to work with than WSGI, for all that 21 | it will expose much more functionality. 22 | 23 | Governance 24 | ========== 25 | 26 | We plan to produce a PEP and reference code. That will be done by direct 27 | collaboration from folk with time and energy. We're not interested in 28 | debating everything to death until we have working code showing that 29 | we've got *something* feasible. At that point we'll move into the regular 30 | PEP process on python-ideas@python.org. This incrementally more public 31 | approach is intended to mitigate against burnout that has negatively 32 | affected previous WSGI overhaul approaches. 33 | 34 | Overall Process 35 | =============== 36 | 37 | There are three broad phases planned (though as with all design work, it 38 | won't be strictly dileneated). That is, folk are encouraged to experiment 39 | with 40 | 41 | Phase 1 - requirements 42 | ++++++++++++++++++++++ 43 | 44 | Gathering requirements. These will come from our needs, the needs of 45 | interop with other PEP's and of course the needs of the underlying 46 | HTTP/1.x,HTTP/2 and Websocket protocols. I'm allowing 3 months for this to 47 | ensure that we've had plenty of time for developers from Django, uWSGI and 48 | so on and so forth to participate. At the end of the 3 months (mid January) 49 | we'll stop iterating on requirements unless security or feasibility are 50 | involved. That is, security requirements and 'this *cannot* work' 51 | requirements can always be added. If and only if we've had broad input from 52 | the Python web community then we may finalise things earlier. 53 | 54 | Phase 2 - design 55 | ++++++++++++++++ 56 | 57 | Here we experiment with different ways of meeting the requirements from 58 | Phase 1. Again, we're allowing 3 months for experimentation and design work 59 | to take place. This needs to include proof of concept implementations that 60 | work in mod_wsgi, uWSGI, gunicorn etc. In mid-April, we'll pick a final 61 | design from the set of experiments that have taken place. 62 | 63 | Phase 3 - PEP 64 | +++++++++++++ 65 | 66 | Here we translate the final design into a PEP and enter it into the PEP 67 | process. This will be as fast or slow as the PEP process runs. 68 | 69 | 70 | Participating 71 | ============= 72 | 73 | While we know things that are broken about WSGI, we probably don't know them 74 | all, so telling us about design issues or capabilities you want to see as 75 | issues in this respository is useful. 76 | 77 | Specifically, please open issues at 78 | https://github.com/python-web-sig/wsgi-ng/issues for things you want to see 79 | in the new protocol. Whether thats 'WSGI makes me cry because X' or 'I want 80 | to be able to write middleware that proxies to a remote server over zeromq' 81 | - all issues are welcome. 82 | 83 | Secondly, we need to experiment to find a good protocol that meets all our 84 | requirements. You can add experiments as pull requests against this 85 | repository. Experiments can be prose (e.g. a draft specification) or code 86 | (e.g. a test implementation of a draft specification). 87 | 88 | There is *a* draft spec based on PEP-3333 in the repository already - it is 89 | not special or privileged - we may end up with a totally new spec rather 90 | than that iterated one. That said please do also submit PRs to change it in 91 | light of requirements or issues that need resolving. This repository is a 92 | community resource where we can build consensus and working code together. 93 | 94 | Current status 95 | ============== 96 | 97 | We're just self assembling at the moment. 98 | 99 | We have a `requirements document `_. 100 | 101 | We have a `draft PEP based on 3333 `_. 102 | 103 | We may add some `experiments too `. 104 | -------------------------------------------------------------------------------- /Requirements.rst: -------------------------------------------------------------------------------- 1 | Overview 2 | ======== 3 | 4 | These are the functional requirements for any successful design. Where the 5 | requirements are too high level to be actionable, please help us refine them :). 6 | 7 | Requirements 8 | ============ 9 | 10 | #. Support servers speaking HTTP/1.x, HTTP/2 and Websockets (potentially all on 11 | a single port). 12 | #. Support graceful degradation for applications that can use HTTP/2 but still 13 | support HTTP/1.x requests. 14 | #. Graceful incremental adoption path - no upgrade-all-components requirement 15 | baked into the design. 16 | #. Support Python 2.7 and 3.x (where x is not yet discussed) 17 | #. Support the existing ecosystem of containers (such as mod_wsgi) 18 | new API. We want a clean, fast and approachable API, and we want to 19 | ensure that its no less friendly to work with than WSGI, for all that 20 | it will expose much more functionality. 21 | #. Apps need to be able to tell what protocol is in use, and what optional 22 | features are available. For instance, HTTP/2 PUSH PROMISE is an optional 23 | feature that can be disabled by clients. Websockets needs to expose a socket 24 | like object, and so on. 25 | #. Support websockets 26 | #. Support HTTP/2 27 | #. Support HTTP/1.x [ which may be just 'point at PEP-3333'. ] 28 | #. Continue to support lightweight shims being built on top such as 29 | https://github.com/Pylons/webob/blob/master/webob/request.py 30 | 31 | Corollaries 32 | =========== 33 | 34 | #. May well want to use `python futures `_ to get 35 | bytes on Python 2.7. 36 | #. Will need old to new and new to old shims to enale upgrading one layer in 37 | a middleware stack at a time. 38 | #. Cannot be coarsely incompatible with WSGI, or writing shims will be very 39 | fragile. 40 | #. Cannot hand the connection socket to apps (pending confirmation from 41 | container authors). 42 | 43 | Not requirements 44 | ================ 45 | 46 | These are things that we've discussed but haven't [yet] decided to make into 47 | requirements for the design, or which we have decided definitely are not 48 | requirements. 49 | 50 | Implementing new protocols as middleware 51 | ++++++++++++++++++++++++++++++++++++++++ 52 | 53 | gunicorn exposes the socket that requests were received on, allowing apps to 54 | write anything they want - e.g. taking over the socket, which is how websockets 55 | can be implemented there. gevent uses a custom handler_class to inject websocket 56 | data into the environment. Neither approach is standardised (such that all 57 | containers support it). Making new protocol implementations be something that 58 | can routinely be done without revving WSGI would be nice, but its not clear that 59 | its compatible with the requirement for supporting e.g. mod_wsgi. Input from 60 | container maintainers is needed! 61 | -------------------------------------------------------------------------------- /experiments.rst: -------------------------------------------------------------------------------- 1 | API experiments 2 | =============== 3 | 4 | API experiments may be added under wsging, to get a persistent sense of how 5 | different approachs may pan out. Benchmarks and other tests associated with 6 | this will be stored there too. 7 | -------------------------------------------------------------------------------- /pep-draft.rst: -------------------------------------------------------------------------------- 1 | PEP: Not assigned yet 2 | Title: Python Web Server Gateway Interface v2 3 | Version: $Revision$ 4 | Last-Modified: $Date$ 5 | Author: Robert Collins 6 | Discussions-To: Python Web-SIG 7 | Status: Draft 8 | Type: Informational 9 | Content-Type: text/x-rst 10 | Created: 25-Sep-2014 11 | Replaces: 333, 3333 12 | 13 | 14 | Preface for Readers of PEP \??? 15 | =============================== 16 | 17 | This is an updated version of PEP 3333, modified to address various 18 | limitations that have evolved since WSGI was first standardised. 19 | These include incremental upload handling, websockets and HTTP/2. 20 | 21 | The broad strategy is to define a new specification to handle these 22 | new features and cleanup some cruft that has become apparent in WSGI. 23 | Then adapters to contain WSGI 1.0.1 apps in WSGI 2 servers, and vice 24 | versa will be written, allowing incremental migration of WSGI stacks. 25 | 26 | Given the still huge deployment of Python2.7, we have limited 27 | ourselves to language features avaiable in Python2.7, though the 28 | design and focus is for Python3.5 and up. 29 | 30 | 31 | Abstract 32 | ======== 33 | 34 | This document specifies a proposed standard interface between web 35 | servers and Python web applications or frameworks, to promote web 36 | application portability across a variety of web servers and web 37 | protocols. Support is included for the features of HTTP/1.x, HTTP/2 38 | and Websockets. 39 | 40 | 41 | Differences to PEP \3333 42 | ======================== 43 | 44 | * Minimum Python version of 2.7 - the oldest supported Python at the 45 | time of the PEP. 46 | 47 | * Pre 2.2 support advice removed. 48 | 49 | * Server push is defined - see ``wsgi.associated_content``. 50 | 51 | * The pending changes from 52 | https://mail.python.org/pipermail/web-sig/2010-September/004655.html 53 | have been applied, barring the point about ``wsgi.input`` and 54 | ``CONTENT_LENGTH`` being out of sync which needs further discussion. 55 | 56 | * Prose about the interaction of chunking and ``CONTENT_LENGTH`` 57 | corrected. 58 | 59 | Original Rationale and Goals (from PEP \333) 60 | ============================================ 61 | 62 | Python currently boasts a wide variety of web application frameworks, 63 | such as Zope, Quixote, Webware, SkunkWeb, PSO, and Twisted Web -- to 64 | name just a few [1]_. This wide variety of choices can be a problem 65 | for new Python users, because generally speaking, their choice of web 66 | framework will limit their choice of usable web servers, and vice 67 | versa. 68 | 69 | By contrast, although Java has just as many web application frameworks 70 | available, Java's "servlet" API makes it possible for applications 71 | written with any Java web application framework to run in any web 72 | server that supports the servlet API. 73 | 74 | The availability and widespread use of such an API in web servers for 75 | Python -- whether those servers are written in Python (e.g. Medusa), 76 | embed Python (e.g. mod_python), or invoke Python via a gateway 77 | protocol (e.g. CGI, FastCGI, etc.) -- would separate choice of 78 | framework from choice of web server, freeing users to choose a pairing 79 | that suits them, while freeing framework and server developers to 80 | focus on their preferred area of specialization. 81 | 82 | This PEP, therefore, proposes a simple and universal interface between 83 | web servers and web applications or frameworks: the Python Web Server 84 | Gateway Interface (WSGIv2, or WSGI in the rest of this document. Where 85 | WSGI version 1 is intended, it will be explicitly identified as 86 | WSGIv1). 87 | 88 | But the mere existence of a WSGI spec does nothing to address the 89 | existing state of servers and frameworks for Python web applications. 90 | Server and framework authors and maintainers must actually implement 91 | WSGI for there to be any effect. 92 | 93 | However, since no existing servers or frameworks support WSGI, there 94 | is little immediate reward for an author who implements WSGI support. 95 | Thus, WSGI **must** be easy to implement, so that an author's initial 96 | investment in the interface can be reasonably low. 97 | 98 | Thus, simplicity of implementation on *both* the server and framework 99 | sides of the interface is absolutely critical to the utility of the 100 | WSGI interface, and is therefore the principal criterion for any 101 | design decisions. 102 | 103 | Note, however, that simplicity of implementation for a framework 104 | author is not the same thing as ease of use for a web application 105 | author. WSGI presents an absolutely "no frills" interface to the 106 | framework author, because bells and whistles like response objects and 107 | cookie handling would just get in the way of existing frameworks' 108 | handling of these issues. Again, the goal of WSGI is to facilitate 109 | easy interconnection of existing servers and applications or 110 | frameworks, not to create a new web framework. 111 | 112 | Note also that this goal precludes WSGI from requiring anything that 113 | is not already available in deployed versions of Python. Therefore, 114 | new standard library modules are not proposed or required by this 115 | specification, and nothing in WSGI requires a Python version greater 116 | than 2.7. (It would be a good idea, however, for future versions 117 | of Python to include support for this interface in web servers 118 | provided by the standard library.) 119 | 120 | In addition to ease of implementation for existing and future 121 | frameworks and servers, it should also be easy to create request 122 | preprocessors, response postprocessors, and other WSGI-based 123 | "middleware" components that look like an application to their 124 | containing server, while acting as a server for their contained 125 | applications. 126 | 127 | If middleware can be both simple and robust, and WSGI is widely 128 | available in servers and frameworks, it allows for the possibility 129 | of an entirely new kind of Python web application framework: one 130 | consisting of loosely-coupled WSGI middleware components. Indeed, 131 | existing framework authors may even choose to refactor their 132 | frameworks' existing services to be provided in this way, becoming 133 | more like libraries used with WSGI, and less like monolithic 134 | frameworks. This would then allow application developers to choose 135 | "best-of-breed" components for specific functionality, rather than 136 | having to commit to all the pros and cons of a single framework. 137 | 138 | Of course, as of this writing, that day is doubtless quite far off. 139 | In the meantime, it is a sufficient short-term goal for WSGI to 140 | enable the use of any framework with any server. 141 | 142 | Finally, it should be mentioned that the current version of WSGI 143 | does not prescribe any particular mechanism for "deploying" an 144 | application for use with a web server or server gateway. At the 145 | present time, this is necessarily implementation-defined by the 146 | server or gateway. After a sufficient number of servers and 147 | frameworks have implemented WSGI to provide field experience with 148 | varying deployment requirements, it may make sense to create 149 | another PEP, describing a deployment standard for WSGI servers and 150 | application frameworks. 151 | 152 | 153 | Specification Overview 154 | ====================== 155 | 156 | The WSGI interface has two sides: the "server" or "gateway" side, and 157 | the "application" or "framework" side. The server side invokes a 158 | callable object that is provided by the application side. The 159 | specifics of how that object is provided are up to the server or 160 | gateway. It is assumed that some servers or gateways will require an 161 | application's deployer to write a short script to create an instance 162 | of the server or gateway, and supply it with the application object. 163 | Other servers and gateways may use configuration files or other 164 | mechanisms to specify where an application object should be 165 | imported from, or otherwise obtained. 166 | 167 | In addition to "pure" servers/gateways and applications/frameworks, 168 | it is also possible to create "middleware" components that implement 169 | both sides of this specification. Such components act as an 170 | application to their containing server, and as a server to a 171 | contained application, and can be used to provide extended APIs, 172 | content transformation, navigation, and other useful functions. 173 | 174 | Throughout this specification, we will use the term "a callable" to 175 | mean "a function, method, class, or an instance with a ``__call__`` 176 | method". It is up to the server, gateway, or application implementing 177 | the callable to choose the appropriate implementation technique for 178 | their needs. Conversely, a server, gateway, or application that is 179 | invoking a callable **must not** have any dependency on what kind of 180 | callable was provided to it. Callables are only to be called, not 181 | introspected upon. 182 | 183 | 184 | A Note On String Types 185 | ---------------------- 186 | 187 | In general, HTTP deals with bytes, which means that this specification 188 | is mostly about handling bytes. 189 | 190 | However, the content of those bytes often has some kind of textual 191 | interpretation, and in Python, strings are the most convenient way 192 | to handle text. 193 | 194 | But in many Python versions and implementations, strings are Unicode, 195 | rather than bytes. This requires a careful balance between a usable 196 | API and correct translations between bytes and text in the context of 197 | HTTP... especially to support porting code between Python 198 | implementations with different ``str`` types. 199 | 200 | WSGI therefore defines two kinds of "string": 201 | 202 | * "Native" strings (which are always implemented using the type 203 | named ``str``) that are used for request/response headers and 204 | metadata 205 | 206 | * "Bytestrings" (which are implemented using the ``bytes`` type 207 | in Python 3, and ``str`` elsewhere), that are used for the bodies 208 | of requests and responses (e.g. POST/PUT input data and HTML page 209 | outputs). 210 | 211 | Do not be confused however: even if Python's ``str`` type is actually 212 | Unicode "under the hood", the *content* of native strings must 213 | still be translatable to bytes via the Latin-1 encoding! (See 214 | the section on `Unicode Issues`_ later in this document for more 215 | details.) 216 | 217 | In short: where you see the word "string" in this document, it refers 218 | to a "native" string, i.e., an object of type ``str``, whether it is 219 | internally implemented as bytes or unicode. Where you see references 220 | to "bytestring", this should be read as "an object of type ``bytes`` 221 | under Python 3, or type ``str`` under Python 2". 222 | 223 | And so, even though HTTP is in some sense "really just bytes", there 224 | are many API conveniences to be had by using whatever Python's 225 | default ``str`` type is. 226 | 227 | 228 | 229 | The Application/Framework Side 230 | ------------------------------ 231 | 232 | The application object is simply a callable object that accepts 233 | two arguments. The term "object" should not be misconstrued as 234 | requiring an actual object instance: a function, method, class, 235 | or instance with a ``__call__`` method are all acceptable for 236 | use as an application object. Application objects must be able 237 | to be invoked more than once, as virtually all servers/gateways 238 | (other than CGI) will make such repeated requests. 239 | 240 | (Note: although we refer to it as an "application" object, this 241 | should not be construed to mean that application developers will use 242 | WSGI as a web programming API! It is assumed that application 243 | developers will continue to use existing, high-level framework 244 | services to develop their applications. WSGI is a tool for 245 | framework and server developers, and is not intended to directly 246 | support application developers.) 247 | 248 | Here are two example application objects; one is a function, and the 249 | other is a class:: 250 | 251 | HELLO_WORLD = b"Hello world!\n" 252 | 253 | def simple_app(environ, start_response): 254 | """Simplest possible application object""" 255 | status = '200 OK' 256 | response_headers = [('Content-type', 'text/plain')] 257 | start_response(status, response_headers) 258 | return [HELLO_WORLD] 259 | 260 | class AppClass: 261 | """Produce the same output, but using a class 262 | 263 | (Note: 'AppClass' is the "application" here, so calling it 264 | returns an instance of 'AppClass', which is then the iterable 265 | return value of the "application callable" as required by 266 | the spec. 267 | 268 | If we wanted to use *instances* of 'AppClass' as application 269 | objects instead, we would have to implement a '__call__' 270 | method, which would be invoked to execute the application, 271 | and we would need to create an instance for use by the 272 | server or gateway. 273 | """ 274 | 275 | def __init__(self, environ, start_response): 276 | self.environ = environ 277 | self.start = start_response 278 | 279 | def __iter__(self): 280 | status = '200 OK' 281 | response_headers = [('Content-type', 'text/plain')] 282 | self.start(status, response_headers) 283 | yield HELLO_WORLD 284 | 285 | 286 | The Server/Gateway Side 287 | ----------------------- 288 | 289 | The server or gateway invokes the application callable once for each 290 | request it receives from an HTTP client, that is directed at the 291 | application. To illustrate, here is a simple CGI gateway, implemented 292 | as a function taking an application object. Note that this simple 293 | example has limited error handling, because by default an uncaught 294 | exception will be dumped to ``sys.stderr`` and logged by the web 295 | server. 296 | 297 | :: 298 | 299 | import os, sys 300 | 301 | enc, esc = sys.getfilesystemencoding(), 'surrogateescape' 302 | 303 | def unicode_to_wsgi(u): 304 | # Convert an environment variable to a WSGI "bytes-as-unicode" string 305 | return u.encode(enc, esc).decode('iso-8859-1') 306 | 307 | def wsgi_to_bytes(s): 308 | return s.encode('iso-8859-1') 309 | 310 | def run_with_cgi(application): 311 | environ = {k: unicode_to_wsgi(v) for k,v in os.environ.items()} 312 | environ['wsgi.input'] = sys.stdin.buffer 313 | environ['wsgi.errors'] = sys.stderr 314 | environ['wsgi.version'] = (1, 0) 315 | environ['wsgi.multithread'] = False 316 | environ['wsgi.multiprocess'] = True 317 | environ['wsgi.run_once'] = True 318 | 319 | if environ.get('HTTPS', 'off') in ('on', '1'): 320 | environ['wsgi.url_scheme'] = 'https' 321 | else: 322 | environ['wsgi.url_scheme'] = 'http' 323 | 324 | headers_set = [] 325 | headers_sent = [] 326 | 327 | def write(data): 328 | out = sys.stdout.buffer 329 | 330 | if not headers_set: 331 | raise AssertionError("write() before start_response()") 332 | 333 | elif not headers_sent: 334 | # Before the first output, send the stored headers 335 | status, response_headers = headers_sent[:] = headers_set 336 | out.write(wsgi_to_bytes('Status: %s\r\n' % status)) 337 | for header in response_headers: 338 | out.write(wsgi_to_bytes('%s: %s\r\n' % header)) 339 | out.write(wsgi_to_bytes('\r\n')) 340 | 341 | out.write(data) 342 | out.flush() 343 | 344 | def start_response(status, response_headers, exc_info=None): 345 | if exc_info: 346 | try: 347 | if headers_sent: 348 | # Re-raise original exception if headers sent 349 | raise exc_info[1].with_traceback(exc_info[2]) 350 | finally: 351 | exc_info = None # avoid dangling circular ref 352 | elif headers_set: 353 | raise AssertionError("Headers already set!") 354 | 355 | headers_set[:] = [status, response_headers] 356 | 357 | # Note: error checking on the headers should happen here, 358 | # *after* the headers are set. That way, if an error 359 | # occurs, start_response can only be re-called with 360 | # exc_info set. 361 | 362 | return write 363 | 364 | result = application(environ, start_response) 365 | try: 366 | for data in result: 367 | if data: # don't send headers until body appears 368 | write(data) 369 | if not headers_sent: 370 | write('') # send headers now if body was empty 371 | finally: 372 | if hasattr(result, 'close'): 373 | result.close() 374 | 375 | 376 | Middleware: Components that Play Both Sides 377 | ------------------------------------------- 378 | 379 | Note that a single object may play the role of a server with respect 380 | to some application(s), while also acting as an application with 381 | respect to some server(s). Such "middleware" components can perform 382 | such functions as: 383 | 384 | * Routing a request to different application objects based on the 385 | target URL, after rewriting the ``environ`` accordingly. 386 | 387 | * Allowing multiple applications or frameworks to run side-by-side 388 | in the same process 389 | 390 | * Load balancing and remote processing, by forwarding requests and 391 | responses over a network 392 | 393 | * Perform content postprocessing, such as applying XSL stylesheets 394 | 395 | The presence of middleware in general is transparent to both the 396 | "server/gateway" and the "application/framework" sides of the 397 | interface, and should require no special support. A user who 398 | desires to incorporate middleware into an application simply 399 | provides the middleware component to the server, as if it were 400 | an application, and configures the middleware component to 401 | invoke the application, as if the middleware component were a 402 | server. Of course, the "application" that the middleware wraps 403 | may in fact be another middleware component wrapping another 404 | application, and so on, creating what is referred to as a 405 | "middleware stack". 406 | 407 | For the most part, middleware must conform to the restrictions 408 | and requirements of both the server and application sides of 409 | WSGI. In some cases, however, requirements for middleware 410 | are more stringent than for a "pure" server or application, 411 | and these points will be noted in the specification. 412 | 413 | Here is a (tongue-in-cheek) example of a middleware component that 414 | converts ``text/plain`` responses to pig latin, using Joe Strout's 415 | ``piglatin.py``. (Note: a "real" middleware component would 416 | probably use a more robust way of checking the content type, and 417 | should also check for a content encoding. Also, this simple 418 | example ignores the possibility that a word might be split across 419 | a block boundary.) 420 | 421 | :: 422 | 423 | from piglatin import piglatin 424 | 425 | class LatinIter: 426 | 427 | """Transform iterated output to piglatin, if it's okay to do so 428 | 429 | Note that the "okayness" can change until the application yields 430 | its first non-empty bytestring, so 'transform_ok' has to be a mutable 431 | truth value. 432 | """ 433 | 434 | def __init__(self, result, transform_ok): 435 | if hasattr(result, 'close'): 436 | self.close = result.close 437 | self._next = iter(result).__next__ 438 | self.transform_ok = transform_ok 439 | 440 | def __iter__(self): 441 | return self 442 | 443 | def __next__(self): 444 | if self.transform_ok: 445 | return piglatin(self._next()) # call must be byte-safe on Py3 446 | else: 447 | return self._next() 448 | 449 | class Latinator: 450 | 451 | # by default, don't transform output 452 | transform = False 453 | 454 | def __init__(self, application): 455 | self.application = application 456 | 457 | def __call__(self, environ, start_response): 458 | 459 | transform_ok = [] 460 | 461 | def start_latin(status, response_headers, exc_info=None): 462 | 463 | # Reset ok flag, in case this is a repeat call 464 | del transform_ok[:] 465 | 466 | for name, value in response_headers: 467 | if name.lower() == 'content-type' and value == 'text/plain': 468 | transform_ok.append(True) 469 | # Strip content-length if present, else it'll be wrong 470 | response_headers = [(name, value) 471 | for name, value in response_headers 472 | if name.lower() != 'content-length' 473 | ] 474 | break 475 | 476 | write = start_response(status, response_headers, exc_info) 477 | 478 | if transform_ok: 479 | def write_latin(data): 480 | write(piglatin(data)) # call must be byte-safe on Py3 481 | return write_latin 482 | else: 483 | return write 484 | 485 | return LatinIter(self.application(environ, start_latin), transform_ok) 486 | 487 | 488 | # Run foo_app under a Latinator's control, using the example CGI gateway 489 | from foo_app import foo_app 490 | run_with_cgi(Latinator(foo_app)) 491 | 492 | 493 | 494 | Specification Details 495 | ===================== 496 | 497 | The application object must accept two positional arguments. For 498 | the sake of illustration, we have named them ``environ`` and 499 | ``start_response``, but they are not required to have these names. 500 | A server or gateway **must** invoke the application object using 501 | positional (not keyword) arguments. (E.g. by calling 502 | ``result = application(environ, start_response)`` as shown above.) 503 | 504 | The ``environ`` parameter is a dictionary object, containing CGI-style 505 | environment variables. This object **must** be a builtin Python 506 | dictionary (*not* a subclass, ``UserDict`` or other dictionary 507 | emulation), and the application is allowed to modify the dictionary 508 | in any way it desires. The dictionary must also include certain 509 | WSGI-required variables (described in a later section), and may 510 | also include server-specific extension variables, named according 511 | to a convention that will be described below. 512 | 513 | The ``start_response`` parameter is a callable accepting two 514 | required positional arguments, and one optional argument. For the sake 515 | of illustration, we have named these arguments ``status``, 516 | ``response_headers``, and ``exc_info``, but they are not required to 517 | have these names, and the application **must** invoke the 518 | ``start_response`` callable using positional arguments (e.g. 519 | ``start_response(status, response_headers)``). 520 | 521 | The ``status`` parameter is a status string of the form 522 | ``"999 Message here"``, and ``response_headers`` is a list of 523 | ``(header_name, header_value)`` tuples describing the HTTP response 524 | header. The optional ``exc_info`` parameter is described below in the 525 | sections on `The start_response() Callable`_ and `Error Handling`_. 526 | It is used only when the application has trapped an error and is 527 | attempting to display an error message to the browser. 528 | 529 | The ``start_response`` callable **must** do any header verification and 530 | checking that will take place before returning (or raising an error) 531 | rather than deferring it until body bytes are supplied (via either the 532 | iterator or the ``write`` callable). This ensures that the error is 533 | raised as close to the causing code as possible. 534 | 535 | The ``start_response`` callable must return a ``write(body_data)`` 536 | callable that takes one positional parameter: a bytestring to be written 537 | as part of the HTTP response body. (Note: the ``write()`` callable is 538 | provided only to support certain existing frameworks' imperative output 539 | APIs; it should not be used by new applications or frameworks if it 540 | can be avoided. See the `Buffering and Streaming`_ section for more 541 | details.) 542 | 543 | When called by the server, the application object must return an 544 | iterable yielding zero or more bytestrings. This can be accomplished in a 545 | variety of ways, such as by returning a list of bytestrings, or by the 546 | application being a generator function that yields bytestrings, or 547 | by the application being a class whose instances are iterable. 548 | Regardless of how it is accomplished, the application object must 549 | always return an iterable yielding zero or more bytestrings. 550 | 551 | The server or gateway must transmit the yielded bytestrings to the client 552 | in an unbuffered fashion, completing the transmission of each bytestring 553 | before requesting another one. (In other words, applications 554 | **should** perform their own buffering. See the `Buffering and 555 | Streaming`_ section below for more on how application output must be 556 | handled.) 557 | 558 | The server or gateway should treat the yielded bytestrings as binary byte 559 | sequences: in particular, it should ensure that line endings are 560 | not altered. The application is responsible for ensuring that the 561 | bytestring(s) to be written are in a format suitable for the client. (The 562 | server or gateway **may** apply HTTP transfer encodings, or perform 563 | other transformations for the purpose of implementing HTTP features 564 | such as byte-range transmission. See `Other HTTP Features`_, below, 565 | for more details.) 566 | 567 | If a call to ``len(iterable)`` succeeds, the server must be able 568 | to rely on the result being accurate. That is, if the iterable 569 | returned by the application provides a working ``__len__()`` 570 | method, it **must** return an accurate result. (See 571 | the `Handling the Content-Length Header`_ section for information 572 | on how this would normally be used.) 573 | 574 | If the iterable returned by the application has a ``close()`` method, 575 | the server or gateway **must** call that method upon completion of the 576 | current request, whether the request was completed normally, or 577 | terminated early due to an application error during iteration or an early 578 | disconnect of the browser. (The ``close()`` method requirement is to 579 | support resource release by the application. This protocol is intended 580 | to complement PEP 342's generator support, and other common iterables 581 | with ``close()`` methods.) 582 | 583 | Applications returning a generator or other custom iterator **should not** 584 | assume the entire iterator will be consumed, as it **may** be closed early 585 | by the server. 586 | 587 | (Note: the application **must** invoke the ``start_response()`` 588 | callable before the iterable yields its first body bytestring, so that the 589 | server can send the headers before any body content. However, this 590 | invocation **may** be performed by the iterable's first iteration, so 591 | servers **must not** assume that ``start_response()`` has been called 592 | before they begin iterating over the iterable.) 593 | 594 | Finally, servers and gateways **must not** directly use any other 595 | attributes of the iterable returned by the application, unless it is an 596 | instance of a type specific to that server or gateway, such as a "file 597 | wrapper" returned by ``wsgi.file_wrapper`` (see `Optional 598 | Platform-Specific File Handling`_). In the general case, only 599 | attributes specified here, or accessed via e.g. the PEP 234 iteration 600 | APIs are acceptable. 601 | 602 | 603 | ``environ`` Variables 604 | --------------------- 605 | 606 | The ``environ`` dictionary is required to contain these CGI 607 | environment variables, as defined by the Common Gateway Interface 608 | specification [2]_. The following variables **must** be present, 609 | unless their value would be an empty string, in which case they 610 | **may** be omitted, except as otherwise noted below. 611 | 612 | ``REQUEST_METHOD`` 613 | The HTTP request method, such as ``"GET"`` or ``"POST"``. This 614 | cannot ever be an empty string, and so is always required. 615 | 616 | ``SCRIPT_NAME`` 617 | The initial portion of the request URL's "path" that corresponds to 618 | the application object, so that the application knows its virtual 619 | "location". This **may** be an empty string, if the application 620 | corresponds to the "root" of the server. 621 | 622 | ``PATH_INFO`` 623 | The remainder of the request URL's "path", designating the virtual 624 | "location" of the request's target within the application. This 625 | **may** be an empty string, if the request URL targets the 626 | application root and does not have a trailing slash. 627 | 628 | ``QUERY_STRING`` 629 | The portion of the request URL that follows the ``"?"``, if any. 630 | May be empty or absent. 631 | 632 | ``CONTENT_TYPE`` 633 | The contents of any ``Content-Type`` fields in the HTTP request. 634 | May be empty or absent. 635 | 636 | ``CONTENT_LENGTH`` 637 | The contents of any ``Content-Length`` fields in the HTTP request. 638 | May be empty or absent. 639 | 640 | ``SERVER_NAME``, ``SERVER_PORT`` 641 | When combined with ``SCRIPT_NAME`` and ``PATH_INFO``, these two strings 642 | can be used to complete the URL. Note, however, that ``HTTP_HOST``, 643 | if present, should be used in preference to ``SERVER_NAME`` for 644 | reconstructing the request URL. See the `URL Reconstruction`_ 645 | section below for more detail. ``SERVER_NAME`` and ``SERVER_PORT`` 646 | can never be empty strings, and so are always required. 647 | 648 | ``SERVER_PROTOCOL`` 649 | The version of the protocol the client used to send the request. 650 | Typically this will be something like ``"HTTP/1.0"`` or ``"HTTP/1.1"`` 651 | and may be used by the application to determine how to treat any 652 | HTTP request headers. (This variable should probably be called 653 | ``REQUEST_PROTOCOL``, since it denotes the protocol used in the 654 | request, and is not necessarily the protocol that will be used in the 655 | server's response. However, for compatibility with CGI we have to 656 | keep the existing name.) 657 | 658 | ``HTTP_`` Variables 659 | Variables corresponding to the client-supplied HTTP request headers 660 | (i.e., variables whose names begin with ``"HTTP_"``). The presence or 661 | absence of these variables should correspond with the presence or 662 | absence of the appropriate HTTP header in the request. 663 | 664 | A server or gateway **should** attempt to provide as many other CGI 665 | variables as are applicable. In addition, if SSL is in use, the server 666 | or gateway **should** also provide as many of the Apache SSL environment 667 | variables [5]_ as are applicable, such as ``HTTPS=on`` and 668 | ``SSL_PROTOCOL``. Note, however, that an application that uses any CGI 669 | variables other than the ones listed above are necessarily non-portable 670 | to web servers that do not support the relevant extensions. (For 671 | example, web servers that do not publish files will not be able to 672 | provide a meaningful ``DOCUMENT_ROOT`` or ``PATH_TRANSLATED``.) 673 | 674 | A WSGI-compliant server or gateway **should** document what variables 675 | it provides, along with their definitions as appropriate. Applications 676 | **should** check for the presence of any variables they require, and 677 | have a fallback plan in the event such a variable is absent. 678 | 679 | Note: missing variables (such as ``REMOTE_USER`` when no 680 | authentication has occurred) should be left out of the ``environ`` 681 | dictionary. Also note that CGI-defined variables must be native strings, 682 | if they are present at all. It is a violation of this specification 683 | for *any* CGI variable's value to be of any type other than ``str``. 684 | 685 | In addition to the CGI-defined variables, the ``environ`` dictionary 686 | **may** also contain arbitrary operating-system "environment variables", 687 | and **must** contain the following WSGI-defined variables: 688 | 689 | =========================== =============================================== 690 | Variable Value 691 | =========================== =============================================== 692 | ``wsgi.version`` The tuple ``(2, 0)``, representing WSGI 693 | version 2.0. 694 | 695 | ``wsgi.url_scheme`` A string representing the "scheme" portion of 696 | the URL at which the application is being 697 | invoked. Normally, this will have the value 698 | ``"http"`` or ``"https"``, as appropriate. 699 | 700 | ``wsgi.input`` An input stream (file-like object) from which 701 | the HTTP request body bytes can be read. 702 | (The server or gateway may perform reads 703 | on-demand as requested by the application, 704 | or it may pre-read the client's request 705 | body and buffer it in-memory or on disk, 706 | or use any other technique for providing 707 | such an input stream, according to its 708 | preference.) 709 | 710 | ``wsgi.errors`` An output stream (file-like object) to which 711 | error output can be written, for the purpose of 712 | recording program or other errors in a 713 | standardized and possibly centralized location. 714 | This should be a "text mode" stream; i.e., 715 | applications should use ``"\n"`` as a line 716 | ending, and assume that it will be converted to 717 | the correct line ending by the server/gateway. 718 | 719 | (On platforms where the ``str`` type is unicode, 720 | the error stream **should** accept and log 721 | arbitary unicode without raising an error; it 722 | is allowed, however, to substitute characters 723 | that cannot be rendered in the stream's encoding.) 724 | 725 | For many servers, ``wsgi.errors`` will be the 726 | server's main error log. Alternatively, this 727 | may be ``sys.stderr``, or a log file of some 728 | sort. The server's documentation should 729 | include an explanation of how to configure this 730 | or where to find the recorded output. A server 731 | or gateway may supply different error streams 732 | to different applications, if this is desired. 733 | 734 | ``wsgi.multithread`` This value should evaluate true if the 735 | application object may be simultaneously 736 | invoked by another thread in the same process, 737 | and should evaluate false otherwise. 738 | 739 | ``wsgi.multiprocess`` This value should evaluate true if an 740 | equivalent application object may be 741 | simultaneously invoked by another process, 742 | and should evaluate false otherwise. 743 | 744 | ``wsgi.run_once`` This value should evaluate true if the server 745 | or gateway expects (but does not guarantee!) 746 | that the application will only be invoked this 747 | one time during the life of its containing 748 | process. Normally, this will only be true for 749 | a gateway based on CGI (or something similar). 750 | 751 | ``wsgi.associated_content`` This is a function which can be called 752 | whenever the application wishes to signal 753 | that some associated content should be 754 | pushed to the client. Server push is an 755 | HTTP/2 feature: if the connection is not 756 | able to provide server push, this will be 757 | a no-op. The function takes one required 758 | parameter, the encoded URL to the object 759 | to push. An optional parameter weight can 760 | be used to override the weight (HTTP/2 761 | section 5.3). 762 | Different gateways/servers may implement 763 | associated_content to suite their 764 | environment. For instance, if mod_spdy is 765 | being used with Apache and mod_wsgi, the 766 | header ``X-Associated-Content`` may be 767 | added when the object headers are sent 768 | (making calling this after start_response 769 | have no effect). In gateways that 770 | implement HTTP/2 themselves, calling the 771 | function may trigger an immediate 772 | ``PUSH_PROMISE`` on the connection socket. 773 | =========================== =============================================== 774 | 775 | Finally, the ``environ`` dictionary may also contain server-defined 776 | variables. These variables should be named using only lower-case 777 | letters, numbers, dots, and underscores, and should be prefixed with 778 | a name that is unique to the defining server or gateway. For 779 | example, ``mod_python`` might define variables with names like 780 | ``mod_python.some_variable``. 781 | 782 | 783 | Input and Error Streams 784 | ~~~~~~~~~~~~~~~~~~~~~~~ 785 | 786 | The input and error streams provided by the server must support 787 | the following methods: 788 | 789 | =================== ========== ======== 790 | Method Stream Notes 791 | =================== ========== ======== 792 | ``read(size)`` ``input`` 1 793 | ``readline(hint)`` ``input`` 1, 2 794 | ``readlines(hint)`` ``input`` 1, 3 795 | ``__iter__()`` ``input`` 796 | ``flush()`` ``errors`` 4 797 | ``write(str)`` ``errors`` 798 | ``writelines(seq)`` ``errors`` 799 | =================== ========== ======== 800 | 801 | The semantics of each method are as documented in the Python Library 802 | Reference, except for these notes as listed in the table above: 803 | 804 | 1. The server is not required to read past the client's specified 805 | ``Content-Length``, and **should** simulate an end-of-file 806 | condition if the application attempts to read past that point. 807 | The application **should not** attempt to read more data than is 808 | specified by the ``CONTENT_LENGTH`` variable. 809 | 810 | A server **must** allow ``read()`` to be called without an argument, 811 | and return the remainder of the client's input stream. This implies 812 | blocking until the stream source source is closed in HTTP/2. 813 | 814 | A server **must** return empty bytestrings from any attempt to 815 | read from an empty or exhausted input stream. 816 | 817 | 2. Servers **must** support the optional ``hint`` argument to ``readline()``. 818 | 819 | 3. Note that the ``hint`` argument to ``readline`` and ``readlines()`` 820 | is optional for both caller and implementer. The application is 821 | free not to supply it, and the server or gateway is free to ignore 822 | it. 823 | 824 | 4. Since the ``errors`` stream may not be rewound, servers and gateways 825 | are free to forward write operations immediately, without buffering. 826 | In this case, the ``flush()`` method may be a no-op. Portable 827 | applications, however, cannot assume that output is unbuffered 828 | or that ``flush()`` is a no-op. They must call ``flush()`` if 829 | they need to ensure that output has in fact been written. (For 830 | example, to minimize intermingling of data from multiple processes 831 | writing to the same error log.) 832 | 833 | The methods listed in the table above **must** be supported by all 834 | servers conforming to this specification. Applications conforming 835 | to this specification **must not** use any other methods or attributes 836 | of the ``input`` or ``errors`` objects. In particular, applications 837 | **must not** attempt to close these streams, even if they possess 838 | ``close()`` methods. 839 | 840 | 841 | The ``start_response()`` Callable 842 | --------------------------------- 843 | 844 | The second parameter passed to the application object is a callable 845 | of the form ``start_response(status, response_headers, exc_info=None)``. 846 | (As with all WSGI callables, the arguments must be supplied 847 | positionally, not by keyword.) The ``start_response`` callable is 848 | used to begin the HTTP response, and it must return a 849 | ``write(body_data)`` callable (see the `Buffering and Streaming`_ 850 | section, below). 851 | 852 | The ``status`` argument is an HTTP "status" string like ``"200 OK"`` 853 | or ``"404 Not Found"``. That is, it is a string consisting of a 854 | Status-Code and a Reason-Phrase, in that order and separated by a 855 | single space, with no surrounding whitespace or other characters. 856 | (See RFC 2616, Section 6.1.1 for more information.) The string 857 | **must not** contain control characters, and must not be terminated 858 | with a carriage return, linefeed, or combination thereof. 859 | 860 | The ``response_headers`` argument is a list of ``(header_name, 861 | header_value)`` tuples. It must be a Python list; i.e. 862 | ``type(response_headers) is ListType``, and the server **may** change 863 | its contents in any way it desires. Each ``header_name`` must be a 864 | valid HTTP header field-name (as defined by RFC 2616, Section 4.2), 865 | without a trailing colon or other punctuation. 866 | 867 | Each ``header_value`` **must not** include *any* control characters, 868 | including carriage returns or linefeeds, either embedded or at the end. 869 | (These requirements are to minimize the complexity of any parsing that 870 | must be performed by servers, gateways, and intermediate response 871 | processors that need to inspect or modify response headers.) 872 | 873 | In general, the server or gateway is responsible for ensuring that 874 | correct headers are sent to the client: if the application omits 875 | a header required by HTTP (or other relevant specifications that are in 876 | effect), the server or gateway **must** add it. For example, the HTTP 877 | ``Date:`` and ``Server:`` headers would normally be supplied by the 878 | server or gateway. 879 | 880 | (A reminder for server/gateway authors: HTTP header names are 881 | case-insensitive, so be sure to take that into consideration when 882 | examining application-supplied headers!) 883 | 884 | Applications and middleware are forbidden from using HTTP/1.1 885 | "hop-by-hop" features or headers, any equivalent features in HTTP/1.0, 886 | or any headers that would affect the persistence of the client's 887 | connection to the web server. These features are the 888 | exclusive province of the actual web server, and a server or gateway 889 | **should** consider it a fatal error for an application to attempt 890 | sending them, and raise an error if they are supplied to 891 | ``start_response()``. (For more specifics on "hop-by-hop" features and 892 | headers, please see the `Other HTTP Features`_ section below.) 893 | 894 | Servers **must** check for errors in the headers at the time 895 | ``start_response`` is called, so that an error can be raised while 896 | the application is still running. 897 | 898 | However, the ``start_response`` callable **must not** actually transmit the 899 | response headers. Instead, it must store them for the server or 900 | gateway to transmit **only** after the first iteration of the 901 | application return value that yields a non-empty bytestring, or upon 902 | the application's first invocation of the ``write()`` callable. In 903 | other words, response headers must not be sent until there is actual 904 | body data available, or until the application's returned iterable is 905 | exhausted. (The only possible exception to this rule is if the 906 | response headers explicitly include a ``Content-Length`` of zero.) 907 | 908 | This delaying of response header transmission is to ensure that buffered 909 | and asynchronous applications can replace their originally intended 910 | output with error output, up until the last possible moment. For 911 | example, the application may need to change the response status from 912 | "200 OK" to "500 Internal Error", if an error occurs while the body is 913 | being generated within an application buffer. 914 | 915 | The ``exc_info`` argument, if supplied, must be a Python 916 | ``sys.exc_info()`` tuple. This argument should be supplied by the 917 | application only if ``start_response`` is being called by an error 918 | handler. If ``exc_info`` is supplied, and no HTTP headers have been 919 | output yet, ``start_response`` should replace the currently-stored 920 | HTTP response headers with the newly-supplied ones, thus allowing the 921 | application to "change its mind" about the output when an error has 922 | occurred. 923 | 924 | However, if ``exc_info`` is provided, and the HTTP headers have already 925 | been sent, ``start_response`` **must** raise an error, and **should** 926 | re-raise using the ``exc_info`` tuple. That is:: 927 | 928 | raise exc_info[1].with_traceback(exc_info[2]) 929 | 930 | This will re-raise the exception trapped by the application, and in 931 | principle should abort the application. (It is not safe for the 932 | application to attempt error output to the browser once the HTTP 933 | headers have already been sent.) The application **must not** trap 934 | any exceptions raised by ``start_response``, if it called 935 | ``start_response`` with ``exc_info``. Instead, it should allow 936 | such exceptions to propagate back to the server or gateway. See 937 | `Error Handling`_ below, for more details. 938 | 939 | The application **may** call ``start_response`` more than once, if and 940 | only if the ``exc_info`` argument is provided. More precisely, it is 941 | a fatal error to call ``start_response`` without the ``exc_info`` 942 | argument if ``start_response`` has already been called within the 943 | current invocation of the application. This includes the case where 944 | the first call to ``start_response`` raised an error. (See the example 945 | CGI gateway above for an illustration of the correct logic.) 946 | 947 | Note: servers, gateways, or middleware implementing ``start_response`` 948 | **should** ensure that no reference is held to the ``exc_info`` 949 | parameter beyond the duration of the function's execution, to avoid 950 | creating a circular reference through the traceback and frames 951 | involved. The simplest way to do this is something like:: 952 | 953 | def start_response(status, response_headers, exc_info=None): 954 | if exc_info: 955 | try: 956 | # do stuff w/exc_info here 957 | finally: 958 | exc_info = None # Avoid circular ref. 959 | 960 | The example CGI gateway provides another illustration of this 961 | technique. 962 | 963 | 964 | Handling the ``Content-Length`` Header 965 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 966 | 967 | If the application supplies a ``Content-Length`` header, the server 968 | **should not** transmit more bytes to the client than the header 969 | allows, and **must** stop iterating over the response when enough 970 | data has been sent, or raise an error if the application tries to 971 | ``write()`` past that point. (Of course, if the application does 972 | not provide *enough* data to meet its stated ``Content-Length``, 973 | the server **should** close the connection and log or otherwise 974 | report the error.) 975 | 976 | If the application does not supply a ``Content-Length`` header, a 977 | server or gateway may choose one of several approaches to handling 978 | it. The simplest of these is to close the client connection when 979 | the response is completed. 980 | 981 | Under some circumstances, however, the server or gateway may be 982 | able to either generate a ``Content-Length`` header, or at least 983 | avoid the need to close the client connection. If the application 984 | does *not* call the ``write()`` callable, and returns an iterable 985 | whose ``len()`` is 1, then the server can automatically determine 986 | ``Content-Length`` by taking the length of the first bytestring yielded 987 | by the iterable. 988 | 989 | And, if the server and client both support HTTP/1.1 "chunked 990 | encoding" [3]_, then the server **may** use chunked encoding to send 991 | a chunk for each ``write()`` call or bytestring yielded by the 992 | iterable, thus not using a ``Content-Length`` header at all. This 993 | allows the server to keep the client connection alive, if it wishes 994 | to do so. Note that the server **must** comply fully with RFC 2616 995 | when doing this, or else fall back to one of the other strategies for 996 | dealing with the absence of ``Content-Length``. 997 | 998 | (Note: applications and middleware **must not** apply any kind of 999 | ``Transfer-Encoding`` to their output, such as chunking or gzipping; 1000 | as "hop-by-hop" operations, these encodings are the province of the 1001 | actual web server/gateway. See `Other HTTP Features`_ below, for 1002 | more details.) 1003 | 1004 | 1005 | Buffering and Streaming 1006 | ----------------------- 1007 | 1008 | Generally speaking, applications will achieve the best throughput 1009 | by buffering their (modestly-sized) output and sending it all at 1010 | once. This is a common approach in existing frameworks such as 1011 | Zope: the output is buffered in a StringIO or similar object, then 1012 | transmitted all at once, along with the response headers. 1013 | 1014 | The corresponding approach in WSGI is for the application to simply 1015 | return a single-element iterable (such as a list) containing the 1016 | response body as a single bytestring. This is the recommended approach 1017 | for the vast majority of application functions, that render 1018 | HTML pages whose text easily fits in memory. 1019 | 1020 | For large files, however, or for specialized uses of HTTP streaming 1021 | (such as multipart "server push"), an application may need to provide 1022 | output in smaller blocks (e.g. to avoid loading a large file into 1023 | memory). It's also sometimes the case that part of a response may 1024 | be time-consuming to produce, but it would be useful to send ahead the 1025 | portion of the response that precedes it. 1026 | 1027 | In these cases, applications will usually return an iterator (often 1028 | a generator-iterator) that produces the output in a block-by-block 1029 | fashion. These blocks may be broken to coincide with mulitpart 1030 | boundaries (for "server push"), or just before time-consuming 1031 | tasks (such as reading another block of an on-disk file). 1032 | 1033 | WSGI servers, gateways, and middleware **must not** delay the 1034 | transmission of any block; they **must** either fully transmit 1035 | the block to the client, or guarantee that they will continue 1036 | transmission even while the application is producing its next block. 1037 | A server/gateway or middleware may provide this guarantee in one of 1038 | three ways: 1039 | 1040 | 1. Send the entire block to the operating system (and request 1041 | that any O/S buffers be flushed) before returning control 1042 | to the application, OR 1043 | 1044 | 2. Use a different thread to ensure that the block continues 1045 | to be transmitted while the application produces the next 1046 | block. 1047 | 1048 | 3. (Middleware only) send the entire block to its parent 1049 | gateway/server 1050 | 1051 | By providing this guarantee, WSGI allows applications to ensure 1052 | that transmission will not become stalled at an arbitrary point 1053 | in their output data. This is critical for proper functioning 1054 | of e.g. multipart "server push" streaming, where data between 1055 | multipart boundaries should be transmitted in full to the client. 1056 | 1057 | 1058 | Middleware Handling of Block Boundaries 1059 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1060 | 1061 | In order to better support asynchronous applications and servers, 1062 | middleware components **must not** block iteration waiting for 1063 | multiple values from an application iterable. If the middleware 1064 | needs to accumulate more data from the application before it can 1065 | produce any output, it **must** yield an empty bytestring. 1066 | 1067 | To put this requirement another way, a middleware component **must 1068 | yield at least one value** each time its underlying application 1069 | yields a value. If the middleware cannot yield any other value, 1070 | it must yield an empty bytestring. 1071 | 1072 | This requirement ensures that asynchronous applications and servers 1073 | can conspire to reduce the number of threads that are required 1074 | to run a given number of application instances simultaneously. 1075 | 1076 | Note also that this requirement means that middleware **must** 1077 | return an iterable as soon as its underlying application returns 1078 | an iterable. It is also forbidden for middleware to use the 1079 | ``write()`` callable to transmit data that is yielded by an 1080 | underlying application. Middleware may only use their parent 1081 | server's ``write()`` callable to transmit data that the 1082 | underlying application sent using a middleware-provided ``write()`` 1083 | callable. 1084 | 1085 | 1086 | The ``write()`` Callable 1087 | ~~~~~~~~~~~~~~~~~~~~~~~~ 1088 | 1089 | Some existing application framework APIs support unbuffered 1090 | output in a different manner than WSGI. Specifically, they 1091 | provide a "write" function or method of some kind to write 1092 | an unbuffered block of data, or else they provide a buffered 1093 | "write" function and a "flush" mechanism to flush the buffer. 1094 | 1095 | Unfortunately, such APIs cannot be implemented in terms of 1096 | WSGI's "iterable" application return value, unless threads 1097 | or other special mechanisms are used. 1098 | 1099 | Therefore, to allow these frameworks to continue using an 1100 | imperative API, WSGI includes a special ``write()`` callable, 1101 | returned by the ``start_response`` callable. 1102 | 1103 | New WSGI applications and frameworks **should not** use the 1104 | ``write()`` callable if it is possible to avoid doing so. The 1105 | ``write()`` callable is strictly a hack to support imperative 1106 | streaming APIs. In general, applications should produce their 1107 | output via their returned iterable, as this makes it possible 1108 | for web servers to interleave other tasks in the same Python thread, 1109 | potentially providing better throughput for the server as a whole. 1110 | 1111 | The ``write()`` callable is returned by the ``start_response()`` 1112 | callable, and it accepts a single parameter: a bytestring to be 1113 | written as part of the HTTP response body, that is treated exactly 1114 | as though it had been yielded by the output iterable. In other 1115 | words, before ``write()`` returns, it must guarantee that the 1116 | passed-in bytestring was either completely sent to the client, or 1117 | that it is buffered for transmission while the application 1118 | proceeds onward. 1119 | 1120 | An application **must** return an iterable object, even if it 1121 | uses ``write()`` to produce all or part of its response body. 1122 | The returned iterable **may** be empty (i.e. yield no non-empty 1123 | bytestrings), but if it *does* yield non-empty bytestrings, that output 1124 | must be treated normally by the server or gateway (i.e., it must be 1125 | sent or queued immediately). Applications **must not** invoke 1126 | ``write()`` from within their return iterable, and therefore any 1127 | bytestrings yielded by the iterable are transmitted after all bytestrings 1128 | passed to ``write()`` have been sent to the client. 1129 | 1130 | 1131 | Unicode Issues 1132 | -------------- 1133 | 1134 | HTTP does not directly support Unicode, and neither does this 1135 | interface. All encoding/decoding must be handled by the application; 1136 | all strings passed to or from the server must be of type ``str`` or 1137 | ``bytes``, never ``unicode``. The result of using a ``unicode`` 1138 | object where a string object is required, is undefined. 1139 | 1140 | Note also that strings passed to ``start_response()`` as a status or 1141 | as response headers **must** follow RFC 2616 with respect to encoding. 1142 | That is, they must either be ISO-8859-1 characters, or use RFC 2047 1143 | MIME encoding. 1144 | 1145 | On Python platforms where the ``str`` or ``StringType`` type is in 1146 | fact Unicode-based (e.g. Jython, IronPython, Python 3, etc.), all 1147 | "strings" referred to in this specification must contain only 1148 | code points representable in ISO-8859-1 encoding (``\u0000`` through 1149 | ``\u00FF``, inclusive). It is a fatal error for an application to 1150 | supply strings containing any other Unicode character or code point. 1151 | Similarly, servers and gateways **must not** supply 1152 | strings to an application containing any other Unicode characters. 1153 | 1154 | Again, all objects referred to in this specification as "strings" 1155 | **must** be of type ``str`` or ``StringType``, and **must not** be 1156 | of type ``unicode`` or ``UnicodeType``. And, even if a given platform 1157 | allows for more than 8 bits per character in ``str``/``StringType`` 1158 | objects, only the lower 8 bits may be used, for any value referred 1159 | to in this specification as a "string". 1160 | 1161 | For values referred to in this specification as "bytestrings" 1162 | (i.e., values read from ``wsgi.input``, passed to ``write()`` 1163 | or yielded by the application), the value **must** be of type 1164 | ``bytes`` under Python 3, and ``str`` in earlier versions of 1165 | Python. 1166 | 1167 | 1168 | Error Handling 1169 | -------------- 1170 | 1171 | In general, applications **should** try to trap their own, internal 1172 | errors, and display a helpful message in the browser. (It is up 1173 | to the application to decide what "helpful" means in this context.) 1174 | 1175 | However, to display such a message, the application must not have 1176 | actually sent any data to the browser yet, or else it risks corrupting 1177 | the response. WSGI therefore provides a mechanism to either allow the 1178 | application to send its error message, or be automatically aborted: 1179 | the ``exc_info`` argument to ``start_response``. Here is an example 1180 | of its use:: 1181 | 1182 | try: 1183 | # regular application code here 1184 | status = "200 Froody" 1185 | response_headers = [("content-type", "text/plain")] 1186 | start_response(status, response_headers) 1187 | return ["normal body goes here"] 1188 | except: 1189 | # XXX should trap runtime issues like MemoryError, KeyboardInterrupt 1190 | # in a separate handler before this bare 'except:'... 1191 | status = "500 Oops" 1192 | response_headers = [("content-type", "text/plain")] 1193 | start_response(status, response_headers, sys.exc_info()) 1194 | return ["error body goes here"] 1195 | 1196 | If no output has been written when an exception occurs, the call to 1197 | ``start_response`` will return normally, and the application will 1198 | return an error body to be sent to the browser. However, if any output 1199 | has already been sent to the browser, ``start_response`` will reraise 1200 | the provided exception. This exception **should not** be trapped by 1201 | the application, and so the application will abort. The server or 1202 | gateway can then trap this (fatal) exception and abort the response. 1203 | 1204 | Servers **should** trap and log any exception that aborts an 1205 | application or the iteration of its return value. If a partial 1206 | response has already been written to the browser when an application 1207 | error occurs, the server or gateway **may** attempt to add an error 1208 | message to the output, if the already-sent headers indicate a 1209 | ``text/*`` content type that the server knows how to modify cleanly. 1210 | 1211 | Some middleware may wish to provide additional exception handling 1212 | services, or intercept and replace application error messages. In 1213 | such cases, middleware may choose to **not** re-raise the ``exc_info`` 1214 | supplied to ``start_response``, but instead raise a middleware-specific 1215 | exception, or simply return without an exception after storing the 1216 | supplied arguments. This will then cause the application to return 1217 | its error body iterable (or invoke ``write()``), allowing the middleware 1218 | to capture and modify the error output. These techniques will work as 1219 | long as application authors: 1220 | 1221 | 1. Always provide ``exc_info`` when beginning an error response 1222 | 1223 | 2. Never trap errors raised by ``start_response`` when ``exc_info`` is 1224 | being provided 1225 | 1226 | 1227 | HTTP 1.1 Expect/Continue 1228 | ------------------------ 1229 | 1230 | Servers and gateways that implement HTTP 1.1 **must** provide 1231 | transparent support for HTTP 1.1's "expect/continue" mechanism. This 1232 | may be done in any of several ways: 1233 | 1234 | 1. Respond to requests containing an ``Expect: 100-continue`` request 1235 | with an immediate "100 Continue" response, and proceed normally. 1236 | 1237 | 2. Proceed with the request normally, but provide the application 1238 | with a ``wsgi.input`` stream that will send the "100 Continue" 1239 | response if/when the application first attempts to read from the 1240 | input stream. The read request must then remain blocked until the 1241 | client responds. 1242 | 1243 | 3. Wait until the client decides that the server does not support 1244 | expect/continue, and sends the request body on its own. (This 1245 | is suboptimal, and is not recommended.) 1246 | 1247 | Note that these behavior restrictions do not apply for HTTP 1.0 1248 | requests, or for requests that are not directed to an application 1249 | object. For more information on HTTP 1.1 Expect/Continue, see RFC 1250 | 2616, sections 8.2.3 and 10.1.1. 1251 | 1252 | 1253 | Other HTTP Features 1254 | ------------------- 1255 | 1256 | In general, servers and gateways should "play dumb" and allow the 1257 | application complete control over its output. They should only make 1258 | changes that do not alter the effective semantics of the application's 1259 | response. It is always possible for the application developer to add 1260 | middleware components to supply additional features, so server/gateway 1261 | developers should be conservative in their implementation. In a sense, 1262 | a server should consider itself to be like an HTTP "gateway server", 1263 | with the application being an HTTP "origin server". (See RFC 2616, 1264 | section 1.3, for the definition of these terms.) 1265 | 1266 | However, because WSGI servers and applications do not communicate via 1267 | HTTP, what RFC 2616 calls "hop-by-hop" headers do not apply to WSGI 1268 | internal communications. WSGI applications **must not** generate any 1269 | "hop-by-hop" headers [4]_, attempt to use HTTP features that would 1270 | require them to generate such headers, or rely on the content of 1271 | any incoming "hop-by-hop" headers in the ``environ`` dictionary. 1272 | WSGI servers **must** handle any supported inbound "hop-by-hop" headers 1273 | on their own, such as by decoding any inbound ``Transfer-Encoding``, 1274 | including chunked encoding if applicable. 1275 | 1276 | Applying these principles to a variety of HTTP features, it should be 1277 | clear that a server **may** handle cache validation via the 1278 | ``If-None-Match`` and ``If-Modified-Since`` request headers and the 1279 | ``Last-Modified`` and ``ETag`` response headers. However, it is 1280 | not required to do this, and the application **should** perform its 1281 | own cache validation if it wants to support that feature, since 1282 | the server/gateway is not required to do such validation. 1283 | 1284 | Similarly, a server **may** re-encode or transport-encode an 1285 | application's response, but the application **should** use a 1286 | suitable content encoding on its own, and **must not** apply a 1287 | transport encoding. A server **may** transmit byte ranges of the 1288 | application's response if requested by the client, and the 1289 | application doesn't natively support byte ranges. Again, however, 1290 | the application **should** perform this function on its own if desired. 1291 | 1292 | Note that these restrictions on applications do not necessarily mean 1293 | that every application must reimplement every HTTP feature; many HTTP 1294 | features can be partially or fully implemented by middleware 1295 | components, thus freeing both server and application authors from 1296 | implementing the same features over and over again. 1297 | 1298 | 1299 | Thread Support 1300 | -------------- 1301 | 1302 | Thread support, or lack thereof, is also server-dependent. 1303 | Servers that can run multiple requests in parallel, **should** also 1304 | provide the option of running an application in a single-threaded 1305 | fashion, so that applications or frameworks that are not thread-safe 1306 | may still be used with that server. 1307 | 1308 | 1309 | 1310 | Implementation/Application Notes 1311 | ================================ 1312 | 1313 | 1314 | Server Extension APIs 1315 | --------------------- 1316 | 1317 | Some server authors may wish to expose more advanced APIs, that 1318 | application or framework authors can use for specialized purposes. 1319 | For example, a gateway based on ``mod_python`` might wish to expose 1320 | part of the Apache API as a WSGI extension. 1321 | 1322 | In the simplest case, this requires nothing more than defining an 1323 | ``environ`` variable, such as ``mod_python.some_api``. But, in many 1324 | cases, the possible presence of middleware can make this difficult. 1325 | For example, an API that offers access to the same HTTP headers that 1326 | are found in ``environ`` variables, might return different data if 1327 | ``environ`` has been modified by middleware. 1328 | 1329 | In general, any extension API that duplicates, supplants, or bypasses 1330 | some portion of WSGI functionality runs the risk of being incompatible 1331 | with middleware components. Server/gateway developers should *not* 1332 | assume that nobody will use middleware, because some framework 1333 | developers specifically intend to organize or reorganize their 1334 | frameworks to function almost entirely as middleware of various kinds. 1335 | 1336 | So, to provide maximum compatibility, servers and gateways that 1337 | provide extension APIs that replace some WSGI functionality, **must** 1338 | design those APIs so that they are invoked using the portion of the 1339 | API that they replace. For example, an extension API to access HTTP 1340 | request headers must require the application to pass in its current 1341 | ``environ``, so that the server/gateway may verify that HTTP headers 1342 | accessible via the API have not been altered by middleware. If the 1343 | extension API cannot guarantee that it will always agree with 1344 | ``environ`` about the contents of HTTP headers, it must refuse service 1345 | to the application, e.g. by raising an error, returning ``None`` 1346 | instead of a header collection, or whatever is appropriate to the API. 1347 | 1348 | Similarly, if an extension API provides an alternate means of writing 1349 | response data or headers, it should require the ``start_response`` 1350 | callable to be passed in, before the application can obtain the 1351 | extended service. If the object passed in is not the same one that 1352 | the server/gateway originally supplied to the application, it cannot 1353 | guarantee correct operation and must refuse to provide the extended 1354 | service to the application. 1355 | 1356 | These guidelines also apply to middleware that adds information such 1357 | as parsed cookies, form variables, sessions, and the like to 1358 | ``environ``. Specifically, such middleware should provide these 1359 | features as functions which operate on ``environ``, rather than simply 1360 | stuffing values into ``environ``. This helps ensure that information 1361 | is calculated from ``environ`` *after* any middleware has done any URL 1362 | rewrites or other ``environ`` modifications. 1363 | 1364 | It is very important that these "safe extension" rules be followed by 1365 | both server/gateway and middleware developers, in order to avoid a 1366 | future in which middleware developers are forced to delete any and all 1367 | extension APIs from ``environ`` to ensure that their mediation isn't 1368 | being bypassed by applications using those extensions! 1369 | 1370 | 1371 | Application Configuration 1372 | ------------------------- 1373 | 1374 | This specification does not define how a server selects or obtains an 1375 | application to invoke. These and other configuration options are 1376 | highly server-specific matters. It is expected that server/gateway 1377 | authors will document how to configure the server to execute a 1378 | particular application object, and with what options (such as 1379 | threading options). 1380 | 1381 | Framework authors, on the other hand, should document how to create an 1382 | application object that wraps their framework's functionality. The 1383 | user, who has chosen both the server and the application framework, 1384 | must connect the two together. However, since both the framework and 1385 | the server now have a common interface, this should be merely a 1386 | mechanical matter, rather than a significant engineering effort for 1387 | each new server/framework pair. 1388 | 1389 | Finally, some applications, frameworks, and middleware may wish to 1390 | use the ``environ`` dictionary to receive simple string configuration 1391 | options. Servers and gateways **should** support this by allowing 1392 | an application's deployer to specify name-value pairs to be placed in 1393 | ``environ``. In the simplest case, this support can consist merely of 1394 | copying all operating system-supplied environment variables from 1395 | ``os.environ`` into the ``environ`` dictionary, since the deployer in 1396 | principle can configure these externally to the server, or in the 1397 | CGI case they may be able to be set via the server's configuration 1398 | files. 1399 | 1400 | Applications **should** try to keep such required variables to a 1401 | minimum, since not all servers will support easy configuration of 1402 | them. Of course, even in the worst case, persons deploying an 1403 | application can create a script to supply the necessary configuration 1404 | values:: 1405 | 1406 | from the_app import application 1407 | 1408 | def new_app(environ, start_response): 1409 | environ['the_app.configval1'] = 'something' 1410 | return application(environ, start_response) 1411 | 1412 | But, most existing applications and frameworks will probably only need 1413 | a single configuration value from ``environ``, to indicate the location 1414 | of their application or framework-specific configuration file(s). (Of 1415 | course, applications should cache such configuration, to avoid having 1416 | to re-read it upon each invocation.) 1417 | 1418 | 1419 | URL Reconstruction 1420 | ------------------ 1421 | 1422 | If an application wishes to reconstruct a request's complete URL, it 1423 | may do so using the following algorithm, contributed by Ian Bicking:: 1424 | 1425 | from urllib import quote 1426 | url = environ['wsgi.url_scheme']+'://' 1427 | 1428 | if environ.get('HTTP_HOST'): 1429 | url += environ['HTTP_HOST'] 1430 | else: 1431 | url += environ['SERVER_NAME'] 1432 | 1433 | if environ['wsgi.url_scheme'] == 'https': 1434 | if environ['SERVER_PORT'] != '443': 1435 | url += ':' + environ['SERVER_PORT'] 1436 | else: 1437 | if environ['SERVER_PORT'] != '80': 1438 | url += ':' + environ['SERVER_PORT'] 1439 | 1440 | url += quote(environ.get('SCRIPT_NAME', '')) 1441 | url += quote(environ.get('PATH_INFO', '')) 1442 | if environ.get('QUERY_STRING'): 1443 | url += '?' + environ['QUERY_STRING'] 1444 | 1445 | Note that such a reconstructed URL may not be precisely the same URI 1446 | as requested by the client. Server rewrite rules, for example, may 1447 | have modified the client's originally requested URL to place it in a 1448 | canonical form. 1449 | 1450 | 1451 | Optional Platform-Specific File Handling 1452 | ---------------------------------------- 1453 | 1454 | Some operating environments provide special high-performance file- 1455 | transmission facilities, such as the Unix ``sendfile()`` call. 1456 | Servers and gateways **may** expose this functionality via an optional 1457 | ``wsgi.file_wrapper`` key in the ``environ``. An application 1458 | **may** use this "file wrapper" to convert a file or file-like object 1459 | into an iterable that it then returns, e.g.:: 1460 | 1461 | if 'wsgi.file_wrapper' in environ: 1462 | return environ['wsgi.file_wrapper'](filelike, block_size) 1463 | else: 1464 | return iter(lambda: filelike.read(block_size), '') 1465 | 1466 | If the server or gateway supplies ``wsgi.file_wrapper``, it must be 1467 | a callable that accepts one required positional parameter, and one 1468 | optional positional parameter. The first parameter is the file-like 1469 | object to be sent, and the second parameter is an optional block 1470 | size "suggestion" (which the server/gateway need not use). The 1471 | callable **must** return an iterable object, and **must not** perform 1472 | any data transmission until and unless the server/gateway actually 1473 | receives the iterable as a return value from the application. 1474 | (To do otherwise would prevent middleware from being able to interpret 1475 | or override the response data.) 1476 | 1477 | To be considered "file-like", the object supplied by the application 1478 | must have a ``read()`` method that takes an optional size argument. 1479 | It **may** have a ``close()`` method, and if so, the iterable returned 1480 | by ``wsgi.file_wrapper`` **must** have a ``close()`` method that 1481 | invokes the original file-like object's ``close()`` method. If the 1482 | "file-like" object has any other methods or attributes with names 1483 | matching those of Python built-in file objects (e.g. ``fileno()``), 1484 | the ``wsgi.file_wrapper`` **may** assume that these methods or 1485 | attributes have the same semantics as those of a built-in file object. 1486 | 1487 | The actual implementation of any platform-specific file handling 1488 | must occur **after** the application returns, and the server or 1489 | gateway checks to see if a wrapper object was returned. (Again, 1490 | because of the presence of middleware, error handlers, and the like, 1491 | it is not guaranteed that any wrapper created will actually be used.) 1492 | 1493 | Apart from the handling of ``close()``, the semantics of returning a 1494 | file wrapper from the application should be the same as if the 1495 | application had returned ``iter(filelike.read, '')``. In other words, 1496 | transmission should begin at the current position within the "file" 1497 | at the time that transmission begins, and continue until the end is 1498 | reached, or until ``Content-Length`` bytes have been written. (If 1499 | the application doesn't supply a ``Content-Length``, the server **may** 1500 | generate one from the file using its knowledge of the underlying file 1501 | implementation.) 1502 | 1503 | Of course, platform-specific file transmission APIs don't usually 1504 | accept arbitrary "file-like" objects. Therefore, a 1505 | ``wsgi.file_wrapper`` has to introspect the supplied object for 1506 | things such as a ``fileno()`` (Unix-like OSes) or a 1507 | ``java.nio.FileChannel`` (under Jython) in order to determine if 1508 | the file-like object is suitable for use with the platform-specific 1509 | API it supports. 1510 | 1511 | Note that even if the object is *not* suitable for the platform API, 1512 | the ``wsgi.file_wrapper`` **must** still return an iterable that wraps 1513 | ``read()`` and ``close()``, so that applications using file wrappers 1514 | are portable across platforms. Here's a simple platform-agnostic 1515 | file wrapper class, suitable for old (pre 2.2) and new Pythons alike:: 1516 | 1517 | class FileWrapper: 1518 | 1519 | def __init__(self, filelike, blksize=8192): 1520 | self.filelike = filelike 1521 | self.blksize = blksize 1522 | if hasattr(filelike, 'close'): 1523 | self.close = filelike.close 1524 | 1525 | def __getitem__(self, key): 1526 | data = self.filelike.read(self.blksize) 1527 | if data: 1528 | return data 1529 | raise IndexError 1530 | 1531 | and here is a snippet from a server/gateway that uses it to provide 1532 | access to a platform-specific API:: 1533 | 1534 | environ['wsgi.file_wrapper'] = FileWrapper 1535 | result = application(environ, start_response) 1536 | 1537 | try: 1538 | if isinstance(result, FileWrapper): 1539 | # check if result.filelike is usable w/platform-specific 1540 | # API, and if so, use that API to transmit the result. 1541 | # If not, fall through to normal iterable handling 1542 | # loop below. 1543 | 1544 | for data in result: 1545 | # etc. 1546 | 1547 | finally: 1548 | if hasattr(result, 'close'): 1549 | result.close() 1550 | 1551 | 1552 | Questions and Answers 1553 | ===================== 1554 | 1555 | 1. Why must ``environ`` be a dictionary? What's wrong with using a 1556 | subclass? 1557 | 1558 | The rationale for requiring a dictionary is to maximize portability 1559 | between servers. The alternative would be to define some subset of 1560 | a dictionary's methods as being the standard and portable 1561 | interface. In practice, however, most servers will probably find a 1562 | dictionary adequate to their needs, and thus framework authors will 1563 | come to expect the full set of dictionary features to be available, 1564 | since they will be there more often than not. But, if some server 1565 | chooses *not* to use a dictionary, then there will be 1566 | interoperability problems despite that server's "conformance" to 1567 | spec. Therefore, making a dictionary mandatory simplifies the 1568 | specification and guarantees interoperabilty. 1569 | 1570 | Note that this does not prevent server or framework developers from 1571 | offering specialized services as custom variables *inside* the 1572 | ``environ`` dictionary. This is the recommended approach for 1573 | offering any such value-added services. 1574 | 1575 | 2. Why can you call ``write()`` *and* yield bytestrings/return an 1576 | iterable? Shouldn't we pick just one way? 1577 | 1578 | If we supported only the iteration approach, then current 1579 | frameworks that assume the availability of "push" suffer. But, if 1580 | we only support pushing via ``write()``, then server performance 1581 | suffers for transmission of e.g. large files (if a worker thread 1582 | can't begin work on a new request until all of the output has been 1583 | sent). Thus, this compromise allows an application framework to 1584 | support both approaches, as appropriate, but with only a little 1585 | more burden to the server implementor than a push-only approach 1586 | would require. 1587 | 1588 | 3. What's the ``close()`` for? 1589 | 1590 | When writes are done during the execution of an application 1591 | object, the application can ensure that resources are released 1592 | using a try/finally block. But, if the application returns an 1593 | iterable, any resources used will not be released until the 1594 | iterable is garbage collected. The ``close()`` idiom allows an 1595 | application to release critical resources at the end of a request, 1596 | and it's forward-compatible with the support for try/finally in 1597 | generators that's proposed by PEP 325. 1598 | 1599 | 4. Why is this interface so low-level? I want feature X! (e.g. 1600 | cookies, sessions, persistence, ...) 1601 | 1602 | This isn't Yet Another Python Web Framework. It's just a way for 1603 | frameworks to talk to web servers, and vice versa. If you want 1604 | these features, you need to pick a web framework that provides the 1605 | features you want. And if that framework lets you create a WSGI 1606 | application, you should be able to run it in most WSGI-supporting 1607 | servers. Also, some WSGI servers may offer additional services via 1608 | objects provided in their ``environ`` dictionary; see the 1609 | applicable server documentation for details. (Of course, 1610 | applications that use such extensions will not be portable to other 1611 | WSGI-based servers.) 1612 | 1613 | 5. Why use CGI variables instead of good old HTTP headers? And why 1614 | mix them in with WSGI-defined variables? 1615 | 1616 | Many existing web frameworks are built heavily upon the CGI spec, 1617 | and existing web servers know how to generate CGI variables. In 1618 | contrast, alternative ways of representing inbound HTTP information 1619 | are fragmented and lack market share. Thus, using the CGI 1620 | "standard" seems like a good way to leverage existing 1621 | implementations. As for mixing them with WSGI variables, 1622 | separating them would just require two dictionary arguments to be 1623 | passed around, while providing no real benefits. 1624 | 1625 | 6. What about the status string? Can't we just use the number, 1626 | passing in ``200`` instead of ``"200 OK"``? 1627 | 1628 | Doing this would complicate the server or gateway, by requiring 1629 | them to have a table of numeric statuses and corresponding 1630 | messages. By contrast, it is easy for an application or framework 1631 | author to type the extra text to go with the specific response code 1632 | they are using, and existing frameworks often already have a table 1633 | containing the needed messages. So, on balance it seems better to 1634 | make the application/framework responsible, rather than the server 1635 | or gateway. 1636 | 1637 | 7. Why is ``wsgi.run_once`` not guaranteed to run the app only once? 1638 | 1639 | Because it's merely a suggestion to the application that it should 1640 | "rig for infrequent running". This is intended for application 1641 | frameworks that have multiple modes of operation for caching, 1642 | sessions, and so forth. In a "multiple run" mode, such frameworks 1643 | may preload caches, and may not write e.g. logs or session data to 1644 | disk after each request. In "single run" mode, such frameworks 1645 | avoid preloading and flush all necessary writes after each request. 1646 | 1647 | However, in order to test an application or framework to verify 1648 | correct operation in the latter mode, it may be necessary (or at 1649 | least expedient) to invoke it more than once. Therefore, an 1650 | application should not assume that it will definitely not be run 1651 | again, just because it is called with ``wsgi.run_once`` set to 1652 | ``True``. 1653 | 1654 | 8. Feature X (dictionaries, callables, etc.) are ugly for use in 1655 | application code; why don't we use objects instead? 1656 | 1657 | All of these implementation choices of WSGI are specifically 1658 | intended to *decouple* features from one another; recombining these 1659 | features into encapsulated objects makes it somewhat harder to 1660 | write servers or gateways, and an order of magnitude harder to 1661 | write middleware that replaces or modifies only small portions of 1662 | the overall functionality. 1663 | 1664 | In essence, middleware wants to have a "Chain of Responsibility" 1665 | pattern, whereby it can act as a "handler" for some functions, 1666 | while allowing others to remain unchanged. This is difficult to do 1667 | with ordinary Python objects, if the interface is to remain 1668 | extensible. For example, one must use ``__getattr__`` or 1669 | ``__getattribute__`` overrides, to ensure that extensions (such as 1670 | attributes defined by future WSGI versions) are passed through. 1671 | 1672 | This type of code is notoriously difficult to get 100% correct, and 1673 | few people will want to write it themselves. They will therefore 1674 | copy other people's implementations, but fail to update them when 1675 | the person they copied from corrects yet another corner case. 1676 | 1677 | Further, this necessary boilerplate would be pure excise, a 1678 | developer tax paid by middleware developers to support a slightly 1679 | prettier API for application framework developers. But, 1680 | application framework developers will typically only be updating 1681 | *one* framework to support WSGI, and in a very limited part of 1682 | their framework as a whole. It will likely be their first (and 1683 | maybe their only) WSGI implementation, and thus they will likely 1684 | implement with this specification ready to hand. Thus, the effort 1685 | of making the API "prettier" with object attributes and suchlike 1686 | would likely be wasted for this audience. 1687 | 1688 | We encourage those who want a prettier (or otherwise improved) WSGI 1689 | interface for use in direct web application programming (as opposed 1690 | to web framework development) to develop APIs or frameworks that 1691 | wrap WSGI for convenient use by application developers. In this 1692 | way, WSGI can remain conveniently low-level for server and 1693 | middleware authors, while not being "ugly" for application 1694 | developers. 1695 | 1696 | 1697 | Proposed/Under Discussion 1698 | ========================= 1699 | 1700 | These items are currently being discussed on the Web-SIG and elsewhere, 1701 | or are on the PEP author's "to-do" list: 1702 | 1703 | * Should ``wsgi.input`` be an iterator instead of a file? This would 1704 | help for asynchronous applications and chunked-encoding input 1705 | streams. 1706 | 1707 | * Optional extensions are being discussed for pausing iteration of an 1708 | application's output until input is available or until a callback 1709 | occurs. 1710 | 1711 | * Add a section about synchronous vs. asynchronous apps and servers, 1712 | the relevant threading models, and issues/design goals in these 1713 | areas. 1714 | 1715 | 1716 | Acknowledgements 1717 | ================ 1718 | 1719 | Thanks go to the many folks on the Web-SIG mailing list whose 1720 | thoughtful feedback made this revised draft possible. Especially: 1721 | 1722 | * Gregory "Grisha" Trubetskoy, author of ``mod_python``, who beat up 1723 | on the first draft as not offering any advantages over "plain old 1724 | CGI", thus encouraging me to look for a better approach. 1725 | 1726 | * Ian Bicking, who helped nag me into properly specifying the 1727 | multithreading and multiprocess options, as well as badgering me to 1728 | provide a mechanism for servers to supply custom extension data to 1729 | an application. 1730 | 1731 | * Tony Lownds, who came up with the concept of a ``start_response`` 1732 | function that took the status and headers, returning a ``write`` 1733 | function. His input also guided the design of the exception handling 1734 | facilities, especially in the area of allowing for middleware that 1735 | overrides application error messages. 1736 | 1737 | * Alan Kennedy, whose courageous attempts to implement WSGI-on-Jython 1738 | (well before the spec was finalized) helped to shape the "supporting 1739 | older versions of Python" section, as well as the optional 1740 | ``wsgi.file_wrapper`` facility, and some of the early bytes/unicode 1741 | decisions. 1742 | 1743 | * Mark Nottingham, who reviewed the spec extensively for issues with 1744 | HTTP RFC compliance, especially with regard to HTTP/1.1 features that 1745 | I didn't even know existed until he pointed them out. 1746 | 1747 | * Graham Dumpleton, who worked tirelessly (even in the face of my laziness 1748 | and stupidity) to get some sort of Python 3 version of WSGI out, who 1749 | proposed the "native strings" vs. "byte strings" concept, and thoughtfully 1750 | wrestled through a great many HTTP, ``wsgi.input``, and other 1751 | amendments. Most, if not all, of the credit for this new PEP 1752 | belongs to him. 1753 | 1754 | 1755 | References 1756 | ========== 1757 | 1758 | .. [1] The Python Wiki "Web Programming" topic 1759 | (http://www.python.org/cgi-bin/moinmoin/WebProgramming) 1760 | 1761 | .. [2] The Common Gateway Interface Specification, v 1.1, 3rd Draft 1762 | (http://ken.coar.org/cgi/draft-coar-cgi-v11-03.txt) 1763 | 1764 | .. [3] "Chunked Transfer Coding" -- HTTP/1.1, section 3.6.1 1765 | (http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.6.1) 1766 | 1767 | .. [4] "End-to-end and Hop-by-hop Headers" -- HTTP/1.1, Section 13.5.1 1768 | (http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.5.1) 1769 | 1770 | .. [5] mod_ssl Reference, "Environment Variables" 1771 | (http://www.modssl.org/docs/2.8/ssl_reference.html#ToC25) 1772 | 1773 | .. [6] Procedural issues regarding modifications to PEP \333 1774 | (http://mail.python.org/pipermail/python-dev/2010-September/104114.html) 1775 | 1776 | .. [7] SVN revision history for PEP \3333, showing differences from PEP 333 1777 | (http://svn.python.org/view/peps/trunk/pep-3333.txt?r1=84854&r2=HEAD) 1778 | 1779 | Copyright 1780 | ========= 1781 | 1782 | This document has been placed in the public domain. 1783 | 1784 | 1785 | 1786 | .. 1787 | Local Variables: 1788 | mode: indented-text 1789 | indent-tabs-mode: nil 1790 | sentence-end-double-space: t 1791 | fill-column: 70 1792 | End: 1793 | -------------------------------------------------------------------------------- /setup.cfg: -------------------------------------------------------------------------------- 1 | [metadata] 2 | name = wsging 3 | author = Robert Collins 4 | author-email = rbtcollins@hp.com 5 | summary = WSGI revamp for HTTP/2, websockets, etc 6 | description-file = 7 | README.rst 8 | home-page = https://github.com/python-web-sig/wsgii-ng 9 | classifier = 10 | Development Status :: 4 - Beta 11 | Intended Audience :: Developers 12 | License :: OSI Approved :: Apache Software License 13 | Operating System :: OS Independent 14 | Programming Language :: Python 15 | Programming Language :: Python :: 2 16 | Programming Language :: Python :: 2.7 17 | Programming Language :: Python :: 3 18 | Programming Language :: Python :: 3.3 19 | 20 | [files] 21 | packages = 22 | wsging 23 | 24 | [global] 25 | setup-hooks = 26 | pbr.hooks.setup_hook 27 | 28 | [build_sphinx] 29 | source-dir = doc/source 30 | build-dir = doc/build 31 | all_files = 1 32 | 33 | [upload_sphinx] 34 | upload-dir = doc/build/html 35 | 36 | [wheel] 37 | universal = 1 38 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # Copyright (c) 2014 Hewlett-Packard Development Company, L.P. 3 | # 4 | # Licensed under the Apache License, Version 2.0 (the "License"); 5 | # you may not use this file except in compliance with the License. 6 | # You may obtain a copy of the License at 7 | # 8 | # http://www.apache.org/licenses/LICENSE-2.0 9 | # 10 | # Unless required by applicable law or agreed to in writing, software 11 | # distributed under the License is distributed on an "AS IS" BASIS, 12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 13 | # implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | 17 | import setuptools 18 | setuptools.setup(setup_requires=['pbr'], pbr=True) 19 | -------------------------------------------------------------------------------- /wsging/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/python-web-sig/wsgi-ng/efddbc5a8b043a0e52623c599de53f45a557c508/wsging/__init__.py --------------------------------------------------------------------------------