├── .gitignore
├── .testr.conf
├── README.rst
├── Requirements.rst
├── experiments.rst
├── pep-draft.rst
├── setup.cfg
├── setup.py
└── wsging
    └── __init__.py


/.gitignore:
--------------------------------------------------------------------------------
 1 | # Byte-compiled / optimized / DLL files
 2 | __pycache__/
 3 | *.py[cod]
 4 | 
 5 | # C extensions
 6 | *.so
 7 | 
 8 | # Distribution / packaging
 9 | .Python
10 | env/
11 | build/
12 | develop-eggs/
13 | dist/
14 | downloads/
15 | eggs/
16 | lib/
17 | lib64/
18 | parts/
19 | sdist/
20 | var/
21 | *.egg-info/
22 | .installed.cfg
23 | *.egg
24 | 
25 | # PyInstaller
26 | #  Usually these files are written by a python script from a template
27 | #  before PyInstaller builds the exe, so as to inject date/other infos into it.
28 | *.manifest
29 | *.spec
30 | 
31 | # Installer logs
32 | pip-log.txt
33 | pip-delete-this-directory.txt
34 | 
35 | # Unit test / coverage reports
36 | htmlcov/
37 | .tox/
38 | .coverage
39 | .cache
40 | nosetests.xml
41 | coverage.xml
42 | 
43 | # Translations
44 | *.mo
45 | *.pot
46 | 
47 | # Django stuff:
48 | *.log
49 | 
50 | # Sphinx documentation
51 | docs/_build/
52 | 
53 | # PyBuilder
54 | target/
55 | 
56 | # Editors
57 | .*.swp
58 | *~
59 | 
60 | AUTHORS
61 | ChangeLog
62 | .testrepository
63 | 


--------------------------------------------------------------------------------
/.testr.conf:
--------------------------------------------------------------------------------
1 | [DEFAULT]
2 | test_command=${PYTHON:-python} -m subunit.run discover $LISTOPT $IDOPTION
3 | test_id_option=--load-list $IDFILE
4 | test_list_option=--list
5 | 


--------------------------------------------------------------------------------
/README.rst:
--------------------------------------------------------------------------------
  1 | This repository is for drafting a revision to the WSGI spec.
  2 | 
  3 | Who
  4 | ===
  5 | 
  6 | Right now we're figuring out who has the time to contribute to this
  7 | on the web-sig list, the We below refers to whomever ends up being
  8 | involved :)
  9 | 
 10 | We want to create a clean common API for applications and middleware
 11 | written in a post HTTP/2 world - where single servers may accept up to
 12 | all three of HTTP/1.x, HTTP/2 and Websocket connections, and
 13 | applications and middleware want to be able to take advantage of
 14 | HTTP/2 and websockets when available, but also degrade gracefully. We
 15 | also want to ensure that there is a graceful incremental path to
 16 | adoption of the new API, including Python 2.7 support, and shims to
 17 | enable existing WSGI apps/middleware/servers to respectively be
 18 | contained, contain-or-be-contained and contain, things written to this
 19 | new API. We want a clean, fast and approachable API, and we want to
 20 | ensure that its no less friendly to work with than WSGI, for all that
 21 | it will expose much more functionality.
 22 | 
 23 | Governance
 24 | ==========
 25 | 
 26 | We plan to produce a PEP and reference code. That will be done by direct
 27 | collaboration from folk with time and energy. We're not interested in
 28 | debating everything to death until we have working code showing that
 29 | we've got *something* feasible. At that point we'll move into the regular
 30 | PEP process on python-ideas@python.org. This incrementally more public
 31 | approach is intended to mitigate against burnout that has negatively
 32 | affected previous WSGI overhaul approaches.
 33 | 
 34 | Overall Process
 35 | ===============
 36 | 
 37 | There are three broad phases planned (though as with all design work, it
 38 | won't be strictly dileneated). That is, folk are encouraged to experiment
 39 | with 
 40 | 
 41 | Phase 1 - requirements
 42 | ++++++++++++++++++++++
 43 | 
 44 | Gathering requirements. These will come from our needs, the needs of
 45 | interop with other PEP's and of course the needs of the underlying
 46 | HTTP/1.x,HTTP/2 and Websocket protocols. I'm allowing 3 months for this to
 47 | ensure that we've had plenty of time for developers from Django, uWSGI and
 48 | so on and so forth to participate. At the end of the 3 months (mid January)
 49 | we'll stop iterating on requirements unless security or feasibility are
 50 | involved. That is, security requirements and 'this *cannot* work'
 51 | requirements can always be added. If and only if we've had broad input from
 52 | the Python web community then we may finalise things earlier.
 53 | 
 54 | Phase 2 - design
 55 | ++++++++++++++++
 56 | 
 57 | Here we experiment with different ways of meeting the requirements from
 58 | Phase 1. Again, we're allowing 3 months for experimentation and design work
 59 | to take place. This needs to include proof of concept implementations that
 60 | work in mod_wsgi, uWSGI, gunicorn etc. In mid-April, we'll pick a final
 61 | design from the set of experiments that have taken place.
 62 | 
 63 | Phase 3 - PEP
 64 | +++++++++++++
 65 | 
 66 | Here we translate the final design into a PEP and enter it into the PEP
 67 | process. This will be as fast or slow as the PEP process runs.
 68 | 
 69 | 
 70 | Participating
 71 | =============
 72 | 
 73 | While we know things that are broken about WSGI, we probably don't know them
 74 | all, so telling us about design issues or capabilities you want to see as
 75 | issues in this respository is useful.
 76 | 
 77 | Specifically, please open issues at
 78 | https://github.com/python-web-sig/wsgi-ng/issues for things you want to see
 79 | in the new protocol. Whether thats 'WSGI makes me cry because X' or 'I want
 80 | to be able to write middleware that proxies to a remote server over zeromq'
 81 | - all issues are welcome.
 82 | 
 83 | Secondly, we need to experiment to find a good protocol that meets all our
 84 | requirements. You can add experiments as pull requests against this
 85 | repository. Experiments can be prose (e.g. a draft specification) or code
 86 | (e.g. a test implementation of a draft specification).
 87 | 
 88 | There is *a* draft spec based on PEP-3333 in the repository already - it is
 89 | not special or privileged - we may end up with a totally new spec rather
 90 | than that iterated one. That said please do also submit PRs to change it in
 91 | light of requirements or issues that need resolving. This repository is a
 92 | community resource where we can build consensus and working code together.
 93 | 
 94 | Current status
 95 | ==============
 96 | 
 97 | We're just self assembling at the moment.
 98 | 
 99 | We have a `requirements document <Requirements.rst>`_.
100 | 
101 | We have a `draft PEP based on 3333 <pep-draft.rst>`_.
102 | 
103 | We may add some `experiments too <experiments.rst>`.
104 | 


--------------------------------------------------------------------------------
/Requirements.rst:
--------------------------------------------------------------------------------
 1 | Overview
 2 | ========
 3 | 
 4 | These are the functional requirements for any successful design. Where the
 5 | requirements are too high level to be actionable, please help us refine them :).
 6 | 
 7 | Requirements
 8 | ============
 9 | 
10 | #. Support servers speaking HTTP/1.x, HTTP/2 and Websockets (potentially all on
11 |    a single port).
12 | #. Support graceful degradation for applications that can use HTTP/2 but still
13 |    support HTTP/1.x requests.
14 | #. Graceful incremental adoption path - no upgrade-all-components requirement
15 |    baked into the design.
16 | #. Support Python 2.7 and 3.x (where x is not yet discussed)
17 | #. Support the existing ecosystem of containers (such as mod_wsgi)
18 |    new API. We want a clean, fast and approachable API, and we want to
19 |    ensure that its no less friendly to work with than WSGI, for all that
20 |    it will expose much more functionality.
21 | #. Apps need to be able to tell what protocol is in use, and what optional
22 |    features are available. For instance, HTTP/2 PUSH PROMISE is an optional
23 |    feature that can be disabled by clients. Websockets needs to expose a socket
24 |    like object, and so on.
25 | #. Support websockets
26 | #. Support HTTP/2
27 | #. Support HTTP/1.x [ which may be just 'point at PEP-3333'. ]
28 | #. Continue to support lightweight shims being built on top such as
29 |    https://github.com/Pylons/webob/blob/master/webob/request.py
30 | 
31 | Corollaries
32 | ===========
33 | 
34 | #. May well want to use `python futures <http://python-futures.org>`_ to get
35 |    bytes on Python 2.7.
36 | #. Will need old to new and new to old shims to enale upgrading one layer in
37 |    a middleware stack at a time.
38 | #. Cannot be coarsely incompatible with WSGI, or writing shims will be very
39 |    fragile.
40 | #. Cannot hand the connection socket to apps (pending confirmation from
41 |    container authors).
42 | 
43 | Not requirements
44 | ================
45 | 
46 | These are things that we've discussed but haven't [yet] decided to make into
47 | requirements for the design, or which we have decided definitely are not
48 | requirements.
49 | 
50 | Implementing new protocols as middleware
51 | ++++++++++++++++++++++++++++++++++++++++
52 | 
53 | gunicorn exposes the socket that requests were received on, allowing apps to
54 | write anything they want - e.g. taking over the socket, which is how websockets
55 | can be implemented there. gevent uses a custom handler_class to inject websocket
56 | data into the environment. Neither approach is standardised (such that all
57 | containers support it). Making new protocol implementations be something that
58 | can routinely be done without revving WSGI would be nice, but its not clear that
59 | its compatible with the requirement for supporting e.g. mod_wsgi. Input from
60 | container maintainers is needed!
61 | 


--------------------------------------------------------------------------------
/experiments.rst:
--------------------------------------------------------------------------------
1 | API experiments
2 | ===============
3 | 
4 | API experiments may be added under wsging, to get a persistent sense of how
5 | different approachs may pan out. Benchmarks and other tests associated with
6 | this will be stored there too.
7 | 


--------------------------------------------------------------------------------
/pep-draft.rst:
--------------------------------------------------------------------------------
   1 | PEP: Not assigned yet
   2 | Title: Python Web Server Gateway Interface v2
   3 | Version: $Revision$
   4 | Last-Modified: $Date$
   5 | Author: Robert Collins <rbtcollins@hp.com>
   6 | Discussions-To: Python Web-SIG <web-sig@python.org>
   7 | Status: Draft
   8 | Type: Informational
   9 | Content-Type: text/x-rst
  10 | Created: 25-Sep-2014
  11 | Replaces: 333, 3333
  12 | 
  13 | 
  14 | Preface for Readers of PEP \???
  15 | ===============================
  16 | 
  17 | This is an updated version of PEP 3333, modified to address various
  18 | limitations that have evolved since WSGI was first standardised.
  19 | These include incremental upload handling, websockets and HTTP/2.
  20 | 
  21 | The broad strategy is to define a new specification to handle these
  22 | new features and cleanup some cruft that has become apparent in WSGI.
  23 | Then adapters to contain WSGI 1.0.1 apps in WSGI 2 servers, and vice
  24 | versa will be written, allowing incremental migration of WSGI stacks.
  25 | 
  26 | Given the still huge deployment of Python2.7, we have limited
  27 | ourselves to language features avaiable in Python2.7, though the
  28 | design and focus is for Python3.5 and up.
  29 | 
  30 | 
  31 | Abstract
  32 | ========
  33 | 
  34 | This document specifies a proposed standard interface between web
  35 | servers and Python web applications or frameworks, to promote web
  36 | application portability across a variety of web servers and web
  37 | protocols. Support is included for the features of HTTP/1.x, HTTP/2
  38 | and Websockets.
  39 | 
  40 | 
  41 | Differences to PEP \3333
  42 | ========================
  43 | 
  44 | * Minimum Python version of 2.7 - the oldest supported Python at the
  45 |   time of the PEP.
  46 | 
  47 | * Pre 2.2 support advice removed.
  48 | 
  49 | * Server push is defined - see ``wsgi.associated_content``.
  50 | 
  51 | * The pending changes from
  52 |   https://mail.python.org/pipermail/web-sig/2010-September/004655.html
  53 |   have been applied, barring the point about ``wsgi.input`` and
  54 |   ``CONTENT_LENGTH`` being out of sync which needs further discussion.
  55 | 
  56 | * Prose about the interaction of chunking and ``CONTENT_LENGTH``
  57 |   corrected.
  58 | 
  59 | Original Rationale and Goals (from PEP \333)
  60 | ============================================
  61 | 
  62 | Python currently boasts a wide variety of web application frameworks,
  63 | such as Zope, Quixote, Webware, SkunkWeb, PSO, and Twisted Web -- to
  64 | name just a few [1]_.  This wide variety of choices can be a problem
  65 | for new Python users, because generally speaking, their choice of web
  66 | framework will limit their choice of usable web servers, and vice
  67 | versa.
  68 | 
  69 | By contrast, although Java has just as many web application frameworks
  70 | available, Java's "servlet" API makes it possible for applications
  71 | written with any Java web application framework to run in any web
  72 | server that supports the servlet API.
  73 | 
  74 | The availability and widespread use of such an API in web servers for
  75 | Python -- whether those servers are written in Python (e.g. Medusa),
  76 | embed Python (e.g. mod_python), or invoke Python via a gateway
  77 | protocol (e.g. CGI, FastCGI, etc.) -- would separate choice of
  78 | framework from choice of web server, freeing users to choose a pairing
  79 | that suits them, while freeing framework and server developers to
  80 | focus on their preferred area of specialization.
  81 | 
  82 | This PEP, therefore, proposes a simple and universal interface between
  83 | web servers and web applications or frameworks: the Python Web Server
  84 | Gateway Interface (WSGIv2, or WSGI in the rest of this document. Where
  85 | WSGI version 1 is intended, it will be explicitly identified as
  86 | WSGIv1).
  87 | 
  88 | But the mere existence of a WSGI spec does nothing to address the
  89 | existing state of servers and frameworks for Python web applications.
  90 | Server and framework authors and maintainers must actually implement
  91 | WSGI for there to be any effect.
  92 | 
  93 | However, since no existing servers or frameworks support WSGI, there
  94 | is little immediate reward for an author who implements WSGI support.
  95 | Thus, WSGI **must** be easy to implement, so that an author's initial
  96 | investment in the interface can be reasonably low.
  97 | 
  98 | Thus, simplicity of implementation on *both* the server and framework
  99 | sides of the interface is absolutely critical to the utility of the
 100 | WSGI interface, and is therefore the principal criterion for any
 101 | design decisions.
 102 | 
 103 | Note, however, that simplicity of implementation for a framework
 104 | author is not the same thing as ease of use for a web application
 105 | author.  WSGI presents an absolutely "no frills" interface to the
 106 | framework author, because bells and whistles like response objects and
 107 | cookie handling would just get in the way of existing frameworks'
 108 | handling of these issues.  Again, the goal of WSGI is to facilitate
 109 | easy interconnection of existing servers and applications or
 110 | frameworks, not to create a new web framework.
 111 | 
 112 | Note also that this goal precludes WSGI from requiring anything that
 113 | is not already available in deployed versions of Python.  Therefore,
 114 | new standard library modules are not proposed or required by this
 115 | specification, and nothing in WSGI requires a Python version greater
 116 | than 2.7.  (It would be a good idea, however, for future versions
 117 | of Python to include support for this interface in web servers
 118 | provided by the standard library.)
 119 | 
 120 | In addition to ease of implementation for existing and future
 121 | frameworks and servers, it should also be easy to create request
 122 | preprocessors, response postprocessors, and other WSGI-based
 123 | "middleware" components that look like an application to their
 124 | containing server, while acting as a server for their contained
 125 | applications.
 126 | 
 127 | If middleware can be both simple and robust, and WSGI is widely
 128 | available in servers and frameworks, it allows for the possibility
 129 | of an entirely new kind of Python web application framework: one
 130 | consisting of loosely-coupled WSGI middleware components.  Indeed,
 131 | existing framework authors may even choose to refactor their
 132 | frameworks' existing services to be provided in this way, becoming
 133 | more like libraries used with WSGI, and less like monolithic
 134 | frameworks.  This would then allow application developers to choose
 135 | "best-of-breed" components for specific functionality, rather than
 136 | having to commit to all the pros and cons of a single framework.
 137 | 
 138 | Of course, as of this writing, that day is doubtless quite far off.
 139 | In the meantime, it is a sufficient short-term goal for WSGI to
 140 | enable the use of any framework with any server.
 141 | 
 142 | Finally, it should be mentioned that the current version of WSGI
 143 | does not prescribe any particular mechanism for "deploying" an
 144 | application for use with a web server or server gateway.  At the
 145 | present time, this is necessarily implementation-defined by the
 146 | server or gateway.  After a sufficient number of servers and
 147 | frameworks have implemented WSGI to provide field experience with
 148 | varying deployment requirements, it may make sense to create
 149 | another PEP, describing a deployment standard for WSGI servers and
 150 | application frameworks.
 151 | 
 152 | 
 153 | Specification Overview
 154 | ======================
 155 | 
 156 | The WSGI interface has two sides: the "server" or "gateway" side, and
 157 | the "application" or "framework" side.  The server side invokes a
 158 | callable object that is provided by the application side.  The
 159 | specifics of how that object is provided are up to the server or
 160 | gateway.  It is assumed that some servers or gateways will require an
 161 | application's deployer to write a short script to create an instance
 162 | of the server or gateway, and supply it with the application object.
 163 | Other servers and gateways may use configuration files or other
 164 | mechanisms to specify where an application object should be
 165 | imported from, or otherwise obtained.
 166 | 
 167 | In addition to "pure" servers/gateways and applications/frameworks,
 168 | it is also possible to create "middleware" components that implement
 169 | both sides of this specification.  Such components act as an
 170 | application to their containing server, and as a server to a
 171 | contained application, and can be used to provide extended APIs,
 172 | content transformation, navigation, and other useful functions.
 173 | 
 174 | Throughout this specification, we will use the term "a callable" to
 175 | mean "a function, method, class, or an instance with a ``__call__``
 176 | method".  It is up to the server, gateway, or application implementing
 177 | the callable to choose the appropriate implementation technique for
 178 | their needs.  Conversely, a server, gateway, or application that is
 179 | invoking a callable **must not** have any dependency on what kind of
 180 | callable was provided to it.  Callables are only to be called, not
 181 | introspected upon.
 182 | 
 183 | 
 184 | A Note On String Types
 185 | ----------------------
 186 | 
 187 | In general, HTTP deals with bytes, which means that this specification
 188 | is mostly about handling bytes.
 189 | 
 190 | However, the content of those bytes often has some kind of textual
 191 | interpretation, and in Python, strings are the most convenient way
 192 | to handle text.
 193 | 
 194 | But in many Python versions and implementations, strings are Unicode,
 195 | rather than bytes.  This requires a careful balance between a usable
 196 | API and correct translations between bytes and text in the context of
 197 | HTTP...  especially to support porting code between Python
 198 | implementations with different ``str`` types.
 199 | 
 200 | WSGI therefore defines two kinds of "string":
 201 | 
 202 | * "Native" strings (which are always implemented using the type
 203 |   named ``str``) that are used for request/response headers and
 204 |   metadata
 205 | 
 206 | * "Bytestrings" (which are implemented using the ``bytes`` type
 207 |   in Python 3, and ``str`` elsewhere), that are used for the bodies
 208 |   of requests and responses (e.g. POST/PUT input data and HTML page
 209 |   outputs).
 210 | 
 211 | Do not be confused however: even if Python's ``str`` type is actually 
 212 | Unicode "under the hood", the *content* of native strings must
 213 | still be translatable to bytes via the Latin-1 encoding!  (See
 214 | the section on `Unicode Issues`_ later in  this document for more
 215 | details.)
 216 | 
 217 | In short: where you see the word "string" in this document, it refers 
 218 | to a "native" string, i.e., an object of type ``str``, whether it is 
 219 | internally implemented as bytes or unicode.  Where you see references 
 220 | to "bytestring", this should be read as "an object of type ``bytes`` 
 221 | under Python 3, or type ``str`` under Python 2".
 222 | 
 223 | And so, even though HTTP is in some sense "really just bytes", there
 224 | are  many API conveniences to be had by using whatever Python's
 225 | default  ``str`` type is.
 226 | 
 227 | 
 228 | 
 229 | The Application/Framework Side
 230 | ------------------------------
 231 | 
 232 | The application object is simply a callable object that accepts
 233 | two arguments.  The term "object" should not be misconstrued as
 234 | requiring an actual object instance: a function, method, class,
 235 | or instance with a ``__call__`` method are all acceptable for
 236 | use as an application object.  Application objects must be able
 237 | to be invoked more than once, as virtually all servers/gateways
 238 | (other than CGI) will make such repeated requests.
 239 | 
 240 | (Note: although we refer to it as an "application" object, this
 241 | should not be construed to mean that application developers will use
 242 | WSGI as a web programming API!  It is assumed that application
 243 | developers will continue to use existing, high-level framework
 244 | services to develop their applications.  WSGI is a tool for
 245 | framework and server developers, and is not intended to directly
 246 | support application developers.)
 247 | 
 248 | Here are two example application objects; one is a function, and the
 249 | other is a class::
 250 | 
 251 |     HELLO_WORLD = b"Hello world!\n"  
 252 | 
 253 |     def simple_app(environ, start_response):
 254 |         """Simplest possible application object"""
 255 |         status = '200 OK'
 256 |         response_headers = [('Content-type', 'text/plain')]
 257 |         start_response(status, response_headers)
 258 |         return [HELLO_WORLD]
 259 | 
 260 |     class AppClass:
 261 |         """Produce the same output, but using a class
 262 | 
 263 |         (Note: 'AppClass' is the "application" here, so calling it
 264 |         returns an instance of 'AppClass', which is then the iterable
 265 |         return value of the "application callable" as required by
 266 |         the spec.
 267 | 
 268 |         If we wanted to use *instances* of 'AppClass' as application
 269 |         objects instead, we would have to implement a '__call__'
 270 |         method, which would be invoked to execute the application,
 271 |         and we would need to create an instance for use by the
 272 |         server or gateway.
 273 |         """
 274 | 
 275 |         def __init__(self, environ, start_response):
 276 |             self.environ = environ
 277 |             self.start = start_response
 278 | 
 279 |         def __iter__(self):
 280 |             status = '200 OK'
 281 |             response_headers = [('Content-type', 'text/plain')]
 282 |             self.start(status, response_headers)
 283 |             yield HELLO_WORLD
 284 | 
 285 | 
 286 | The Server/Gateway Side
 287 | -----------------------
 288 | 
 289 | The server or gateway invokes the application callable once for each
 290 | request it receives from an HTTP client, that is directed at the
 291 | application.  To illustrate, here is a simple CGI gateway, implemented
 292 | as a function taking an application object.  Note that this simple
 293 | example has limited error handling, because by default an uncaught
 294 | exception will be dumped to ``sys.stderr`` and logged by the web
 295 | server.
 296 | 
 297 | ::
 298 | 
 299 |     import os, sys
 300 | 
 301 |     enc, esc = sys.getfilesystemencoding(), 'surrogateescape'
 302 | 
 303 |     def unicode_to_wsgi(u):
 304 |         # Convert an environment variable to a WSGI "bytes-as-unicode" string
 305 |         return u.encode(enc, esc).decode('iso-8859-1')
 306 | 
 307 |     def wsgi_to_bytes(s):
 308 |         return s.encode('iso-8859-1')
 309 | 
 310 |     def run_with_cgi(application):
 311 |         environ = {k: unicode_to_wsgi(v) for k,v in os.environ.items()}
 312 |         environ['wsgi.input']        = sys.stdin.buffer
 313 |         environ['wsgi.errors']       = sys.stderr
 314 |         environ['wsgi.version']      = (1, 0)
 315 |         environ['wsgi.multithread']  = False
 316 |         environ['wsgi.multiprocess'] = True
 317 |         environ['wsgi.run_once']     = True
 318 | 
 319 |         if environ.get('HTTPS', 'off') in ('on', '1'):
 320 |             environ['wsgi.url_scheme'] = 'https'
 321 |         else:
 322 |             environ['wsgi.url_scheme'] = 'http'
 323 | 
 324 |         headers_set = []
 325 |         headers_sent = []
 326 | 
 327 |         def write(data):
 328 |             out = sys.stdout.buffer
 329 | 
 330 |             if not headers_set:
 331 |                  raise AssertionError("write() before start_response()")
 332 | 
 333 |             elif not headers_sent:
 334 |                  # Before the first output, send the stored headers
 335 |                  status, response_headers = headers_sent[:] = headers_set
 336 |                  out.write(wsgi_to_bytes('Status: %s\r\n' % status))
 337 |                  for header in response_headers:
 338 |                      out.write(wsgi_to_bytes('%s: %s\r\n' % header))
 339 |                  out.write(wsgi_to_bytes('\r\n'))
 340 | 
 341 |             out.write(data)
 342 |             out.flush()
 343 | 
 344 |         def start_response(status, response_headers, exc_info=None):
 345 |             if exc_info:
 346 |                 try:
 347 |                     if headers_sent:
 348 |                         # Re-raise original exception if headers sent
 349 |                         raise exc_info[1].with_traceback(exc_info[2])
 350 |                 finally:
 351 |                     exc_info = None     # avoid dangling circular ref
 352 |             elif headers_set:
 353 |                 raise AssertionError("Headers already set!")
 354 | 
 355 |             headers_set[:] = [status, response_headers]
 356 | 
 357 |             # Note: error checking on the headers should happen here,
 358 |             # *after* the headers are set.  That way, if an error
 359 |             # occurs, start_response can only be re-called with
 360 |             # exc_info set.
 361 | 
 362 |             return write
 363 | 
 364 |         result = application(environ, start_response)
 365 |         try:
 366 |             for data in result:
 367 |                 if data:    # don't send headers until body appears
 368 |                     write(data)
 369 |             if not headers_sent:
 370 |                 write('')   # send headers now if body was empty
 371 |         finally:
 372 |             if hasattr(result, 'close'):
 373 |                 result.close()
 374 | 
 375 | 
 376 | Middleware: Components that Play Both Sides
 377 | -------------------------------------------
 378 | 
 379 | Note that a single object may play the role of a server with respect
 380 | to some application(s), while also acting as an application with
 381 | respect to some server(s).  Such "middleware" components can perform
 382 | such functions as:
 383 | 
 384 | * Routing a request to different application objects based on the
 385 |   target URL, after rewriting the ``environ`` accordingly.
 386 | 
 387 | * Allowing multiple applications or frameworks to run side-by-side
 388 |   in the same process
 389 | 
 390 | * Load balancing and remote processing, by forwarding requests and
 391 |   responses over a network
 392 | 
 393 | * Perform content postprocessing, such as applying XSL stylesheets
 394 | 
 395 | The presence of middleware in general is transparent to both the
 396 | "server/gateway" and the "application/framework" sides of the
 397 | interface, and should require no special support.  A user who
 398 | desires to incorporate middleware into an application simply
 399 | provides the middleware component to the server, as if it were
 400 | an application, and configures the middleware component to
 401 | invoke the application, as if the middleware component were a
 402 | server.  Of course, the "application" that the middleware wraps
 403 | may in fact be another middleware component wrapping another
 404 | application, and so on, creating what is referred to as a
 405 | "middleware stack".
 406 | 
 407 | For the most part, middleware must conform to the restrictions
 408 | and requirements of both the server and application sides of
 409 | WSGI.  In some cases, however, requirements for middleware
 410 | are more stringent than for a "pure" server or application,
 411 | and these points will be noted in the specification.
 412 | 
 413 | Here is a (tongue-in-cheek) example of a middleware component that
 414 | converts ``text/plain`` responses to pig latin, using Joe Strout's
 415 | ``piglatin.py``.  (Note: a "real" middleware component would
 416 | probably use a more robust way of checking the content type, and
 417 | should also check for a content encoding.  Also, this simple
 418 | example ignores the possibility that a word might be split across
 419 | a block boundary.)
 420 | 
 421 | ::
 422 | 
 423 |     from piglatin import piglatin
 424 | 
 425 |     class LatinIter:
 426 | 
 427 |         """Transform iterated output to piglatin, if it's okay to do so
 428 | 
 429 |         Note that the "okayness" can change until the application yields
 430 |         its first non-empty bytestring, so 'transform_ok' has to be a mutable
 431 |         truth value.
 432 |         """
 433 | 
 434 |         def __init__(self, result, transform_ok):
 435 |             if hasattr(result, 'close'):
 436 |                 self.close = result.close
 437 |             self._next = iter(result).__next__
 438 |             self.transform_ok = transform_ok
 439 | 
 440 |         def __iter__(self):
 441 |             return self
 442 | 
 443 |         def __next__(self):
 444 |             if self.transform_ok:
 445 |                 return piglatin(self._next())   # call must be byte-safe on Py3
 446 |             else:
 447 |                 return self._next()
 448 | 
 449 |     class Latinator:
 450 | 
 451 |         # by default, don't transform output
 452 |         transform = False
 453 | 
 454 |         def __init__(self, application):
 455 |             self.application = application
 456 | 
 457 |         def __call__(self, environ, start_response):
 458 | 
 459 |             transform_ok = []
 460 | 
 461 |             def start_latin(status, response_headers, exc_info=None):
 462 | 
 463 |                 # Reset ok flag, in case this is a repeat call
 464 |                 del transform_ok[:]
 465 | 
 466 |                 for name, value in response_headers:
 467 |                     if name.lower() == 'content-type' and value == 'text/plain':
 468 |                         transform_ok.append(True)
 469 |                         # Strip content-length if present, else it'll be wrong
 470 |                         response_headers = [(name, value)
 471 |                             for name, value in response_headers
 472 |                                 if name.lower() != 'content-length'
 473 |                         ]
 474 |                         break
 475 | 
 476 |                 write = start_response(status, response_headers, exc_info)
 477 | 
 478 |                 if transform_ok:
 479 |                     def write_latin(data):
 480 |                         write(piglatin(data))   # call must be byte-safe on Py3
 481 |                     return write_latin
 482 |                 else:
 483 |                     return write
 484 | 
 485 |             return LatinIter(self.application(environ, start_latin), transform_ok)
 486 | 
 487 | 
 488 |     # Run foo_app under a Latinator's control, using the example CGI gateway
 489 |     from foo_app import foo_app
 490 |     run_with_cgi(Latinator(foo_app))
 491 | 
 492 | 
 493 | 
 494 | Specification Details
 495 | =====================
 496 | 
 497 | The application object must accept two positional arguments.  For
 498 | the sake of illustration, we have named them ``environ`` and
 499 | ``start_response``, but they are not required to have these names.
 500 | A server or gateway **must** invoke the application object using
 501 | positional (not keyword) arguments.  (E.g. by calling
 502 | ``result = application(environ, start_response)`` as shown above.)
 503 | 
 504 | The ``environ`` parameter is a dictionary object, containing CGI-style
 505 | environment variables.  This object **must** be a builtin Python
 506 | dictionary (*not* a subclass, ``UserDict`` or other dictionary
 507 | emulation), and the application is allowed to modify the dictionary
 508 | in any way it desires.  The dictionary must also include certain
 509 | WSGI-required variables (described in a later section), and may
 510 | also include server-specific extension variables, named according
 511 | to a convention that will be described below.
 512 | 
 513 | The ``start_response`` parameter is a callable accepting two
 514 | required positional arguments, and one optional argument.  For the sake
 515 | of illustration, we have named these arguments ``status``,
 516 | ``response_headers``, and ``exc_info``, but they are not required to
 517 | have these names, and the application **must** invoke the
 518 | ``start_response`` callable using positional arguments (e.g.
 519 | ``start_response(status, response_headers)``).
 520 | 
 521 | The ``status`` parameter is a status string of the form
 522 | ``"999 Message here"``, and ``response_headers`` is a list of
 523 | ``(header_name, header_value)`` tuples describing the HTTP response
 524 | header.  The optional ``exc_info`` parameter is described below in the
 525 | sections on `The start_response() Callable`_ and `Error Handling`_.
 526 | It is used only when the application has trapped an error and is
 527 | attempting to display an error message to the browser.
 528 | 
 529 | The ``start_response`` callable **must** do any header verification and
 530 | checking that will take place before returning (or raising an error)
 531 | rather than deferring it until body bytes are supplied (via either the
 532 | iterator or the ``write`` callable). This ensures that the error is
 533 | raised as close to the causing code as possible.
 534 | 
 535 | The ``start_response`` callable must return a ``write(body_data)``
 536 | callable that takes one positional parameter: a bytestring to be written
 537 | as part of the HTTP response body.  (Note: the ``write()`` callable is
 538 | provided only to support certain existing frameworks' imperative output
 539 | APIs; it should not be used by new applications or frameworks if it
 540 | can be avoided.  See the `Buffering and Streaming`_ section for more
 541 | details.)
 542 | 
 543 | When called by the server, the application object must return an
 544 | iterable yielding zero or more bytestrings.  This can be accomplished in a
 545 | variety of ways, such as by returning a list of bytestrings, or by the
 546 | application being a generator function that yields bytestrings, or
 547 | by the application being a class whose instances are iterable.
 548 | Regardless of how it is accomplished, the application object must
 549 | always return an iterable yielding zero or more bytestrings.
 550 | 
 551 | The server or gateway must transmit the yielded bytestrings to the client
 552 | in an unbuffered fashion, completing the transmission of each bytestring
 553 | before requesting another one.  (In other words, applications
 554 | **should** perform their own buffering.  See the `Buffering and
 555 | Streaming`_ section below for more on how application output must be
 556 | handled.)
 557 | 
 558 | The server or gateway should treat the yielded bytestrings as binary byte
 559 | sequences: in particular, it should ensure that line endings are
 560 | not altered.  The application is responsible for ensuring that the
 561 | bytestring(s) to be written are in a format suitable for the client.  (The
 562 | server or gateway **may** apply HTTP transfer encodings, or perform
 563 | other transformations for the purpose of implementing HTTP features
 564 | such as byte-range transmission.  See `Other HTTP Features`_, below,
 565 | for more details.)
 566 | 
 567 | If a call to ``len(iterable)`` succeeds, the server must be able
 568 | to rely on the result being accurate.  That is, if the iterable
 569 | returned by the application provides a working ``__len__()``
 570 | method, it **must** return an accurate result.  (See
 571 | the `Handling the Content-Length Header`_ section for information
 572 | on how this would normally be used.)
 573 | 
 574 | If the iterable returned by the application has a ``close()`` method,
 575 | the server or gateway **must** call that method upon completion of the
 576 | current request, whether the request was completed normally, or
 577 | terminated early due to an application error during iteration or an early
 578 | disconnect of the browser.  (The ``close()`` method requirement is to
 579 | support resource release by the application.  This protocol is intended
 580 | to complement PEP 342's generator support, and other common iterables
 581 | with ``close()`` methods.)
 582 | 
 583 | Applications returning a generator or other custom iterator **should not**
 584 | assume the entire iterator will be consumed, as it **may** be closed early
 585 | by the server.
 586 | 
 587 | (Note: the application **must** invoke the ``start_response()``
 588 | callable before the iterable yields its first body bytestring, so that the
 589 | server can send the headers before any body content.  However, this
 590 | invocation **may** be performed by the iterable's first iteration, so
 591 | servers **must not** assume that ``start_response()`` has been called
 592 | before they begin iterating over the iterable.)
 593 | 
 594 | Finally, servers and gateways **must not** directly use any other
 595 | attributes of the iterable returned by the application, unless it is an
 596 | instance of a type specific to that server or gateway, such as a "file
 597 | wrapper" returned by ``wsgi.file_wrapper`` (see `Optional
 598 | Platform-Specific File Handling`_).  In the general case, only
 599 | attributes specified here, or accessed via e.g. the PEP 234 iteration
 600 | APIs are acceptable.
 601 | 
 602 | 
 603 | ``environ`` Variables
 604 | ---------------------
 605 | 
 606 | The ``environ`` dictionary is required to contain these CGI
 607 | environment variables, as defined by the Common Gateway Interface
 608 | specification [2]_.  The following variables **must** be present,
 609 | unless their value would be an empty string, in which case they
 610 | **may** be omitted, except as otherwise noted below.
 611 | 
 612 | ``REQUEST_METHOD``
 613 |   The HTTP request method, such as ``"GET"`` or ``"POST"``.  This
 614 |   cannot ever be an empty string, and so is always required.
 615 | 
 616 | ``SCRIPT_NAME``
 617 |   The initial portion of the request URL's "path" that corresponds to
 618 |   the application object, so that the application knows its virtual
 619 |   "location".  This **may** be an empty string, if the application
 620 |   corresponds to the "root" of the server.
 621 | 
 622 | ``PATH_INFO``
 623 |   The remainder of the request URL's "path", designating the virtual
 624 |   "location" of the request's target within the application.  This
 625 |   **may** be an empty string, if the request URL targets the
 626 |   application root and does not have a trailing slash.
 627 | 
 628 | ``QUERY_STRING``
 629 |   The portion of the request URL that follows the ``"?"``, if any.
 630 |   May be empty or absent.
 631 | 
 632 | ``CONTENT_TYPE``
 633 |   The contents of any ``Content-Type`` fields in the HTTP request.
 634 |   May be empty or absent.
 635 | 
 636 | ``CONTENT_LENGTH``
 637 |   The contents of any ``Content-Length`` fields in the HTTP request.
 638 |   May be empty or absent.
 639 | 
 640 | ``SERVER_NAME``, ``SERVER_PORT``
 641 |   When combined with ``SCRIPT_NAME`` and ``PATH_INFO``, these two strings
 642 |   can be used to complete the URL.  Note, however, that ``HTTP_HOST``,
 643 |   if present, should be used in   preference to ``SERVER_NAME`` for
 644 |   reconstructing the request URL.  See the `URL Reconstruction`_
 645 |   section below for more detail.   ``SERVER_NAME`` and ``SERVER_PORT``
 646 |   can never be empty strings, and so are always required.
 647 | 
 648 | ``SERVER_PROTOCOL``
 649 |   The version of the protocol the client used to send the request.
 650 |   Typically this will be something like ``"HTTP/1.0"`` or ``"HTTP/1.1"``
 651 |   and may be used by the application to determine how to treat any
 652 |   HTTP request headers.  (This variable should probably be called
 653 |   ``REQUEST_PROTOCOL``, since it denotes the protocol used in the
 654 |   request, and is not necessarily the protocol that will be used in the
 655 |   server's response.  However, for compatibility with CGI we have to
 656 |   keep the existing name.)
 657 | 
 658 | ``HTTP_`` Variables
 659 |   Variables corresponding to the client-supplied HTTP request headers
 660 |   (i.e., variables whose names begin with ``"HTTP_"``).  The presence or
 661 |   absence of these variables should correspond with the presence or
 662 |   absence of the appropriate HTTP header in the request.
 663 | 
 664 | A server or gateway **should** attempt to provide as many other CGI
 665 | variables as are applicable.  In addition, if SSL is in use, the server
 666 | or gateway **should** also provide as many of the Apache SSL environment
 667 | variables [5]_ as are applicable, such as ``HTTPS=on`` and
 668 | ``SSL_PROTOCOL``.  Note, however, that an application that uses any CGI
 669 | variables other than the ones listed above are necessarily non-portable
 670 | to web servers that do not support the relevant extensions.  (For
 671 | example, web servers that do not publish files will not be able to
 672 | provide a meaningful ``DOCUMENT_ROOT`` or ``PATH_TRANSLATED``.)
 673 | 
 674 | A WSGI-compliant server or gateway **should** document what variables
 675 | it provides, along with their definitions as appropriate.  Applications
 676 | **should** check for the presence of any variables they require, and
 677 | have a fallback plan in the event such a variable is absent.
 678 | 
 679 | Note: missing variables (such as ``REMOTE_USER`` when no
 680 | authentication has occurred) should be left out of the ``environ``
 681 | dictionary.  Also note that CGI-defined variables must be native strings,
 682 | if they are present at all.  It is a violation of this specification
 683 | for *any* CGI variable's value to be of any type other than ``str``.
 684 | 
 685 | In addition to the CGI-defined variables, the ``environ`` dictionary
 686 | **may** also contain arbitrary operating-system "environment variables",
 687 | and **must** contain the following WSGI-defined variables:
 688 | 
 689 | =========================== ===============================================
 690 | Variable                    Value
 691 | =========================== ===============================================
 692 | ``wsgi.version``            The tuple ``(2, 0)``, representing WSGI
 693 |                             version 2.0.
 694 | 
 695 | ``wsgi.url_scheme``         A string representing the "scheme" portion of
 696 |                             the URL at which the application is being
 697 |                             invoked.  Normally, this will have the value
 698 |                             ``"http"`` or ``"https"``, as appropriate.
 699 | 
 700 | ``wsgi.input``              An input stream (file-like object) from which
 701 | 			    the HTTP request body bytes can be read.
 702 | 			    (The server or gateway may perform reads
 703 | 			    on-demand as requested by the application,
 704 | 			    or it may pre-read the client's request
 705 | 			    body and buffer it in-memory or on disk,
 706 | 			    or use any other technique for providing
 707 | 			    such an input stream, according to its
 708 | 			    preference.)
 709 | 
 710 | ``wsgi.errors``             An output stream (file-like object) to which
 711 |                             error output can be written, for the purpose of
 712 |                             recording program or other errors in a
 713 |                             standardized and possibly centralized location.
 714 |                             This should be a "text mode" stream; i.e.,
 715 |                             applications should use ``"\n"`` as a line
 716 |                             ending, and assume that it will be converted to
 717 |                             the correct line ending by the server/gateway.
 718 | 
 719 |                             (On platforms where the ``str`` type is unicode,
 720 |                             the error stream **should** accept and log
 721 |                             arbitary unicode without raising an error; it
 722 |                             is allowed, however, to substitute characters
 723 |                             that cannot be rendered in the stream's encoding.)
 724 | 
 725 |                             For many servers, ``wsgi.errors`` will be the
 726 |                             server's main error log. Alternatively, this
 727 |                             may be ``sys.stderr``, or a log file of some
 728 |                             sort.  The server's documentation should
 729 |                             include an explanation of how to configure this
 730 |                             or where to find the recorded output.  A server
 731 |                             or gateway may supply different error streams
 732 |                             to different applications, if this is desired.
 733 | 
 734 | ``wsgi.multithread``        This value should evaluate true if the
 735 |                             application object may be simultaneously
 736 |                             invoked by another thread in the same process,
 737 |                             and should evaluate false otherwise.
 738 | 
 739 | ``wsgi.multiprocess``       This value should evaluate true if an
 740 |                             equivalent application object may be
 741 |                             simultaneously invoked by another process,
 742 |                             and should evaluate false otherwise.
 743 | 
 744 | ``wsgi.run_once``           This value should evaluate true if the server
 745 |                             or gateway expects (but does not guarantee!)
 746 |                             that the application will only be invoked this
 747 |                             one time during the life of its containing
 748 |                             process.  Normally, this will only be true for
 749 |                             a gateway based on CGI (or something similar).
 750 | 
 751 | ``wsgi.associated_content`` This is a function which can be called
 752 | 			    whenever the application wishes to signal
 753 | 			    that some associated content should be
 754 | 			    pushed to the client. Server push is an
 755 | 			    HTTP/2 feature: if the connection is not
 756 | 			    able to provide server push, this will be
 757 | 			    a no-op. The function takes one required
 758 | 			    parameter, the encoded URL to the object
 759 | 			    to push. An optional parameter weight can
 760 | 			    be used to override the weight (HTTP/2
 761 | 			    section 5.3).
 762 | 			    Different gateways/servers may implement
 763 | 			    associated_content to suite their
 764 | 			    environment. For instance, if mod_spdy is
 765 | 			    being used with Apache and mod_wsgi, the
 766 | 			    header ``X-Associated-Content`` may be
 767 | 			    added when the object headers are sent
 768 | 			    (making calling this after start_response
 769 | 			    have no effect). In gateways that
 770 | 			    implement HTTP/2 themselves, calling the
 771 | 			    function may trigger an immediate
 772 | 			    ``PUSH_PROMISE`` on the connection socket.
 773 | =========================== ===============================================
 774 | 
 775 | Finally, the ``environ`` dictionary may also contain server-defined
 776 | variables.  These variables should be named using only lower-case
 777 | letters, numbers, dots, and underscores, and should be prefixed with
 778 | a name that is unique to the defining server or gateway.  For
 779 | example, ``mod_python`` might define variables with names like
 780 | ``mod_python.some_variable``.
 781 | 
 782 | 
 783 | Input and Error Streams
 784 | ~~~~~~~~~~~~~~~~~~~~~~~
 785 | 
 786 | The input and error streams provided by the server must support
 787 | the following methods:
 788 | 
 789 | ===================  ==========  ========
 790 | Method               Stream      Notes
 791 | ===================  ==========  ========
 792 | ``read(size)``       ``input``   1
 793 | ``readline(hint)``   ``input``   1, 2
 794 | ``readlines(hint)``  ``input``   1, 3
 795 | ``__iter__()``       ``input``
 796 | ``flush()``          ``errors``  4
 797 | ``write(str)``       ``errors``
 798 | ``writelines(seq)``  ``errors``
 799 | ===================  ==========  ========
 800 | 
 801 | The semantics of each method are as documented in the Python Library
 802 | Reference, except for these notes as listed in the table above:
 803 | 
 804 | 1. The server is not required to read past the client's specified
 805 |    ``Content-Length``, and **should** simulate an end-of-file
 806 |    condition if the application attempts to read past that point.
 807 |    The application **should not** attempt to read more data than is
 808 |    specified by the ``CONTENT_LENGTH`` variable.
 809 | 
 810 |    A server **must** allow ``read()`` to be called without an argument,
 811 |    and return the remainder of the client's input stream. This implies
 812 |    blocking until the stream source source is closed in HTTP/2.
 813 | 
 814 |    A server **must** return empty bytestrings from any attempt to
 815 |    read from an empty or exhausted input stream.
 816 | 
 817 | 2. Servers **must** support the optional ``hint`` argument to ``readline()``.
 818 | 
 819 | 3. Note that the ``hint`` argument to ``readline`` and ``readlines()``
 820 |    is optional for both caller and implementer.  The application is
 821 |    free not to supply it, and the server or gateway is free to ignore
 822 |    it.
 823 | 
 824 | 4. Since the ``errors`` stream may not be rewound, servers and gateways
 825 |    are free to forward write operations immediately, without buffering.
 826 |    In this case, the ``flush()`` method may be a no-op.  Portable
 827 |    applications, however, cannot assume that output is unbuffered
 828 |    or that ``flush()`` is a no-op.  They must call ``flush()`` if
 829 |    they need to ensure that output has in fact been written.  (For
 830 |    example, to minimize intermingling of data from multiple processes
 831 |    writing to the same error log.)
 832 | 
 833 | The methods listed in the table above **must** be supported by all
 834 | servers conforming to this specification.  Applications conforming
 835 | to this specification **must not** use any other methods or attributes
 836 | of the ``input`` or ``errors`` objects.  In particular, applications
 837 | **must not** attempt to close these streams, even if they possess
 838 | ``close()`` methods.
 839 | 
 840 | 
 841 | The ``start_response()`` Callable
 842 | ---------------------------------
 843 | 
 844 | The second parameter passed to the application object is a callable
 845 | of the form ``start_response(status, response_headers, exc_info=None)``.
 846 | (As with all WSGI callables, the arguments must be supplied
 847 | positionally, not by keyword.)  The ``start_response`` callable is
 848 | used to begin the HTTP response, and it must return a
 849 | ``write(body_data)`` callable (see the `Buffering and Streaming`_
 850 | section, below).
 851 | 
 852 | The ``status`` argument is an HTTP "status" string like ``"200 OK"``
 853 | or ``"404 Not Found"``.  That is, it is a string consisting of a
 854 | Status-Code and a Reason-Phrase, in that order and separated by a
 855 | single space, with no surrounding whitespace or other characters.
 856 | (See RFC 2616, Section 6.1.1 for more information.)  The string
 857 | **must not** contain control characters, and must not be terminated
 858 | with a carriage return, linefeed, or combination thereof.
 859 | 
 860 | The ``response_headers`` argument is a list of ``(header_name,
 861 | header_value)`` tuples.  It must be a Python list; i.e.
 862 | ``type(response_headers) is ListType``, and the server **may** change
 863 | its contents in any way it desires.  Each ``header_name`` must be a
 864 | valid HTTP header field-name (as defined by RFC 2616, Section 4.2),
 865 | without a trailing colon or other punctuation.
 866 | 
 867 | Each ``header_value`` **must not** include *any* control characters,
 868 | including carriage returns or linefeeds, either embedded or at the end.
 869 | (These requirements are to minimize the complexity of any parsing that
 870 | must be performed by servers, gateways, and intermediate response
 871 | processors that need to inspect or modify response headers.)
 872 | 
 873 | In general, the server or gateway is responsible for ensuring that
 874 | correct headers are sent to the client: if the application omits
 875 | a header required by HTTP (or other relevant specifications that are in
 876 | effect), the server or gateway **must** add it.  For example, the HTTP
 877 | ``Date:`` and ``Server:`` headers would normally be supplied by the
 878 | server or gateway.
 879 | 
 880 | (A reminder for server/gateway authors: HTTP header names are
 881 | case-insensitive, so be sure to take that into consideration when
 882 | examining application-supplied headers!)
 883 | 
 884 | Applications and middleware are forbidden from using HTTP/1.1
 885 | "hop-by-hop" features or headers, any equivalent features in HTTP/1.0,
 886 | or any headers that would affect the persistence of the client's
 887 | connection to the web server.  These features are the
 888 | exclusive province of the actual web server, and a server or gateway
 889 | **should** consider it a fatal error for an application to attempt
 890 | sending them, and raise an error if they are supplied to
 891 | ``start_response()``.  (For more specifics on "hop-by-hop" features and
 892 | headers, please see the `Other HTTP Features`_ section below.)
 893 | 
 894 | Servers **must** check for errors in the headers at the time
 895 | ``start_response`` is called, so that an error can be raised while
 896 | the application is still running.
 897 | 
 898 | However, the ``start_response`` callable **must not** actually transmit the
 899 | response headers.  Instead, it must store them for the server or
 900 | gateway to transmit **only** after the first iteration of the
 901 | application return value that yields a non-empty bytestring, or upon
 902 | the application's first invocation of the ``write()`` callable.  In
 903 | other words, response headers must not be sent until there is actual
 904 | body data available, or until the application's returned iterable is
 905 | exhausted.  (The only possible exception to this rule is if the
 906 | response headers explicitly include a ``Content-Length`` of zero.)
 907 | 
 908 | This delaying of response header transmission is to ensure that buffered
 909 | and asynchronous applications can replace their originally intended
 910 | output with error output, up until the last possible moment.  For
 911 | example, the application may need to change the response status from
 912 | "200 OK" to "500 Internal Error", if an error occurs while the body is
 913 | being generated within an application buffer.
 914 | 
 915 | The ``exc_info`` argument, if supplied, must be a Python
 916 | ``sys.exc_info()`` tuple.  This argument should be supplied by the
 917 | application only if ``start_response`` is being called by an error
 918 | handler.  If ``exc_info`` is supplied, and no HTTP headers have been
 919 | output yet, ``start_response`` should replace the currently-stored
 920 | HTTP response headers with the newly-supplied ones, thus allowing the
 921 | application to "change its mind" about the output when an error has
 922 | occurred.
 923 | 
 924 | However, if ``exc_info`` is provided, and the HTTP headers have already
 925 | been sent, ``start_response`` **must** raise an error, and **should**
 926 | re-raise using the ``exc_info`` tuple.  That is::
 927 | 
 928 |     raise exc_info[1].with_traceback(exc_info[2])
 929 | 
 930 | This will re-raise the exception trapped by the application, and in
 931 | principle should abort the application.  (It is not safe for the
 932 | application to attempt error output to the browser once the HTTP
 933 | headers have already been sent.)  The application **must not** trap
 934 | any exceptions raised by ``start_response``, if it called
 935 | ``start_response`` with ``exc_info``.  Instead, it should allow
 936 | such exceptions to propagate back to the server or gateway.  See
 937 | `Error Handling`_ below, for more details.
 938 | 
 939 | The application **may** call ``start_response`` more than once, if and
 940 | only if the ``exc_info`` argument is provided.  More precisely, it is
 941 | a fatal error to call ``start_response`` without the ``exc_info``
 942 | argument if ``start_response`` has already been called within the
 943 | current invocation of the application.  This includes the case where
 944 | the first call to ``start_response`` raised an error.  (See the example
 945 | CGI gateway above for an illustration of the correct logic.)
 946 | 
 947 | Note: servers, gateways, or middleware implementing ``start_response``
 948 | **should** ensure that no reference is held to the ``exc_info``
 949 | parameter beyond the duration of the function's execution, to avoid
 950 | creating a circular reference through the traceback and frames
 951 | involved.  The simplest way to do this is something like::
 952 | 
 953 |     def start_response(status, response_headers, exc_info=None):
 954 |         if exc_info:
 955 |              try:
 956 |                  # do stuff w/exc_info here
 957 |              finally:
 958 |                  exc_info = None    # Avoid circular ref.
 959 | 
 960 | The example CGI gateway provides another illustration of this
 961 | technique.
 962 | 
 963 | 
 964 | Handling the ``Content-Length`` Header
 965 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 966 | 
 967 | If the application supplies a ``Content-Length`` header, the server
 968 | **should not** transmit more bytes to the client than the header
 969 | allows, and **must** stop iterating over the response when enough
 970 | data has been sent, or raise an error if the application tries to
 971 | ``write()`` past that point.  (Of course, if the application does
 972 | not provide *enough* data to meet its stated ``Content-Length``,
 973 | the server **should** close the connection and log or otherwise
 974 | report the error.)
 975 | 
 976 | If the application does not supply a ``Content-Length`` header, a
 977 | server or gateway may choose one of several approaches to handling
 978 | it.  The simplest of these is to close the client connection when
 979 | the response is completed.
 980 | 
 981 | Under some circumstances, however, the server or gateway may be
 982 | able to either generate a ``Content-Length`` header, or at least
 983 | avoid the need to close the client connection.  If the application
 984 | does *not* call the ``write()`` callable, and returns an iterable
 985 | whose ``len()`` is 1, then the server can automatically determine
 986 | ``Content-Length`` by taking the length of the first bytestring yielded
 987 | by the iterable.
 988 | 
 989 | And, if the server and client both support HTTP/1.1 "chunked
 990 | encoding" [3]_, then the server **may** use chunked encoding to send
 991 | a chunk for each ``write()`` call or bytestring yielded by the
 992 | iterable, thus not using a ``Content-Length`` header at all.  This
 993 | allows the server to keep the client connection alive, if it wishes
 994 | to do so.  Note that the server **must** comply fully with RFC 2616
 995 | when doing this, or else fall back to one of the other strategies for
 996 | dealing with the absence of ``Content-Length``.
 997 | 
 998 | (Note: applications and middleware **must not** apply any kind of
 999 | ``Transfer-Encoding`` to their output, such as chunking or gzipping;
1000 | as "hop-by-hop" operations, these encodings are the province of the
1001 | actual web server/gateway.  See `Other HTTP Features`_ below, for
1002 | more details.)
1003 | 
1004 | 
1005 | Buffering and Streaming
1006 | -----------------------
1007 | 
1008 | Generally speaking, applications will achieve the best throughput
1009 | by buffering their (modestly-sized) output and sending it all at
1010 | once.  This is a common approach in existing frameworks such as
1011 | Zope: the output is buffered in a StringIO or similar object, then
1012 | transmitted all at once, along with the response headers.
1013 | 
1014 | The corresponding approach in WSGI is for the application to simply
1015 | return a single-element iterable (such as a list) containing the
1016 | response body as a single bytestring.  This is the recommended approach
1017 | for the vast majority of application functions, that render
1018 | HTML pages whose text easily fits in memory.
1019 | 
1020 | For large files, however, or for specialized uses of HTTP streaming
1021 | (such as multipart "server push"), an application may need to provide
1022 | output in smaller blocks (e.g. to avoid loading a large file into
1023 | memory).  It's also sometimes the case that part of a response may
1024 | be time-consuming to produce, but it would be useful to send ahead the
1025 | portion of the response that precedes it.
1026 | 
1027 | In these cases, applications will usually return an iterator (often
1028 | a generator-iterator) that produces the output in a block-by-block
1029 | fashion.  These blocks may be broken to coincide with mulitpart
1030 | boundaries (for "server push"), or just before time-consuming
1031 | tasks (such as reading another block of an on-disk file).
1032 | 
1033 | WSGI servers, gateways, and middleware **must not** delay the
1034 | transmission of any block; they **must** either fully transmit
1035 | the block to the client, or guarantee that they will continue
1036 | transmission even while the application is producing its next block.
1037 | A server/gateway or middleware may provide this guarantee in one of
1038 | three ways:
1039 | 
1040 | 1. Send the entire block to the operating system (and request
1041 |    that any O/S buffers be flushed) before returning control
1042 |    to the application, OR
1043 | 
1044 | 2. Use a different thread to ensure that the block continues
1045 |    to be transmitted while the application produces the next
1046 |    block.
1047 | 
1048 | 3. (Middleware only) send the entire block to its parent
1049 |    gateway/server
1050 | 
1051 | By providing this guarantee, WSGI allows applications to ensure
1052 | that transmission will not become stalled at an arbitrary point
1053 | in their output data.  This is critical for proper functioning
1054 | of e.g. multipart "server push" streaming, where data between
1055 | multipart boundaries should be transmitted in full to the client.
1056 | 
1057 | 
1058 | Middleware Handling of Block Boundaries
1059 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1060 | 
1061 | In order to better support asynchronous applications and servers,
1062 | middleware components **must not** block iteration waiting for
1063 | multiple values from an application iterable.  If the middleware
1064 | needs to accumulate more data from the application before it can
1065 | produce any output, it **must** yield an empty bytestring.
1066 | 
1067 | To put this requirement another way, a middleware component **must
1068 | yield at least one value** each time its underlying application
1069 | yields a value.  If the middleware cannot yield any other value,
1070 | it must yield an empty bytestring.
1071 | 
1072 | This requirement ensures that asynchronous applications and servers
1073 | can conspire to reduce the number of threads that are required
1074 | to run a given number of application instances simultaneously.
1075 | 
1076 | Note also that this requirement means that middleware **must**
1077 | return an iterable as soon as its underlying application returns
1078 | an iterable.  It is also forbidden for middleware to use the
1079 | ``write()`` callable to transmit data that is yielded by an
1080 | underlying application.  Middleware may only use their parent
1081 | server's ``write()`` callable to transmit data that the
1082 | underlying application sent using a middleware-provided ``write()``
1083 | callable.
1084 | 
1085 | 
1086 | The ``write()`` Callable
1087 | ~~~~~~~~~~~~~~~~~~~~~~~~
1088 | 
1089 | Some existing application framework APIs support unbuffered
1090 | output in a different manner than WSGI.  Specifically, they
1091 | provide a "write" function or method of some kind to write
1092 | an unbuffered block of data, or else they provide a buffered
1093 | "write" function and a "flush" mechanism to flush the buffer.
1094 | 
1095 | Unfortunately, such APIs cannot be implemented in terms of
1096 | WSGI's "iterable" application return value, unless threads
1097 | or other special mechanisms are used.
1098 | 
1099 | Therefore, to allow these frameworks to continue using an
1100 | imperative API, WSGI includes a special ``write()`` callable,
1101 | returned by the ``start_response`` callable.
1102 | 
1103 | New WSGI applications and frameworks **should not** use the
1104 | ``write()`` callable if it is possible to avoid doing so.  The
1105 | ``write()`` callable is strictly a hack to support imperative
1106 | streaming APIs.  In general, applications should produce their
1107 | output via their returned iterable, as this makes it possible
1108 | for web servers to interleave other tasks in the same Python thread,
1109 | potentially providing better throughput for the server as a whole.
1110 | 
1111 | The ``write()`` callable is returned by the ``start_response()``
1112 | callable, and it accepts a single parameter:  a bytestring to be
1113 | written as part of the HTTP response body, that is treated exactly
1114 | as though it had been yielded by the output iterable.  In other
1115 | words, before ``write()`` returns, it must guarantee that the
1116 | passed-in bytestring was either completely sent to the client, or
1117 | that it is buffered for transmission while the application
1118 | proceeds onward.
1119 | 
1120 | An application **must** return an iterable object, even if it
1121 | uses ``write()`` to produce all or part of its response body.
1122 | The returned iterable **may** be empty (i.e. yield no non-empty
1123 | bytestrings), but if it *does* yield non-empty bytestrings, that output
1124 | must be treated normally by the server or gateway (i.e., it must be
1125 | sent or queued immediately).  Applications **must not** invoke
1126 | ``write()`` from within their return iterable, and therefore any
1127 | bytestrings yielded by the iterable are transmitted after all bytestrings
1128 | passed to ``write()`` have been sent to the client.
1129 | 
1130 | 
1131 | Unicode Issues
1132 | --------------
1133 | 
1134 | HTTP does not directly support Unicode, and neither does this
1135 | interface.  All encoding/decoding must be handled by the application;
1136 | all strings passed to or from the server must be of type ``str`` or
1137 | ``bytes``, never ``unicode``.  The result of using a ``unicode``
1138 | object where a string object is required, is undefined.
1139 | 
1140 | Note also that strings passed to ``start_response()`` as a status or
1141 | as response headers **must** follow RFC 2616 with respect to encoding.
1142 | That is, they must either be ISO-8859-1 characters, or use RFC 2047
1143 | MIME encoding.
1144 | 
1145 | On Python platforms where the ``str`` or ``StringType`` type is in
1146 | fact Unicode-based (e.g. Jython, IronPython, Python 3, etc.), all
1147 | "strings" referred to in this specification must contain only
1148 | code points representable in ISO-8859-1 encoding (``\u0000`` through
1149 | ``\u00FF``, inclusive).  It is a fatal error for an application to
1150 | supply strings containing any other Unicode character or code point.
1151 | Similarly, servers and gateways **must not** supply
1152 | strings to an application containing any other Unicode characters.
1153 | 
1154 | Again, all objects referred to in this specification as "strings"
1155 | **must** be of type ``str`` or ``StringType``, and **must not** be
1156 | of type ``unicode`` or ``UnicodeType``.  And, even if a given platform
1157 | allows for more than 8 bits per character in ``str``/``StringType``
1158 | objects, only the lower 8 bits may be used, for any value referred
1159 | to in this specification as a "string".
1160 | 
1161 | For values referred to in this specification as "bytestrings"
1162 | (i.e., values read from ``wsgi.input``, passed to ``write()``
1163 | or yielded by the application), the value **must** be of type
1164 | ``bytes`` under Python 3, and ``str`` in earlier versions of
1165 | Python.
1166 | 
1167 | 
1168 | Error Handling
1169 | --------------
1170 | 
1171 | In general, applications **should** try to trap their own, internal
1172 | errors, and display a helpful message in the browser.  (It is up
1173 | to the application to decide what "helpful" means in this context.)
1174 | 
1175 | However, to display such a message, the application must not have
1176 | actually sent any data to the browser yet, or else it risks corrupting
1177 | the response.  WSGI therefore provides a mechanism to either allow the
1178 | application to send its error message, or be automatically aborted:
1179 | the ``exc_info`` argument to ``start_response``.  Here is an example
1180 | of its use::
1181 | 
1182 |     try:
1183 |         # regular application code here
1184 |         status = "200 Froody"
1185 |         response_headers = [("content-type", "text/plain")]
1186 |         start_response(status, response_headers)
1187 |         return ["normal body goes here"]
1188 |     except:
1189 |         # XXX should trap runtime issues like MemoryError, KeyboardInterrupt
1190 |         #     in a separate handler before this bare 'except:'...
1191 |         status = "500 Oops"
1192 |         response_headers = [("content-type", "text/plain")]
1193 |         start_response(status, response_headers, sys.exc_info())
1194 |         return ["error body goes here"]
1195 | 
1196 | If no output has been written when an exception occurs, the call to
1197 | ``start_response`` will return normally, and the application will
1198 | return an error body to be sent to the browser.  However, if any output
1199 | has already been sent to the browser, ``start_response`` will reraise
1200 | the provided exception.  This exception **should not** be trapped by
1201 | the application, and so the application will abort.  The server or
1202 | gateway can then trap this (fatal) exception and abort the response.
1203 | 
1204 | Servers **should** trap and log any exception that aborts an
1205 | application or the iteration of its return value.  If a partial
1206 | response has already been written to the browser when an application
1207 | error occurs, the server or gateway **may** attempt to add an error
1208 | message to the output, if the already-sent headers indicate a
1209 | ``text/*`` content type that the server knows how to modify cleanly.
1210 | 
1211 | Some middleware may wish to provide additional exception handling
1212 | services, or intercept and replace application error messages.  In
1213 | such cases, middleware may choose to **not** re-raise the ``exc_info``
1214 | supplied to ``start_response``, but instead raise a middleware-specific
1215 | exception, or simply return without an exception after storing the
1216 | supplied arguments.  This will then cause the application to return
1217 | its error body iterable (or invoke ``write()``), allowing the middleware
1218 | to capture and modify the error output.  These techniques will work as
1219 | long as application authors:
1220 | 
1221 | 1. Always provide ``exc_info`` when beginning an error response
1222 | 
1223 | 2. Never trap errors raised by ``start_response`` when ``exc_info`` is
1224 |    being provided
1225 | 
1226 | 
1227 | HTTP 1.1 Expect/Continue
1228 | ------------------------
1229 | 
1230 | Servers and gateways that implement HTTP 1.1 **must** provide
1231 | transparent support for HTTP 1.1's "expect/continue" mechanism.  This
1232 | may be done in any of several ways:
1233 | 
1234 | 1. Respond to requests containing an ``Expect: 100-continue`` request
1235 |    with an immediate "100 Continue" response, and proceed normally.
1236 | 
1237 | 2. Proceed with the request normally, but provide the application
1238 |    with a ``wsgi.input`` stream that will send the "100 Continue"
1239 |    response if/when the application first attempts to read from the
1240 |    input stream.  The read request must then remain blocked until the
1241 |    client responds.
1242 | 
1243 | 3. Wait until the client decides that the server does not support
1244 |    expect/continue, and sends the request body on its own.  (This
1245 |    is suboptimal, and is not recommended.)
1246 | 
1247 | Note that these behavior restrictions do not apply for HTTP 1.0
1248 | requests, or for requests that are not directed to an application
1249 | object.  For more information on HTTP 1.1 Expect/Continue, see RFC
1250 | 2616, sections 8.2.3 and 10.1.1.
1251 | 
1252 | 
1253 | Other HTTP Features
1254 | -------------------
1255 | 
1256 | In general, servers and gateways should "play dumb" and allow the
1257 | application complete control over its output.  They should only make
1258 | changes that do not alter the effective semantics of the application's
1259 | response.  It is always possible for the application developer to add
1260 | middleware components to supply additional features, so server/gateway
1261 | developers should be conservative in their implementation.  In a sense,
1262 | a server should consider itself to be like an HTTP "gateway server",
1263 | with the application being an HTTP "origin server".  (See RFC 2616,
1264 | section 1.3, for the definition of these terms.)
1265 | 
1266 | However, because WSGI servers and applications do not communicate via
1267 | HTTP, what RFC 2616 calls "hop-by-hop" headers do not apply to WSGI
1268 | internal communications.  WSGI applications **must not** generate any
1269 | "hop-by-hop" headers [4]_, attempt to use HTTP features that would
1270 | require them to generate such headers, or rely on the content of
1271 | any incoming "hop-by-hop" headers in the ``environ`` dictionary.
1272 | WSGI servers **must** handle any supported inbound "hop-by-hop" headers
1273 | on their own, such as by decoding any inbound ``Transfer-Encoding``,
1274 | including chunked encoding if applicable.
1275 | 
1276 | Applying these principles to a variety of HTTP features, it should be
1277 | clear that a server **may** handle cache validation via the
1278 | ``If-None-Match`` and ``If-Modified-Since`` request headers and the
1279 | ``Last-Modified`` and ``ETag`` response headers.  However, it is
1280 | not required to do this, and the application **should** perform its
1281 | own cache validation if it wants to support that feature, since
1282 | the server/gateway is not required to do such validation.
1283 | 
1284 | Similarly, a server **may** re-encode or transport-encode an
1285 | application's response, but the application **should** use a
1286 | suitable content encoding on its own, and **must not** apply a
1287 | transport encoding.  A server **may** transmit byte ranges of the
1288 | application's response if requested by the client, and the
1289 | application doesn't natively support byte ranges.  Again, however,
1290 | the application **should** perform this function on its own if desired.
1291 | 
1292 | Note that these restrictions on applications do not necessarily mean
1293 | that every application must reimplement every HTTP feature; many HTTP
1294 | features can be partially or fully implemented by middleware
1295 | components, thus freeing both server and application authors from
1296 | implementing the same features over and over again.
1297 | 
1298 | 
1299 | Thread Support
1300 | --------------
1301 | 
1302 | Thread support, or lack thereof, is also server-dependent.
1303 | Servers that can run multiple requests in parallel, **should** also
1304 | provide the option of running an application in a single-threaded
1305 | fashion, so that applications or frameworks that are not thread-safe
1306 | may still be used with that server.
1307 | 
1308 | 
1309 | 
1310 | Implementation/Application Notes
1311 | ================================
1312 | 
1313 | 
1314 | Server Extension APIs
1315 | ---------------------
1316 | 
1317 | Some server authors may wish to expose more advanced APIs, that
1318 | application or framework authors can use for specialized purposes.
1319 | For example, a gateway based on ``mod_python`` might wish to expose
1320 | part of the Apache API as a WSGI extension.
1321 | 
1322 | In the simplest case, this requires nothing more than defining an
1323 | ``environ`` variable, such as ``mod_python.some_api``.  But, in many
1324 | cases, the possible presence of middleware can make this difficult.
1325 | For example, an API that offers access to the same HTTP headers that
1326 | are found in ``environ`` variables, might return different data if
1327 | ``environ`` has been modified by middleware.
1328 | 
1329 | In general, any extension API that duplicates, supplants, or bypasses
1330 | some portion of WSGI functionality runs the risk of being incompatible
1331 | with middleware components.  Server/gateway developers should *not*
1332 | assume that nobody will use middleware, because some framework
1333 | developers specifically intend to organize or reorganize their
1334 | frameworks to function almost entirely as middleware of various kinds.
1335 | 
1336 | So, to provide maximum compatibility, servers and gateways that
1337 | provide extension APIs that replace some WSGI functionality, **must**
1338 | design those APIs so that they are invoked using the portion of the
1339 | API that they replace.  For example, an extension API to access HTTP
1340 | request headers must require the application to pass in its current
1341 | ``environ``, so that the server/gateway may verify that HTTP headers
1342 | accessible via the API have not been altered by middleware.  If the
1343 | extension API cannot guarantee that it will always agree with
1344 | ``environ`` about the contents of HTTP headers, it must refuse service
1345 | to the application, e.g. by raising an error, returning ``None``
1346 | instead of a header collection, or whatever is appropriate to the API.
1347 | 
1348 | Similarly, if an extension API provides an alternate means of writing
1349 | response data or headers, it should require the ``start_response``
1350 | callable to be passed in, before the application can obtain the
1351 | extended service.  If the object passed in is not the same one that
1352 | the server/gateway originally supplied to the application, it cannot
1353 | guarantee correct operation and must refuse to provide the extended
1354 | service to the application.
1355 | 
1356 | These guidelines also apply to middleware that adds information such
1357 | as parsed cookies, form variables, sessions, and the like to
1358 | ``environ``.  Specifically, such middleware should provide these
1359 | features as functions which operate on ``environ``, rather than simply
1360 | stuffing values into ``environ``.  This helps ensure that information
1361 | is calculated from ``environ`` *after* any middleware has done any URL
1362 | rewrites or other ``environ`` modifications.
1363 | 
1364 | It is very important that these "safe extension" rules be followed by
1365 | both server/gateway and middleware developers, in order to avoid a
1366 | future in which middleware developers are forced to delete any and all
1367 | extension APIs from ``environ`` to ensure that their mediation isn't
1368 | being bypassed by applications using those extensions!
1369 | 
1370 | 
1371 | Application Configuration
1372 | -------------------------
1373 | 
1374 | This specification does not define how a server selects or obtains an
1375 | application to invoke.  These and other configuration options are
1376 | highly server-specific matters.  It is expected that server/gateway
1377 | authors will document how to configure the server to execute a
1378 | particular application object, and with what options (such as
1379 | threading options).
1380 | 
1381 | Framework authors, on the other hand, should document how to create an
1382 | application object that wraps their framework's functionality.  The
1383 | user, who has chosen both the server and the application framework,
1384 | must connect the two together.  However, since both the framework and
1385 | the server now have a common interface, this should be merely a
1386 | mechanical matter, rather than a significant engineering effort for
1387 | each new server/framework pair.
1388 | 
1389 | Finally, some applications, frameworks, and middleware may wish to
1390 | use the ``environ`` dictionary to receive simple string configuration
1391 | options.  Servers and gateways **should** support this by allowing
1392 | an application's deployer to specify name-value pairs to be placed in
1393 | ``environ``.  In the simplest case, this support can consist merely of
1394 | copying all operating system-supplied environment variables from
1395 | ``os.environ`` into the ``environ`` dictionary, since the deployer in
1396 | principle can configure these externally to the server, or in the
1397 | CGI case they may be able to be set via the server's configuration
1398 | files.
1399 | 
1400 | Applications **should** try to keep such required variables to a
1401 | minimum, since not all servers will support easy configuration of
1402 | them.  Of course, even in the worst case, persons deploying an
1403 | application can create a script to supply the necessary configuration
1404 | values::
1405 | 
1406 |    from the_app import application
1407 | 
1408 |    def new_app(environ, start_response):
1409 |        environ['the_app.configval1'] = 'something'
1410 |        return application(environ, start_response)
1411 | 
1412 | But, most existing applications and frameworks will probably only need
1413 | a single configuration value from ``environ``, to indicate the location
1414 | of their application or framework-specific configuration file(s).  (Of
1415 | course, applications should cache such configuration, to avoid having
1416 | to re-read it upon each invocation.)
1417 | 
1418 | 
1419 | URL Reconstruction
1420 | ------------------
1421 | 
1422 | If an application wishes to reconstruct a request's complete URL, it
1423 | may do so using the following algorithm, contributed by Ian Bicking::
1424 | 
1425 |     from urllib import quote
1426 |     url = environ['wsgi.url_scheme']+'://'
1427 | 
1428 |     if environ.get('HTTP_HOST'):
1429 |         url += environ['HTTP_HOST']
1430 |     else:
1431 |         url += environ['SERVER_NAME']
1432 | 
1433 |         if environ['wsgi.url_scheme'] == 'https':
1434 |             if environ['SERVER_PORT'] != '443':
1435 |                url += ':' + environ['SERVER_PORT']
1436 |         else:
1437 |             if environ['SERVER_PORT'] != '80':
1438 |                url += ':' + environ['SERVER_PORT']
1439 | 
1440 |     url += quote(environ.get('SCRIPT_NAME', ''))
1441 |     url += quote(environ.get('PATH_INFO', ''))
1442 |     if environ.get('QUERY_STRING'):
1443 |         url += '?' + environ['QUERY_STRING']
1444 | 
1445 | Note that such a reconstructed URL may not be precisely the same URI
1446 | as requested by the client.  Server rewrite rules, for example, may
1447 | have modified the client's originally requested URL to place it in a
1448 | canonical form.
1449 | 
1450 | 
1451 | Optional Platform-Specific File Handling
1452 | ----------------------------------------
1453 | 
1454 | Some operating environments provide special high-performance file-
1455 | transmission facilities, such as the Unix ``sendfile()`` call.
1456 | Servers and gateways **may** expose this functionality via an optional
1457 | ``wsgi.file_wrapper`` key in the ``environ``.  An application
1458 | **may** use this "file wrapper" to convert a file or file-like object
1459 | into an iterable that it then returns, e.g.::
1460 | 
1461 |     if 'wsgi.file_wrapper' in environ:
1462 |         return environ['wsgi.file_wrapper'](filelike, block_size)
1463 |     else:
1464 |         return iter(lambda: filelike.read(block_size), '')
1465 | 
1466 | If the server or gateway supplies ``wsgi.file_wrapper``, it must be
1467 | a callable that accepts one required positional parameter, and one
1468 | optional positional parameter.  The first parameter is the file-like
1469 | object to be sent, and the second parameter is an optional block
1470 | size "suggestion" (which the server/gateway need not use).  The
1471 | callable **must** return an iterable object, and **must not** perform
1472 | any data transmission until and unless the server/gateway actually
1473 | receives the iterable as a return value from the application.
1474 | (To do otherwise would prevent middleware from being able to interpret
1475 | or override the response data.)
1476 | 
1477 | To be considered "file-like", the object supplied by the application
1478 | must have a ``read()`` method that takes an optional size argument.
1479 | It **may** have a ``close()`` method, and if so, the iterable returned
1480 | by ``wsgi.file_wrapper`` **must** have a ``close()`` method that
1481 | invokes the original file-like object's ``close()`` method.  If the
1482 | "file-like" object has any other methods or attributes with names
1483 | matching those of Python built-in file objects (e.g. ``fileno()``),
1484 | the ``wsgi.file_wrapper`` **may** assume that these methods or
1485 | attributes have the same semantics as those of a built-in file object.
1486 | 
1487 | The actual implementation of any platform-specific file handling
1488 | must occur **after** the application returns, and the server or
1489 | gateway checks to see if a wrapper object was returned.  (Again,
1490 | because of the presence of middleware, error handlers, and the like,
1491 | it is not guaranteed that any wrapper created will actually be used.)
1492 | 
1493 | Apart from the handling of ``close()``, the semantics of returning a
1494 | file wrapper from the application should be the same as if the
1495 | application had returned ``iter(filelike.read, '')``.  In other words,
1496 | transmission should begin at the current position within the "file"
1497 | at the time that transmission begins, and continue until the end is
1498 | reached, or until ``Content-Length`` bytes have been written.  (If
1499 | the application doesn't supply a ``Content-Length``, the server **may**
1500 | generate one from the file using its knowledge of the underlying file
1501 | implementation.)
1502 | 
1503 | Of course, platform-specific file transmission APIs don't usually
1504 | accept arbitrary "file-like" objects.  Therefore, a
1505 | ``wsgi.file_wrapper`` has to introspect the supplied object for
1506 | things such as a ``fileno()`` (Unix-like OSes) or a
1507 | ``java.nio.FileChannel`` (under Jython) in order to determine if
1508 | the file-like object is suitable for use with the platform-specific
1509 | API it supports.
1510 | 
1511 | Note that even if the object is *not* suitable for the platform API,
1512 | the ``wsgi.file_wrapper`` **must** still return an iterable that wraps
1513 | ``read()`` and ``close()``, so that applications using file wrappers
1514 | are portable across platforms.  Here's a simple platform-agnostic
1515 | file wrapper class, suitable for old (pre 2.2) and new Pythons alike::
1516 | 
1517 |     class FileWrapper:
1518 | 
1519 |         def __init__(self, filelike, blksize=8192):
1520 |             self.filelike = filelike
1521 |             self.blksize = blksize
1522 |             if hasattr(filelike, 'close'):
1523 |                 self.close = filelike.close
1524 | 
1525 |         def __getitem__(self, key):
1526 |             data = self.filelike.read(self.blksize)
1527 |             if data:
1528 |                 return data
1529 |             raise IndexError
1530 | 
1531 | and here is a snippet from a server/gateway that uses it to provide
1532 | access to a platform-specific API::
1533 | 
1534 |     environ['wsgi.file_wrapper'] = FileWrapper
1535 |     result = application(environ, start_response)
1536 | 
1537 |     try:
1538 |         if isinstance(result, FileWrapper):
1539 |             # check if result.filelike is usable w/platform-specific
1540 |             # API, and if so, use that API to transmit the result.
1541 |             # If not, fall through to normal iterable handling
1542 |             # loop below.
1543 | 
1544 |         for data in result:
1545 |             # etc.
1546 | 
1547 |     finally:
1548 |         if hasattr(result, 'close'):
1549 |             result.close()
1550 | 
1551 | 
1552 | Questions and Answers
1553 | =====================
1554 | 
1555 | 1. Why must ``environ`` be a dictionary?  What's wrong with using a
1556 |    subclass?
1557 | 
1558 |    The rationale for requiring a dictionary is to maximize portability
1559 |    between servers.  The alternative would be to define some subset of
1560 |    a dictionary's methods as being the standard and portable
1561 |    interface.  In practice, however, most servers will probably find a
1562 |    dictionary adequate to their needs, and thus framework authors will
1563 |    come to expect the full set of dictionary features to be available,
1564 |    since they will be there more often than not.  But, if some server
1565 |    chooses *not* to use a dictionary, then there will be
1566 |    interoperability problems despite that server's "conformance" to
1567 |    spec.  Therefore, making a dictionary mandatory simplifies the
1568 |    specification and guarantees interoperabilty.
1569 | 
1570 |    Note that this does not prevent server or framework developers from
1571 |    offering specialized services as custom variables *inside* the
1572 |    ``environ`` dictionary.  This is the recommended approach for
1573 |    offering any such value-added services.
1574 | 
1575 | 2. Why can you call ``write()`` *and* yield bytestrings/return an
1576 |    iterable?  Shouldn't we pick just one way?
1577 | 
1578 |    If we supported only the iteration approach, then current
1579 |    frameworks that assume the availability of "push" suffer.  But, if
1580 |    we only support pushing via ``write()``, then server performance
1581 |    suffers for transmission of e.g. large files (if a worker thread
1582 |    can't begin work on a new request until all of the output has been
1583 |    sent).  Thus, this compromise allows an application framework to
1584 |    support both approaches, as appropriate, but with only a little
1585 |    more burden to the server implementor than a push-only approach
1586 |    would require.
1587 | 
1588 | 3. What's the ``close()`` for?
1589 | 
1590 |    When writes are done during the execution of an application
1591 |    object, the application can ensure that resources are released
1592 |    using a try/finally block.  But, if the application returns an
1593 |    iterable, any resources used will not be released until the
1594 |    iterable is garbage collected.  The ``close()`` idiom allows an
1595 |    application to release critical resources at the end of a request,
1596 |    and it's forward-compatible with the support for try/finally in
1597 |    generators that's proposed by PEP 325.
1598 | 
1599 | 4. Why is this interface so low-level?  I want feature X!  (e.g.
1600 |    cookies, sessions, persistence, ...)
1601 | 
1602 |    This isn't Yet Another Python Web Framework.  It's just a way for
1603 |    frameworks to talk to web servers, and vice versa.  If you want
1604 |    these features, you need to pick a web framework that provides the
1605 |    features you want.  And if that framework lets you create a WSGI
1606 |    application, you should be able to run it in most WSGI-supporting
1607 |    servers.  Also, some WSGI servers may offer additional services via
1608 |    objects provided in their ``environ`` dictionary; see the
1609 |    applicable server documentation for details.  (Of course,
1610 |    applications that use such extensions will not be portable to other
1611 |    WSGI-based servers.)
1612 | 
1613 | 5. Why use CGI variables instead of good old HTTP headers?  And why
1614 |    mix them in with WSGI-defined variables?
1615 | 
1616 |    Many existing web frameworks are built heavily upon the CGI spec,
1617 |    and existing web servers know how to generate CGI variables.  In
1618 |    contrast, alternative ways of representing inbound HTTP information
1619 |    are fragmented and lack market share.  Thus, using the CGI
1620 |    "standard" seems like a good way to leverage existing
1621 |    implementations.  As for mixing them with WSGI variables,
1622 |    separating them would just require two dictionary arguments to be
1623 |    passed around, while providing no real benefits.
1624 | 
1625 | 6. What about the status string?  Can't we just use the number,
1626 |    passing in ``200`` instead of ``"200 OK"``?
1627 | 
1628 |    Doing this would complicate the server or gateway, by requiring
1629 |    them to have a table of numeric statuses and corresponding
1630 |    messages.  By contrast, it is easy for an application or framework
1631 |    author to type the extra text to go with the specific response code
1632 |    they are using, and existing frameworks often already have a table
1633 |    containing the needed messages.  So, on balance it seems better to
1634 |    make the application/framework responsible, rather than the server
1635 |    or gateway.
1636 | 
1637 | 7. Why is ``wsgi.run_once`` not guaranteed to run the app only once?
1638 | 
1639 |    Because it's merely a suggestion to the application that it should
1640 |    "rig for infrequent running".  This is intended for application
1641 |    frameworks that have multiple modes of operation for caching,
1642 |    sessions, and so forth.  In a "multiple run" mode, such frameworks
1643 |    may preload caches, and may not write e.g. logs or session data to
1644 |    disk after each request.  In "single run" mode, such frameworks
1645 |    avoid preloading and flush all necessary writes after each request.
1646 | 
1647 |    However, in order to test an application or framework to verify
1648 |    correct operation in the latter mode, it may be necessary (or at
1649 |    least expedient) to invoke it more than once.  Therefore, an
1650 |    application should not assume that it will definitely not be run
1651 |    again, just because it is called with ``wsgi.run_once`` set to
1652 |    ``True``.
1653 | 
1654 | 8. Feature X (dictionaries, callables, etc.) are ugly for use in
1655 |    application code; why don't we use objects instead?
1656 | 
1657 |    All of these implementation choices of WSGI are specifically
1658 |    intended to *decouple* features from one another; recombining these
1659 |    features into encapsulated objects makes it somewhat harder to
1660 |    write servers or gateways, and an order of magnitude harder to
1661 |    write middleware that replaces or modifies only small portions of
1662 |    the overall functionality.
1663 | 
1664 |    In essence, middleware wants to have a "Chain of Responsibility"
1665 |    pattern, whereby it can act as a "handler" for some functions,
1666 |    while allowing others to remain unchanged.  This is difficult to do
1667 |    with ordinary Python objects, if the interface is to remain
1668 |    extensible.  For example, one must use ``__getattr__`` or
1669 |    ``__getattribute__`` overrides, to ensure that extensions (such as
1670 |    attributes defined by future WSGI versions) are passed through.
1671 | 
1672 |    This type of code is notoriously difficult to get 100% correct, and
1673 |    few people will want to write it themselves.  They will therefore
1674 |    copy other people's implementations, but fail to update them when
1675 |    the person they copied from corrects yet another corner case.
1676 | 
1677 |    Further, this necessary boilerplate would be pure excise, a
1678 |    developer tax paid by middleware developers to support a slightly
1679 |    prettier API for application framework developers.  But,
1680 |    application framework developers will typically only be updating
1681 |    *one* framework to support WSGI, and in a very limited part of
1682 |    their framework as a whole.  It will likely be their first (and
1683 |    maybe their only) WSGI implementation, and thus they will likely
1684 |    implement with this specification ready to hand.  Thus, the effort
1685 |    of making the API "prettier" with object attributes and suchlike
1686 |    would likely be wasted for this audience.
1687 | 
1688 |    We encourage those who want a prettier (or otherwise improved) WSGI
1689 |    interface for use in direct web application programming (as opposed
1690 |    to web framework development) to develop APIs or frameworks that
1691 |    wrap WSGI for convenient use by application developers.  In this
1692 |    way, WSGI can remain conveniently low-level for server and
1693 |    middleware authors, while not being "ugly" for application
1694 |    developers.
1695 | 
1696 | 
1697 | Proposed/Under Discussion
1698 | =========================
1699 | 
1700 | These items are currently being discussed on the Web-SIG and elsewhere,
1701 | or are on the PEP author's "to-do" list:
1702 | 
1703 | * Should ``wsgi.input`` be an iterator instead of a file?  This would
1704 |   help for asynchronous applications and chunked-encoding input
1705 |   streams.
1706 | 
1707 | * Optional extensions are being discussed for pausing iteration of an
1708 |   application's output until input is available or until a callback
1709 |   occurs.
1710 | 
1711 | * Add a section about synchronous vs. asynchronous apps and servers,
1712 |   the relevant threading models, and issues/design goals in these
1713 |   areas.
1714 | 
1715 | 
1716 | Acknowledgements
1717 | ================
1718 | 
1719 | Thanks go to the many folks on the Web-SIG mailing list whose
1720 | thoughtful feedback made this revised draft possible.  Especially:
1721 | 
1722 | * Gregory "Grisha" Trubetskoy, author of ``mod_python``, who beat up
1723 |   on the first draft as not offering any advantages over "plain old
1724 |   CGI", thus encouraging me to look for a better approach.
1725 | 
1726 | * Ian Bicking, who helped nag me into properly specifying the
1727 |   multithreading and multiprocess options, as well as badgering me to
1728 |   provide a mechanism for servers to supply custom extension data to
1729 |   an application.
1730 | 
1731 | * Tony Lownds, who came up with the concept of a ``start_response``
1732 |   function that took the status and headers, returning a ``write``
1733 |   function.  His input also guided the design of the exception handling
1734 |   facilities, especially in the area of allowing for middleware that
1735 |   overrides application error messages.
1736 | 
1737 | * Alan Kennedy, whose courageous attempts to implement WSGI-on-Jython
1738 |   (well before the spec was finalized) helped to shape the "supporting
1739 |   older versions of Python" section, as well as the optional
1740 |   ``wsgi.file_wrapper`` facility, and some of the early bytes/unicode
1741 |   decisions.
1742 | 
1743 | * Mark Nottingham, who reviewed the spec extensively for issues with
1744 |   HTTP RFC compliance, especially with regard to HTTP/1.1 features that
1745 |   I didn't even know existed until he pointed them out.
1746 | 
1747 | * Graham Dumpleton, who worked tirelessly (even in the face of my laziness
1748 |   and stupidity) to get some sort of Python 3 version of WSGI out, who
1749 |   proposed the "native strings" vs. "byte strings" concept, and thoughtfully
1750 |   wrestled through a great many HTTP, ``wsgi.input``, and other
1751 |   amendments.  Most, if not all, of the credit for this new PEP
1752 |   belongs to him.
1753 | 
1754 | 
1755 | References
1756 | ==========
1757 | 
1758 | .. [1] The Python Wiki "Web Programming" topic
1759 |    (http://www.python.org/cgi-bin/moinmoin/WebProgramming)
1760 | 
1761 | .. [2] The Common Gateway Interface Specification, v 1.1, 3rd Draft
1762 |    (http://ken.coar.org/cgi/draft-coar-cgi-v11-03.txt)
1763 | 
1764 | .. [3] "Chunked Transfer Coding" -- HTTP/1.1, section 3.6.1
1765 |    (http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.6.1)
1766 | 
1767 | .. [4] "End-to-end and Hop-by-hop Headers" -- HTTP/1.1, Section 13.5.1
1768 |    (http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.5.1)
1769 | 
1770 | .. [5] mod_ssl Reference, "Environment Variables"
1771 |    (http://www.modssl.org/docs/2.8/ssl_reference.html#ToC25)
1772 | 
1773 | .. [6] Procedural issues regarding modifications to PEP \333
1774 |    (http://mail.python.org/pipermail/python-dev/2010-September/104114.html)
1775 | 
1776 | .. [7] SVN revision history for PEP \3333, showing differences from PEP 333
1777 |    (http://svn.python.org/view/peps/trunk/pep-3333.txt?r1=84854&r2=HEAD)
1778 | 
1779 | Copyright
1780 | =========
1781 | 
1782 | This document has been placed in the public domain.
1783 | 
1784 | 
1785 | 
1786 | ..
1787 |    Local Variables:
1788 |    mode: indented-text
1789 |    indent-tabs-mode: nil
1790 |    sentence-end-double-space: t
1791 |    fill-column: 70
1792 |    End:
1793 | 


--------------------------------------------------------------------------------
/setup.cfg:
--------------------------------------------------------------------------------
 1 | [metadata]
 2 | name = wsging
 3 | author = Robert Collins
 4 | author-email = rbtcollins@hp.com
 5 | summary = WSGI revamp for HTTP/2, websockets, etc
 6 | description-file =
 7 |     README.rst
 8 | home-page = https://github.com/python-web-sig/wsgii-ng
 9 | classifier =
10 |     Development Status :: 4 - Beta
11 |     Intended Audience :: Developers
12 |     License :: OSI Approved :: Apache Software License
13 |     Operating System :: OS Independent
14 |     Programming Language :: Python
15 |     Programming Language :: Python :: 2
16 |     Programming Language :: Python :: 2.7
17 |     Programming Language :: Python :: 3
18 |     Programming Language :: Python :: 3.3
19 | 
20 | [files]
21 | packages =
22 |     wsging
23 | 
24 | [global]
25 | setup-hooks =
26 |     pbr.hooks.setup_hook
27 | 
28 | [build_sphinx]
29 | source-dir = doc/source
30 | build-dir = doc/build
31 | all_files = 1
32 | 
33 | [upload_sphinx]
34 | upload-dir = doc/build/html
35 | 
36 | [wheel]
37 | universal = 1
38 | 


--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python
 2 | # Copyright (c) 2014 Hewlett-Packard Development Company, L.P.
 3 | #
 4 | # Licensed under the Apache License, Version 2.0 (the "License");
 5 | # you may not use this file except in compliance with the License.
 6 | # You may obtain a copy of the License at
 7 | #
 8 | #    http://www.apache.org/licenses/LICENSE-2.0
 9 | #
10 | # Unless required by applicable law or agreed to in writing, software
11 | # distributed under the License is distributed on an "AS IS" BASIS,
12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
13 | # implied.
14 | # See the License for the specific language governing permissions and
15 | # limitations under the License.
16 | 
17 | import setuptools
18 | setuptools.setup(setup_requires=['pbr'], pbr=True)
19 | 


--------------------------------------------------------------------------------
/wsging/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/python-web-sig/wsgi-ng/efddbc5a8b043a0e52623c599de53f45a557c508/wsging/__init__.py


--------------------------------------------------------------------------------