├── .gitignore ├── README.md ├── client.py ├── config.py ├── requirements.txt └── server.py /.gitignore: -------------------------------------------------------------------------------- 1 | *.pyc 2 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | Async, non-blocking Flask & SQLAlchemy example 2 | ============================================== 3 | 4 | > [!WARNING] 5 | > This code is really old at this point. Use it for edification but not production! 6 | 7 | ## Overview 8 | 9 | This code shows how to use the following menagerie of compontents 10 | together in a completely non-blocking manner: 11 | 12 | * [Flask](http://flask.pocoo.org/), for the web application framework; 13 | * [SQLAlchemy](http://www.sqlalchemy.org/), for the object relational mapper (via [Flask-SQLAlchemy](https://github.com/mitsuhiko/flask-sqlalchemy)); 14 | * [Postgresql](http://www.postgresql.org/), for the database; 15 | * [Psycopg2](http://initd.org/psycopg/), for the SQLAlchemy-Postgresql adapter; 16 | * [Gunicorn](http://gunicorn.org/), for the WSGI server; and, 17 | * [Gevent](http://www.gevent.org/), for the networking library. 18 | 19 | The file `server.py` defines a small Flask application that has 20 | two routes: one that triggers a `time.sleep(5)` in Python and one that 21 | triggers a `pg_sleep(5)` in Postgres. Both of these sleeps are normally 22 | blocking operations. By running the server using the Gevent 23 | worker for Gunicorn, we can make the Python sleep non-blocking. 24 | By configuring Psycopg2's co-routine support (via 25 | [psycogreen](https://bitbucket.org/dvarrazzo/psycogreen)) we 26 | can make the Postgres sleep non-blocking. 27 | 28 | 29 | ## Installation 30 | 31 | Clone the repo: 32 | 33 | git clone https://github.com/kljensen/async-flask-sqlalchemy-example.git 34 | 35 | Install the requirements 36 | 37 | pip install -r requirements.txt 38 | 39 | Make sure you've got the required database 40 | 41 | createdb fsppgg_test 42 | 43 | Create the required tables in this database 44 | 45 | python ./server.py -c 46 | 47 | 48 | ## Running the code 49 | 50 | You can test three situations with this code: 51 | * Gunicorn blocking with SQLAlchemy/Psycopg2 blocking; 52 | * Gunicorn non-blocking with SQLAlchemy/Psycopg2 blocking; and, 53 | * Gunicorn non-blocking with SQLAlchemy/Psycopg2 non-blocking. 54 | 55 | ### Gunicorn blocking with SQLAlchemy blocking 56 | 57 | Run the server (which is the Flask application) like 58 | 59 | gunicorn server:app 60 | 61 | Then, in a separate shell, run the client like 62 | 63 | python ./client.py 64 | 65 | You should see output like 66 | 67 | Sending 5 requests for http://localhost:8000/sleep/python/... 68 | @ 5.05s got response [200] 69 | @ 10.05s got response [200] 70 | @ 15.07s got response [200] 71 | @ 20.07s got response [200] 72 | @ 25.08s got response [200] 73 | = 25.09s TOTAL 74 | Sending 5 requests for http://localhost:8000/sleep/postgres/... 75 | @ 5.02s got response [200] 76 | @ 10.02s got response [200] 77 | @ 15.03s got response [200] 78 | @ 20.04s got response [200] 79 | @ 25.05s got response [200] 80 | = 25.05s TOTAL 81 | ------------------------------------------ 82 | SUM TOTAL = 50.15s 83 | 84 | 85 | ### Gunicorn non-blocking with SQLAlchemy blocking 86 | 87 | Run the server like 88 | 89 | gunicorn server:app -k gevent 90 | 91 | and run the client again. You should see output like 92 | 93 | Sending 5 requests for http://localhost:8000/sleep/python/... 94 | @ 5.05s got response [200] 95 | @ 5.06s got response [200] 96 | @ 5.06s got response [200] 97 | @ 5.06s got response [200] 98 | @ 5.07s got response [200] 99 | = 5.08s TOTAL 100 | Sending 5 requests for http://localhost:8000/sleep/postgres/... 101 | @ 5.01s got response [200] 102 | @ 10.02s got response [200] 103 | @ 15.04s got response [200] 104 | @ 20.05s got response [200] 105 | @ 25.06s got response [200] 106 | = 25.06s TOTAL 107 | ------------------------------------------ 108 | SUM TOTAL = 30.14s 109 | 110 | 111 | ### Gunicorn non-blocking with SQLAlchemy non-blocking 112 | 113 | Run the server like 114 | 115 | PSYCOGREEN=true gunicorn server:app -k gevent 116 | 117 | and run the client again. You should see output like 118 | 119 | Sending 5 requests for http://localhost:8000/sleep/python/... 120 | @ 5.03s got response [200] 121 | @ 5.03s got response [200] 122 | @ 5.03s got response [200] 123 | @ 5.04s got response [200] 124 | @ 5.03s got response [200] 125 | = 5.04s TOTAL 126 | Sending 5 requests for http://localhost:8000/sleep/postgres/... 127 | @ 5.02s got response [200] 128 | @ 5.03s got response [200] 129 | @ 5.03s got response [200] 130 | @ 5.03s got response [200] 131 | @ 5.03s got response [200] 132 | = 5.03s TOTAL 133 | ------------------------------------------ 134 | SUM TOTAL = 10.07s 135 | 136 | 137 | ## Warnings (I lied, it actually does block) 138 | 139 | If you increase the number of requests made in `client.py` you'll notice 140 | that SQLAlchemy/Psycopg2 start to block again. Try, e.g. 141 | 142 | python ./client.py 100 143 | 144 | when running the server in fully non-blocking mode. You'll notice the `/sleep/postgres/` 145 | responses come back in sets of 15. (Well, probably 15, you could have your 146 | environment configured differently than I.) This because SQLAlchemy uses 147 | [connection pooling](http://docs.sqlalchemy.org/en/latest/core/pooling.html) 148 | and, by default, the [QueuePool](http://docs.sqlalchemy.org/en/latest/core/pooling.html#sqlalchemy.pool.QueuePool) 149 | which limits the number of connections to some configuration parameter 150 | `pool_size` plus a possible "burst" of `max_overflow`. (If you're using 151 | the [Flask-SQLAlchemy](https://github.com/mitsuhiko/flask-sqlalchemy) 152 | extension, `pool_size` is set by your Flask app's configuration variable 153 | `SQLALCHEMY_POOL_SIZE`. It is 5 by default. `max_overflow` is 10 by 154 | default and cannot be specified by a Flask configuration variable, you need 155 | to set it on the pool yourself.) Once you get over 156 | `pool_size + max_overflow` needed connections, the SQLAlchemy operations 157 | will block. You can get around this by disabling pooling via SQLAlchemy's 158 | [SQLAlchemy's NullPool](http://docs.sqlalchemy.org/en/latest/core/pooling.html#sqlalchemy.pool.NullPool); 159 | however, you probably don't want to do that for two reasons. 160 | 161 | 1. Postgresql has a configuration parameter `max_connections` that, drumroll, limits the 162 | number of connections. If `pool_size + max_overflow` exceeds `max_connections`, 163 | any new connection requests will be declined by your Postgresql instance. 164 | Each unique connection will cause Postgresql to use a non-trival amount of 165 | RAM. Therefore, unless you have a ton of RAM, you should keep `max_connections` 166 | to some reasonable value. 167 | 168 | 2. If you used the `NullPool`, you'd create a new TCP connection every 169 | time you use SQLAlchemy to talk to the database. Thus, you'll encur an 170 | overhead associated with the TCP handshake, etc. 171 | 172 | So, in effect, the concurrency for Postgresql operations is always 173 | limited by `max_connections` and how much RAM you have. 174 | 175 | 176 | ## Results 177 | 178 | Stuff gets faster, shizzle works fine. Your mileage may vary in production. 179 | 180 | 181 | ## License (MIT) 182 | 183 | Copyright (c) 2013 Kyle L. Jensen (kljensen@gmail.com) 184 | 185 | Permission is hereby granted, free of charge, to any person obtaining 186 | a copy of this software and associated documentation files (the 187 | "Software"), to deal in the Software without restriction, including 188 | without limitation the rights to use, copy, modify, merge, publish, 189 | distribute, sublicense, and/or sell copies of the Software, and to 190 | permit persons to whom the Software is furnished to do so, subject to 191 | the following conditions: 192 | 193 | The above copyright notice and this permission notice shall be 194 | included in all copies or substantial portions of the Software. 195 | 196 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 197 | EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 198 | MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. 199 | IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY 200 | CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, 201 | TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE 202 | SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 203 | -------------------------------------------------------------------------------- /client.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import gevent 3 | import time 4 | from gevent import monkey 5 | monkey.patch_all() 6 | import urllib2 7 | 8 | 9 | def fetch_url(url): 10 | """ Fetch a URL and return the total amount of time required. 11 | """ 12 | t0 = time.time() 13 | try: 14 | resp = urllib2.urlopen(url) 15 | resp_code = resp.code 16 | except urllib2.HTTPError, e: 17 | resp_code = e.code 18 | 19 | t1 = time.time() 20 | print("\t@ %5.2fs got response [%d]" % (t1 - t0, resp_code)) 21 | return t1 - t0 22 | 23 | 24 | def time_fetch_urls(url, num_jobs): 25 | """ Fetch a URL `num_jobs` times in parallel and return the 26 | total amount of time required. 27 | """ 28 | print("Sending %d requests for %s..." % (num_jobs, url)) 29 | t0 = time.time() 30 | jobs = [gevent.spawn(fetch_url, url) for i in range(num_jobs)] 31 | gevent.joinall(jobs) 32 | t1 = time.time() 33 | print("\t= %5.2fs TOTAL" % (t1 - t0)) 34 | return t1 - t0 35 | 36 | 37 | if __name__ == '__main__': 38 | 39 | try: 40 | num_requests = int(sys.argv[1]) 41 | except IndexError: 42 | num_requests = 5 43 | 44 | # Fetch the URL that blocks with a `time.sleep` 45 | t0 = time_fetch_urls("http://localhost:8000/sleep/python/", num_requests) 46 | 47 | # Fetch the URL that blocks with a `pg_sleep` 48 | t1 = time_fetch_urls("http://localhost:8000/sleep/postgres/", num_requests) 49 | 50 | print("------------------------------------------") 51 | print("SUM TOTAL = %.2fs" % (t0 + t1)) 52 | -------------------------------------------------------------------------------- /config.py: -------------------------------------------------------------------------------- 1 | SQLALCHEMY_DATABASE_URI = 'postgresql+psycopg2://localhost/fsppgg_test' 2 | SQLALCHEMY_ECHO = False 3 | SECRET_KEY = '\xfb\x12\xdf\xa1@i\xd6>V\xc0\xbb\x8fp\x16#Z\x0b\x81\xeb\x16' 4 | DEBUG = True 5 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | Flask-SQLAlchemy==0.16 2 | psycopg2==2.4.6 3 | psycogreen==1.0 4 | gevent==0.13.8 5 | gunicorn==0.17.2 -------------------------------------------------------------------------------- /server.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import os 3 | import time 4 | from flask import Flask, jsonify 5 | from flask.ext.sqlalchemy import SQLAlchemy 6 | 7 | 8 | # Optionally, set up psycopg2 & SQLAlchemy to be greenlet-friendly. 9 | # Note: psycogreen does not really monkey patch psycopg2 in the 10 | # manner that gevent monkey patches socket. 11 | # 12 | if "PSYCOGREEN" in os.environ: 13 | 14 | # Do our monkey patching 15 | # 16 | from gevent.monkey import patch_all 17 | patch_all() 18 | from psycogreen.gevent import patch_psycopg 19 | patch_psycopg() 20 | 21 | using_gevent = True 22 | else: 23 | using_gevent = False 24 | 25 | 26 | # Create our Flask app 27 | # 28 | app = Flask(__name__) 29 | app.config.from_pyfile('config.py') 30 | 31 | 32 | # Create our Flask-SQLAlchemy instance 33 | # 34 | db = SQLAlchemy(app) 35 | if using_gevent: 36 | 37 | # Assuming that gevent monkey patched the builtin 38 | # threading library, we're likely good to use 39 | # SQLAlchemy's QueuePool, which is the default 40 | # pool class. However, we need to make it use 41 | # threadlocal connections 42 | # 43 | # 44 | db.engine.pool._use_threadlocal = True 45 | 46 | 47 | class Todo(db.Model): 48 | """ Small example model just to show you that SQLAlchemy is 49 | doing everything it should be doing. 50 | """ 51 | id = db.Column(db.Integer, primary_key=True) 52 | title = db.Column(db.String(60)) 53 | done = db.Column(db.Boolean) 54 | priority = db.Column(db.Integer) 55 | 56 | def as_dict(self): 57 | """ Return an individual Todo as a dictionary. 58 | """ 59 | return { 60 | 'id': self.id, 61 | 'title': self.title, 62 | 'done': self.done, 63 | 'priority': self.priority 64 | } 65 | 66 | @classmethod 67 | def jsonify_all(cls): 68 | """ Returns all Todo instances in a JSON 69 | Flask response. 70 | """ 71 | return jsonify(todos=[todo.as_dict() for todo in cls.query.all()]) 72 | 73 | 74 | @app.route('/sleep/postgres/') 75 | def sleep_postgres(): 76 | """ This handler asks Postgres to sleep for 5s and will 77 | block for 5s unless psycopg2 is set up (above) to be 78 | gevent-friendly. 79 | """ 80 | db.session.execute('SELECT pg_sleep(5)') 81 | return Todo.jsonify_all() 82 | 83 | 84 | @app.route('/sleep/python/') 85 | def sleep_python(): 86 | """ This handler sleeps for 5s and will block for 5s unless 87 | gunicorn is using the gevent worker class. 88 | """ 89 | time.sleep(5) 90 | return Todo.jsonify_all() 91 | 92 | 93 | # Create the tables and populate it with some dummy data 94 | # 95 | def create_data(): 96 | """ A helper function to create our tables and some Todo objects. 97 | """ 98 | db.create_all() 99 | todos = [] 100 | for i in range(50): 101 | todo = Todo( 102 | title="Slave for the man {0}".format(i), 103 | done=(i % 2 == 0), 104 | priority=(i % 5) 105 | ) 106 | todos.append(todo) 107 | db.session.add_all(todos) 108 | db.session.commit() 109 | 110 | 111 | if __name__ == '__main__': 112 | 113 | if '-c' in sys.argv: 114 | create_data() 115 | else: 116 | app.run() 117 | --------------------------------------------------------------------------------