├── .editorconfig ├── .gitignore ├── .travis.yml ├── CONTRIBUTING.md ├── LICENSE ├── README.md ├── README.rst ├── pymjq ├── __init__.py ├── jobqueue.py └── test.py ├── requirements.txt └── setup.py /.editorconfig: -------------------------------------------------------------------------------- 1 | ; 2 | ; Global Editor Config 3 | ; 4 | ; This is an ini style configuration. See http://editorconfig.org/ for more information on this file. 5 | ; 6 | ; Top level editor config. 7 | root = true 8 | ; Always use Unix style new lines with new line ending on every file and trim whitespace 9 | [*] 10 | end_of_line = lf 11 | insert_final_newline = true 12 | trim_trailing_whitespace = true 13 | ; Python: PEP8 defines 4 spaces for indentation 14 | [*.py] 15 | indent_style = space 16 | indent_size = 4 17 | ; YAML format, 2 spaces 18 | [*.yaml, *.yml] 19 | indent_style = space 20 | indent_size = 2 21 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | *.py[cod] 2 | 3 | # C extensions 4 | *.so 5 | 6 | # Packages 7 | *.egg 8 | *.egg-info 9 | dist 10 | build 11 | eggs 12 | parts 13 | bin 14 | var 15 | sdist 16 | develop-eggs 17 | .installed.cfg 18 | lib 19 | lib64 20 | MANIFEST 21 | 22 | # Installer logs 23 | pip-log.txt 24 | 25 | # Unit test / coverage reports 26 | .coverage 27 | .tox 28 | nosetests.xml 29 | tmp 30 | 31 | # Translations 32 | *.mo 33 | 34 | # Code editor 35 | *~ 36 | venv/* -------------------------------------------------------------------------------- /.travis.yml: -------------------------------------------------------------------------------- 1 | language: python 2 | python: 3 | - "2.7" 4 | services: mongodb 5 | # command to install dependencies 6 | install: "pip install -r requirements.txt" 7 | # # command to run tests 8 | script: 9 | - coverage run pymjq/test.py 10 | 11 | after_success: codecov 12 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # How to contribute 2 | 3 | This project operates with the open contributor model where anyone is welcome to contribute towards development in the form of code review, testing and submitting fixes/patches. If you have a new feature you would like to propose, please create an issue first so we can discuss it together. 4 | 5 | ## Submitting a Change 6 | 7 | To submit a change, please do the following: 8 | 9 | * Fork this repo 10 | * Create a new branch on your fork 11 | * Push commits to yr branch 12 | * Submit new Pull Request 13 | 14 | Please write clear/coherent messages for your commits: 15 | 16 | * For small changes you can do `$ git commit -m "yr brief summary"` 17 | * For larger changes please use the subject, body, footer style. A great resource for writing good commit messages can be found [here](http://chris.beams.io/posts/git-commit/). 18 | 19 | ### Testing 20 | 21 | Anyone submitting code to the project is strongly encouraged to also include tests. We currently have written unit tests in `test.py`. If you have a better idea, please feel free to open an issue and discuss. 22 | 23 | ## Coding Conventions 24 | 25 | We use [EditorConfig](http://editorconfig.org/) to maintain a consistent coding style between different editors. Basically it just ensures that we use `pep8` in our Python files across the project. 26 | 27 | If you don't want to use EditorConfig, just make sure to: 28 | 29 | * Indent using four spaces for Python 30 | * Indent using two spaces for YAML 31 | 32 | 33 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright (c) 2016 Discogs 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # pymongo-job-queue [![travis build](https://img.shields.io/travis/discogs/pymongo-job-queue.svg)](https://travis-ci.org/discogs/pymongo-job-queue) [![Codecov](https://img.shields.io/codecov/c/github/discogs/pymongo-job-queue.svg)](https://codecov.io/github/discogs/pymongo-job-queue) [![version](https://img.shields.io/pypi/v/pymjq.svg)](https://pypi.python.org/pypi/pymjq) 2 | 3 | This package (`pymjq`) is a simple MongoDB based job queue for Python. By using capped collections and tailable cursors, you can queue up data to be consumed by a service worker in order to process your long running tasks asynchronously. 4 | 5 | This is currently used to send notifications on the Meta sites (a.k.a. Vinylhub, Bibliogs, Filmogs, Gearogs, Comicogs and the Reference Wiki). 6 | 7 | #### Dependencies 8 | * mongodb 2.6 9 | * pymongo 2.7.2 10 | * python 2.7 11 | 12 | ### Install 13 | 14 | ``` 15 | $ pip install pymjq 16 | ``` 17 | 18 | ### Examples 19 | 20 | ``` 21 | >>> from pymongo import MongoClient 22 | >>> from pymjq import JobQueue 23 | >>> client = MongoClient("localhost", 27017) 24 | >>> db = client.job_queue 25 | >>> jobqueue = JobQueue(db) 26 | Creating jobqueue collection. 27 | >>> jobqueue.valid(): 28 | True 29 | >>> jobqueue.pub({"message": "hello world!"}) # add a job to queue 30 | True 31 | >>> for j in jobqueue: 32 | ... print (j) 33 | ... print (j["data"]["message"]) 34 | ... 35 | --- 36 | Working on job: 37 | {u'status': u'waiting', u'_id': ObjectId('568d963d2c69a1e3ef34da84'), 38 | u'data': {u'message': u'hello world!'}... 39 | hello world! 40 | waiting! 41 | waiting! 42 | waiting! 43 | ... 44 | ^C Keyboard Interrupt 45 | >>> jobqueue.pub({"message": "hello again!"}) # add another job to queue 46 | True 47 | >>> j = jobqueue.next() 48 | True 49 | >>> print (j["data"]["message"]) 50 | hello again! 51 | print (j) 52 | {u'status': u'waiting', u'_id': ObjectId('568d963d2c69a1e3ef34da84'), 53 | u'data': {u'message': u'hello again!'}... 54 | >>> 55 | 56 | ``` 57 | 58 | ### How It Works 59 | 60 | * [Capped collections](http://docs.mongodb.org/manual/core/capped-collections/) ensure that documents are accessed in the natural order they are inserted into the collection. 61 | * [Tailable cursors](http://docs.mongodb.org/manual/tutorial/create-tailable-cursor/) give us a cursor which will stay open and wait for new documents to process if the job queue is empty. 62 | * The `JobQueue` class has an iterator that yields a document from our queue. The iterator will update a doc's status to 'working' and then 'done' once the worker has completed it's task. 63 | 64 | #### Jobs 65 | 66 | Job document, when added to the queue, has the following structure: 67 | 68 | ```python 69 | 70 | { 71 | 'ts': { 72 | 'created': datetime, 73 | 'started': datetime, 74 | 'done': datetime 75 | }, 76 | 'status': 'string', 77 | 'data': 'Your job data goes here! Define whatever structure you want. '' 78 | } 79 | 80 | ``` 81 | In the `data` field, the `JobQueue.pub()` method will add whatever data you pass as a parameter. The `ts` attributes will be updated as the document is worked on. 82 | 83 | ## Contributing 84 | 85 | Want to hack on this? Check out the "Submitting a Change" section in the [CONTRIBUTING](https://github.com/discogs/pymongo-job-queue/blob/master/CONTRIBUTING.md) doc. 86 | 87 | ### License 88 | 89 | [MIT](https://github.com/discogs/pymongo-job-queue/blob/master/LICENSE) Copyright (c) 2016 Discogs 90 | -------------------------------------------------------------------------------- /README.rst: -------------------------------------------------------------------------------- 1 | pymongo-job-queue |travis build| |Codecov| |version| 2 | ==================================================== 3 | 4 | This package (``pymjq``) is a simple MongoDB based job queue for Python. 5 | By using capped collections and tailable cursors, you can queue up data 6 | to be consumed by a service worker in order to process your long running 7 | tasks asynchronously. 8 | 9 | This is currently used to send notifications on the Meta sites (a.k.a. 10 | Vinylhub, Bibliogs, Filmogs, Gearogs, Comicogs and the Reference Wiki). 11 | 12 | Dependencies 13 | ^^^^^^^^^^^^ 14 | 15 | - pymongo 2.7.2 16 | 17 | Install 18 | ~~~~~~~ 19 | 20 | :: 21 | 22 | $ pip install pymjq 23 | 24 | Examples 25 | ~~~~~~~~ 26 | 27 | :: 28 | 29 | >>> from pymongo import MongoClient 30 | >>> from pymjq import JobQueue 31 | >>> client = MongoClient("localhost", 27017) 32 | >>> db = client.job_queue 33 | >>> jobqueue = JobQueue(db) 34 | Creating jobqueue collection. 35 | >>> jobqueue.valid(): 36 | True 37 | >>> jobqueue.pub({"message": "hello world!"}) # add a job to queue 38 | True 39 | >>> for j in jobqueue: 40 | ... print (j) 41 | ... print (j["data"]["message"]) 42 | ... 43 | --- 44 | Working on job: 45 | {u'status': u'waiting', u'_id': ObjectId('568d963d2c69a1e3ef34da84'), 46 | u'data': {u'message': u'hello world!'}... 47 | hello world! 48 | waiting! 49 | waiting! 50 | waiting! 51 | waiting! 52 | waiting! 53 | waiting! 54 | ... 55 | ^C Keyboard Interrupt 56 | >>> jobqueue.pub({"message": "hello again!"}) # add another job to queue 57 | True 58 | >>> j = jobqueue.next() 59 | True 60 | >>> print (j["data"]["message"]) 61 | hello again! 62 | print (j) 63 | {u'status': u'waiting', u'_id': ObjectId('568d963d2c69a1e3ef34da84'), 64 | u'data': {u'message': u'hello again!'}... 65 | >>> 66 | 67 | How It Works 68 | ~~~~~~~~~~~~ 69 | 70 | - `Capped 71 | collections `__ 72 | ensure that documents are accessed in the natural order they are 73 | inserted into the collection. 74 | - `Tailable 75 | cursors `__ 76 | give us a cursor which will stay open and wait for new documents to 77 | process if the job queue is empty. 78 | - The **JobQueue** class has an iterator that yields a document from 79 | our queue. The iterator will update a doc's status to 'working' and 80 | then 'done' once the worker has completed it's task. 81 | 82 | Jobs 83 | ^^^^ 84 | 85 | Job document, when added to the queue, has the following structure: 86 | 87 | .. code:: python 88 | 89 | 90 | { 91 | 'ts': { 92 | 'created': datetime, 93 | 'started': datetime, 94 | 'done': datetime 95 | }, 96 | 'status': 'string', 97 | 'site': 'string', 98 | 'data': 'Your job data goes here! Define whatever structure you want. '' 99 | } 100 | 101 | In the ``data`` field, the ``JobQueue.pub()`` method will add whatever 102 | data you pass as a parameter. The ``ts`` attributes will be updated as 103 | the document is worked on. 104 | 105 | License 106 | ~~~~~~~ 107 | 108 | MIT Copyright (c) 2016 Discogs 109 | 110 | .. |travis build| image:: https://img.shields.io/travis/discogs/pymongo-job-queue.svg 111 | :target: https://travis-ci.org/discogs/pymongo-job-queue 112 | .. |Codecov| image:: https://img.shields.io/codecov/c/github/discogs/pymongo-job-queue.svg 113 | :target: https://codecov.io/github/discogs/pymongo-job-queue 114 | .. |version| image:: https://img.shields.io/pypi/v/pymjq.svg 115 | :target: https://pypi.python.org/pypi/pymjq 116 | -------------------------------------------------------------------------------- /pymjq/__init__.py: -------------------------------------------------------------------------------- 1 | """MongoDB based message queue using pymongo""" 2 | 3 | __version__ = '1.1.0' 4 | __url__ = 'https://github.com/discogs/pymongo-job-queue' 5 | __author__ = 'Andy Craze' 6 | __email__ = 'acraze@discogsinc.com' 7 | 8 | from jobqueue import JobQueue 9 | -------------------------------------------------------------------------------- /pymjq/jobqueue.py: -------------------------------------------------------------------------------- 1 | import pymongo 2 | from datetime import datetime 3 | import time 4 | 5 | 6 | class JobQueue: 7 | 8 | def __init__(self, db, silent=False): 9 | """ Return an instance of a JobQueue. 10 | Initialization requires one argument, the database, 11 | since we use one jobqueue collection to cover all 12 | sites in an installation/database. The second 13 | argument specifies if to print status while waiting 14 | for new job, the default value is False""" 15 | self.db = db 16 | self.silent=silent 17 | if not self._exists(): 18 | print ('Creating jobqueue collection.') 19 | self._create() 20 | self.q = self.db['jobqueue'] 21 | 22 | def _create(self, capped=True): 23 | """ Creates a Capped Collection. """ 24 | # TODO - does the size parameter mean number of docs or bytesize? 25 | try: 26 | self.db.create_collection('jobqueue', 27 | capped=capped, max=100000, 28 | size=100000, autoIndexId=True) 29 | except: 30 | raise Exception('Collection "jobqueue" already created') 31 | 32 | def _exists(self): 33 | """ Ensures that the jobqueue collection exists in the DB. """ 34 | return 'jobqueue' in self.db.collection_names() 35 | 36 | def valid(self): 37 | """ Checks to see if the jobqueue is a capped collection. """ 38 | opts = self.db['jobqueue'].options() 39 | if opts.get('capped', False): 40 | return True 41 | return False 42 | 43 | def next(self): 44 | """ Runs the next job in the queue. """ 45 | cursor = self.q.find({'status': 'waiting'}, 46 | tailable=True) 47 | if cursor: 48 | row = cursor.next() 49 | row['status'] = 'done' 50 | row['ts']['started'] = datetime.now() 51 | row['ts']['done'] = datetime.now() 52 | self.q.save(row) 53 | try: 54 | return row 55 | except: 56 | raise Exception('There are no jobs in the queue') 57 | 58 | def pub(self, data=None): 59 | """ Publishes a doc to the work queue. """ 60 | doc = dict( 61 | ts={'created': datetime.now(), 62 | 'started': datetime.now(), 63 | 'done': datetime.now()}, 64 | status='waiting', 65 | data=data) 66 | try: 67 | self.q.insert(doc, manipulate=False) 68 | except: 69 | raise Exception('could not add to queue') 70 | return True 71 | 72 | def __iter__(self): 73 | """ Iterates through all docs in the queue 74 | andw aits for new jobs when queue is empty. """ 75 | cursor = self.q.find({'status': 'waiting'}, tailable=True) 76 | while 1: 77 | try: 78 | row = cursor.next() 79 | try: 80 | result = self.q.update({'_id': row['_id'], 81 | 'status': 'waiting'}, 82 | {'$set': { 83 | 'status': 'working', 84 | 'ts.started': datetime.now() 85 | } 86 | }) 87 | except OperationFailure: 88 | print ('Job Failed!!') 89 | continue 90 | print ('---') 91 | print ('Working on job:') 92 | yield row 93 | row['status'] = 'done' 94 | row['ts']['done'] = datetime.now() 95 | self.q.save(row) 96 | except: 97 | time.sleep(5) 98 | if not self.silent: 99 | print ('waiting!') 100 | 101 | def queue_count(self): 102 | """ Returns the number of jobs waiting in the queue. """ 103 | cursor = self.q.find({'status': 'waiting'}) 104 | if cursor: 105 | return cursor.count() 106 | 107 | def clear_queue(self): 108 | """ Drops the queue collection. """ 109 | self.q.drop() 110 | -------------------------------------------------------------------------------- /pymjq/test.py: -------------------------------------------------------------------------------- 1 | from pymongo import MongoClient 2 | from jobqueue import JobQueue 3 | import unittest 4 | 5 | host = 'localhost' 6 | port = 27017 7 | pair = '%s:%d' % (host, port) 8 | 9 | 10 | class TestJobQueue(unittest.TestCase): 11 | 12 | @classmethod 13 | def setUpClass(cls): 14 | client = MongoClient(host, port) 15 | client.pymongo_test.jobqueue.drop() 16 | cls.db = client.pymongo_test 17 | 18 | def test_init(self): 19 | jq = JobQueue(self.db) 20 | self.assertTrue(jq.valid()) 21 | self.assertRaises(Exception, jq._create) 22 | jq.clear_queue() 23 | 24 | def test_valid(self): 25 | jq = JobQueue(self.db) 26 | jq.db['jobqueue'].drop() 27 | jq._create(capped=False) 28 | self.assertFalse(jq.valid()) 29 | self.assertRaises(Exception, jq._create) 30 | jq.clear_queue() 31 | 32 | def test_publish(self): 33 | jq = JobQueue(self.db) 34 | job = {'message': 'hello world!'} 35 | jq.pub(job) 36 | self.assertEquals(jq.queue_count(), 1) 37 | jq.clear_queue() 38 | jq.q = None # erase the queue 39 | self.assertRaises(Exception, jq.pub, job) 40 | 41 | def test_next(self): 42 | jq = JobQueue(self.db) 43 | self.assertRaises(Exception, jq.next) 44 | job = {'message': 'hello world!'} 45 | jq.pub(job) 46 | row = jq.next() 47 | self.assertEquals(row['data']['message'], 'hello world!') 48 | jq.clear_queue() 49 | 50 | # def test_iter(self): 51 | # jq = JobQueue(self.db) 52 | # job = {'message': 'hello world!'} 53 | # jq.pub(job) 54 | # for job in jq: 55 | # if job: 56 | # self.assertTrue(True, "Found job") 57 | # jq.clear_queue() 58 | # return 59 | # self.assertEquals(False, "No jobs found!") 60 | # jq.clear_queue() 61 | 62 | 63 | if __name__ == '__main__': 64 | unittest.main() 65 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | pymongo==2.7.2 2 | coverage 3 | codecov 4 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | from distutils.core import setup 2 | from codecs import open # To use a consistent encoding 3 | from os import path 4 | 5 | # here = path.abspath(path.dirname(__file__)) 6 | 7 | # # Get the long description from the relevant file 8 | # with open(path.join(here, 'README.rst'), encoding='utf-8') as f: 9 | # long_description = f.read() 10 | 11 | setup( 12 | name = 'pymjq', 13 | packages = ['pymjq'], # this must be the same as the name above 14 | version = '1.2.0', 15 | description = 'Simple MongoDB based job queue', 16 | # long_description=long_description, 17 | license = 'MIT', 18 | author = 'Andy Craze', 19 | author_email = 'acraze@discogsinc.com', 20 | url = 'https://github.com/discogs/pymongo-job-queue', # use the URL to the github repo 21 | download_url = 'https://github.com/discogs/pymongo-job-queue/tarball/1.2.0', 22 | keywords = ['queue', 'pymongo', 'mongodb', 'job', 'async', 'worker', 'tail'], # arbitrary keywords 23 | classifiers = [ 24 | # Indicate who the project is intended for 25 | 'Intended Audience :: Developers', 26 | 'Topic :: Software Development :: Libraries :: Python Modules', 27 | 28 | # Pick your license as you wish (should match "license" above) 29 | 'License :: OSI Approved :: MIT License', 30 | 31 | # Specify the Python versions you support here. In particular, ensure 32 | # that you indicate whether you support Python 2, Python 3 or both. 33 | 'Programming Language :: Python :: 2.7' 34 | ] 35 | ) 36 | --------------------------------------------------------------------------------