├── .gitignore ├── .travis.yml ├── HISTORY.md ├── LICENSE ├── MANIFEST ├── README.rst ├── docs ├── Makefile ├── api.rst ├── conf.py ├── examples.rst └── index.rst ├── redset ├── __init__.py ├── exceptions.py ├── interfaces.py ├── locks.py ├── serializers.py └── sets.py ├── setup.py └── tests ├── __init__.py ├── test_concurrency.py ├── test_serializers.py └── test_sets.py /.gitignore: -------------------------------------------------------------------------------- 1 | *.pyc 2 | docs/_build 3 | 4 | -------------------------------------------------------------------------------- /.travis.yml: -------------------------------------------------------------------------------- 1 | language: python 2 | python: 3 | - "2.7" 4 | - "3.4" 5 | # requires redis-server for multiprocess tests 6 | services: 7 | - redis-server 8 | # command to install dependencies 9 | install: 10 | - "pip install redis coveralls coverage nose-cov --use-mirrors" 11 | - "pip install ." 12 | # command to run tests 13 | script: 14 | - nosetests --with-cov --cov redset tests 15 | # push coverage info to coveralls.io 16 | after_success: 17 | - coveralls 18 | 19 | -------------------------------------------------------------------------------- /HISTORY.md: -------------------------------------------------------------------------------- 1 | # History 2 | 3 | ## 0.5.1 4 | 5 | - Add `available` method on ScheduledSet 6 | 7 | ## 0.5 8 | 9 | - Make compatible with Python 3 10 | 11 | ## 0.4.1 12 | 13 | - Introduce `position` kwarg to peek() (by @ynsnyc) 14 | 15 | ## 0.4 16 | 17 | - Reintroduce KeyError on empty pop() 18 | 19 | ## 0.3.3 20 | 21 | - ScheduledSet respects limit 22 | - Make removal compatible with versions of redis <2.4 23 | 24 | ## 0.3.2 25 | 26 | - Add `redis.ScheduledSet` for easy scheduled task processing 27 | 28 | ## 0.3.1 29 | 30 | - Use `redis.Redis.pipeline` for doing atomic set operations, batching multiple 31 | pops. 32 | 33 | ## 0.3 34 | 35 | - Add builtin serializer `NamedtupleSerializer` 36 | - Improve lock re: redis' spotty timestamp precision 37 | - Increase test coverage 38 | - Add coveralls to travis-CI 39 | 40 | ## 0.2.3 41 | 42 | - Documentation updates 43 | 44 | ## 0.2.2 45 | 46 | - Use `setuptools` now that distribute has been merged back into it 47 | 48 | ## 0.2.1 49 | 50 | - Change serializer interface to match `json` (dump, load `->` dumps, loads) 51 | 52 | ## 0.2.0 53 | 54 | - Spiked documentation with sphinx. 55 | - Converted docstrings to ReST. 56 | 57 | ## 0.1.2 58 | 59 | - Removed used of `mockredispy` because of its inconsistency with how redis 60 | actually behaves. 61 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright (c) 2012, Percolate 2 | All rights reserved. 3 | 4 | Redistribution and use in source and binary forms, with or without 5 | modification, are permitted provided that the following conditions are met: 6 | 7 | 1. Redistributions of source code must retain the above copyright notice, this 8 | list of conditions and the following disclaimer. 9 | 2. Redistributions in binary form must reproduce the above copyright notice, 10 | this list of conditions and the following disclaimer in the documentation 11 | and/or other materials provided with the distribution. 12 | 13 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND 14 | ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 15 | WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 16 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR 17 | ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES 18 | (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; 19 | LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND 20 | ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 21 | (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS 22 | SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 23 | 24 | The views and conclusions contained in the software and documentation are those 25 | of the authors and should not be interpreted as representing official policies, 26 | either expressed or implied, of the FreeBSD Project. 27 | -------------------------------------------------------------------------------- /MANIFEST: -------------------------------------------------------------------------------- 1 | # file GENERATED by distutils, do NOT edit 2 | setup.py 3 | redset/__init__.py 4 | redset/exceptions.py 5 | redset/interfaces.py 6 | redset/locks.py 7 | redset/sets.py 8 | -------------------------------------------------------------------------------- /README.rst: -------------------------------------------------------------------------------- 1 | redset 2 | ====== 3 | 4 | |PyPI version| |build status| |Coverage Status| 5 | 6 | You may not need heavyweights like Celery or RQ. Maintaing an AMQP server 7 | might be overkill. There's a simpler, easier way to distribute work. 8 | 9 | Redset provides simple, generic sorted sets backed by Redis that can be used to 10 | coordinate distributed systems and parcel out work. Unlike more common 11 | distribution libraries like Celery or RQ, redset avoids duplicate work for 12 | certain use-cases by maintaining a set of tasks instead of a list or queue. 13 | And it does so with a dead-simple interface that feels natural for Python. 14 | 15 | Redset is currently used in the wild to do things like 16 | 17 | - maintain a high-throughput work queue of streaming updates to be processed 18 | - power a multi-producer, multi-consumer scraping architecture that won't do 19 | the same work twice 20 | - maintain a simple, cross-process set of "seen" items that each have a 21 | TTL 22 | - schedule non-duplicate, periodic polling of analytics on social services 23 | 24 | 25 | Features 26 | -------- 27 | 28 | - No worker daemons to run, no heavy AMQP service to monitor 29 | - Safe for multiple producers and consumers 30 | - Seamless, simple use with Python objects using serializers 31 | - Zero dependencies: you provide an object that implements the 32 | ``redis.client.Redis`` interface, we don't ask any questions. 33 | - Simple, easy-to-read implementation 34 | - Mimics Python's native ``set`` interface 35 | - Battle-tested 36 | - Python 3 compatible 37 | 38 | Simple example 39 | -------------- 40 | 41 | .. code:: python 42 | 43 | import json 44 | import redis 45 | 46 | from redset import TimeSortedSet 47 | 48 | r = redis.Redis() 49 | ss = TimeSortedSet(r, 'important_json_biz', serializer=json) 50 | 51 | ss.add({'foo': 'bar1'}) 52 | ss.add({'foo': 'bar2'}) 53 | 54 | ss.add({'foo': 'bar3'}) 55 | ss.add({'foo': 'bar3'}) 56 | 57 | len(ss) 58 | # 3 59 | 60 | 61 | # ...some other process A 62 | 63 | ss.peek() 64 | # {'foo': 'bar1'} 65 | 66 | ss.pop() 67 | # {'foo': 'bar1'} 68 | 69 | 70 | # ...meanwhile in process B (at exactly same time as A's pop) 71 | 72 | ss.take(2) 73 | # [{'foo': 'bar2'}, {'foo': 'bar3'}] 74 | 75 | Docs 76 | ---- 77 | 78 | `Here `__ 79 | 80 | About 81 | ----- 82 | 83 | This software was developed at `Percolate `__, 84 | where we use it for all sorts of things that involve maintaining 85 | synchronized sets of things across process boundaries. A common use-case 86 | is to use redset for coordinating time-sensitive tasks where duplicate 87 | requests may be generated. 88 | 89 | Redset is unopinionated about how consumers look or behave. Want to have 90 | a plain 'ol Python consumer managed by supervisor? Fine. Want to be able 91 | to pop off items from within a celery job? Great. Redset has no say in 92 | where or how it is used: mechanism, not policy. 93 | 94 | Usage concepts 95 | -------------- 96 | 97 | ``redset.SortedSet`` and its subclasses can be instantiated with a few 98 | paramters that are notable. 99 | 100 | Specifying a serializer 101 | ~~~~~~~~~~~~~~~~~~~~~~~ 102 | 103 | Since Redis only stores primitive numbers and strings, handling 104 | serialization and deserialization is a key part of making redset set 105 | usage simple in Python. 106 | 107 | A ``serializer`` instance can be passed (which adheres to the 108 | ``redset.interfaces.Serializer`` interface, though it need not subclass 109 | it) to automatically handle packing and unpacking items managed with 110 | redset. 111 | 112 | Specifying a scorer 113 | ~~~~~~~~~~~~~~~~~~~ 114 | 115 | A callable that specifies how to generate a score for items being added 116 | can also be passed to SortedSet's constructor as ``scorer``. This 117 | callable takes one argument, which is the item *object* (i.e. the item 118 | before serialization) to be "scored." 119 | 120 | Related projects 121 | ---------------- 122 | 123 | - `redis-py `__ 124 | - `celery `__ 125 | - `RQ `__ 126 | 127 | .. |PyPI version| image:: https://badge.fury.io/py/redset.png 128 | :target: http://badge.fury.io/py/redset 129 | .. |build status| image:: https://travis-ci.org/percolate/redset.png?branch=master 130 | :target: https://travis-ci.org/percolate/redset 131 | .. |Coverage Status| image:: https://coveralls.io/repos/percolate/redset/badge.png?branch=master 132 | :target: https://coveralls.io/r/percolate/redset?branch=master 133 | -------------------------------------------------------------------------------- /docs/Makefile: -------------------------------------------------------------------------------- 1 | # Makefile for Sphinx documentation 2 | # 3 | 4 | # You can set these variables from the command line. 5 | SPHINXOPTS = 6 | SPHINXBUILD = sphinx-build 7 | PAPER = 8 | BUILDDIR = _build 9 | 10 | # Internal variables. 11 | PAPEROPT_a4 = -D latex_paper_size=a4 12 | PAPEROPT_letter = -D latex_paper_size=letter 13 | ALLSPHINXOPTS = -d $(BUILDDIR)/doctrees $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) . 14 | # the i18n builder cannot share the environment and doctrees with the others 15 | I18NSPHINXOPTS = $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) . 16 | 17 | .PHONY: help clean html dirhtml singlehtml pickle json htmlhelp qthelp devhelp epub latex latexpdf text man changes linkcheck doctest gettext 18 | 19 | help: 20 | @echo "Please use \`make ' where is one of" 21 | @echo " html to make standalone HTML files" 22 | @echo " dirhtml to make HTML files named index.html in directories" 23 | @echo " singlehtml to make a single large HTML file" 24 | @echo " pickle to make pickle files" 25 | @echo " json to make JSON files" 26 | @echo " htmlhelp to make HTML files and a HTML help project" 27 | @echo " qthelp to make HTML files and a qthelp project" 28 | @echo " devhelp to make HTML files and a Devhelp project" 29 | @echo " epub to make an epub" 30 | @echo " latex to make LaTeX files, you can set PAPER=a4 or PAPER=letter" 31 | @echo " latexpdf to make LaTeX files and run them through pdflatex" 32 | @echo " text to make text files" 33 | @echo " man to make manual pages" 34 | @echo " texinfo to make Texinfo files" 35 | @echo " info to make Texinfo files and run them through makeinfo" 36 | @echo " gettext to make PO message catalogs" 37 | @echo " changes to make an overview of all changed/added/deprecated items" 38 | @echo " linkcheck to check all external links for integrity" 39 | @echo " doctest to run all doctests embedded in the documentation (if enabled)" 40 | 41 | clean: 42 | -rm -rf $(BUILDDIR)/* 43 | 44 | html: 45 | $(SPHINXBUILD) -b html $(ALLSPHINXOPTS) $(BUILDDIR)/html 46 | @echo 47 | @echo "Build finished. The HTML pages are in $(BUILDDIR)/html." 48 | 49 | dirhtml: 50 | $(SPHINXBUILD) -b dirhtml $(ALLSPHINXOPTS) $(BUILDDIR)/dirhtml 51 | @echo 52 | @echo "Build finished. The HTML pages are in $(BUILDDIR)/dirhtml." 53 | 54 | singlehtml: 55 | $(SPHINXBUILD) -b singlehtml $(ALLSPHINXOPTS) $(BUILDDIR)/singlehtml 56 | @echo 57 | @echo "Build finished. The HTML page is in $(BUILDDIR)/singlehtml." 58 | 59 | pickle: 60 | $(SPHINXBUILD) -b pickle $(ALLSPHINXOPTS) $(BUILDDIR)/pickle 61 | @echo 62 | @echo "Build finished; now you can process the pickle files." 63 | 64 | json: 65 | $(SPHINXBUILD) -b json $(ALLSPHINXOPTS) $(BUILDDIR)/json 66 | @echo 67 | @echo "Build finished; now you can process the JSON files." 68 | 69 | htmlhelp: 70 | $(SPHINXBUILD) -b htmlhelp $(ALLSPHINXOPTS) $(BUILDDIR)/htmlhelp 71 | @echo 72 | @echo "Build finished; now you can run HTML Help Workshop with the" \ 73 | ".hhp project file in $(BUILDDIR)/htmlhelp." 74 | 75 | qthelp: 76 | $(SPHINXBUILD) -b qthelp $(ALLSPHINXOPTS) $(BUILDDIR)/qthelp 77 | @echo 78 | @echo "Build finished; now you can run "qcollectiongenerator" with the" \ 79 | ".qhcp project file in $(BUILDDIR)/qthelp, like this:" 80 | @echo "# qcollectiongenerator $(BUILDDIR)/qthelp/redset.qhcp" 81 | @echo "To view the help file:" 82 | @echo "# assistant -collectionFile $(BUILDDIR)/qthelp/redset.qhc" 83 | 84 | devhelp: 85 | $(SPHINXBUILD) -b devhelp $(ALLSPHINXOPTS) $(BUILDDIR)/devhelp 86 | @echo 87 | @echo "Build finished." 88 | @echo "To view the help file:" 89 | @echo "# mkdir -p $$HOME/.local/share/devhelp/redset" 90 | @echo "# ln -s $(BUILDDIR)/devhelp $$HOME/.local/share/devhelp/redset" 91 | @echo "# devhelp" 92 | 93 | epub: 94 | $(SPHINXBUILD) -b epub $(ALLSPHINXOPTS) $(BUILDDIR)/epub 95 | @echo 96 | @echo "Build finished. The epub file is in $(BUILDDIR)/epub." 97 | 98 | latex: 99 | $(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex 100 | @echo 101 | @echo "Build finished; the LaTeX files are in $(BUILDDIR)/latex." 102 | @echo "Run \`make' in that directory to run these through (pdf)latex" \ 103 | "(use \`make latexpdf' here to do that automatically)." 104 | 105 | latexpdf: 106 | $(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex 107 | @echo "Running LaTeX files through pdflatex..." 108 | $(MAKE) -C $(BUILDDIR)/latex all-pdf 109 | @echo "pdflatex finished; the PDF files are in $(BUILDDIR)/latex." 110 | 111 | text: 112 | $(SPHINXBUILD) -b text $(ALLSPHINXOPTS) $(BUILDDIR)/text 113 | @echo 114 | @echo "Build finished. The text files are in $(BUILDDIR)/text." 115 | 116 | man: 117 | $(SPHINXBUILD) -b man $(ALLSPHINXOPTS) $(BUILDDIR)/man 118 | @echo 119 | @echo "Build finished. The manual pages are in $(BUILDDIR)/man." 120 | 121 | texinfo: 122 | $(SPHINXBUILD) -b texinfo $(ALLSPHINXOPTS) $(BUILDDIR)/texinfo 123 | @echo 124 | @echo "Build finished. The Texinfo files are in $(BUILDDIR)/texinfo." 125 | @echo "Run \`make' in that directory to run these through makeinfo" \ 126 | "(use \`make info' here to do that automatically)." 127 | 128 | info: 129 | $(SPHINXBUILD) -b texinfo $(ALLSPHINXOPTS) $(BUILDDIR)/texinfo 130 | @echo "Running Texinfo files through makeinfo..." 131 | make -C $(BUILDDIR)/texinfo info 132 | @echo "makeinfo finished; the Info files are in $(BUILDDIR)/texinfo." 133 | 134 | gettext: 135 | $(SPHINXBUILD) -b gettext $(I18NSPHINXOPTS) $(BUILDDIR)/locale 136 | @echo 137 | @echo "Build finished. The message catalogs are in $(BUILDDIR)/locale." 138 | 139 | changes: 140 | $(SPHINXBUILD) -b changes $(ALLSPHINXOPTS) $(BUILDDIR)/changes 141 | @echo 142 | @echo "The overview file is in $(BUILDDIR)/changes." 143 | 144 | linkcheck: 145 | $(SPHINXBUILD) -b linkcheck $(ALLSPHINXOPTS) $(BUILDDIR)/linkcheck 146 | @echo 147 | @echo "Link check complete; look for any errors in the above output " \ 148 | "or in $(BUILDDIR)/linkcheck/output.txt." 149 | 150 | doctest: 151 | $(SPHINXBUILD) -b doctest $(ALLSPHINXOPTS) $(BUILDDIR)/doctest 152 | @echo "Testing of doctests in the sources finished, look at the " \ 153 | "results in $(BUILDDIR)/doctest/output.txt." 154 | -------------------------------------------------------------------------------- /docs/api.rst: -------------------------------------------------------------------------------- 1 | 2 | .. _api: 3 | 4 | API 5 | === 6 | 7 | .. module:: redset 8 | 9 | Introduction 10 | ------------ 11 | 12 | Redset offers a narrow interface consisting of a few objects. Most often, 13 | you'll be using an object that resembles a set. 14 | 15 | There are two interesting components to the sets in redset: *scorers* and 16 | *serializers*. 17 | 18 | Scorers determine what score the inserted item will be assigned (lower means 19 | popped sooner). Serializers determine what transformations happen on an item 20 | going into and coming out of redis. 21 | 22 | .. note:: 23 | If an Exception is thrown while deserialization is attempted on a 24 | particular item, ``None`` will be returned in its stead 25 | and an exception will be logged. In the case of :func:`SortedSet.take`, 26 | the failed item will simply be filtered from the returned list. 27 | 28 | 29 | Sets 30 | ---- 31 | 32 | .. autoclass:: SortedSet 33 | :members: 34 | 35 | .. automethod:: __init__ 36 | .. automethod:: __len__ 37 | .. automethod:: __contains__ 38 | 39 | 40 | Specialized sets 41 | ---------------- 42 | 43 | The only builtin concrete subclasses of :class:`SortedSet ` are 44 | sorted sets relating to time. One class maintains order based on time 45 | (:class:`TimeSortedSet `) and the other does the same, but 46 | won't return items until their score is less than or equal to now 47 | (:class:`ScheduledSet `). 48 | 49 | 50 | .. autoclass:: TimeSortedSet 51 | :show-inheritance: 52 | :members: 53 | 54 | .. automethod:: __init__ 55 | 56 | 57 | This ScheduledSet allows you to schedule items to be processed strictly 58 | in the future, which allows you to easily implement backoffs for expensive 59 | tasks that can't be repeated continuously. 60 | 61 | 62 | .. autoclass:: ScheduledSet 63 | :show-inheritance: 64 | :members: 65 | 66 | .. automethod:: __init__ 67 | 68 | 69 | 70 | Interfaces 71 | ---------- 72 | 73 | The :class:`Serializer` interface is included as a guideline for end-users. 74 | It need not be subclassed for concrete serializers. 75 | 76 | .. autoclass:: redset.interfaces.Serializer 77 | :members: 78 | 79 | 80 | .. module:: redset.serializers 81 | 82 | 83 | Builtin serializers 84 | ------------------- 85 | 86 | One serializer is included for convenience, and that's 87 | :class:`redset.serializers.NamedtupleSerializer`, which allows seamless use of 88 | namedtuples. 89 | 90 | .. autoclass:: redset.serializers.NamedtupleSerializer 91 | :members: 92 | 93 | .. automethod:: __init__ 94 | .. automethod:: dumps 95 | .. automethod:: loads 96 | 97 | 98 | 99 | -------------------------------------------------------------------------------- /docs/conf.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | # 3 | # redset documentation build configuration file, created by 4 | # sphinx-quickstart on Thu Sep 19 21:34:43 2013. 5 | # 6 | # This file is execfile()d with the current directory set to its containing 7 | # dir. 8 | # 9 | # Note that not all possible configuration values are present in this 10 | # autogenerated file. 11 | # 12 | # All configuration values have a default; values that are commented out 13 | # serve to show the default. 14 | 15 | import sys 16 | import os 17 | 18 | sys.path.append(os.path.join(os.path.dirname(__file__), '..')) 19 | 20 | import redset 21 | 22 | # If extensions (or modules to document with autodoc) are in another directory, 23 | # add these directories to sys.path here. If the directory is relative to the 24 | # documentation root, use os.path.abspath to make it absolute, like shown here. 25 | #sys.path.insert(0, os.path.abspath('.')) 26 | 27 | # -- General configuration ---------------------------------------------------- 28 | 29 | # If your documentation needs a minimal Sphinx version, state it here. 30 | #needs_sphinx = '1.0' 31 | 32 | # Add any Sphinx extension module names here, as strings. They can be 33 | # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom ones. 34 | extensions = ['sphinx.ext.autodoc', 35 | 'sphinx.ext.doctest', 36 | 'sphinx.ext.intersphinx', 37 | 'sphinx.ext.todo', 38 | 'sphinx.ext.coverage', 39 | 'sphinx.ext.ifconfig', 40 | 'sphinx.ext.viewcode'] 41 | 42 | # Add any paths that contain templates here, relative to this directory. 43 | templates_path = ['_templates'] 44 | 45 | # The suffix of source filenames. 46 | source_suffix = '.rst' 47 | 48 | # The encoding of source files. 49 | #source_encoding = 'utf-8-sig' 50 | 51 | # The master toctree document. 52 | master_doc = 'index' 53 | 54 | # General information about the project. 55 | project = u'redset' 56 | copyright = u'2013, jamesob, thekantian' 57 | 58 | # The version info for the project you're documenting, acts as replacement for 59 | # |version| and |release|, also used in various other places throughout the 60 | # built documents. 61 | # 62 | # The short X.Y version. 63 | version = redset.__version__ 64 | # The full version, including alpha/beta/rc tags. 65 | release = redset.__version__ 66 | 67 | # The language for content autogenerated by Sphinx. Refer to documentation 68 | # for a list of supported languages. 69 | #language = None 70 | 71 | # There are two options for replacing |today|: either, you set today to some 72 | # non-false value, then it is used: 73 | #today = '' 74 | # Else, today_fmt is used as the format for a strftime call. 75 | #today_fmt = '%B %d, %Y' 76 | 77 | # List of patterns, relative to source directory, that match files and 78 | # directories to ignore when looking for source files. 79 | exclude_patterns = ['_build'] 80 | 81 | # The reST default role (used for this markup: `text`) to use for all 82 | # documents. 83 | #default_role = None 84 | 85 | # If true, '()' will be appended to :func: etc. cross-reference text. 86 | #add_function_parentheses = True 87 | 88 | # If true, the current module name will be prepended to all description 89 | # unit titles (such as .. function::). 90 | #add_module_names = True 91 | 92 | # If true, sectionauthor and moduleauthor directives will be shown in the 93 | # output. They are ignored by default. 94 | #show_authors = False 95 | 96 | # The name of the Pygments (syntax highlighting) style to use. 97 | pygments_style = 'sphinx' 98 | 99 | # A list of ignored prefixes for module index sorting. 100 | #modindex_common_prefix = [] 101 | 102 | 103 | # -- Options for HTML output -------------------------------------------------- 104 | 105 | # The theme to use for HTML and HTML Help pages. See the documentation for 106 | # a list of builtin themes. 107 | html_theme = 'default' 108 | 109 | # Theme options are theme-specific and customize the look and feel of a theme 110 | # further. For a list of options available for each theme, see the 111 | # documentation. 112 | #html_theme_options = {} 113 | 114 | # Add any paths that contain custom themes here, relative to this directory. 115 | #html_theme_path = [] 116 | 117 | # The name for this set of Sphinx documents. If None, it defaults to 118 | # " v documentation". 119 | #html_title = None 120 | 121 | # A shorter title for the navigation bar. Default is the same as html_title. 122 | #html_short_title = None 123 | 124 | # The name of an image file (relative to this directory) to place at the top 125 | # of the sidebar. 126 | #html_logo = None 127 | 128 | # The name of an image file (within the static path) to use as favicon of the 129 | # docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32 130 | # pixels large. 131 | #html_favicon = None 132 | 133 | # Add any paths that contain custom static files (such as style sheets) here, 134 | # relative to this directory. They are copied after the builtin static files, 135 | # so a file named "default.css" will overwrite the builtin "default.css". 136 | html_static_path = ['_static'] 137 | 138 | # If not '', a 'Last updated on:' timestamp is inserted at every page bottom, 139 | # using the given strftime format. 140 | #html_last_updated_fmt = '%b %d, %Y' 141 | 142 | # If true, SmartyPants will be used to convert quotes and dashes to 143 | # typographically correct entities. 144 | #html_use_smartypants = True 145 | 146 | # Custom sidebar templates, maps document names to template names. 147 | #html_sidebars = {} 148 | 149 | # Additional templates that should be rendered to pages, maps page names to 150 | # template names. 151 | #html_additional_pages = {} 152 | 153 | # If false, no module index is generated. 154 | #html_domain_indices = True 155 | 156 | # If false, no index is generated. 157 | #html_use_index = True 158 | 159 | # If true, the index is split into individual pages for each letter. 160 | #html_split_index = False 161 | 162 | # If true, links to the reST sources are added to the pages. 163 | #html_show_sourcelink = True 164 | 165 | # If true, "Created using Sphinx" is shown in the HTML footer. Default is True. 166 | #html_show_sphinx = True 167 | 168 | # If true, "(C) Copyright ..." is shown in the HTML footer. Default is True. 169 | #html_show_copyright = True 170 | 171 | # If true, an OpenSearch description file will be output, and all pages will 172 | # contain a tag referring to it. The value of this option must be the 173 | # base URL from which the finished HTML is served. 174 | #html_use_opensearch = '' 175 | 176 | # This is the file name suffix for HTML files (e.g. ".xhtml"). 177 | #html_file_suffix = None 178 | 179 | # Output file base name for HTML help builder. 180 | htmlhelp_basename = 'redsetdoc' 181 | 182 | 183 | # -- Options for LaTeX output ------------------------------------------------- 184 | 185 | latex_elements = { 186 | # The paper size ('letterpaper' or 'a4paper'). 187 | #'papersize': 'letterpaper', 188 | 189 | # The font size ('10pt', '11pt' or '12pt'). 190 | #'pointsize': '10pt', 191 | 192 | # Additional stuff for the LaTeX preamble. 193 | #'preamble': '', 194 | } 195 | 196 | # Grouping the document tree into LaTeX files. List of tuples 197 | # (source start file, target name, title, author, 198 | # documentclass [howto/manual]). 199 | latex_documents = [('index', 200 | 'redset.tex', 201 | u'redset Documentation', 202 | u'jamesob, thekantian', 203 | 'manual'), ] 204 | 205 | # The name of an image file (relative to this directory) to place at the top of 206 | # the title page. 207 | #latex_logo = None 208 | 209 | # For "manual" documents, if this is true, then toplevel headings are parts, 210 | # not chapters. 211 | #latex_use_parts = False 212 | 213 | # If true, show page references after internal links. 214 | #latex_show_pagerefs = False 215 | 216 | # If true, show URL addresses after external links. 217 | #latex_show_urls = False 218 | 219 | # Documents to append as an appendix to all manuals. 220 | #latex_appendices = [] 221 | 222 | # If false, no module index is generated. 223 | #latex_domain_indices = True 224 | 225 | 226 | # -- Options for manual page output ------------------------------------------- 227 | 228 | # One entry per manual page. List of tuples 229 | # (source start file, name, description, authors, manual section). 230 | man_pages = [ 231 | ('index', 'redset', u'redset Documentation', 232 | [u'jamesob, thekantian'], 1) 233 | ] 234 | 235 | # If true, show URL addresses after external links. 236 | #man_show_urls = False 237 | 238 | 239 | # -- Options for Texinfo output ----------------------------------------------- 240 | 241 | # Grouping the document tree into Texinfo files. List of tuples 242 | # (source start file, target name, title, author, 243 | # dir menu entry, description, category) 244 | texinfo_documents = [ 245 | ('index', 'redset', u'redset Documentation', u'jamesob, thekantian', 246 | 'redset', 'One line description of project.', 'Miscellaneous'), 247 | ] 248 | 249 | # Documents to append as an appendix to all manuals. 250 | #texinfo_appendices = [] 251 | 252 | # If false, no module index is generated. 253 | #texinfo_domain_indices = True 254 | 255 | # How to display URL addresses: 'footnote', 'no', or 'inline'. 256 | #texinfo_show_urls = 'footnote' 257 | 258 | 259 | # -- Options for Epub output -------------------------------------------------- 260 | 261 | # Bibliographic Dublin Core info. 262 | epub_title = u'redset' 263 | epub_author = u'jamesob, thekantian' 264 | epub_publisher = u'jamesob, thekantian' 265 | epub_copyright = u'2013, jamesob, thekantian' 266 | 267 | # The language of the text. It defaults to the language option 268 | # or en if the language is not set. 269 | #epub_language = '' 270 | 271 | # The scheme of the identifier. Typical schemes are ISBN or URL. 272 | #epub_scheme = '' 273 | 274 | # The unique identifier of the text. This can be a ISBN number 275 | # or the project homepage. 276 | #epub_identifier = '' 277 | 278 | # A unique identification for the text. 279 | #epub_uid = '' 280 | 281 | # A tuple containing the cover image and cover page html template filenames. 282 | #epub_cover = () 283 | 284 | # HTML files that should be inserted before the pages created by sphinx. 285 | # The format is a list of tuples containing the path and title. 286 | #epub_pre_files = [] 287 | 288 | # HTML files shat should be inserted after the pages created by sphinx. 289 | # The format is a list of tuples containing the path and title. 290 | #epub_post_files = [] 291 | 292 | # A list of files that should not be packed into the epub file. 293 | #epub_exclude_files = [] 294 | 295 | # The depth of the table of contents in toc.ncx. 296 | #epub_tocdepth = 3 297 | 298 | # Allow duplicate toc entries. 299 | #epub_tocdup = True 300 | 301 | 302 | # Example configuration for intersphinx: refer to the Python standard library. 303 | intersphinx_mapping = {'http://docs.python.org/': None} 304 | -------------------------------------------------------------------------------- /docs/examples.rst: -------------------------------------------------------------------------------- 1 | 2 | .. _examples: 3 | 4 | Examples 5 | ======== 6 | 7 | Task system 8 | ------------- 9 | 10 | Here's an example that shows how to construct a very basic multi-producer, 11 | multi-consumer prioritized task system. 12 | 13 | :: 14 | 15 | import json 16 | from collections import namedtuple 17 | import redset, redis 18 | from redset.serializers import NamedtupleSerializer 19 | 20 | Task = namedtuple('Task', 'foo,bar,priority') 21 | 22 | task_set = redset.SortedSet( 23 | redis.Redis(), 24 | 'tasks', 25 | scorer=lambda task: task.priority, 26 | serializer=NamedtupleSerializer(Task), 27 | ) 28 | 29 | Now we can produce from anywhere: 30 | 31 | :: 32 | 33 | task_set.add(Task('yo', 'baz', 1)) 34 | task_set.add(Task('hey', 'roar', 0)) 35 | 36 | And maybe have a daemon that consumes: 37 | 38 | :: 39 | 40 | def process_tasks(): 41 | while True: 42 | for task in task_set.take(10): 43 | do_work_on_task(task) 44 | sleep(1) 45 | 46 | -------------------------------------------------------------------------------- /docs/index.rst: -------------------------------------------------------------------------------- 1 | .. redset documentation master file, created by 2 | sphinx-quickstart on Thu Sep 19 21:34:43 2013. 3 | You can adapt this file completely to your liking, but it should at least 4 | contain the root `toctree` directive. 5 | 6 | Redset: Redis-backed sorted sets for Python 7 | =========================================== 8 | 9 | Introduction 10 | ------------ 11 | 12 | Redset offers simple objects that mimic Python's builtin set, but are backed by 13 | Redis and safe to use concurrently across process boundaries. Time-sorted 14 | sets come included and are particularly interesting for use when 15 | time-sensitive and duplicate tasks are being generated from multiple sources. 16 | 17 | Traditional queuing solutions in Python don't easily allow users to ensure that 18 | messages are unique; this is especially important when you're generating a lot 19 | of time-consumptive tasks that may have overlap. Redset can help with this, 20 | among other things. 21 | 22 | Redset doesn't mandate the use of any particular consumer process or client. 23 | Production and consumption can happen easily from any piece of Python, making 24 | usage flexible and lightweight. 25 | 26 | Quick example 27 | ------------- 28 | 29 | :: 30 | 31 | >>> import json 32 | 33 | >>> ss = TimeSortedSet(redis.Redis(), 'json_biz', serializer=json) 34 | 35 | >>> ss.add({'foo': 'bar1'}, score=123) 36 | 123 37 | 38 | >>> {'foo': 'bar1'} in ss 39 | True 40 | 41 | >>> ss.score({'foo': 'bar1'}) 42 | 123 43 | 44 | >>> ss.pop() 45 | {'foo': 'bar1'} 46 | 47 | 48 | Redset is designed to be simple and pleasant to use. It was 49 | written at `Percolate `_ where it's used for all sorts of 50 | things, especially storing sets of time-sensitive tasks. 51 | 52 | 53 | Contents 54 | -------- 55 | 56 | .. toctree:: 57 | :maxdepth: 2 58 | 59 | examples 60 | api 61 | 62 | 63 | Authors 64 | ------- 65 | 66 | Written by `jamesob `_ and 67 | `thekantian `_. 68 | 69 | -------------------------------------------------------------------------------- /redset/__init__.py: -------------------------------------------------------------------------------- 1 | 2 | """ 3 | redset 4 | --------- 5 | 6 | Simple redis-backed, distributed sorted sets. 7 | 8 | """ 9 | 10 | __version__ = '0.5.1' 11 | 12 | from redset.sets import * 13 | -------------------------------------------------------------------------------- /redset/exceptions.py: -------------------------------------------------------------------------------- 1 | 2 | 3 | class LockTimeout(Exception): 4 | """ 5 | Raised when waiting too long on a set lock. 6 | 7 | """ 8 | pass 9 | -------------------------------------------------------------------------------- /redset/interfaces.py: -------------------------------------------------------------------------------- 1 | 2 | import abc 3 | 4 | 5 | class Serializer(object): 6 | """ 7 | This is a guideline for implementing a serializer for redset. Serializers 8 | need not subclass this directly, but should match the interface defined 9 | here. 10 | 11 | """ 12 | __metaclass__ = abc.ABCMeta 13 | 14 | @abc.abstractmethod 15 | def loads(self, str_from_redis): 16 | """ 17 | Deserialize a str item from redis into a Python object. 18 | 19 | :param str_from_redis: the str corresponding with an item in redis 20 | :type str_from_redis: str 21 | :returns: object 22 | 23 | """ 24 | 25 | @abc.abstractmethod 26 | def dumps(self, obj): 27 | """ 28 | Serialize a Python object into a `str` 29 | 30 | :param obj: the Python object to be stored in a sorted set 31 | :returns: str 32 | 33 | """ 34 | -------------------------------------------------------------------------------- /redset/locks.py: -------------------------------------------------------------------------------- 1 | """ 2 | Locks used to synchronize mutations on queues. 3 | 4 | """ 5 | 6 | import time 7 | 8 | from redset.exceptions import LockTimeout 9 | 10 | __all__ = ( 11 | 'Lock', 12 | ) 13 | 14 | 15 | # redis or redis-py truncates timestamps to the hundredth 16 | REDIS_TIME_PRECISION = 0.01 17 | 18 | 19 | class Lock(object): 20 | """ 21 | Context manager that implements a distributed lock with redis. 22 | 23 | Based on Chris Lamb's version 24 | (https://chris-lamb.co.uk/posts/distributing-locking-python-and-redis) 25 | 26 | """ 27 | def __init__(self, 28 | redis, 29 | key, 30 | expires=None, 31 | timeout=None, 32 | poll_interval=None, 33 | ): 34 | """ 35 | Distributed locking using Redis SETNX and GETSET. 36 | 37 | Usage:: 38 | 39 | with Lock('my_lock'): 40 | print "Critical section" 41 | 42 | :param redis: the redis client 43 | :param key: the key the lock is labeled with 44 | :param timeout: If another client has already obtained the lock, 45 | sleep for a maximum of ``timeout`` seconds before 46 | giving up. A value of 0 means we never wait. Defaults to 10. 47 | :param expires: We consider any existing lock older than 48 | ``expires`` seconds to be invalid in order to 49 | detect crashed clients. This value must be higher 50 | than it takes the critical section to execute. Defaults to 20. 51 | :param poll_interval: How often we should poll for lock acquisition. 52 | Note that poll intervals below 0.01 don't make sense since 53 | timestamps stored in redis are truncated to the hundredth. 54 | Defaults to 0.2. 55 | :raises: LockTimeout 56 | 57 | """ 58 | self.redis = redis 59 | self.key = key 60 | self.timeout = timeout or 10 61 | self.expires = expires or 20 62 | self.poll_interval = poll_interval or 0.2 63 | 64 | def __enter__(self): 65 | timeout = self.timeout 66 | 67 | while timeout >= 0: 68 | expires = time.time() + self.expires 69 | 70 | if self.redis.setnx(self.key, expires): 71 | # We gained the lock; enter critical section 72 | return 73 | 74 | current_value = self.redis.get(self.key) 75 | 76 | # We found an expired lock and nobody raced us to replacing it 77 | has_expired = ( 78 | current_value and 79 | # bump the retrieved time by redis' precision so that we don't 80 | # erroneously consider a recently acquired lock as expired 81 | (float(current_value) + REDIS_TIME_PRECISION) < time.time() and 82 | self.redis.getset(self.key, expires) == current_value 83 | ) 84 | if has_expired: 85 | return 86 | 87 | timeout -= self.poll_interval 88 | time.sleep(self.poll_interval) 89 | 90 | raise LockTimeout("Timeout while waiting for lock '%s'" % self.key) 91 | 92 | def __exit__(self, exc_type, exc_value, traceback): 93 | self.redis.delete(self.key) 94 | -------------------------------------------------------------------------------- /redset/serializers.py: -------------------------------------------------------------------------------- 1 | """ 2 | Builtin serializers. 3 | 4 | """ 5 | 6 | import json 7 | 8 | from redset.interfaces import Serializer 9 | 10 | 11 | class NamedtupleSerializer(Serializer): 12 | """ 13 | Serialize namedtuple classes. 14 | 15 | """ 16 | def __init__(self, NTClass): 17 | """ 18 | :param NTClass: the namedtuple class that you'd like to marshal to and 19 | from. 20 | :type NTClass: type 21 | 22 | """ 23 | self.NTClass = NTClass 24 | 25 | def loads(self, str_from_redis): 26 | return self.NTClass(**json.loads(str_from_redis)) 27 | 28 | def dumps(self, nt_instance): 29 | return json.dumps(nt_instance._asdict()) 30 | -------------------------------------------------------------------------------- /redset/sets.py: -------------------------------------------------------------------------------- 1 | 2 | import time 3 | 4 | from redset.interfaces import Serializer 5 | from redset.locks import Lock 6 | 7 | import logging 8 | log = logging.getLogger(__name__) 9 | 10 | 11 | __all__ = ( 12 | 'SortedSet', 13 | 'TimeSortedSet', 14 | 'ScheduledSet', 15 | ) 16 | 17 | 18 | class SortedSet(object): 19 | """ 20 | A Redis-backed sorted set safe for multiprocess consumption. 21 | 22 | By default, items are stored and returned as str. Scores default to 0. 23 | 24 | A serializer can be specified to ease packing/unpacking of items. 25 | Otherwise, items are cast to and returned as strings. 26 | 27 | """ 28 | def __init__(self, 29 | redis_client, 30 | name, 31 | scorer=None, 32 | serializer=None, 33 | lock_timeout=None, 34 | lock_expires=None, 35 | ): 36 | """ 37 | :param redis_client: an object matching the interface of the 38 | redis.Redis client. Used to communicate with a Redis server. 39 | :type redis_client: redis.Redis instance 40 | :param name: used to identify the storage location for this 41 | set. 42 | :type name: str 43 | :param scorer: takes in a single argument, which is 44 | the item to be stored, and returns a score which will be used 45 | for the item. 46 | :type scorer: Callable, arity 1 47 | :param serializer: must match the interface defined 48 | by `redset.interfaces.Serializer`. 49 | Defines how objects are marshalled into redis. 50 | :type serializer: :class:`interfaces.Serializer 51 | ` 52 | :param lock_timeout: maximum time we should wait on a lock in seconds. 53 | Defaults to value set in :class:`locks.Lock ` 54 | :type lock_timeout: Number 55 | :param lock_expires: maximum time we should hold the lock in seconds 56 | Defaults to value set in :class:`locks.Lock ` 57 | :type lock_expires: Number 58 | 59 | """ 60 | self._name = name 61 | self.redis = redis_client 62 | self.scorer = scorer or _default_scorer 63 | self.serializer = serializer or _DefaultSerializer() 64 | self.lock = Lock( 65 | self.redis, 66 | '%s__lock' % self.name, 67 | expires=lock_expires, 68 | timeout=lock_timeout) 69 | 70 | def __repr__(self): 71 | return ( 72 | "<%s name='%s', length=%s>" % 73 | (self.__class__.__name__, self.name, len(self)) 74 | ) 75 | 76 | __str__ = __repr__ 77 | 78 | def __len__(self): 79 | """ 80 | How many values are the in the set? 81 | 82 | :returns: int 83 | 84 | """ 85 | return int(self.redis.zcard(self.name)) 86 | 87 | def __contains__(self, item): 88 | return (self.score(item) is not None) 89 | 90 | @property 91 | def name(self): 92 | """ 93 | The name of this set and the string that identifies the redis key 94 | where this set is stored. 95 | 96 | :returns: str 97 | 98 | """ 99 | return self._name 100 | 101 | def add(self, item, score=None): 102 | """ 103 | Add the item to the set. If the item is already in the set, update its 104 | score. 105 | 106 | :param item: 107 | :type item: str 108 | :param score: optionally specify the score for the item 109 | to be added. 110 | :type score: Number 111 | 112 | :returns: Number -- score the item was added with 113 | 114 | """ 115 | score = score or self.scorer(item) 116 | 117 | log.debug( 118 | 'Adding %s to set %s with score: %s' % (item, self.name, score) 119 | ) 120 | self.redis.zadd(self.name, self._dump_item(item), score) 121 | 122 | return score 123 | 124 | def pop(self): 125 | """ 126 | Atomically remove and return the next item eligible for processing in 127 | the set. 128 | 129 | If, for some reason, deserializing the object fails, None is returned 130 | and the object is deleted from redis. 131 | 132 | :raises: KeyError -- if no items left 133 | 134 | :returns: object. 135 | 136 | """ 137 | item = self._pop_item() 138 | 139 | if not item: 140 | raise KeyError('%s is empty' % self) 141 | 142 | return item 143 | 144 | def take(self, num): 145 | """ 146 | Atomically remove and return the next ``num`` items for processing in 147 | the set. 148 | 149 | Will return at most ``min(num, len(self))`` items. If certain items 150 | fail to deserialize, the falsey value returned will be filtered out. 151 | 152 | :returns: list of objects 153 | 154 | """ 155 | num = int(num) 156 | 157 | if num < 1: 158 | return [] 159 | 160 | return self._pop_items(num) 161 | 162 | def clear(self): 163 | """ 164 | Empty the set of all scores and ID strings. 165 | 166 | :returns: bool 167 | 168 | """ 169 | log.debug('Flushing set %s' % self.name) 170 | return self.redis.delete(self.name) 171 | 172 | def discard(self, item): 173 | """ 174 | Remove a given item from the set. 175 | 176 | :param item: 177 | :type item: object 178 | :returns: bool -- success of removal 179 | 180 | """ 181 | return self._discard_by_str(self._dump_item(item)) 182 | 183 | def peek(self, position=0): 184 | """ 185 | Return the an item without removing it. 186 | 187 | :param position: 188 | :type position: int 189 | :raises: KeyError -- if no items found at the specified position 190 | :returns: object 191 | 192 | """ 193 | return self._load_item(self._peek_str(position)) 194 | 195 | def score(self, item): 196 | """ 197 | See what the score for an item is. 198 | 199 | :returns: Number or None. 200 | 201 | """ 202 | return self.redis.zscore(self.name, self._dump_item(item)) 203 | 204 | def peek_score(self): 205 | """ 206 | What is the score of the next item to be processed? This is interesting 207 | if you are trying to keep your set at real-time consumption. 208 | 209 | :returns: Number 210 | 211 | """ 212 | res = self._get_next_item(with_score=True) 213 | 214 | return res[0][1] if res else None 215 | 216 | def _peek_str(self, position=0): 217 | """ 218 | Internal peek to allow peeking by str. 219 | 220 | """ 221 | results = self._get_item(position) 222 | 223 | if not results: 224 | raise KeyError("%s is empty" % self.name) 225 | 226 | return results[0] 227 | 228 | def _pop_item(self): 229 | """ 230 | Internal method for returning the next item without locking. 231 | 232 | """ 233 | res_list = self._pop_items(1) 234 | return res_list[0] if res_list else None 235 | 236 | def _pop_items(self, num_items): 237 | """ 238 | Internal method for poping items atomically from redis. 239 | 240 | :returns: [loaded_item, ...]. if we can't deserialize a particular 241 | item, just skip it. 242 | 243 | """ 244 | res = [] 245 | 246 | item_strs = self._get_and_remove_items(num_items) 247 | 248 | for item_str in item_strs: 249 | item_str = _py3_compat_decode(item_str) 250 | try: 251 | res.append(self._load_item(item_str)) 252 | except Exception: 253 | log.exception("Could not deserialize '%s'" % res) 254 | 255 | return res 256 | 257 | def _get_and_remove_items(self, num_items): 258 | """ 259 | get and remove items from the redis store. 260 | 261 | :returns: [str, ...] 262 | 263 | """ 264 | pipe = self.redis.pipeline() 265 | 266 | (pipe 267 | .zrange( 268 | self.name, 269 | 0, 270 | num_items - 1, 271 | withscores=False) 272 | .zremrangebyrank( 273 | self.name, 274 | 0, 275 | num_items - 1) 276 | ) 277 | 278 | return pipe.execute()[0] 279 | 280 | def _discard_by_str(self, *item_strs): 281 | """ 282 | Internal discard to allow discarding by the str representation of 283 | an item. 284 | 285 | """ 286 | pipe = self.redis.pipeline() 287 | 288 | for item in item_strs: 289 | pipe.zrem(self.name, item) 290 | 291 | return all(pipe.execute()) 292 | 293 | def _get_item(self, position, with_score=False): 294 | """ 295 | Returns a specific element from the redis store 296 | 297 | :param position: 298 | :type position: int 299 | :returns: [str] or [str, float]. item optionally with score, without 300 | removing it. 301 | """ 302 | return self.redis.zrange( 303 | self.name, 304 | position, 305 | position, 306 | withscores=with_score, 307 | ) 308 | 309 | def _get_next_item(self, with_score=False): 310 | """ 311 | :returns: [str] or [str, float]. item optionally with score, without 312 | removing it. 313 | 314 | """ 315 | return self._get_item(0, with_score) 316 | 317 | def _load_item(self, item): 318 | """ 319 | Conditionally deserialize if a routine was specified. 320 | 321 | """ 322 | try: 323 | self.serializer.loads 324 | except AttributeError: 325 | return _py3_compat_decode(item) 326 | 327 | return self.serializer.loads(item) 328 | 329 | def _dump_item(self, item): 330 | """ 331 | Conditionally serialize if a routine was specified. 332 | 333 | """ 334 | try: 335 | self.serializer.dumps 336 | except AttributeError: 337 | return item 338 | 339 | return self.serializer.dumps(item) 340 | 341 | 342 | class TimeSortedSet(SortedSet): 343 | """ 344 | A distributed, FIFO-by-default, time-sorted set that's safe for 345 | multiprocess consumption. 346 | 347 | Implemented in terms of a redis ZSET where UNIX timestamps are used as 348 | the score. 349 | 350 | """ 351 | def __init__(self, *args, **kwargs): 352 | """ 353 | See `redset.sets.SortedSet`. Default scorer will return the current 354 | time when an item is added. 355 | 356 | """ 357 | if not kwargs.get('scorer'): 358 | kwargs['scorer'] = lambda i: time.time() 359 | 360 | super(TimeSortedSet, self).__init__(*args, **kwargs) 361 | 362 | 363 | class ScheduledSet(TimeSortedSet): 364 | """ 365 | A distributed, FIFO-by-default, time-sorted set that's safe for 366 | multiprocess consumption. Supports scheduling item consumption for the 367 | future. 368 | 369 | Implemented in terms of a redis ZSET where UNIX timestamps are used as 370 | the score. 371 | 372 | A ScheduledSet will only return results with a score less than 373 | time.time() - to enable you to schedule jobs for the future and let redis 374 | do the work of defering them until they are ready for consumption. 375 | 376 | """ 377 | def _get_and_remove_items(self, num_items): 378 | with self.lock: 379 | item_strs = self.redis.zrangebyscore( 380 | self.name, 381 | '-inf', 382 | time.time(), 383 | start=0, 384 | num=num_items, 385 | withscores=False 386 | ) 387 | 388 | if item_strs: 389 | self._discard_by_str(*item_strs) 390 | 391 | return item_strs 392 | 393 | def _get_item(self, position=0, with_score=False): 394 | return self.redis.zrangebyscore( 395 | self.name, 396 | '-inf', 397 | time.time(), 398 | # start/num combination is effectively a limit=1 399 | start=position, 400 | num=position+1, 401 | withscores=with_score, 402 | ) 403 | 404 | def _get_next_item(self, with_score=False): 405 | return self._get_item(with_score=with_score) 406 | 407 | def available(self): 408 | """ 409 | The count of items with a score less than now. 410 | 411 | """ 412 | return self.redis.zcount(self.name, '-inf', time.time()) 413 | 414 | 415 | class _DefaultSerializer(Serializer): 416 | 417 | loads = lambda self, i: _py3_compat_decode(i) 418 | dumps = lambda self, i: i 419 | 420 | 421 | _default_scorer = lambda i: 0 422 | 423 | 424 | def _py3_compat_decode(item_out_of_redis): 425 | """Py3 redis returns bytes, so we must handle the decode.""" 426 | if not isinstance(item_out_of_redis, str): 427 | return item_out_of_redis.decode('utf-8') 428 | return item_out_of_redis 429 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | from setuptools import setup 3 | 4 | import redset 5 | 6 | setup( 7 | name='redset', 8 | version=redset.__version__, 9 | author='thekantian, jamesob', 10 | author_email='zach@percolate.com, jamesob@percolate.com', 11 | packages=['redset'], 12 | url='https://github.com/percolate/redset', 13 | license='see LICENSE', 14 | description='Simple, distributed sorted sets with redis', 15 | long_description=open('README.rst').read(), 16 | tests_require=[ 17 | 'redis', 18 | ], 19 | classifiers=[ 20 | 'Intended Audience :: Developers', 21 | 'License :: OSI Approved :: BSD License', 22 | 'Operating System :: OS Independent', 23 | 'Programming Language :: Python :: 2.7', 24 | 'Programming Language :: Python :: 3', 25 | 'Topic :: Software Development :: Libraries :: Python Modules', 26 | ], 27 | ) 28 | -------------------------------------------------------------------------------- /tests/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/percolate/redset/3ff84935b9a2974a78ddc996007c3da1bc9b3757/tests/__init__.py -------------------------------------------------------------------------------- /tests/test_concurrency.py: -------------------------------------------------------------------------------- 1 | """ 2 | Test the use of redset from multiple processeses. 3 | 4 | """ 5 | 6 | import unittest 7 | import multiprocessing 8 | import itertools 9 | 10 | import redis 11 | 12 | from redset import SortedSet, ScheduledSet 13 | from redset.exceptions import LockTimeout 14 | 15 | client = redis.Redis() 16 | 17 | 18 | class MultiprocessTest(unittest.TestCase): 19 | """ 20 | Ensure that we can bang on the sorted set from multiple processes without 21 | trouble. 22 | 23 | """ 24 | 25 | def setUp(self): 26 | self.r = redis.Redis() 27 | self.set_name = 'MultiprocessTest' 28 | self.ss = self._make_ss() 29 | 30 | def _make_ss(self): 31 | class Serializer(object): 32 | loads = int 33 | dumps = str 34 | 35 | return SortedSet( 36 | redis.Redis(), self.set_name, serializer=Serializer(), 37 | ) 38 | 39 | def tearDown(self): 40 | self.ss.clear() 41 | 42 | def test_multiprocessing(self): 43 | """ 44 | Add a bunch of items to a sorted set; attempt to take simulataneously 45 | from multiple processes. Ensure that we end up taking all elements 46 | added without duplication. 47 | 48 | """ 49 | num_procs = 10 50 | num_items = num_procs * 100 51 | 52 | # prime the sortedset with stuff 53 | for i in range(num_items): 54 | self.ss.add(i) 55 | 56 | # utility for running functions within TestCase using multiprocess 57 | def parmap(f, X): 58 | def spawn(f): 59 | def fun(pipe, x): 60 | pipe.send(f(x)) 61 | pipe.close() 62 | return fun 63 | 64 | pipe = [multiprocessing.Pipe() for x in X] 65 | proc = [multiprocessing.Process(target=spawn(f), args=(c, x)) 66 | for x, (p, c) in zip(X, pipe)] 67 | [p.start() for p in proc] 68 | [p.join() for p in proc] 69 | return [p.recv() for (p, c) in pipe] 70 | 71 | def take_subset(process_num): 72 | ss = self._make_ss() 73 | return ss.take(int(num_items / num_procs)) 74 | 75 | taken_items = list(itertools.chain.from_iterable( 76 | parmap(take_subset, range(num_procs)))) 77 | 78 | # assert that no duplicates resulted from taking from the set 79 | # concurrently 80 | self.assertEquals( 81 | list(sorted(taken_items)), 82 | list(range(num_items)), 83 | ) 84 | 85 | self.assertEquals( 86 | 0, 87 | len(self.ss), 88 | ) 89 | 90 | 91 | class ScheduledMultiprocessTest(unittest.TestCase): 92 | """ 93 | ScheduledSet has slightly different concurrency semantics. 94 | 95 | """ 96 | def _make_ss(self): 97 | class Serializer(object): 98 | loads = int 99 | dumps = str 100 | 101 | return ScheduledSet( 102 | redis.Redis(), self.set_name, serializer=Serializer(), 103 | ) 104 | 105 | 106 | class LockExpiryTest(unittest.TestCase): 107 | """ 108 | Ensure that we can bang on the sorted set from multiple processes without 109 | trouble. 110 | 111 | """ 112 | 113 | def setUp(self): 114 | self.r = redis.Redis() 115 | self.set_name = self.__class__.__name__ 116 | self.timeout_length = 0.001 117 | self.holder = SortedSet(redis.Redis(), self.set_name, lock_expires=10) 118 | self.chump = SortedSet( 119 | redis.Redis(), 120 | self.set_name, 121 | lock_timeout=self.timeout_length, 122 | lock_expires=self.timeout_length, 123 | ) 124 | 125 | def tearDown(self): 126 | self.holder.clear() 127 | self.chump.clear() 128 | 129 | def test_lock_timeout(self): 130 | """ 131 | One process holds the lock while the other times out. 132 | 133 | """ 134 | with self.holder.lock: 135 | with self.assertRaises(LockTimeout): 136 | with self.chump.lock: 137 | assert False, "shouldn't be able to acquire the lock" 138 | 139 | def test_lock_expires(self): 140 | """ 141 | One process holds the lock, times out, and the other scoopes the lock. 142 | 143 | """ 144 | got_the_lock = False 145 | 146 | # chump's acquisition should timeout, get picked up by holder 147 | with self.chump.lock: 148 | with self.holder.lock: 149 | got_the_lock = True 150 | 151 | assert got_the_lock, "`holder` should have acquired the lock" 152 | -------------------------------------------------------------------------------- /tests/test_serializers.py: -------------------------------------------------------------------------------- 1 | 2 | import unittest 3 | import redis 4 | from collections import namedtuple 5 | 6 | from redset import SortedSet 7 | from redset.serializers import NamedtupleSerializer 8 | 9 | 10 | DiscoTask = namedtuple('DiscoTask', 'tiger,woods') 11 | 12 | 13 | class TestNTSerializer(unittest.TestCase): 14 | 15 | def setUp(self): 16 | self.ss = SortedSet( 17 | redis.Redis(), 18 | self.__class__.__name__, 19 | serializer=NamedtupleSerializer(DiscoTask), 20 | ) 21 | 22 | def tearDown(self): 23 | self.ss.clear() 24 | 25 | def test_nt_serializer(self): 26 | dt = DiscoTask(tiger='larry', woods='david') 27 | 28 | self.ss.add(dt) 29 | 30 | assert len(self.ss) == 1 31 | assert dt in self.ss 32 | 33 | self.assertEqual( 34 | dt, 35 | self.ss.pop(), 36 | ) 37 | -------------------------------------------------------------------------------- /tests/test_sets.py: -------------------------------------------------------------------------------- 1 | 2 | import unittest 3 | import time 4 | import json 5 | import redis 6 | 7 | from redset import SortedSet, TimeSortedSet, ScheduledSet 8 | from redset.interfaces import Serializer 9 | 10 | 11 | class SortedSetTest(unittest.TestCase): 12 | 13 | def setUp(self): 14 | self.key = 'ss_test' 15 | self.ss = SortedSet(redis.Redis(), self.key) 16 | 17 | def tearDown(self): 18 | self.ss.clear() 19 | 20 | def test_repr(self): 21 | """Just make sure it doesn't blow up.""" 22 | str(self.ss) 23 | 24 | def test_length(self): 25 | for i in range(5): 26 | self.ss.add(i) 27 | 28 | self.assertEquals( 29 | len(self.ss), 30 | 5, 31 | ) 32 | 33 | def test_add_with_score(self): 34 | item = 'samere' 35 | score = 123 36 | self.ss.add(item, score) 37 | 38 | assert self.ss.score(item) == score 39 | 40 | def test_and_and_update_score(self): 41 | item = 'samere' 42 | score = 123 43 | self.ss.add(item, score) 44 | 45 | new_score = 456 46 | self.ss.add(item, new_score) 47 | 48 | assert self.ss.score(item) == new_score 49 | 50 | def test_contains(self): 51 | for i in range(5): 52 | self.ss.add(i) 53 | 54 | self.assertTrue( 55 | 0 in self.ss 56 | ) 57 | 58 | self.assertFalse( 59 | -1 in self.ss 60 | ) 61 | 62 | def test_ordering(self): 63 | for i in range(5): 64 | self.ss.add(i, score=i) 65 | 66 | self.assertEquals( 67 | [str(i) for i in range(5)], 68 | [self.ss.pop() for __ in range(5)], 69 | ) 70 | 71 | def test_empty_pop(self): 72 | with self.assertRaises(KeyError): 73 | self.ss.pop() 74 | 75 | def test_empty_peek(self): 76 | with self.assertRaises(KeyError): 77 | self.ss.peek() 78 | 79 | def test_add_dup(self): 80 | for i in range(5): 81 | self.ss.add(i) 82 | 83 | dup_added_at = 10 84 | self.ss.add(0, score=dup_added_at) 85 | 86 | self.assertEquals( 87 | len(self.ss), 88 | 5, 89 | ) 90 | 91 | self.assertEquals( 92 | int(self.ss.score(0)), 93 | int(dup_added_at), 94 | ) 95 | 96 | def test_clear(self): 97 | self.assertFalse(self.ss.clear()) 98 | 99 | for i in range(5): 100 | self.ss.add(i) 101 | 102 | self.assertTrue(self.ss.clear()) 103 | self.assertEquals( 104 | len(self.ss), 105 | 0, 106 | ) 107 | 108 | def test_discard(self): 109 | self.ss.add(0) 110 | 111 | self.assertTrue(self.ss.discard(0)) 112 | self.assertFalse(self.ss.discard(0)) 113 | 114 | def test_peek(self): 115 | with self.assertRaises(KeyError): 116 | self.ss.peek() 117 | 118 | self.ss.add(0) 119 | 120 | for __ in range(2): 121 | self.assertEquals( 122 | self.ss.peek(), 123 | '0', 124 | ) 125 | 126 | with self.assertRaises(KeyError): 127 | self.ss.peek(position=1) 128 | 129 | self.ss.add(1) 130 | 131 | for __ in range(2): 132 | self.assertEquals( 133 | self.ss.peek(position=1), 134 | '1', 135 | ) 136 | 137 | def test_take(self): 138 | for i in range(5): 139 | self.ss.add(i) 140 | 141 | self.assertEquals( 142 | set([str(i) for i in range(2)]), 143 | set(self.ss.take(2)), 144 | ) 145 | 146 | self.assertEquals( 147 | set([str(i + 2) for i in range(3)]), 148 | set(self.ss.take(100)), 149 | ) 150 | 151 | self.assertEquals( 152 | len(self.ss), 153 | 0, 154 | ) 155 | 156 | self.assertEquals( 157 | self.ss.take(0), 158 | [], 159 | ) 160 | 161 | self.assertEquals( 162 | self.ss.take(-1), 163 | [], 164 | ) 165 | 166 | 167 | class SerializerTest(unittest.TestCase): 168 | 169 | class FakeJsonSerializer(Serializer): 170 | """ 171 | Handles JSON serialization. 172 | 173 | """ 174 | def dumps(self, item): 175 | return json.dumps(item) 176 | 177 | def loads(self, item): 178 | if 'uhoh' in item: 179 | raise Exception("omg unserializing failed!") 180 | return json.loads(item) 181 | 182 | def setUp(self): 183 | self.key = 'json_ss_test' 184 | self.ss = SortedSet( 185 | redis.Redis(), 186 | self.key, 187 | serializer=self.FakeJsonSerializer(), 188 | ) 189 | 190 | # has a bad serializer 191 | self.ss2 = SortedSet( 192 | redis.Redis(), 193 | self.key + '2', 194 | serializer=object(), 195 | ) 196 | 197 | def tearDown(self): 198 | self.ss.clear() 199 | self.ss2.clear() 200 | 201 | def test_add_and_pop(self): 202 | self.ss.add({'yo': 'json'}, score=1) 203 | self.ss.add({'yo': 'yaml'}, score=0) 204 | 205 | self.assertTrue( 206 | {'yo': 'json'} in self.ss 207 | ) 208 | 209 | self.assertEqual( 210 | self.ss.pop(), 211 | {'yo': 'yaml'}, 212 | ) 213 | 214 | self.assertEqual( 215 | self.ss.pop(), 216 | {'yo': 'json'}, 217 | ) 218 | 219 | self.assertEqual( 220 | 0, 221 | len(self.ss), 222 | ) 223 | 224 | def test_cant_deserialize(self): 225 | self.ss.add({'yo': 'foo'}, score=0) 226 | self.ss.add({'yo': 'uhoh!'}, score=1) 227 | self.ss.add({'yo': 'hey'}, score=2) 228 | 229 | self.assertEquals( 230 | self.ss.take(3), 231 | [{'yo': 'foo'}, 232 | {'yo': 'hey'}], 233 | ) 234 | 235 | def test_bad_serializer(self): 236 | self.ss2.add(1, score=0) 237 | self.ss2.add(2, score=1) 238 | 239 | assert '2' in self.ss2 240 | 241 | # gets deserialied as a str, not an int 242 | self.assertEquals( 243 | '1', 244 | self.ss2.pop(), 245 | ) 246 | 247 | 248 | class ScorerTest(unittest.TestCase): 249 | 250 | def setUp(self): 251 | self.key = 'scorer_ss_test' 252 | 253 | class Ser(Serializer): 254 | loads = int 255 | dumps = str 256 | 257 | self.ss = SortedSet( 258 | redis.Redis(), 259 | self.key, 260 | scorer=lambda i: i * -1, 261 | serializer=Ser(), 262 | ) 263 | 264 | def tearDown(self): 265 | self.ss.clear() 266 | 267 | def test_scorer(self): 268 | for i in range(5): 269 | self.ss.add(i) 270 | 271 | self.assertEqual( 272 | [4, 3, 2, 1, 0], 273 | self.ss.take(5), 274 | ) 275 | 276 | 277 | class TimeSortedSetTest(unittest.TestCase): 278 | 279 | def setUp(self): 280 | self.key = 'tss_test' 281 | self.now = time.time() 282 | 283 | self.tss = TimeSortedSet(redis.Redis(), self.key) 284 | 285 | def tearDown(self): 286 | self.tss.clear() 287 | 288 | def test_length(self): 289 | for i in range(5): 290 | self.tss.add(i) 291 | 292 | self.assertEquals( 293 | len(self.tss), 294 | 5, 295 | ) 296 | 297 | def test_contains(self): 298 | for i in range(5): 299 | self.tss.add(i) 300 | 301 | self.assertTrue( 302 | 0 in self.tss 303 | ) 304 | 305 | self.assertFalse( 306 | -1 in self.tss 307 | ) 308 | 309 | def test_add_at(self): 310 | for i in range(5): 311 | self.tss.add(i, score=(self.now - i)) 312 | 313 | self.assertEquals( 314 | [str(i) for i in reversed(range(5))], 315 | [self.tss.pop() for __ in range(5)], 316 | ) 317 | 318 | def test_add_dup(self): 319 | for i in range(5): 320 | self.tss.add(i) 321 | 322 | dup_added_at = self.now + 10 323 | self.tss.add(0, score=dup_added_at) 324 | 325 | self.assertEquals( 326 | len(self.tss), 327 | 5, 328 | ) 329 | 330 | self.assertEquals( 331 | int(self.tss.score(0)), 332 | int(dup_added_at), 333 | ) 334 | 335 | def test_clear(self): 336 | self.assertFalse(self.tss.clear()) 337 | 338 | for i in range(5): 339 | self.tss.add(i) 340 | 341 | self.assertTrue(self.tss.clear()) 342 | self.assertEquals( 343 | len(self.tss), 344 | 0, 345 | ) 346 | 347 | def test_discard(self): 348 | self.tss.add(0) 349 | 350 | self.assertTrue(self.tss.discard(0)) 351 | self.assertFalse(self.tss.discard(0)) 352 | 353 | def test_peek(self): 354 | with self.assertRaises(KeyError): 355 | self.tss.peek() 356 | 357 | self.tss.add(0) 358 | 359 | for __ in range(2): 360 | self.assertEquals( 361 | self.tss.peek(), 362 | '0', 363 | ) 364 | 365 | with self.assertRaises(KeyError): 366 | self.tss.peek(position=1) 367 | 368 | self.tss.add(1) 369 | 370 | for __ in range(2): 371 | self.assertEquals( 372 | self.tss.peek(position=1), 373 | '1', 374 | ) 375 | 376 | def test_score(self): 377 | self.assertEquals( 378 | None, 379 | self.tss.score(0), 380 | ) 381 | 382 | self.tss.add(0, self.now) 383 | 384 | self.assertEquals( 385 | int(self.now), 386 | int(self.tss.score(0)), 387 | ) 388 | 389 | def test_oldest_time(self): 390 | self.assertEquals( 391 | None, 392 | self.tss.peek_score(), 393 | ) 394 | 395 | for i in range(3): 396 | self.tss.add(i, self.now - i) 397 | 398 | self.assertEquals( 399 | int(self.now - 2), 400 | int(self.tss.peek_score()), 401 | ) 402 | 403 | self.tss.pop() 404 | 405 | self.assertEquals( 406 | int(self.now - 1), 407 | int(self.tss.peek_score()), 408 | ) 409 | 410 | 411 | class ScheduledSetTest(unittest.TestCase): 412 | 413 | def setUp(self): 414 | self.key = 'scheduled_set_test' 415 | self.now = time.time() - 1 # offset to avoid having to sleep 416 | 417 | self.ss = ScheduledSet(redis.Redis(), self.key) 418 | 419 | def tearDown(self): 420 | self.ss.clear() 421 | 422 | def test_schedule(self): 423 | self.ss.add(1, self.now) 424 | self.ss.add(2, self.now + 1000) 425 | 426 | next_item = self.ss.pop() 427 | self.assertEquals(next_item, '1') 428 | 429 | with self.assertRaises(KeyError): 430 | self.ss.pop() 431 | 432 | self.assertEquals(len(self.ss), 1) 433 | 434 | def test_peek(self): 435 | with self.assertRaises(KeyError): 436 | self.ss.peek() 437 | 438 | self.ss.add(1, self.now - 1000) 439 | self.ss.add(2, self.now) 440 | self.ss.add(3, self.now + 1000) 441 | 442 | self.assertEquals( 443 | self.ss.peek(), 444 | '1', 445 | ) 446 | 447 | self.assertEquals( 448 | self.ss.peek(position=1), 449 | '2', 450 | ) 451 | 452 | with self.assertRaises(KeyError): 453 | self.ss.peek(position=2) 454 | 455 | self.ss.pop() 456 | self.ss.pop() 457 | 458 | with self.assertRaises(KeyError): 459 | self.ss.peek() 460 | 461 | self.assertEquals(len(self.ss), 1) 462 | 463 | def test_take(self): 464 | self.ss.add('1', self.now - 3) 465 | self.ss.add('2', self.now - 2) 466 | self.ss.add('3', self.now - 1) 467 | 468 | items = self.ss.take(2) 469 | self.assertEquals(len(items), 2) 470 | self.assertEquals(['1', '2'], items) 471 | 472 | self.assertEquals(self.ss.pop(), '3') 473 | 474 | self.assertEquals( 475 | len(self.ss), 476 | 0, 477 | ) 478 | 479 | self.assertEquals( 480 | self.ss.take(0), 481 | [], 482 | ) 483 | 484 | self.assertEquals( 485 | self.ss.take(-1), 486 | [], 487 | ) 488 | 489 | def test_length(self): 490 | for i in range(2): 491 | self.ss.add(i, self.now + 50) 492 | 493 | for i in range(3): 494 | self.ss.add(i + 2, self.now - 50) 495 | 496 | self.assertEquals( 497 | len(self.ss), 498 | 5, 499 | ) 500 | 501 | def test_length_available(self): 502 | for i in range(2): 503 | self.ss.add(i, self.now + 50) 504 | 505 | for i in range(3): 506 | self.ss.add(i + 2, self.now - 50) 507 | 508 | self.assertEquals( 509 | self.ss.available(), 510 | 3, 511 | ) 512 | 513 | def test_contains(self): 514 | for i in range(5): 515 | self.ss.add(i) 516 | 517 | self.assertTrue( 518 | 0 in self.ss 519 | ) 520 | 521 | self.assertFalse( 522 | -1 in self.ss 523 | ) 524 | 525 | def test_add_dup(self): 526 | for i in range(5): 527 | self.ss.add(i) 528 | 529 | dup_added_at = 10 530 | self.ss.add(0, score=dup_added_at) 531 | 532 | self.assertEquals( 533 | len(self.ss), 534 | 5, 535 | ) 536 | 537 | self.assertEquals( 538 | int(self.ss.score(0)), 539 | int(dup_added_at), 540 | ) 541 | 542 | def test_clear(self): 543 | self.assertFalse(self.ss.clear()) 544 | 545 | for i in range(5): 546 | self.ss.add(i) 547 | 548 | self.assertTrue(self.ss.clear()) 549 | self.assertEquals( 550 | len(self.ss), 551 | 0, 552 | ) 553 | 554 | def test_discard(self): 555 | self.ss.add(0) 556 | self.ss.add(1, self.now + 50) 557 | 558 | self.assertTrue(self.ss.discard(0)) 559 | self.assertFalse(self.ss.discard(0)) 560 | 561 | self.assertTrue(self.ss.discard(1)) 562 | self.assertFalse(self.ss.discard(1)) 563 | 564 | def test_peek_score(self): 565 | self.assertEquals( 566 | None, 567 | self.ss.peek_score(), 568 | ) 569 | 570 | for i in range(3): 571 | self.ss.add(i, self.now - i) 572 | 573 | self.assertEquals( 574 | int(self.now - 2), 575 | int(self.ss.peek_score()), 576 | ) 577 | 578 | self.ss.pop() 579 | 580 | self.assertEquals( 581 | int(self.now - 1), 582 | int(self.ss.peek_score()), 583 | ) 584 | --------------------------------------------------------------------------------