├── .github └── workflows │ └── codeql-analysis.yml ├── .gitignore ├── .travis.yml ├── AUTHORS ├── LICENSE ├── MANIFEST ├── README.rst ├── queued_search ├── __init__.py ├── indexes.py ├── management │ ├── __init__.py │ └── commands │ │ ├── __init__.py │ │ └── process_search_queue.py ├── models.py ├── signals.py └── utils.py ├── runtests.py ├── setup.cfg ├── setup.py ├── tests ├── __init__.py ├── models.py ├── requirements.txt ├── search_indexes.py └── tests.py └── tox.ini /.github/workflows/codeql-analysis.yml: -------------------------------------------------------------------------------- 1 | name: "CodeQL" 2 | 3 | on: 4 | push: 5 | branches: [master, ] 6 | pull_request: 7 | # The branches below must be a subset of the branches above 8 | branches: [master] 9 | schedule: 10 | - cron: '0 8 * * 4' 11 | 12 | jobs: 13 | analyze: 14 | name: Analyze 15 | runs-on: ubuntu-latest 16 | 17 | steps: 18 | - name: Checkout repository 19 | uses: actions/checkout@v2 20 | with: 21 | # We must fetch at least the immediate parents so that if this is 22 | # a pull request then we can checkout the head. 23 | fetch-depth: 2 24 | 25 | # If this run was triggered by a pull request event, then checkout 26 | # the head of the pull request instead of the merge commit. 27 | - run: git checkout HEAD^2 28 | if: ${{ github.event_name == 'pull_request' }} 29 | 30 | # Initializes the CodeQL tools for scanning. 31 | - name: Initialize CodeQL 32 | uses: github/codeql-action/init@v1 33 | 34 | - name: Perform CodeQL Analysis 35 | uses: github/codeql-action/analyze@v1 36 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .DS_Store 2 | *.pyc 3 | tests/whoosh_index 4 | tests/haystack 5 | tests/queues 6 | env 7 | dist 8 | build 9 | -------------------------------------------------------------------------------- /.travis.yml: -------------------------------------------------------------------------------- 1 | language: python 2 | 3 | python: 4 | - "2.6" 5 | - "2.7" 6 | - "3.3" 7 | 8 | env: 9 | - DJANGO=https://github.com/django/django/archive/master.tar.gz 10 | - DJANGO=https://github.com/django/django/archive/stable/1.6.x.tar.gz 11 | - DJANGO="django>=1.5,<1.6" 12 | - DJANGO="django>=1.4,<1.5" 13 | 14 | install: 15 | - pip install $DJANGO --use-mirrors 16 | - pip install -r tests/requirements.txt --use-mirrors 17 | 18 | script: 19 | - python runtests.py 20 | 21 | matrix: 22 | exclude: 23 | - python: "2.6" 24 | env: DJANGO=https://github.com/django/django/archive/master.tar.gz 25 | - python: "3.3" 26 | env: DJANGO="django>=1.4,<1.5" 27 | allow_failures: 28 | - python: "3.3" 29 | -------------------------------------------------------------------------------- /AUTHORS: -------------------------------------------------------------------------------- 1 | Primary: 2 | ======== 3 | 4 | * Daniel Lindsley 5 | 6 | 7 | Contributors: 8 | ============= 9 | 10 | * Sébastien Fievet (zyegfryed) for adding batching to the ``process_search_queue`` command. 11 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Copyright (c) 2010, Daniel Lindsley. 2 | All rights reserved. 3 | 4 | Redistribution and use in source and binary forms, with or without modification, 5 | are permitted provided that the following conditions are met: 6 | 7 | 1. Redistributions of source code must retain the above copyright notice, 8 | this list of conditions and the following disclaimer. 9 | 10 | 2. Redistributions in binary form must reproduce the above copyright 11 | notice, this list of conditions and the following disclaimer in the 12 | documentation and/or other materials provided with the distribution. 13 | 14 | 3. Neither the name of queued_search nor the names of its contributors may 15 | be used to endorse or promote products derived from this software without 16 | specific prior written permission. 17 | 18 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND 19 | ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 20 | WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 21 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR 22 | ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES 23 | (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; 24 | LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON 25 | ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT 26 | (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS 27 | SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 28 | -------------------------------------------------------------------------------- /MANIFEST: -------------------------------------------------------------------------------- 1 | setup.py 2 | queued_search/__init__.py 3 | queued_search/models.py 4 | queued_search/signals.py 5 | queued_search/utils.py 6 | queued_search/management/__init__.py 7 | queued_search/management/commands/__init__.py 8 | queued_search/management/commands/process_search_queue.py 9 | -------------------------------------------------------------------------------- /README.rst: -------------------------------------------------------------------------------- 1 | ============= 2 | queued_search 3 | ============= 4 | 5 | Allows you to utilize a queue and shove updates/deletes for search into it, 6 | keeping your pages fast and your index fresh. 7 | 8 | For use with Haystack (http://haystacksearch.org/). 9 | 10 | **WARNING!!!** 11 | 12 | This project has been updated to be compatible with Haystack 2.0.x! 13 | If you need ``queued_search`` for Haystack 1.2.X, please use the 1.0.4 tag 14 | or ``pip install queued_search==1.0.4``! 15 | 16 | 17 | Requirements 18 | ============ 19 | 20 | * Python 2.6+ or Python 3.3+ 21 | * Django 1.5+ 22 | * Haystack 2.0.X (http://github.com/toastdriven/django-haystack) 23 | * Queues (http://code.google.com/p/queues/) 24 | 25 | You also need to install your choice of one of the supported search engines for 26 | Haystack and one of the supported queue backends for Queues. 27 | 28 | 29 | Setup 30 | ===== 31 | 32 | #. Add ``queued_search`` to ``INSTALLED_APPS``. 33 | #. Alter all of your ``SearchIndex`` subclasses to inherit from ``queued_search.indexes.QueuedSearchIndex`` (as well as ``indexes.Indexable``). 34 | #. Ensure your queuing solution of choice is running. 35 | #. Setup a cron job to run the ``process_search_queue`` management command. 36 | #. PROFIT! 37 | -------------------------------------------------------------------------------- /queued_search/__init__.py: -------------------------------------------------------------------------------- 1 | __author__ = 'Daniel Lindsley' 2 | __version__ = (2, 1, 0) 3 | -------------------------------------------------------------------------------- /queued_search/indexes.py: -------------------------------------------------------------------------------- 1 | raise DeprecationWarning("This module is no longer used. Please setup & use the ``QueuedSignalProcessor``.") 2 | -------------------------------------------------------------------------------- /queued_search/management/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/django-haystack/queued_search/e2c565979228c897f40bb75d0ac2a58575c44382/queued_search/management/__init__.py -------------------------------------------------------------------------------- /queued_search/management/commands/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/django-haystack/queued_search/e2c565979228c897f40bb75d0ac2a58575c44382/queued_search/management/commands/__init__.py -------------------------------------------------------------------------------- /queued_search/management/commands/process_search_queue.py: -------------------------------------------------------------------------------- 1 | import logging 2 | from optparse import make_option 3 | from queues import queues, QueueException 4 | from django.conf import settings 5 | from django.core.exceptions import ObjectDoesNotExist, MultipleObjectsReturned 6 | from django.core.management.base import NoArgsCommand 7 | from django.db.models.loading import get_model 8 | from haystack import connections 9 | from haystack.constants import DEFAULT_ALIAS 10 | from haystack.exceptions import NotHandled 11 | from queued_search.utils import get_queue_name 12 | 13 | 14 | DEFAULT_BATCH_SIZE = None 15 | LOG_LEVEL = getattr(settings, 'SEARCH_QUEUE_LOG_LEVEL', logging.ERROR) 16 | 17 | logging.basicConfig( 18 | level=LOG_LEVEL 19 | ) 20 | 21 | class Command(NoArgsCommand): 22 | help = "Consume any objects that have been queued for modification in search." 23 | can_import_settings = True 24 | base_options = ( 25 | make_option('-b', '--batch-size', action='store', dest='batchsize', 26 | default=None, type='int', 27 | help='Number of items to index at once.' 28 | ), 29 | make_option("-u", "--using", action="store", type="string", dest="using", default=DEFAULT_ALIAS, 30 | help='If provided, chooses a connection to work with.' 31 | ), 32 | ) 33 | option_list = NoArgsCommand.option_list + base_options 34 | 35 | def __init__(self, *args, **kwargs): 36 | super(Command, self).__init__(*args, **kwargs) 37 | self.log = logging.getLogger('queued_search') 38 | self.actions = { 39 | 'update': set(), 40 | 'delete': set(), 41 | } 42 | self.processed_updates = set() 43 | self.processed_deletes = set() 44 | 45 | def handle_noargs(self, **options): 46 | self.batchsize = options.get('batchsize', DEFAULT_BATCH_SIZE) or 1000 47 | self.using = options.get('using') 48 | # Setup the queue. 49 | self.queue = queues.Queue(get_queue_name()) 50 | 51 | # Check if enough is there to process. 52 | if not len(self.queue): 53 | self.log.info("Not enough items in the queue to process.") 54 | 55 | self.log.info("Starting to process the queue.") 56 | 57 | # Consume the whole queue first so that we can group update/deletes 58 | # for efficiency. 59 | try: 60 | while True: 61 | message = self.queue.read() 62 | 63 | if not message: 64 | break 65 | 66 | self.process_message(message) 67 | except QueueException: 68 | # We've run out of items in the queue. 69 | pass 70 | 71 | self.log.info("Queue consumed.") 72 | 73 | try: 74 | self.handle_updates() 75 | self.handle_deletes() 76 | except Exception as e: 77 | self.log.error('Exception seen during processing: %s' % e) 78 | self.requeue() 79 | raise e 80 | 81 | self.log.info("Processing complete.") 82 | 83 | def requeue(self): 84 | """ 85 | On failure, requeue all unprocessed messages. 86 | """ 87 | self.log.error('Requeuing unprocessed messages.') 88 | update_count = 0 89 | delete_count = 0 90 | 91 | for update in self.actions['update']: 92 | if not update in self.processed_updates: 93 | self.queue.write('update:%s' % update) 94 | update_count += 1 95 | 96 | for delete in self.actions['delete']: 97 | if not delete in self.processed_deletes: 98 | self.queue.write('delete:%s' % delete) 99 | delete_count += 1 100 | 101 | self.log.error('Requeued %d updates and %d deletes.' % (update_count, delete_count)) 102 | 103 | def process_message(self, message): 104 | """ 105 | Given a message added by the ``QueuedSearchIndex``, add it to either 106 | the updates or deletes for processing. 107 | """ 108 | self.log.debug("Processing message '%s'..." % message) 109 | 110 | if not ':' in message: 111 | self.log.error("Unable to parse message '%s'. Moving on..." % message) 112 | return 113 | 114 | action, obj_identifier = message.split(':') 115 | self.log.debug("Saw '%s' on '%s'..." % (action, obj_identifier)) 116 | 117 | if action == 'update': 118 | # Remove it from the delete list if it's present. 119 | # Since we process the queue in order, this could occur if an 120 | # object was deleted then readded, in which case we should ignore 121 | # the delete and just update the index. 122 | if obj_identifier in self.actions['delete']: 123 | self.actions['delete'].remove(obj_identifier) 124 | 125 | self.actions['update'].add(obj_identifier) 126 | self.log.debug("Added '%s' to the update list." % obj_identifier) 127 | elif action == 'delete': 128 | # Remove it from the update list if it's present. 129 | # Since we process the queue in order, this could occur if an 130 | # object was updated then deleted, in which case we should ignore 131 | # the update and just delete the document from the index. 132 | if obj_identifier in self.actions['update']: 133 | self.actions['update'].remove(obj_identifier) 134 | 135 | self.actions['delete'].add(obj_identifier) 136 | self.log.debug("Added '%s' to the delete list." % obj_identifier) 137 | else: 138 | self.log.error("Unrecognized action '%s'. Moving on..." % action) 139 | 140 | def split_obj_identifier(self, obj_identifier): 141 | """ 142 | Break down the identifier representing the instance. 143 | 144 | Converts 'notes.note.23' into ('notes.note', 23). 145 | """ 146 | bits = obj_identifier.split('.') 147 | 148 | if len(bits) < 2: 149 | self.log.error("Unable to parse object identifer '%s'. Moving on..." % obj_identifier) 150 | return (None, None) 151 | 152 | pk = bits[-1] 153 | # In case Django ever handles full paths... 154 | object_path = '.'.join(bits[:-1]) 155 | return (object_path, pk) 156 | 157 | def get_model_class(self, object_path): 158 | """Fetch the model's class in a standarized way.""" 159 | bits = object_path.split('.') 160 | app_name = '.'.join(bits[:-1]) 161 | classname = bits[-1] 162 | model_class = get_model(app_name, classname) 163 | 164 | if model_class is None: 165 | self.log.error("Could not load model from '%s'. Moving on..." % object_path) 166 | return None 167 | 168 | return model_class 169 | 170 | def get_instance(self, model_class, pk): 171 | """Fetch the instance in a standarized way.""" 172 | try: 173 | instance = model_class.objects.get(pk=pk) 174 | except ObjectDoesNotExist: 175 | self.log.error("Couldn't load model instance with pk #%s. Somehow it went missing?" % pk) 176 | return None 177 | except MultipleObjectsReturned: 178 | self.log.error("More than one object with pk #%s. Oops?" % pk) 179 | return None 180 | 181 | return instance 182 | 183 | def get_index(self, model_class): 184 | """Fetch the model's registered ``SearchIndex`` in a standarized way.""" 185 | try: 186 | return connections['default'].get_unified_index().get_index(model_class) 187 | except NotHandled: 188 | self.log.error("Couldn't find a SearchIndex for %s." % model_class) 189 | return None 190 | 191 | def handle_updates(self): 192 | """ 193 | Process through all updates. 194 | 195 | Updates are grouped by model class for maximum batching/minimized 196 | merging. 197 | """ 198 | # For grouping same model classes for efficiency. 199 | updates = {} 200 | previous_path = None 201 | current_index = None 202 | 203 | for obj_identifier in self.actions['update']: 204 | (object_path, pk) = self.split_obj_identifier(obj_identifier) 205 | 206 | if object_path is None or pk is None: 207 | self.log.error("Skipping.") 208 | continue 209 | 210 | if object_path not in updates: 211 | updates[object_path] = [] 212 | 213 | updates[object_path].append(pk) 214 | 215 | # We've got all updates grouped. Process them. 216 | for object_path, pks in updates.items(): 217 | model_class = self.get_model_class(object_path) 218 | 219 | if object_path != previous_path: 220 | previous_path = object_path 221 | current_index = self.get_index(model_class) 222 | 223 | if not current_index: 224 | self.log.error("Skipping.") 225 | continue 226 | 227 | instances = [self.get_instance(model_class, pk) for pk in pks] 228 | 229 | # Filter out what we didn't find. 230 | instances = [instance for instance in instances if instance is not None] 231 | 232 | # Update the batch of instances for this class. 233 | # Use the backend instead of the index because we can batch the 234 | # instances. 235 | total = len(instances) 236 | self.log.debug("Indexing %d %s." % (total, object_path)) 237 | 238 | for start in range(0, total, self.batchsize): 239 | end = min(start + self.batchsize, total) 240 | batch_instances = instances[start:end] 241 | 242 | self.log.debug(" indexing %s - %d of %d." % (start+1, end, total)) 243 | current_index._get_backend(self.using).update(current_index, batch_instances) 244 | 245 | for updated in batch_instances: 246 | self.processed_updates.add("%s.%s" % (object_path, updated.pk)) 247 | 248 | self.log.debug("Updated objects for '%s': %s" % (object_path, ", ".join(pks))) 249 | 250 | def handle_deletes(self): 251 | """ 252 | Process through all deletes. 253 | 254 | Deletes are grouped by model class for maximum batching. 255 | """ 256 | deletes = {} 257 | previous_path = None 258 | current_index = None 259 | 260 | for obj_identifier in self.actions['delete']: 261 | (object_path, pk) = self.split_obj_identifier(obj_identifier) 262 | 263 | if object_path is None or pk is None: 264 | self.log.error("Skipping.") 265 | continue 266 | 267 | if object_path not in deletes: 268 | deletes[object_path] = [] 269 | 270 | deletes[object_path].append(obj_identifier) 271 | 272 | # We've got all deletes grouped. Process them. 273 | for object_path, obj_identifiers in deletes.items(): 274 | model_class = self.get_model_class(object_path) 275 | 276 | if object_path != previous_path: 277 | previous_path = object_path 278 | current_index = self.get_index(model_class) 279 | 280 | if not current_index: 281 | self.log.error("Skipping.") 282 | continue 283 | 284 | pks = [] 285 | 286 | for obj_identifier in obj_identifiers: 287 | current_index.remove_object(obj_identifier, using=self.using) 288 | pks.append(self.split_obj_identifier(obj_identifier)[1]) 289 | self.processed_deletes.add(obj_identifier) 290 | 291 | self.log.debug("Deleted objects for '%s': %s" % (object_path, ", ".join(pks))) 292 | -------------------------------------------------------------------------------- /queued_search/models.py: -------------------------------------------------------------------------------- 1 | # O HAI. 2 | # Faking ``models.py`` so Django sees the app/management command. 3 | -------------------------------------------------------------------------------- /queued_search/signals.py: -------------------------------------------------------------------------------- 1 | from queues import queues 2 | from django.db import models 3 | from haystack.signals import BaseSignalProcessor 4 | from haystack.utils import get_identifier 5 | from queued_search.utils import get_queue_name 6 | 7 | 8 | class QueuedSignalProcessor(BaseSignalProcessor): 9 | def setup(self): 10 | models.signals.post_save.connect(self.enqueue_save) 11 | models.signals.post_delete.connect(self.enqueue_delete) 12 | 13 | def teardown(self): 14 | models.signals.post_save.disconnect(self.enqueue_save) 15 | models.signals.post_delete.disconnect(self.enqueue_delete) 16 | 17 | def enqueue_save(self, sender, instance, **kwargs): 18 | return self.enqueue('update', instance) 19 | 20 | def enqueue_delete(self, sender, instance, **kwargs): 21 | return self.enqueue('delete', instance) 22 | 23 | def enqueue(self, action, instance): 24 | """ 25 | Shoves a message about how to update the index into the queue. 26 | 27 | This is a standardized string, resembling something like:: 28 | 29 | ``update:notes.note.23`` 30 | # ...or... 31 | ``delete:weblog.entry.8`` 32 | """ 33 | message = "%s:%s" % (action, get_identifier(instance)) 34 | queue = queues.Queue(get_queue_name()) 35 | return queue.write(message) 36 | -------------------------------------------------------------------------------- /queued_search/utils.py: -------------------------------------------------------------------------------- 1 | from django.conf import settings 2 | 3 | 4 | def get_queue_name(): 5 | """ 6 | Standized way to fetch the queue name. 7 | 8 | Can be overridden by specifying ``SEARCH_QUEUE_NAME`` in your settings. 9 | 10 | Given that the queue name is used in disparate places, this is primarily 11 | for sanity. 12 | """ 13 | return getattr(settings, 'SEARCH_QUEUE_NAME', 'haystack_search_queue') 14 | -------------------------------------------------------------------------------- /runtests.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | import os 3 | import sys 4 | import logging 5 | 6 | from django.conf import settings 7 | 8 | 9 | settings.configure( 10 | DATABASES={ 11 | 'default': {'ENGINE': 'django.db.backends.sqlite3', 'NAME': ':memory;'} 12 | }, 13 | INSTALLED_APPS=[ 14 | 'haystack', 15 | 'queued_search', 16 | 'tests', 17 | ], 18 | HAYSTACK_CONNECTIONS={ 19 | 'default': { 20 | 'ENGINE': 'haystack.backends.whoosh_backend.WhooshEngine', 21 | 'PATH': os.path.join(os.path.dirname(__file__), 'whoosh_index') 22 | } 23 | }, 24 | HAYSTACK_SIGNAL_PROCESSOR='queued_search.signals.QueuedSignalProcessor', 25 | QUEUE_BACKEND='dummy', 26 | SEARCH_QUEUE_LOG_LEVEL=logging.DEBUG 27 | ) 28 | 29 | 30 | def runtests(*test_args): 31 | import django.test.utils 32 | 33 | runner_class = django.test.utils.get_runner(settings) 34 | test_runner = runner_class(verbosity=1, interactive=True, failfast=False) 35 | failures = test_runner.run_tests(['tests']) 36 | sys.exit(failures) 37 | 38 | 39 | if __name__ == '__main__': 40 | runtests() 41 | -------------------------------------------------------------------------------- /setup.cfg: -------------------------------------------------------------------------------- 1 | [wheel] 2 | universal = 1 3 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | try: 4 | from setuptools import setup 5 | except ImportError: 6 | from distutils.core import setup 7 | 8 | setup( 9 | name='queued_search', 10 | version='2.1.0', 11 | description='A queuing setup for integration with Haystack.', 12 | author='Daniel Lindsley', 13 | author_email='daniel@toastdriven.com', 14 | url='http://github.com/toastdriven/queued_search', 15 | packages=[ 16 | 'queued_search', 17 | 'queued_search.management', 18 | 'queued_search.management.commands', 19 | ], 20 | classifiers=[ 21 | 'Development Status :: 5 - Production/Stable', 22 | 'Environment :: Web Environment', 23 | 'Framework :: Django', 24 | 'Intended Audience :: Developers', 25 | 'License :: OSI Approved :: BSD License', 26 | 'Operating System :: OS Independent', 27 | 'Programming Language :: Python', 28 | 'Programming Language :: Python :: 2', 29 | 'Programming Language :: Python :: 3', 30 | 'Topic :: Utilities' 31 | ], 32 | ) 33 | -------------------------------------------------------------------------------- /tests/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/django-haystack/queued_search/e2c565979228c897f40bb75d0ac2a58575c44382/tests/__init__.py -------------------------------------------------------------------------------- /tests/models.py: -------------------------------------------------------------------------------- 1 | import datetime 2 | from django.db import models 3 | 4 | 5 | # Ghetto app! 6 | class Note(models.Model): 7 | title = models.CharField(max_length=128) 8 | content = models.TextField() 9 | author = models.CharField(max_length=64) 10 | created = models.DateTimeField(default=datetime.datetime.now) 11 | 12 | def __unicode__(self): 13 | return self.title 14 | -------------------------------------------------------------------------------- /tests/requirements.txt: -------------------------------------------------------------------------------- 1 | django-haystack>=2.1.0,<2.2 2 | queues 3 | Whoosh 4 | -------------------------------------------------------------------------------- /tests/search_indexes.py: -------------------------------------------------------------------------------- 1 | from haystack import indexes 2 | from .models import Note 3 | 4 | 5 | # Simplest possible subclass that could work. 6 | class NoteIndex(indexes.SearchIndex, indexes.Indexable): 7 | text = indexes.CharField(document=True, model_attr='content') 8 | 9 | def get_model(self): 10 | return Note 11 | -------------------------------------------------------------------------------- /tests/tests.py: -------------------------------------------------------------------------------- 1 | import logging 2 | from queues import queues, QueueException 3 | from django.core.management import call_command 4 | from django.test import TestCase 5 | from haystack import connections 6 | from haystack.query import SearchQuerySet 7 | from queued_search.management.commands.process_search_queue import Command as ProcessSearchQueueCommand 8 | from queued_search.utils import get_queue_name 9 | from .models import Note 10 | 11 | 12 | class AssertableHandler(logging.Handler): 13 | stowed_messages = [] 14 | 15 | def emit(self, record): 16 | AssertableHandler.stowed_messages.append(record.getMessage()) 17 | 18 | 19 | assertable = AssertableHandler() 20 | logging.getLogger('queued_search').addHandler(assertable) 21 | 22 | 23 | class QueuedSearchIndexTestCase(TestCase): 24 | def setUp(self): 25 | super(QueuedSearchIndexTestCase, self).setUp() 26 | 27 | # Nuke the queue. 28 | queues.delete_queue(get_queue_name()) 29 | 30 | # Nuke the index. 31 | call_command('clear_index', interactive=False, verbosity=0) 32 | 33 | # Get a queue connection so we can poke at it. 34 | self.queue = queues.Queue(get_queue_name()) 35 | 36 | def test_update(self): 37 | self.assertEqual(len(self.queue), 0) 38 | 39 | note1 = Note.objects.create( 40 | title='A test note', 41 | content='Because everyone loves test data.', 42 | author='Daniel' 43 | ) 44 | 45 | self.assertEqual(len(self.queue), 1) 46 | 47 | note2 = Note.objects.create( 48 | title='Another test note', 49 | content='More test data.', 50 | author='Daniel' 51 | ) 52 | 53 | self.assertEqual(len(self.queue), 2) 54 | 55 | note3 = Note.objects.create( 56 | title='Final test note', 57 | content='The test data. All done.', 58 | author='Joe' 59 | ) 60 | 61 | self.assertEqual(len(self.queue), 3) 62 | 63 | note3.title = 'Final test note FOR REAL' 64 | note3.save() 65 | 66 | self.assertEqual(len(self.queue), 4) 67 | 68 | # Pull the whole queue. 69 | messages = [] 70 | 71 | try: 72 | while True: 73 | messages.append(self.queue.read()) 74 | except QueueException: 75 | # We're out of queued bits. 76 | pass 77 | 78 | self.assertEqual(messages, [u'update:tests.note.1', u'update:tests.note.2', u'update:tests.note.3', u'update:tests.note.3']) 79 | 80 | def test_delete(self): 81 | note1 = Note.objects.create( 82 | title='A test note', 83 | content='Because everyone loves test data.', 84 | author='Daniel' 85 | ) 86 | note2 = Note.objects.create( 87 | title='Another test note', 88 | content='More test data.', 89 | author='Daniel' 90 | ) 91 | note3 = Note.objects.create( 92 | title='Final test note', 93 | content='The test data. All done.', 94 | author='Joe' 95 | ) 96 | 97 | # Dump the queue in preparation for the deletes. 98 | queues.delete_queue(get_queue_name()) 99 | self.queue = queues.Queue(get_queue_name()) 100 | 101 | self.assertEqual(len(self.queue), 0) 102 | note1.delete() 103 | self.assertEqual(len(self.queue), 1) 104 | note2.delete() 105 | self.assertEqual(len(self.queue), 2) 106 | note3.delete() 107 | self.assertEqual(len(self.queue), 3) 108 | 109 | # Pull the whole queue. 110 | messages = [] 111 | 112 | try: 113 | while True: 114 | messages.append(self.queue.read()) 115 | except QueueException: 116 | # We're out of queued bits. 117 | pass 118 | 119 | self.assertEqual(messages, [u'delete:tests.note.1', u'delete:tests.note.2', u'delete:tests.note.3']) 120 | 121 | def test_complex(self): 122 | self.assertEqual(len(self.queue), 0) 123 | 124 | note1 = Note.objects.create( 125 | title='A test note', 126 | content='Because everyone loves test data.', 127 | author='Daniel' 128 | ) 129 | 130 | self.assertEqual(len(self.queue), 1) 131 | 132 | note2 = Note.objects.create( 133 | title='Another test note', 134 | content='More test data.', 135 | author='Daniel' 136 | ) 137 | 138 | self.assertEqual(len(self.queue), 2) 139 | 140 | note1.delete() 141 | self.assertEqual(len(self.queue), 3) 142 | 143 | note3 = Note.objects.create( 144 | title='Final test note', 145 | content='The test data. All done.', 146 | author='Joe' 147 | ) 148 | 149 | self.assertEqual(len(self.queue), 4) 150 | 151 | note3.title = 'Final test note FOR REAL' 152 | note3.save() 153 | 154 | self.assertEqual(len(self.queue), 5) 155 | 156 | note3.delete() 157 | self.assertEqual(len(self.queue), 6) 158 | 159 | # Pull the whole queue. 160 | messages = [] 161 | 162 | try: 163 | while True: 164 | messages.append(self.queue.read()) 165 | except QueueException: 166 | # We're out of queued bits. 167 | pass 168 | 169 | self.assertEqual(messages, [u'update:tests.note.1', u'update:tests.note.2', u'delete:tests.note.1', u'update:tests.note.3', u'update:tests.note.3', u'delete:tests.note.3']) 170 | 171 | 172 | class ProcessSearchQueueTestCase(TestCase): 173 | def setUp(self): 174 | super(ProcessSearchQueueTestCase, self).setUp() 175 | 176 | # Nuke the queue. 177 | queues.delete_queue(get_queue_name()) 178 | 179 | # Nuke the index. 180 | call_command('clear_index', interactive=False, verbosity=0) 181 | 182 | # Get a queue connection so we can poke at it. 183 | self.queue = queues.Queue(get_queue_name()) 184 | 185 | # Clear out and capture log messages. 186 | AssertableHandler.stowed_messages = [] 187 | 188 | self.psqc = ProcessSearchQueueCommand() 189 | 190 | def test_process_mesage(self): 191 | self.assertEqual(self.psqc.actions, {'update': set([]), 'delete': set([])}) 192 | 193 | self.psqc.process_message('update:tests.note.1') 194 | self.assertEqual(self.psqc.actions, {'update': set(['tests.note.1']), 'delete': set([])}) 195 | 196 | self.psqc.process_message('delete:tests.note.2') 197 | self.assertEqual(self.psqc.actions, {'update': set(['tests.note.1']), 'delete': set(['tests.note.2'])}) 198 | 199 | self.psqc.process_message('update:tests.note.2') 200 | self.assertEqual(self.psqc.actions, {'update': set(['tests.note.1', 'tests.note.2']), 'delete': set([])}) 201 | 202 | self.psqc.process_message('delete:tests.note.1') 203 | self.assertEqual(self.psqc.actions, {'update': set(['tests.note.2']), 'delete': set(['tests.note.1'])}) 204 | 205 | self.psqc.process_message('wtfmate:tests.note.1') 206 | self.assertEqual(self.psqc.actions, {'update': set(['tests.note.2']), 'delete': set(['tests.note.1'])}) 207 | 208 | self.psqc.process_message('just plain wrong') 209 | self.assertEqual(self.psqc.actions, {'update': set(['tests.note.2']), 'delete': set(['tests.note.1'])}) 210 | 211 | def test_split_obj_identifier(self): 212 | self.assertEqual(self.psqc.split_obj_identifier('tests.note.1'), ('tests.note', '1')) 213 | self.assertEqual(self.psqc.split_obj_identifier('myproject.tests.note.73'), ('myproject.tests.note', '73')) 214 | self.assertEqual(self.psqc.split_obj_identifier('wtfmate.1'), ('wtfmate', '1')) 215 | self.assertEqual(self.psqc.split_obj_identifier('wtfmate'), (None, None)) 216 | 217 | def test_processing(self): 218 | self.assertEqual(len(self.queue), 0) 219 | 220 | note1 = Note.objects.create( 221 | title='A test note', 222 | content='Because everyone loves test data.', 223 | author='Daniel' 224 | ) 225 | 226 | self.assertEqual(len(self.queue), 1) 227 | 228 | note2 = Note.objects.create( 229 | title='Another test note', 230 | content='More test data.', 231 | author='Daniel' 232 | ) 233 | 234 | self.assertEqual(len(self.queue), 2) 235 | 236 | note1.delete() 237 | self.assertEqual(len(self.queue), 3) 238 | 239 | note3 = Note.objects.create( 240 | title='Final test note', 241 | content='The test data. All done.', 242 | author='Joe' 243 | ) 244 | 245 | self.assertEqual(len(self.queue), 4) 246 | 247 | note3.title = 'Final test note FOR REAL' 248 | note3.save() 249 | 250 | self.assertEqual(len(self.queue), 5) 251 | 252 | note3.delete() 253 | self.assertEqual(len(self.queue), 6) 254 | 255 | self.assertEqual(AssertableHandler.stowed_messages, []) 256 | 257 | # Call the command. 258 | call_command('process_search_queue') 259 | 260 | self.assertEqual(AssertableHandler.stowed_messages, [ 261 | 'Starting to process the queue.', 262 | u"Processing message 'update:tests.note.1'...", 263 | u"Saw 'update' on 'tests.note.1'...", 264 | u"Added 'tests.note.1' to the update list.", 265 | u"Processing message 'update:tests.note.2'...", 266 | u"Saw 'update' on 'tests.note.2'...", 267 | u"Added 'tests.note.2' to the update list.", 268 | u"Processing message 'delete:tests.note.1'...", 269 | u"Saw 'delete' on 'tests.note.1'...", 270 | u"Added 'tests.note.1' to the delete list.", 271 | u"Processing message 'update:tests.note.3'...", 272 | u"Saw 'update' on 'tests.note.3'...", 273 | u"Added 'tests.note.3' to the update list.", 274 | u"Processing message 'update:tests.note.3'...", 275 | u"Saw 'update' on 'tests.note.3'...", 276 | u"Added 'tests.note.3' to the update list.", 277 | u"Processing message 'delete:tests.note.3'...", 278 | u"Saw 'delete' on 'tests.note.3'...", 279 | u"Added 'tests.note.3' to the delete list.", 280 | 'Queue consumed.', 281 | u'Indexing 1 tests.note.', 282 | ' indexing 1 - 1 of 1.', 283 | u"Updated objects for 'tests.note': 2", 284 | u"Deleted objects for 'tests.note': 1, 3", 285 | 'Processing complete.' 286 | ]) 287 | self.assertEqual(SearchQuerySet().all().count(), 1) 288 | 289 | def test_requeuing(self): 290 | self.assertEqual(len(self.queue), 0) 291 | 292 | note1 = Note.objects.create( 293 | title='A test note', 294 | content='Because everyone loves test data.', 295 | author='Daniel' 296 | ) 297 | 298 | self.assertEqual(len(self.queue), 1) 299 | 300 | # Write a failed message. 301 | self.queue.write('update:tests.note.abc') 302 | self.assertEqual(len(self.queue), 2) 303 | 304 | self.assertEqual(AssertableHandler.stowed_messages, []) 305 | 306 | try: 307 | # Call the command, which will fail. 308 | call_command('process_search_queue') 309 | self.fail("The command ran successfully, which is incorrect behavior in this case.") 310 | except: 311 | # We don't care that it failed. We just want to examine the state 312 | # of things afterward. 313 | pass 314 | 315 | self.assertEqual(len(self.queue), 2) 316 | 317 | # Pull the whole queue. 318 | messages = [] 319 | 320 | try: 321 | while True: 322 | messages.append(self.queue.read()) 323 | except QueueException: 324 | # We're out of queued bits. 325 | pass 326 | 327 | self.assertEqual(messages, [u'update:tests.note.1', 'update:tests.note.abc']) 328 | self.assertEqual(len(self.queue), 0) 329 | 330 | self.assertEqual(AssertableHandler.stowed_messages, [ 331 | 'Starting to process the queue.', 332 | u"Processing message 'update:tests.note.1'...", 333 | u"Saw 'update' on 'tests.note.1'...", 334 | u"Added 'tests.note.1' to the update list.", 335 | "Processing message 'update:tests.note.abc'...", 336 | "Saw 'update' on 'tests.note.abc'...", 337 | "Added 'tests.note.abc' to the update list.", 338 | 'Queue consumed.', 339 | "Exception seen during processing: invalid literal for int() with base 10: 'abc'", 340 | 'Requeuing unprocessed messages.', 341 | 'Requeued 2 updates and 0 deletes.' 342 | ]) 343 | 344 | # Start over. 345 | note1 = Note.objects.create( 346 | title='A test note', 347 | content='Because everyone loves test data.', 348 | author='Daniel' 349 | ) 350 | 351 | self.assertEqual(len(self.queue), 1) 352 | 353 | note2 = Note.objects.create( 354 | title='Another test note', 355 | content='Because everyone loves test data.', 356 | author='Daniel' 357 | ) 358 | 359 | self.assertEqual(len(self.queue), 2) 360 | 361 | # Now delete it. 362 | note2.delete() 363 | 364 | # Write a failed message. 365 | self.queue.write('delete:tests.note.abc') 366 | self.assertEqual(len(self.queue), 4) 367 | 368 | AssertableHandler.stowed_messages = [] 369 | self.assertEqual(AssertableHandler.stowed_messages, []) 370 | 371 | try: 372 | # Call the command, which will fail again. 373 | call_command('process_search_queue') 374 | self.fail("The command ran successfully, which is incorrect behavior in this case.") 375 | except: 376 | # We don't care that it failed. We just want to examine the state 377 | # of things afterward. 378 | pass 379 | 380 | # Everything but the bad bit of data should have processed. 381 | self.assertEqual(len(self.queue), 1) 382 | 383 | # Pull the whole queue. 384 | messages = [] 385 | 386 | try: 387 | while True: 388 | messages.append(self.queue.read()) 389 | except QueueException: 390 | # We're out of queued bits. 391 | pass 392 | 393 | self.assertEqual(messages, ['delete:tests.note.abc']) 394 | self.assertEqual(len(self.queue), 0) 395 | 396 | self.assertEqual(AssertableHandler.stowed_messages, [ 397 | 'Starting to process the queue.', 398 | u"Processing message 'update:tests.note.2'...", 399 | u"Saw 'update' on 'tests.note.2'...", 400 | u"Added 'tests.note.2' to the update list.", 401 | u"Processing message 'update:tests.note.3'...", 402 | u"Saw 'update' on 'tests.note.3'...", 403 | u"Added 'tests.note.3' to the update list.", 404 | u"Processing message 'delete:tests.note.3'...", 405 | u"Saw 'delete' on 'tests.note.3'...", 406 | u"Added 'tests.note.3' to the delete list.", 407 | "Processing message 'delete:tests.note.abc'...", 408 | "Saw 'delete' on 'tests.note.abc'...", 409 | "Added 'tests.note.abc' to the delete list.", 410 | 'Queue consumed.', 411 | u'Indexing 1 tests.note.', 412 | ' indexing 1 - 1 of 1.', 413 | u"Updated objects for 'tests.note': 2", 414 | "Exception seen during processing: Provided string 'tests.note.abc' is not a valid identifier.", 415 | 'Requeuing unprocessed messages.', 416 | 'Requeued 0 updates and 1 deletes.' 417 | ]) 418 | -------------------------------------------------------------------------------- /tox.ini: -------------------------------------------------------------------------------- 1 | [tox] 2 | envlist = 3 | py26-django14, 4 | py26-django15, 5 | py26-django16, 6 | py27-django14, 7 | py27-django15, 8 | py27-django16, 9 | py27-django-master, 10 | py33-django15, 11 | py33-django16 12 | 13 | [testenv] 14 | deps= 15 | Whoosh 16 | django-haystack>=2.1.0,<2.2 17 | queues 18 | commands = 19 | python runtests.py 20 | 21 | 22 | [django14] 23 | deps = 24 | Django>=1.4,<1.5 25 | 26 | [django15] 27 | deps = 28 | Django>=1.5,<1.6 29 | 30 | [django16] 31 | deps = 32 | https://github.com/django/django/archive/stable/1.6.x.tar.gz 33 | 34 | [django-master] 35 | deps = 36 | https://github.com/django/django/archive/master.tar.gz 37 | 38 | 39 | [testenv:py26-django14] 40 | basepython = python2.6 41 | deps = 42 | {[testenv]deps} 43 | {[django14]deps} 44 | 45 | [testenv:py26-django15] 46 | basepython = python2.6 47 | deps = 48 | {[testenv]deps} 49 | {[django15]deps} 50 | 51 | [testenv:py26-django16] 52 | basepython = python2.6 53 | deps = 54 | {[testenv]deps} 55 | {[django16]deps} 56 | 57 | 58 | [testenv:py27-django14] 59 | basepython = python2.7 60 | deps = 61 | {[testenv]deps} 62 | {[django14]deps} 63 | 64 | [testenv:py27-django15] 65 | basepython = python2.7 66 | deps = 67 | {[testenv]deps} 68 | {[django15]deps} 69 | 70 | [testenv:py27-django16] 71 | basepython = python2.7 72 | deps = 73 | {[testenv]deps} 74 | {[django16]deps} 75 | 76 | [testenv:py27-django-master] 77 | basepython = python2.7 78 | deps = 79 | {[testenv]deps} 80 | {[django-master]deps} 81 | 82 | 83 | [testenv:py33-django15] 84 | basepython = python3.3 85 | deps = 86 | {[testenv]deps} 87 | {[django15]deps} 88 | 89 | [testenv:py33-django16] 90 | basepython = python3.3 91 | deps = 92 | {[testenv]deps} 93 | {[django16]deps} 94 | 95 | [testenv:py33-django-master] 96 | basepython = python3.3 97 | deps = 98 | {[testenv]deps} 99 | {[django-master]deps} 100 | --------------------------------------------------------------------------------