├── .gitignore ├── .project ├── .pydevproject ├── README.txt ├── __init__.py ├── django_cassandra ├── __init__.py └── db │ ├── __init__.py │ ├── base.py │ ├── compiler.py │ ├── creation.py │ ├── introspection.py │ ├── predicate.py │ └── utils.py ├── manage.py ├── settings.py ├── tests ├── __init__.py ├── admin.py ├── models.py ├── tests.py └── views.py └── urls.py /.gitignore: -------------------------------------------------------------------------------- 1 | *~ 2 | *.pyc 3 | *.pyo 4 | *.egg-info/ 5 | -------------------------------------------------------------------------------- /.project: -------------------------------------------------------------------------------- 1 | 2 | 3 | django_cassandra_backend 4 | 5 | 6 | 7 | 8 | 9 | org.python.pydev.PyDevBuilder 10 | 11 | 12 | 13 | 14 | 15 | org.python.pydev.pythonNature 16 | 17 | 18 | -------------------------------------------------------------------------------- /.pydevproject: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | Default 6 | python 2.6 7 | 8 | 9 | -------------------------------------------------------------------------------- /README.txt: -------------------------------------------------------------------------------- 1 | Introduction 2 | ============ 3 | This is an early development release of a Django backend for the Cassandra database. 4 | It has only been under development for a short time and there are almost certainly 5 | issues/bugs with this release -- see the end of this document for a list of known 6 | issues. Needless to say, you shouldn't use this release in a production setting, the 7 | format of the data stored in Cassandra may change in future versions, there's no 8 | promise of backwards compatibility with this version, and so on. 9 | 10 | Please let me know if you find any bugs or have any suggestions for how to improve 11 | the backend. You can contact me at: rob.vaterlaus@gmail.com 12 | 13 | Installation 14 | ============ 15 | The backend requires at least the 0.7 version of Cassandra. 0.7 has several features 16 | (e.g. programmatic creation/deletion of keyspaces & column families, secondary index 17 | support) that are useful for the Django database backend, so I targeted that 18 | instead of 0.6. Unfortunately, the Cassandra Thrift API changed between 0.6 and 0.7, 19 | so the two version are incompatible. 20 | 21 | I currently use the 1.0.10 release. That's the only version I test against, so no 22 | promises if you try it with a different version. I have tested earlier versions 23 | against the 0.7.x and 0.8.x versions of Cassandra with no problem, so I would expect 24 | that it would still work. 25 | 26 | If you're updating from a previous version of the Cassandra DB backend, then it's 27 | possible/likely that the format it stores models/fields in Cassandra has changed, 28 | so you should wipe your Cassandra database. If you're using the default locations 29 | for things, then this should involve executing something like "rm -rf /var/log/cassandra/*" 30 | and "rm -rf /var/lib/cassandra/*". At some point as the backend becomes more stable 31 | data format compatibility or migration will be supported, but for now it's not worth 32 | the effort. 33 | 34 | The backend also requires the Django-nonrel fork of Django and djangotoolbox. 35 | Both are available here: . 36 | I installed the Django-nonrel version of Django globally in site-packages and 37 | copied djangotoolbox into the directory where I'm testing the Cassandra backend, 38 | but there are probably other (better, e.g. virtualenv) ways to install those things. 39 | I'm using the current (as of 11/1/2011) version of both packages. The Django-nonrel is 40 | based on the 1.3 beta 1 release of Django and the version of djangotoolbox is 0.9.2. 41 | 42 | You also need to generate the Python Thrift API code as described in the Cassandra 43 | documentation and copy the generated "cassandra" directory (from Cassandra's 44 | interface/gen-py directory) over to the top-level Django project directory. 45 | You should use the 0.6.x version of Thrift if you're using the 0.8 or higher version 46 | of Cassandra. You should use the 0.5.x version of Thrift if you're using 0.7. 47 | 48 | To configure a project to use the Cassandra backend all you have to do is change 49 | the database settings in the settings.py file. Change the ENGINE value to be 50 | 'django_cassandra.db' and the NAME value to be the name of the keyspace to use. 51 | You also need to set the SUPPORTS_TRANSACTIONS setting to False, since Cassandra 52 | doesn't support transactions. You can set HOST and PORT to specify the host and 53 | port where the Cassandra daemon process is running. If these aren't specified 54 | then the backend uses default values of 'localhost' and 9160. You can also set 55 | USER and PASSWORD if you're using authentication with Cassandra. You can also set 56 | a few optional Cassandra-specific settings in the database settings. Set the 57 | CASSANDRA_REPLICATION_FACTOR and CASSANDRA_STRATEGY_CLASS settings to be the 58 | replication factor and strategy class value you want to use when the backend 59 | creates the keyspace during syncdb. The default values for these settings are 60 | 1 and "org.apache.cassandra.locator.SimpleStrategy". You can also define 61 | CASSANDRA_READ_CONSISTENCY_LEVEL and CASSANDRA_WRITE_CONSISTENCY_LEVEL to be 62 | the values you want to use for the consistency level for read and write 63 | operations. If you want to use different consistency level values for 64 | different operations or different column families then it should work to 65 | use the Django multiple database support to define different database 66 | settings with different consistency levels and use the appropriate one, 67 | but I haven't tested this to verify that it works. 68 | 69 | Configure Cassandra as described in the Cassandra documentation. 70 | If want to be able to do range queries over primary keys then you need to set the 71 | partitioner in the cassandra.yaml config file to be the OrderPreservingPartitioner. 72 | 73 | Once you're finished configuring Cassandra start up the Cassandra daemon process as 74 | described in the Cassandra documentation. 75 | 76 | Run syncdb. This creates the keyspace (if necessary) and the column families for the 77 | models in the installed apps. The Cassandra backend creates one column family per 78 | model. It uses the db_table value from the meta settings for the name of the 79 | column family if it's specified; otherwise it uses the default name similar to 80 | other backends. 81 | 82 | Now you should be able to use the normal model and query set calls from your 83 | Django code. 84 | 85 | The backend supports query set update operations. This doesn't have the same 86 | transactional semantics that it would have on a relational database, but it 87 | does mean that you can use the backend with code that depends on this feature. 88 | In particular it means that cascading deletes are now supported. For large data 89 | sets cascading deletes are typically a bad idea, so they are disabled by default. 90 | To enable them you define a value in the database settings dictionary named 91 | "CASSANDRA_ENABLE_CASCADING_DELETES" whose value is True. 92 | 93 | The backend supports automatic construction of compound id/pk fields that 94 | are composed of the values of other fields in the model. You would typically 95 | use this when you have some subset of the fields in the model that together 96 | uniquely identify that particular instance of the model. Compound key generation 97 | is enabled for a model by defining a class variable named COMPOUND_KEY_FIELDS 98 | in a nested class called "CassandraSettings" of the model. The value of the 99 | COMPOUND_KEY_FIELDS value is a tuple of the names of the fields that are used 100 | to form the compound key. By default the field values are separated by the '|' 101 | character, but this separator value can be overridden by defining a class 102 | variable in the CassandraSettings named COMPOUND_KEY_SEPARATOR whose value is 103 | the character to use as the separator. 104 | 105 | This release includes a test project and app. If you want to use the backend in 106 | another project you can copy the django_cassandra directory to the 107 | top-level directory of the project (along with the cassandra and djangotoolbox 108 | directories) or else make sure that these are installed in your environment. 109 | 110 | What Works 111 | ========== 112 | - the basics: creating model instances, querying (get/filter/exclude), count, 113 | update/save, delete, order_by 114 | - efficient queries for exact matches on the primary key. It can also do range 115 | queries on the primary key, but your Cassandra cluster must be configured to use the 116 | OrderPreservingPartitioner if you want to do that. Unfortunately, currently it 117 | doesn't fail gracefully if you try to do range queries when using the 118 | RandomPartitioner, so just don't do that :-) 119 | - inefficient queries for everything else that can't be done efficiently in 120 | Cassandra. The basic approach used in the query processing code is to first try 121 | to prune the number of rows to look at by finding a part of the query that can 122 | be evaluated efficiently (i.e. a primary key filter predicate or an exact match 123 | secondary index predicate). Then it evaluates the remaining filter 124 | predicates over the pruned rows to obtain the final result. If there's no part 125 | of the query that can be evaluated efficiently, then it just fetches the entire 126 | set of rows and does all of the filtering in the backend code. 127 | - programmatic creation of the keyspace & column families via syncdb 128 | - Django admin UI, except for users in the auth application (see below) 129 | - I think all of the filter operations (e.g. gt, startswith, regex, etc.) are supported 130 | although it's possible I missed something 131 | - complex queries with Q nodes 132 | - basic secondary index support. If the db_index attribute of a field is set to True, 133 | then the backend configures the column family to index on that field/column. 134 | Currently Cassandra only supports exact match queries with the secondary 135 | indexes, so the support is limited. Range queries on columns with secondary indexes 136 | will still be inefficient. 137 | - support for query update operations (and thus cascading deletes, but that's 138 | disabled by default) 139 | 140 | What Doesn't Work (Yet) 141 | ======================= 142 | - I haven't tested all of the different field types, so there are probably 143 | issues there with how the data is converted to and from Cassandra with some of the 144 | field types. My use case was mostly string fields, so most of the testing was with 145 | that. I've also tried out integer, float, boolean, date, datetime, time, text 146 | and decimal fields, so I think those should work too, but I haven't tested all 147 | of the possible field types. 148 | - joins 149 | - chunked queries. It just tries to get everything all at once from Cassandra. 150 | Currently the maximum number of keys/rows that it can fetch (i.e. the count 151 | value in the Cassandra Thrift API) defaults semi-arbitrarily to 1000000, so 152 | if you try to query over a column family with more returned rows than that 153 | it won't work (and if you're anywhere near approaching that limit you're going 154 | to be using gobs of memory). Similarly, there's a limit of 10000 for the number 155 | of columns returned in a given row. It's doubtful that anyone would come 156 | anywhere near that limit, since that is dictated by the number of fields there 157 | are in the Django model. You override either/both of these limits by setting 158 | the CASSANDRA_MAX_KEY_COUNT and/or CASSANDRA_MAX_COLUMN_COUNT settings in the 159 | database settings in settings.py. 160 | - ListModel/ListField support from djangotoolbox (I think?). I haven't 161 | investigated how this works and if it's feasible to support in Cassandra, 162 | although I'm guessing it probably wouldn't be too hard. For now, this means 163 | that several of the unit tests from djangotoolbox fail if you have that 164 | in your installed apps. I made a preliminary pass to try to get this to 165 | work, but it turned out to be more difficult than expected, so it exists 166 | in a partially-completed form in the source. 167 | - probably a lot of other stuff that I've forgotten or am unaware of :-) 168 | 169 | Known Issues 170 | ============ 171 | - I haven't been able to get the admin UI to work for users in the Django 172 | authentication middleware. I included djangotoolbox in my installed apps, as 173 | suggested on the Django-nonrel web site, which got my further, but I still get 174 | an error in some Django template code that tries to render a change list (I think). 175 | I still need to track down what's going on there. 176 | - There's a reported issues with using unicode strings. At this point it's 177 | still unclear whether this is a problem in the Django backend or in the 178 | Python Thrift bindings to Cassandra. I've think I've fixed all of the obvious 179 | places in the backend code to deal properly with Unicode strings, but it's 180 | possible/probable there are some remaining issues. The reported problem is with 181 | using non-ASCII characters in the model definitions. This triggers an exception 182 | during syncdb, so for now just don't do that. It hasn't been tested yet 183 | whether there's a problem with simply storing Unicode strings as the field 184 | values (as opposed to the model/field names). 185 | - There are a few unit tests that fail in the sites middleware. These don't fail 186 | with the other nonrel backends, so it's a bug/limitation in the Cassandra backend. 187 | - If you enable the authentication and session middleware a bunch of the 188 | associated unit tests fail if you run all of the unit tests. 189 | Waldemar says that it's expected that some of these unit tests will fail, 190 | because they rely on joins which aren't supported yet. I haven't verified yet 191 | that all of the failures are because of joins, though. 192 | - the code needs a cleanup pass for things like the exception handling/safety, 193 | some refactoring, more pydoc comments, etc. 194 | - I have a feeling there are some places where I haven't completely leveraged 195 | the code in djangotoolbox, so there may be places where I haven't done 196 | things in the optimal way 197 | - the error handling/messaging isn't great for things like the Cassandra 198 | daemon not running, a versioning mismatch between client and Cassandra 199 | daemon, etc. Currently you just get a somewhat uninformative exception in 200 | these cases. 201 | 202 | Changes for 0.2.4 203 | ================= 204 | - switch the timestamp format to use the system time in microseconds to be 205 | consistent with the standard Cassandra timestamps used by other Cassandra 206 | components (e.g. the Cassandra CLI) and to hopefully eliminate issues with 207 | timestamp collisions across multiple Django processes. 208 | 209 | Changes for 0.2.3 210 | ================= 211 | - fixed a bug with the retry/reconnect logic where it would use a stale Cassandra 212 | Client object. 213 | 214 | Changes for 0.2.2 215 | ================= 216 | - fixed a bug with handling delete operations where it would sometimes incorrectly 217 | delete all items whose values were a substring of the specified query value 218 | instead of only if there was an exact match. 219 | 220 | Changes for 0.2.1 221 | ================= 222 | 223 | - Fixed typo in the CassandraAccessError class 224 | - Added support for customizing the arguments that are used to create the 225 | keyspace. In particular this allows you to specify the durable_writes 226 | setting that was added in Cassandra 1.0 if you want to disable that for 227 | a keyspace. 228 | 229 | Changes for 0.2 230 | =============== 231 | - added support for automatic construction of compound id/pk fields that 232 | are composed of the values of other fields in the model. You would typically 233 | use this when you have some subset of the fields in the model that together 234 | uniquely identify that particular instance of the model. Compound key generation 235 | is enabled for a model by defining a class variable named COMPOUND_KEY_FIELDS 236 | in a nested class called "CassandraSettings" of the model. The value of the 237 | COMPOUND_KEY_FIELDS value is a tuple of the names of the fields that are used 238 | to form the compound key. By default the field values are separated by the '|' 239 | character, but this separator value can be overridden by defining a class 240 | variable in the CassandraSettings named COMPOUND_KEY_SEPARATOR whose value is 241 | the character to use as the separator. 242 | - added support for running under the 0.8 version of Cassandra. This included 243 | fixing a bug where the secondary index names were not properly scoped with 244 | its associated column family (which "worked" before because Cassandra wasn't 245 | properly checking for conflicts) and properly setting the replication factor 246 | as a strategy option instead of a field in the KsDef struct. The code checks 247 | the API version to detect whether it's running against the 0.7 or 0.8 version 248 | of Cassandra, so it still works under 0.7. 249 | - support for query set update operations 250 | - support for cascading deletes (disabled by default) 251 | - fixed some bugs in the constructors of some exception classes 252 | - cleaned up the code for handling reconnecting to Cassandra if there's a 253 | disruption in the connection (e.g. Cassandra restarting). 254 | 255 | Changes for 0.1.7 256 | ================= 257 | 258 | - Made the max key/column counts bigger as a temporary workaround for large queries. 259 | Really need to support chunked operations for this to work better. 260 | 261 | Changes for 0.1.6 262 | ================= 263 | 264 | - Fixed a bug with handling default values of fields 265 | 266 | Changes For 0.1.5 267 | ================= 268 | 269 | - Fixed a bug with the Cassandra reconnection logic 270 | 271 | Changes For 0.1.4 272 | ================= 273 | 274 | - Fixed a bug with the id field not being properly initialized if the model 275 | instance is created with no intialization arguments. 276 | - Added unit tests for the bugs that were fixed recently 277 | - Thanks to Abd Allah Diab for reporting this bug 278 | 279 | Changes For 0.1.3 280 | ================= 281 | 282 | - Fixed a bug with query set filter operations if there were multiple filters 283 | on indexed fields (e.g. foreign key fields) 284 | - Fixed a bug with order_by operations on foreign key fields 285 | - Thanks to Abd Allah Diab for reporting these bugs 286 | 287 | Changes For 0.1.2 288 | ================= 289 | 290 | - Added support for configuring the column family definition settings so that 291 | you can tune the various memtable, row/key cache, & compaction settings. 292 | You can configure global default settings in the datbase settings in 293 | settings.py and you can have per-model overrides for the column family 294 | associated with each model. For the global settings you define an item 295 | in the dictionary of database settings whose key is named 296 | CASSANDRA_COLUMN_FAMILY_DEF_DEFAULT_SETTINGS and whose value is a dictionary 297 | of the optional keyword arguments to be passed to the CfDef constructor. 298 | Consult the Cassandra docs for the list of valid keyword args to use. 299 | Currently the per-model settings overrides are specified inline in the models, 300 | which isn't a general solution but works in most cases. 301 | I'm also planning on adding a way to specify these settings for models 302 | non-intrusively. With the current inline mechanism you define a nested class 303 | inside the model called 'CassandraSettings'. The column family def settings 304 | are specified in a class variable named COLUMN_FAMILY_DEF_SETTINGS, which 305 | is a dictionary of any of the optional CfDef settings that you want to 306 | override from the default values. All of these things are optional, so if 307 | you don't need to override anything you don't need to define the 308 | CassandraSettings class. All of the required settings for the CfDef 309 | (e.g. keyspace, name, etc.) are determined by other means. 310 | - Fixed a bug in handling null/missing columns when converting from the 311 | value from Cassandra. 312 | - Fixed some bugs with reconnecting to Cassandra if connectivity to 313 | Cassandra is disrupted. 314 | - Added a few new tests and did some cleanup to the unit tests 315 | 316 | Changes For 0.1.1 317 | ================= 318 | - fixed some bugs in the cassandra reconnection logic where it was always 319 | retrying the operation even when it succeeded the first time. 320 | - fixed a nasty bug with deleting instances where it would delete all 321 | instances whose key was a substring of the key of the instance being deleted. 322 | 323 | -------------------------------------------------------------------------------- /__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vaterlaus/django_cassandra_backend/4f6df2585df51a7b51fed59481c564c0aee74418/__init__.py -------------------------------------------------------------------------------- /django_cassandra/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vaterlaus/django_cassandra_backend/4f6df2585df51a7b51fed59481c564c0aee74418/django_cassandra/__init__.py -------------------------------------------------------------------------------- /django_cassandra/db/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vaterlaus/django_cassandra_backend/4f6df2585df51a7b51fed59481c564c0aee74418/django_cassandra/db/__init__.py -------------------------------------------------------------------------------- /django_cassandra/db/base.py: -------------------------------------------------------------------------------- 1 | # Copyright 2010 BSN, Inc. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | 15 | from django.db.utils import DatabaseError 16 | 17 | from djangotoolbox.db.base import NonrelDatabaseFeatures, \ 18 | NonrelDatabaseOperations, NonrelDatabaseWrapper, NonrelDatabaseClient, \ 19 | NonrelDatabaseValidation, NonrelDatabaseIntrospection, \ 20 | NonrelDatabaseCreation 21 | 22 | import re 23 | import time 24 | from .creation import DatabaseCreation 25 | from .introspection import DatabaseIntrospection 26 | from .utils import CassandraConnection, CassandraConnectionError, CassandraAccessError 27 | from thrift.transport import TTransport 28 | from cassandra.ttypes import * 29 | 30 | 31 | class DatabaseFeatures(NonrelDatabaseFeatures): 32 | string_based_auto_field = True 33 | 34 | def __init__(self, connection): 35 | super(DatabaseFeatures, self).__init__(connection) 36 | self.supports_deleting_related_objects = connection.settings_dict.get('CASSANDRA_ENABLE_CASCADING_DELETES', False) 37 | 38 | 39 | class DatabaseOperations(NonrelDatabaseOperations): 40 | compiler_module = __name__.rsplit('.', 1)[0] + '.compiler' 41 | 42 | def pk_default_value(self): 43 | """ 44 | Use None as the value to indicate to the insert compiler that it needs 45 | to auto-generate a guid to use for the id. The case where this gets hit 46 | is when you create a model instance with no arguments. We override from 47 | the default implementation (which returns 'DEFAULT') because it's possible 48 | that someone would explicitly initialize the id field to be that value and 49 | we wouldn't want to override that. But None would never be a valid value 50 | for the id. 51 | """ 52 | return None 53 | 54 | def sql_flush(self, style, tables, sequence_list): 55 | for table_name in tables: 56 | self.connection.creation.flush_table(table_name) 57 | return "" 58 | 59 | class DatabaseClient(NonrelDatabaseClient): 60 | pass 61 | 62 | class DatabaseValidation(NonrelDatabaseValidation): 63 | pass 64 | 65 | class DatabaseWrapper(NonrelDatabaseWrapper): 66 | def __init__(self, *args, **kwds): 67 | super(DatabaseWrapper, self).__init__(*args, **kwds) 68 | 69 | # Set up the associated backend objects 70 | self.features = DatabaseFeatures(self) 71 | self.ops = DatabaseOperations(self) 72 | self.client = DatabaseClient(self) 73 | self.creation = DatabaseCreation(self) 74 | self.validation = DatabaseValidation(self) 75 | self.introspection = DatabaseIntrospection(self) 76 | 77 | self.read_consistency_level = self.settings_dict.get('CASSANDRA_READ_CONSISTENCY_LEVEL', ConsistencyLevel.ONE) 78 | self.write_consistency_level = self.settings_dict.get('CASSANDRA_WRITE_CONSISTENCY_LEVEL', ConsistencyLevel.ONE) 79 | self.max_key_count = self.settings_dict.get('CASSANDRA_MAX_KEY_COUNT', 1000000) 80 | self.max_column_count = self.settings_dict.get('CASSANDRA_MAX_COLUMN_COUNT', 10000) 81 | self.column_family_def_defaults = self.settings_dict.get('CASSANDRA_COLUMN_FAMILY_DEF_DEFAULT_SETTINGS', {}) 82 | 83 | self._db_connection = None 84 | self.determined_version = False 85 | 86 | def configure_connection(self, set_keyspace=False, login=False): 87 | 88 | if not self._db_connection.is_connected(): 89 | self._db_connection.open(False, False) 90 | self.determined_version = False 91 | 92 | if not self.determined_version: 93 | # Determine which version of Cassandra we're connected to 94 | version_string = self._db_connection.get_client().describe_version() 95 | try: 96 | # FIXME: Should do some version check here to make sure that we're 97 | # talking to a cassandra daemon that supports the operations we require 98 | m = re.match('^([0-9]+)\.([0-9]+)\.([0-9]+)$', version_string) 99 | major_version = int(m.group(1)) 100 | minor_version = int(m.group(2)) 101 | patch_version = int(m.group(3)) 102 | self.determined_version = True 103 | except Exception, e: 104 | raise DatabaseError('Invalid Thrift version string', e) 105 | 106 | # Determine supported features based on the API version 107 | self.supports_replication_factor_as_strategy_option = major_version >= 19 and minor_version >= 10 108 | 109 | if login: 110 | self._db_connection.login() 111 | 112 | if set_keyspace: 113 | try: 114 | self._db_connection.set_keyspace() 115 | except Exception, e: 116 | # Set up the default settings for the keyspace 117 | keyspace_def_settings = { 118 | 'name': self._db_connection.keyspace, 119 | 'strategy_class': 'org.apache.cassandra.locator.SimpleStrategy', 120 | 'strategy_options': {}, 121 | 'cf_defs': []} 122 | 123 | # Apply any overrides for the keyspace settings 124 | custom_keyspace_def_settings = self.settings_dict.get('CASSANDRA_KEYSPACE_DEF_SETTINGS') 125 | if custom_keyspace_def_settings: 126 | keyspace_def_settings.update(custom_keyspace_def_settings) 127 | 128 | # Apply any overrides for the replication strategy 129 | # Note: This could be done by the user using the 130 | # CASSANDRA_KEYSPACE_DEF_SETTINGS, but the following customizations are 131 | # still supported for backwards compatibility with older versions of the backend 132 | strategy_class = self.settings_dict.get('CASSANDRA_REPLICATION_STRATEGY') 133 | if strategy_class: 134 | keyspace_def_settings['strategy_class'] = strategy_class 135 | 136 | # Apply an override of the strategy options 137 | strategy_options = self.settings_dict.get('CASSANDRA_REPLICATION_STRATEGY_OPTIONS') 138 | if strategy_options: 139 | if type(strategy_options) != dict: 140 | raise DatabaseError('CASSANDRA_REPLICATION_STRATEGY_OPTIONS must be a dictionary') 141 | keyspace_def_settings['strategy_options'].update(strategy_options) 142 | 143 | # Apply an override of the replication factor. Depending on the version of 144 | # Cassandra this may be applied to either the strategy options or the top-level 145 | # keyspace def settings 146 | replication_factor = self.settings_dict.get('CASSANDRA_REPLICATION_FACTOR') 147 | replication_factor_parent = keyspace_def_settings['strategy_options'] \ 148 | if self.supports_replication_factor_as_strategy_option else keyspace_def_settings 149 | if replication_factor: 150 | replication_factor_parent['replication_factor'] = str(replication_factor) 151 | elif 'replication_factor' not in replication_factor_parent: 152 | replication_factor_parent['replication_factor'] = '1' 153 | 154 | keyspace_def = KsDef(**keyspace_def_settings) 155 | self._db_connection.get_client().system_add_keyspace(keyspace_def) 156 | self._db_connection.set_keyspace() 157 | 158 | 159 | def get_db_connection(self, set_keyspace=False, login=False): 160 | if not self._db_connection: 161 | # Get the host and port specified in the database backend settings. 162 | # Default to the standard Cassandra settings. 163 | host = self.settings_dict.get('HOST') 164 | if not host or host == '': 165 | host = 'localhost' 166 | 167 | port = self.settings_dict.get('PORT') 168 | if not port or port == '': 169 | port = 9160 170 | 171 | keyspace = self.settings_dict.get('NAME') 172 | if keyspace == None: 173 | keyspace = 'django' 174 | 175 | user = self.settings_dict.get('USER') 176 | password = self.settings_dict.get('PASSWORD') 177 | 178 | # Create our connection wrapper 179 | self._db_connection = CassandraConnection(host, port, keyspace, user, password) 180 | 181 | try: 182 | self.configure_connection(set_keyspace, login) 183 | except TTransport.TTransportException, e: 184 | raise CassandraConnectionError(e) 185 | except Exception, e: 186 | raise CassandraAccessError(e) 187 | 188 | return self._db_connection 189 | 190 | @property 191 | def db_connection(self): 192 | return self.get_db_connection(True, True) 193 | -------------------------------------------------------------------------------- /django_cassandra/db/compiler.py: -------------------------------------------------------------------------------- 1 | # Copyright 2010 BSN, Inc. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | 15 | import datetime 16 | import sys 17 | import traceback 18 | import datetime 19 | import decimal 20 | 21 | from django.db.models import ForeignKey 22 | from django.db.models.sql.where import AND, OR, WhereNode 23 | from django.db.models.sql.constants import MULTI 24 | from django.db.utils import DatabaseError 25 | 26 | from functools import wraps 27 | 28 | from djangotoolbox.db.basecompiler import NonrelQuery, NonrelCompiler, \ 29 | NonrelInsertCompiler, NonrelUpdateCompiler, NonrelDeleteCompiler 30 | 31 | from .utils import * 32 | from .predicate import * 33 | 34 | from uuid import uuid4 35 | from cassandra import Cassandra 36 | from cassandra.ttypes import * 37 | from thrift.transport.TTransport import TTransportException 38 | 39 | def safe_call(func): 40 | @wraps(func) 41 | def _func(*args, **kwargs): 42 | try: 43 | return func(*args, **kwargs) 44 | except Exception, e: 45 | raise DatabaseError, DatabaseError(*tuple(e)), sys.exc_info()[2] 46 | return _func 47 | 48 | class CassandraQuery(NonrelQuery): 49 | 50 | # FIXME: How do we set this value? What's the maximum value it can be? 51 | #MAX_FETCH_COUNT = 0x7ffffff 52 | MAX_FETCH_COUNT = 10000 53 | 54 | def __init__(self, compiler, fields): 55 | super(CassandraQuery, self).__init__(compiler, fields) 56 | 57 | self.pk_column = self.query.get_meta().pk.column 58 | self.column_family = self.query.get_meta().db_table 59 | self.root_predicate = None 60 | self.ordering_spec = None 61 | self.cached_results = None 62 | 63 | self.indexed_columns = [] 64 | self.field_name_to_column_name = {} 65 | for field in fields: 66 | column_name = field.db_column if field.db_column else field.column 67 | if field.db_index: 68 | self.indexed_columns.append(column_name) 69 | self.field_name_to_column_name[field.name] = column_name 70 | 71 | # This is needed for debugging 72 | def __repr__(self): 73 | # TODO: add some meaningful query string for debugging 74 | return '' 75 | 76 | def _convert_key_slice_to_rows(self, key_slice): 77 | rows = [] 78 | for element in key_slice: 79 | if element.columns: 80 | row = self._convert_column_list_to_row(element.columns, self.pk_column, element.key) 81 | rows.append(row) 82 | return rows 83 | 84 | def _convert_column_list_to_row(self, column_list, pk_column_name, pk_value): 85 | row = {} 86 | # FIXME: When we add code to allow primary keys that also are indexed, 87 | # then we can change this to not set the primary key column in that case. 88 | # row[pk_column_name] = pk_value 89 | for column in column_list: 90 | row[column.column.name] = column.column.value 91 | return row 92 | 93 | 94 | def _get_rows_by_pk(self, range_predicate): 95 | 96 | db_connection = self.connection.db_connection 97 | column_parent = ColumnParent(column_family=self.column_family) 98 | slice_predicate = SlicePredicate(slice_range=SliceRange(start='', 99 | finish='', count=self.connection.max_column_count)) 100 | 101 | if range_predicate._is_exact(): 102 | column_list = call_cassandra_with_reconnect(db_connection, 103 | Cassandra.Client.get_slice, range_predicate.start, 104 | column_parent, slice_predicate, self.connection.read_consistency_level) 105 | if column_list: 106 | row = self._convert_column_list_to_row(column_list, self.pk_column, range_predicate.start) 107 | rows = [row] 108 | else: 109 | rows = [] 110 | else: 111 | if range_predicate.start != None: 112 | key_start = range_predicate.start 113 | if not range_predicate.start_inclusive: 114 | key_start = key_start + chr(1) 115 | else: 116 | key_start = '' 117 | 118 | if range_predicate.end != None: 119 | key_end = range_predicate.end 120 | if not range_predicate.end_inclusive: 121 | key_end = key_end[:-1] + chr(ord(key_end[-1])-1) + (chr(126) * 16) 122 | else: 123 | key_end = '' 124 | 125 | key_range = KeyRange(start_key=key_start, end_key=key_end, 126 | count=self.connection.max_key_count) 127 | key_slice = call_cassandra_with_reconnect(db_connection, 128 | Cassandra.Client.get_range_slices, column_parent, 129 | slice_predicate, key_range, self.connection.read_consistency_level) 130 | 131 | rows = self._convert_key_slice_to_rows(key_slice) 132 | 133 | return rows 134 | 135 | def _get_rows_by_indexed_column(self, range_predicate): 136 | # Construct the index expression for the range predicate 137 | index_expressions = [] 138 | if ((range_predicate.start != None) and 139 | (range_predicate.end == range_predicate.start) and 140 | range_predicate.start_inclusive and 141 | range_predicate.end_inclusive): 142 | index_expression = IndexExpression(range_predicate.column, IndexOperator.EQ, unicode(range_predicate.start)) 143 | index_expressions.append(index_expression) 144 | else: 145 | # NOTE: These range queries don't work with the current version of cassandra 146 | # that I'm using (0.7 beta3) 147 | # It looks like there are cassandra tickets to add support for this, but it's 148 | # unclear how soon it will be supported. We shouldn't hit this code for now, 149 | # though, because can_evaluate_efficiently was changed to disable range queries 150 | # on indexed columns (they still can be performed, just inefficiently). 151 | if range_predicate.start: 152 | index_op = IndexOperator.GTE if range_predicate.start_inclusive else IndexOperator.GT 153 | index_expression = IndexExpression(unicode(range_predicate.column), index_op, unicode(range_predicate.start)) 154 | index_expressions.append(index_expression) 155 | if range_predicate.end: 156 | index_op = IndexOperator.LTE if range_predicate.end_inclusive else IndexOperator.LT 157 | index_expression = IndexExpression(unicode(range_predicate.column), index_op, unicode(range_predicate.end)) 158 | index_expressions.append(index_expression) 159 | 160 | assert(len(index_expressions) > 0) 161 | 162 | # Now make the call to cassandra to get the key slice 163 | db_connection = self.connection.db_connection 164 | column_parent = ColumnParent(column_family=self.column_family) 165 | index_clause = IndexClause(index_expressions, '', self.connection.max_key_count) 166 | slice_predicate = SlicePredicate(slice_range=SliceRange(start='', finish='', count=self.connection.max_column_count)) 167 | 168 | key_slice = call_cassandra_with_reconnect(db_connection, 169 | Cassandra.Client.get_indexed_slices, 170 | column_parent, index_clause, slice_predicate, 171 | self.connection.read_consistency_level) 172 | rows = self._convert_key_slice_to_rows(key_slice) 173 | 174 | return rows 175 | 176 | def get_row_range(self, range_predicate): 177 | pk_column = self.query.get_meta().pk.column 178 | if range_predicate.column == pk_column: 179 | rows = self._get_rows_by_pk(range_predicate) 180 | else: 181 | assert(range_predicate.column in self.indexed_columns) 182 | rows = self._get_rows_by_indexed_column(range_predicate) 183 | return rows 184 | 185 | def get_all_rows(self): 186 | # TODO: Could factor this code better 187 | db_connection = self.connection.db_connection 188 | column_parent = ColumnParent(column_family=self.column_family) 189 | slice_predicate = SlicePredicate(slice_range=SliceRange(start='', finish='', count=self.connection.max_column_count)) 190 | key_range = KeyRange(start_token = '0', end_token = '0', count=self.connection.max_key_count) 191 | #end_key = u'\U0010ffff'.encode('utf-8') 192 | #key_range = KeyRange(start_key='\x01', end_key=end_key, count=self.connection.max_key_count) 193 | 194 | key_slice = call_cassandra_with_reconnect(db_connection, 195 | Cassandra.Client.get_range_slices, column_parent, 196 | slice_predicate, key_range, self.connection.read_consistency_level) 197 | rows = self._convert_key_slice_to_rows(key_slice) 198 | 199 | return rows 200 | 201 | def _get_query_results(self): 202 | if self.cached_results == None: 203 | assert(self.root_predicate != None) 204 | self.cached_results = self.root_predicate.get_matching_rows(self) 205 | if self.ordering_spec: 206 | sort_rows(self.cached_results, self.ordering_spec) 207 | return self.cached_results 208 | 209 | @safe_call 210 | def fetch(self, low_mark, high_mark): 211 | 212 | if self.root_predicate == None: 213 | raise DatabaseError('No root query node') 214 | 215 | try: 216 | if high_mark is not None and high_mark <= low_mark: 217 | return 218 | 219 | results = self._get_query_results() 220 | if low_mark is not None or high_mark is not None: 221 | results = results[low_mark:high_mark] 222 | except Exception, e: 223 | # FIXME: Can get rid of this exception handling code eventually, 224 | # but it's useful for debugging for now. 225 | #traceback.print_exc() 226 | raise e 227 | 228 | for entity in results: 229 | yield entity 230 | 231 | @safe_call 232 | def count(self, limit=None): 233 | # TODO: This could be implemented more efficiently for simple predicates 234 | # where we could call the count method in the Cassandra Thrift API. 235 | # We can optimize for that later 236 | results = self._get_query_results() 237 | return len(results) 238 | 239 | @safe_call 240 | def delete(self): 241 | results = self._get_query_results() 242 | timestamp = get_next_timestamp() 243 | column_family = self.query.get_meta().db_table 244 | mutation_map = {} 245 | for item in results: 246 | mutation_map[item[self.pk_column]] = {column_family: [Mutation(deletion=Deletion(timestamp=timestamp))]} 247 | db_connection = self.connection.db_connection 248 | call_cassandra_with_reconnect(db_connection, 249 | Cassandra.Client.batch_mutate, mutation_map, 250 | self.connection.write_consistency_level) 251 | 252 | 253 | @safe_call 254 | def order_by(self, ordering): 255 | self.ordering_spec = [] 256 | for order in ordering: 257 | if order.startswith('-'): 258 | field_name = order[1:] 259 | reversed = True 260 | else: 261 | field_name = order 262 | reversed = False 263 | column_name = self.field_name_to_column_name.get(field_name, field_name) 264 | #if column in self.foreign_key_columns: 265 | # column = column + '_id' 266 | self.ordering_spec.append((column_name, reversed)) 267 | 268 | def init_predicate(self, parent_predicate, node): 269 | if isinstance(node, WhereNode): 270 | if node.connector == OR: 271 | compound_op = COMPOUND_OP_OR 272 | elif node.connector == AND: 273 | compound_op = COMPOUND_OP_AND 274 | else: 275 | raise InvalidQueryOpException() 276 | predicate = CompoundPredicate(compound_op, node.negated) 277 | for child in node.children: 278 | child_predicate = self.init_predicate(predicate, child) 279 | if parent_predicate: 280 | parent_predicate.add_child(predicate) 281 | else: 282 | column, lookup_type, db_type, value = self._decode_child(node) 283 | db_value = self.convert_value_for_db(db_type, value) 284 | assert parent_predicate 285 | parent_predicate.add_filter(column, lookup_type, db_value) 286 | predicate = None 287 | 288 | return predicate 289 | 290 | # FIXME: This is bad. We're modifying the WhereNode object that's passed in to us 291 | # from the Django ORM. We should do the pruning as we build our predicates, not 292 | # munge the WhereNode. 293 | def remove_unnecessary_nodes(self, node, retain_root_node): 294 | if isinstance(node, WhereNode): 295 | child_count = len(node.children) 296 | for i in range(child_count): 297 | node.children[i] = self.remove_unnecessary_nodes(node.children[i], False) 298 | if (not retain_root_node) and (not node.negated) and (len(node.children) == 1): 299 | node = node.children[0] 300 | return node 301 | 302 | @safe_call 303 | def add_filters(self, filters): 304 | """ 305 | Traverses the given Where tree and adds the filters to this query 306 | """ 307 | 308 | #if filters.negated: 309 | # raise InvalidQueryOpException('Exclude queries not implemented yet.') 310 | assert isinstance(filters,WhereNode) 311 | self.remove_unnecessary_nodes(filters, True) 312 | self.root_predicate = self.init_predicate(None, filters) 313 | 314 | class SQLCompiler(NonrelCompiler): 315 | query_class = CassandraQuery 316 | 317 | SPECIAL_NONE_VALUE = "\b" 318 | 319 | # Override this method from NonrelCompiler to get around problem with 320 | # mixing the field default values with the field format as its stored 321 | # in the database (i.e. convert_value_from_db should only be passed 322 | # the database-specific storage format not the field default value. 323 | def _make_result(self, entity, fields): 324 | result = [] 325 | for field in fields: 326 | value = entity.get(field.column) 327 | if value is not None: 328 | value = self.convert_value_from_db( 329 | field.db_type(connection=self.connection), value) 330 | else: 331 | value = field.get_default() 332 | if not field.null and value is None: 333 | raise DatabaseError("Non-nullable field %s can't be None!" % field.name) 334 | result.append(value) 335 | 336 | return result 337 | 338 | # This gets called for each field type when you fetch() an entity. 339 | # db_type is the string that you used in the DatabaseCreation mapping 340 | def convert_value_from_db(self, db_type, value): 341 | 342 | if value == self.SPECIAL_NONE_VALUE or value is None: 343 | return None 344 | 345 | if db_type.startswith('ListField:'): 346 | db_sub_type = db_type.split(':', 1)[1] 347 | value = convert_string_to_list(value) 348 | if isinstance(value, (list, tuple)) and len(value): 349 | value = [self.convert_value_from_db(db_sub_type, subvalue) 350 | for subvalue in value] 351 | elif db_type == 'date': 352 | dt = datetime.datetime.strptime(value, '%Y-%m-%d') 353 | value = dt.date() 354 | elif db_type == 'datetime': 355 | value = datetime.datetime.strptime(value, '%Y-%m-%d %H:%M:%S.%f') 356 | elif db_type == 'time': 357 | dt = datetime.datetime.strptime(value, '%H:%M:%S.%f') 358 | value = dt.time() 359 | elif db_type == 'bool': 360 | value = value.lower() == 'true' 361 | elif db_type == 'int': 362 | value = int(value) 363 | elif db_type == 'long': 364 | value = long(value) 365 | elif db_type == 'float': 366 | value = float(value) 367 | #elif db_type == 'id': 368 | # value = unicode(value).decode('utf-8') 369 | elif db_type.startswith('decimal'): 370 | value = decimal.Decimal(value) 371 | elif isinstance(value, str): 372 | # always retrieve strings as unicode (it is possible that old datasets 373 | # contain non unicode strings, nevertheless work with unicode ones) 374 | value = value.decode('utf-8') 375 | 376 | return value 377 | 378 | # This gets called for each field type when you insert() an entity. 379 | # db_type is the string that you used in the DatabaseCreation mapping 380 | def convert_value_for_db(self, db_type, value): 381 | if value is None: 382 | return self.SPECIAL_NONE_VALUE 383 | 384 | if db_type.startswith('ListField:'): 385 | db_sub_type = db_type.split(':', 1)[1] 386 | if isinstance(value, (list, tuple)) and len(value): 387 | value = [self.convert_value_for_db(db_sub_type, subvalue) for subvalue in value] 388 | value = convert_list_to_string(value) 389 | elif type(value) is list: 390 | value = [self.convert_value_for_db(db_type, item) for item in value] 391 | elif db_type == 'datetime': 392 | value = value.strftime('%Y-%m-%d %H:%M:%S.%f') 393 | elif db_type == 'time': 394 | value = value.strftime('%H:%M:%S.%f') 395 | elif db_type == 'bool': 396 | value = str(value).lower() 397 | elif (db_type == 'int') or (db_type == 'long') or (db_type == 'float'): 398 | value = str(value) 399 | elif db_type == 'id': 400 | value = unicode(value) 401 | elif (type(value) is not unicode) and (type(value) is not str): 402 | value = unicode(value) 403 | 404 | # always store strings as utf-8 405 | if type(value) is unicode: 406 | value = value.encode('utf-8') 407 | 408 | return value 409 | 410 | # This handles both inserts and updates of individual entities 411 | class SQLInsertCompiler(NonrelInsertCompiler, SQLCompiler): 412 | 413 | @safe_call 414 | def insert(self, data, return_id=False): 415 | pk_column = self.query.get_meta().pk.column 416 | model = self.query.model 417 | compound_key_fields = None 418 | if hasattr(model, 'CassandraSettings'): 419 | if hasattr(model.CassandraSettings, 'ADJUSTED_COMPOUND_KEY_FIELDS'): 420 | compound_key_fields = model.CassandraSettings.ADJUSTED_COMPOUND_KEY_FIELDS 421 | elif hasattr(model.CassandraSettings, 'COMPOUND_KEY_FIELDS'): 422 | compound_key_fields = [] 423 | for field_name in model.CassandraSettings.COMPOUND_KEY_FIELDS: 424 | field_class = None 425 | for lf in model._meta.local_fields: 426 | if lf.name == field_name: 427 | field_class = lf 428 | break 429 | if field_class is None: 430 | raise DatabaseError('Invalid compound key field') 431 | if type(field_class) is ForeignKey: 432 | field_name += '_id' 433 | compound_key_fields.append(field_name) 434 | model.CassandraSettings.ADJUSTED_COMPOUND_KEY_FIELDS = compound_key_fields 435 | separator = model.CassandraSettings.COMPOUND_KEY_SEPARATOR \ 436 | if hasattr(model.CassandraSettings, 'COMPOUND_KEY_SEPARATOR') \ 437 | else self.connection.settings_dict.get('CASSANDRA_COMPOUND_KEY_SEPARATOR', '|') 438 | # See if the data arguments contain a value for the primary key. 439 | # FIXME: For now we leave the key data as a column too. This is 440 | # suboptimal, since the data is duplicated, but there are a couple of cases 441 | # where you need to keep the column. First, if you have a model with only 442 | # a single field that's the primary key (admittedly a semi-pathological case, 443 | # but I can imagine valid use cases where you have this), then it doesn't 444 | # work if the column is removed, because then there are no columns and that's 445 | # interpreted as a deleted row (i.e. the usual Cassandra tombstone issue). 446 | # Second, if there's a secondary index configured for the primary key field 447 | # (not particularly useful with the current Cassandra, but would be valid when 448 | # you can do a range query on indexed column) then you'd want to keep the 449 | # column. So for now, we just leave the column in there so these cases work. 450 | # Eventually we can optimize this and remove the column where it makes sense. 451 | key = data.get(pk_column) 452 | if key: 453 | if compound_key_fields is not None: 454 | compound_key_values = key.split(separator) 455 | for field_name, compound_key_value in zip(compound_key_fields, compound_key_values): 456 | if field_name in data and data[field_name] != compound_key_value: 457 | raise DatabaseError("The value of the compound key doesn't match the values of the individual fields") 458 | else: 459 | if compound_key_fields is not None: 460 | try: 461 | compound_key_values = [data.get(field_name) for field_name in compound_key_fields] 462 | key = separator.join(compound_key_values) 463 | except Exception, e: 464 | raise DatabaseError('The values of the fields used to form a compound key must be specified and cannot be null') 465 | else: 466 | key = str(uuid4()) 467 | # Insert the key as column data too 468 | # FIXME. See the above comment. When the primary key handling is optimized, 469 | # then we would not always add the key to the data here. 470 | data[pk_column] = key 471 | 472 | timestamp = get_next_timestamp() 473 | 474 | mutation_list = [] 475 | for name, value in data.items(): 476 | # FIXME: Do we need this check here? Or is the name always already a str instead of unicode. 477 | if type(name) is unicode: 478 | name = name.decode('utf-8') 479 | mutation = Mutation(column_or_supercolumn=ColumnOrSuperColumn(column=Column(name=name, value=value, timestamp=timestamp))) 480 | mutation_list.append(mutation) 481 | 482 | db_connection = self.connection.db_connection 483 | column_family = self.query.get_meta().db_table 484 | call_cassandra_with_reconnect(db_connection, 485 | Cassandra.Client.batch_mutate, {key: {column_family: mutation_list}}, 486 | self.connection.write_consistency_level) 487 | 488 | if return_id: 489 | return key 490 | 491 | class SQLUpdateCompiler(NonrelUpdateCompiler, SQLCompiler): 492 | def __init__(self, *args, **kwargs): 493 | super(SQLUpdateCompiler, self).__init__(*args, **kwargs) 494 | 495 | def execute_sql(self, result_type=MULTI): 496 | data = {} 497 | for field, model, value in self.query.values: 498 | assert field is not None 499 | if not field.null and value is None: 500 | raise DatabaseError("You can't set %s (a non-nullable " 501 | "field) to None!" % field.name) 502 | db_type = field.db_type(connection=self.connection) 503 | value = self.convert_value_for_db(db_type, value) 504 | data[field.column] = value 505 | 506 | # TODO: Add compound key check here -- ensure that we're not updating 507 | # any of the fields that are components in the compound key. 508 | 509 | # TODO: This isn't super efficient because executing the query will 510 | # fetch all of the columns for each row even though all we really need 511 | # is the key for the row. Should be pretty straightforward to change 512 | # the CassandraQuery class to support custom slice predicates. 513 | 514 | #model = self.query.model 515 | pk_column = self.query.get_meta().pk.column 516 | 517 | pk_index = -1 518 | fields = self.get_fields() 519 | for index in range(len(fields)): 520 | if fields[index].column == pk_column: 521 | pk_index = index; 522 | break 523 | if pk_index == -1: 524 | raise DatabaseError('Invalid primary key column') 525 | 526 | row_count = 0 527 | column_family = self.query.get_meta().db_table 528 | timestamp = get_next_timestamp() 529 | batch_mutate_data = {} 530 | for result in self.results_iter(): 531 | row_count += 1 532 | mutation_list = [] 533 | key = result[pk_index] 534 | for name, value in data.items(): 535 | # FIXME: Do we need this check here? Or is the name always already a str instead of unicode. 536 | if type(name) is unicode: 537 | name = name.decode('utf-8') 538 | mutation = Mutation(column_or_supercolumn=ColumnOrSuperColumn(column=Column(name=name, value=value, timestamp=timestamp))) 539 | mutation_list.append(mutation) 540 | batch_mutate_data[key] = {column_family: mutation_list} 541 | 542 | db_connection = self.connection.db_connection 543 | call_cassandra_with_reconnect(db_connection, 544 | Cassandra.Client.batch_mutate, batch_mutate_data, 545 | self.connection.write_consistency_level) 546 | 547 | return row_count 548 | 549 | class SQLDeleteCompiler(NonrelDeleteCompiler, SQLCompiler): 550 | pass 551 | -------------------------------------------------------------------------------- /django_cassandra/db/creation.py: -------------------------------------------------------------------------------- 1 | # Copyright 2010 BSN, Inc. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | 15 | from django.db.backends.creation import TEST_DATABASE_PREFIX 16 | from django.db.utils import DatabaseError 17 | from djangotoolbox.db.creation import NonrelDatabaseCreation 18 | from cassandra import Cassandra 19 | from cassandra.ttypes import * 20 | from django.core.management import call_command 21 | from .utils import get_next_timestamp 22 | 23 | class DatabaseCreation(NonrelDatabaseCreation): 24 | 25 | data_types = { 26 | 'AutoField': 'text', 27 | 'BigIntegerField': 'long', 28 | 'BooleanField': 'bool', 29 | 'CharField': 'text', 30 | 'CommaSeparatedIntegerField': 'text', 31 | 'DateField': 'date', 32 | 'DateTimeField': 'datetime', 33 | 'DecimalField': 'decimal:%(max_digits)s,%(decimal_places)s', 34 | 'EmailField': 'text', 35 | 'FileField': 'text', 36 | 'FilePathField': 'text', 37 | 'FloatField': 'float', 38 | 'ImageField': 'text', 39 | 'IntegerField': 'int', 40 | 'IPAddressField': 'text', 41 | 'NullBooleanField': 'bool', 42 | 'OneToOneField': 'integer', 43 | 'PositiveIntegerField': 'int', 44 | 'PositiveSmallIntegerField': 'int', 45 | 'SlugField': 'text', 46 | 'SmallIntegerField': 'integer', 47 | 'TextField': 'text', 48 | 'TimeField': 'time', 49 | 'URLField': 'text', 50 | 'XMLField': 'text', 51 | 'GenericAutoField': 'id', 52 | 'StringForeignKey': 'id', 53 | 'AutoField': 'id', 54 | 'RelatedAutoField': 'id', 55 | } 56 | 57 | def sql_create_model(self, model, style, known_models=set()): 58 | 59 | db_connection = self.connection.db_connection 60 | keyspace = self.connection.settings_dict['NAME'] 61 | 62 | opts = model._meta 63 | column_metadata = [] 64 | 65 | # Browsing through fields to find indexed fields 66 | for field in opts.local_fields: 67 | if field.db_index: 68 | column_name = str(field.db_column if field.db_column else field.column) 69 | column_def = ColumnDef(name=column_name, validation_class='BytesType', 70 | index_type=IndexType.KEYS) 71 | column_metadata.append(column_def) 72 | 73 | cfdef_settings = self.connection.column_family_def_defaults.copy() 74 | 75 | if hasattr(model, 'CassandraSettings') and \ 76 | hasattr(model.CassandraSettings, 'COLUMN_FAMILY_DEF_SETTINGS'): 77 | cfdef_overrides = model.CassandraSettings.COLUMN_FAMILY_DEF_SETTINGS 78 | if type(cfdef_overrides) is not dict: 79 | raise DatabaseError('The value of COLUMN_FAMILY_DEF_SETTINGS in the ' 80 | 'CassandraSettings class must be a dictionary of the optional ' 81 | 'settings to use when creating the column family.') 82 | cfdef_settings.update(cfdef_overrides) 83 | 84 | cfdef_settings['keyspace'] = keyspace 85 | if not cfdef_settings.get('name'): 86 | cfdef_settings['name'] = opts.db_table 87 | if not cfdef_settings.get('comparator_type'): 88 | cfdef_settings['comparator_type'] = 'UTF8Type' 89 | cfdef_settings['column_metadata'] = column_metadata 90 | 91 | column_family_def = CfDef(**cfdef_settings) 92 | 93 | db_connection.get_client().system_add_column_family(column_family_def) 94 | 95 | return [], {} 96 | 97 | def drop_keyspace(self, keyspace_name, verbosity=1): 98 | """ 99 | Drop the specified keyspace from the cluster. 100 | """ 101 | 102 | db_connection = self.connection.get_db_connection(False, False) 103 | 104 | try: 105 | db_connection.get_client().system_drop_keyspace(keyspace_name) 106 | except Exception, e: 107 | # We want to succeed without complaining if the test db doesn't 108 | # exist yet, so we just assume that any exception that's raised 109 | # was for that reason and ignore it, except for printing a 110 | # message if verbose output is enabled 111 | # FIXME: Could probably be more specific about the Thrift 112 | # exception that we catch here. 113 | #if verbosity >= 1: 114 | # print "Exception thrown while trying to drop the test database/keyspace: ", e 115 | pass 116 | 117 | def create_test_db(self, verbosity, autoclobber): 118 | """ 119 | Create a new test database/keyspace. 120 | """ 121 | 122 | if verbosity >= 1: 123 | print "Creating test database '%s'..." % self.connection.alias 124 | 125 | # Replace the NAME field in the database settings with the test keyspace name 126 | settings_dict = self.connection.settings_dict 127 | if settings_dict.get('TEST_NAME'): 128 | test_keyspace_name = settings_dict['TEST_NAME'] 129 | else: 130 | test_keyspace_name = TEST_DATABASE_PREFIX + settings_dict['NAME'] 131 | 132 | settings_dict['NAME'] = test_keyspace_name 133 | 134 | # First make sure we've destroyed an existing test keyspace 135 | # FIXME: Should probably do something with autoclobber here, but why 136 | # would you ever not want to autoclobber when running the tests? 137 | self.drop_keyspace(test_keyspace_name, verbosity) 138 | 139 | # Call syncdb to create the necessary tables/column families 140 | call_command('syncdb', verbosity=False, interactive=False, database=self.connection.alias) 141 | 142 | return test_keyspace_name 143 | 144 | def destroy_test_db(self, old_database_name, verbosity=1): 145 | """ 146 | Destroy the test database/keyspace. 147 | """ 148 | 149 | if verbosity >= 1: 150 | print "Destroying test database '%s'..." % self.connection.alias 151 | 152 | settings_dict = self.connection.settings_dict 153 | test_keyspace_name = settings_dict.get('NAME') 154 | settings_dict['NAME'] = old_database_name 155 | 156 | self.drop_keyspace(test_keyspace_name, verbosity) 157 | 158 | def flush_table(self, table_name): 159 | 160 | db_connection = self.connection.db_connection 161 | 162 | # FIXME: Calling truncate here seems to corrupt the secondary indexes, 163 | # so for now the truncate call has been replaced with removing the 164 | # row one by one. When the truncate bug has been fixed in Cassandra 165 | # this should be switched back to use truncate. 166 | # NOTE: This should be fixed as of the 0.7.0-rc2 build, so we should 167 | # try this out again to see if it works now. 168 | # UPDATE: Tried it with rc2 and it worked calling truncate but it was 169 | # slower than using remove (at least for the unit tests), so for now 170 | # I'm leaving it alone pending further investigation. 171 | #db_connection.get_client().truncate(table_name) 172 | 173 | column_parent = ColumnParent(column_family=table_name) 174 | slice_predicate = SlicePredicate(column_names=[]) 175 | key_range = KeyRange(start_token = '0', end_token = '0', count = 1000) 176 | key_slice_list = db_connection.get_client().get_range_slices(column_parent, slice_predicate, key_range, ConsistencyLevel.ONE) 177 | column_path = ColumnPath(column_family=table_name) 178 | timestamp = get_next_timestamp() 179 | for key_slice in key_slice_list: 180 | db_connection.get_client().remove(key_slice.key, column_path, timestamp, ConsistencyLevel.ONE) 181 | 182 | 183 | def sql_indexes_for_model(self, model, style): 184 | """ 185 | We already handle creating the indexes in sql_create_model (above) so 186 | we don't need to do anything more here. 187 | """ 188 | return [] 189 | 190 | def set_autocommit(self): 191 | pass 192 | 193 | -------------------------------------------------------------------------------- /django_cassandra/db/introspection.py: -------------------------------------------------------------------------------- 1 | # Copyright 2010 BSN, Inc. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | 15 | from djangotoolbox.db.base import NonrelDatabaseIntrospection 16 | from django.db.backends import BaseDatabaseIntrospection 17 | 18 | class DatabaseIntrospection(NonrelDatabaseIntrospection): 19 | def get_table_list(self, cursor): 20 | "Returns a list of names of all tables that exist in the database." 21 | db_connection = self.connection.db_connection 22 | ks_def = db_connection.get_client().describe_keyspace(db_connection.keyspace) 23 | result = [cf_def.name for cf_def in ks_def.cf_defs] 24 | return result 25 | 26 | def table_names(self): 27 | # NonrelDatabaseIntrospection has an implementation of this that returns 28 | # that all of the tables for the models already exist in the database, 29 | # so the DatabaseCreation code never gets called to create new tables, 30 | # which isn't how we want things to work for Cassandra, so we bypass the 31 | # nonrel implementation and go directly to the base introspection code. 32 | return BaseDatabaseIntrospection.table_names(self) 33 | 34 | def sequence_list(self): 35 | return [] 36 | 37 | # TODO: Implement these things eventually 38 | #=============================================================================== 39 | # def get_table_description(self, cursor, table_name): 40 | # "Returns a description of the table, with the DB-API cursor.description interface." 41 | # return "" 42 | # 43 | # def get_relations(self, cursor, table_name): 44 | # """ 45 | # Returns a dictionary of {field_index: (field_index_other_table, other_table)} 46 | # representing all relationships to the given table. Indexes are 0-based. 47 | # """ 48 | # relations = {} 49 | # return relations 50 | # 51 | # def get_indexes(self, cursor, table_name): 52 | # """ 53 | # Returns a dictionary of fieldname -> infodict for the given table, 54 | # where each infodict is in the format: 55 | # {'primary_key': boolean representing whether it's the primary key, 56 | # 'unique': boolean representing whether it's a unique index} 57 | # """ 58 | # indexes = {} 59 | # return indexes 60 | #=============================================================================== 61 | -------------------------------------------------------------------------------- /django_cassandra/db/predicate.py: -------------------------------------------------------------------------------- 1 | # Copyright 2010 BSN, Inc. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | 15 | import re 16 | from .utils import combine_rows 17 | 18 | SECONDARY_INDEX_SUPPORT_ENABLED = True 19 | 20 | class InvalidSortSpecException(Exception): 21 | def __init__(self): 22 | super(InvalidSortSpecException, self).__init__('The row sort spec must be a sort spec tuple/list or a tuple/list of sort specs') 23 | 24 | class InvalidRowCombinationOpException(Exception): 25 | def __init__(self): 26 | super(InvalidRowCombinationOpException, self).__init__('Invalid row combination operation') 27 | 28 | class InvalidPredicateOpException(Exception): 29 | def __init__(self): 30 | super(InvalidPredicateOpException, self).__init__('Invalid/unsupported query predicate operation') 31 | 32 | 33 | COMPOUND_OP_AND = 1 34 | COMPOUND_OP_OR = 2 35 | 36 | class RangePredicate(object): 37 | 38 | def __init__(self, column, start=None, start_inclusive=True, end=None, end_inclusive=True): 39 | self.column = column 40 | self.start = start 41 | self.start_inclusive = start_inclusive 42 | self.end = end 43 | self.end_inclusive = end_inclusive 44 | 45 | def __repr__(self): 46 | s = '(RANGE: ' 47 | if self.start: 48 | op = '<=' if self.start_inclusive else '<' 49 | s += (unicode(self.start) + op) 50 | s += self.column 51 | if self.end: 52 | op = '>=' if self.end_inclusive else '>' 53 | s += (op + unicode(self.end)) 54 | s += ')' 55 | return s 56 | 57 | def _is_exact(self): 58 | return (self.start != None) and (self.start == self.end) and self.start_inclusive and self.end_inclusive 59 | 60 | def can_evaluate_efficiently(self, pk_column, indexed_columns): 61 | # FIXME: There's some problem with secondary index support currently. 62 | # I'm suspicious that this is a bug in Cassandra but I haven't really verified that yet. 63 | # Anyway disabling the secondary index support for now. 64 | return ((self.column == pk_column) or 65 | (SECONDARY_INDEX_SUPPORT_ENABLED and ((self.column in indexed_columns) and self._is_exact()))) 66 | 67 | def incorporate_range_op(self, column, op, value, parent_compound_op): 68 | if column != self.column: 69 | return False 70 | 71 | # FIXME: The following logic could probably be tightened up a bit 72 | # (although perhaps at the expense of clarity?) 73 | if parent_compound_op == COMPOUND_OP_AND: 74 | if op == 'gt': 75 | if self.start == None or value >= self.start: 76 | self.start = value 77 | self.start_inclusive = False 78 | return True 79 | elif op == 'gte': 80 | if self.start == None or value > self.start: 81 | self.start = value 82 | self.start_inclusive = True 83 | return True 84 | elif op == 'lt': 85 | if self.end == None or value <= self.end: 86 | self.end = value 87 | self.end_inclusive = False 88 | return True 89 | elif op == 'lte': 90 | if self.end == None or value < self.end: 91 | self.end = value 92 | self.end_inclusive = True 93 | return True 94 | elif op == 'exact': 95 | if self._matches_value(value): 96 | self.start = self.end = value 97 | self.start_inclusive = self.end_inclusive = True 98 | return True 99 | elif op == 'startswith': 100 | # For the end value we increment the ordinal value of the last character 101 | # in the start value and make the end value not inclusive 102 | end_value = value[:-1] + chr(ord(value[-1])+1) 103 | if (((self.start == None) or (value > self.start)) and 104 | ((self.end == None) or (end_value <= self.end))): 105 | self.start = value 106 | self.end = end_value 107 | self.start_inclusive = True 108 | self.end_inclusive = False 109 | return True 110 | else: 111 | raise InvalidPredicateOpException() 112 | elif parent_compound_op == COMPOUND_OP_OR: 113 | if op == 'gt': 114 | if self.start == None or value < self.start: 115 | self.start = value 116 | self.start_inclusive = False 117 | return True 118 | elif op == 'gte': 119 | if self.start == None or value <= self.start: 120 | self.start = value 121 | self.start_inclusive = True 122 | return True 123 | elif op == 'lt': 124 | if self.end == None or value > self.end: 125 | self.end = value 126 | self.end_inclusive = False 127 | return True 128 | elif op == 'lte': 129 | if self.end == None or value >= self.end: 130 | self.end = value 131 | self.end_inclusive = True 132 | return True 133 | elif op == 'exact': 134 | if self._matches_value(value): 135 | return True 136 | elif op == 'startswith': 137 | # For the end value we increment the ordinal value of the last character 138 | # in the start value and make the end value not inclusive 139 | end_value = value[:-1] + chr(ord(value[-1])+1) 140 | if (((self.start == None) or (value <= self.start)) and 141 | ((self.end == None) or (end_value > self.end))): 142 | self.start = value 143 | self.end = end_value 144 | self.start_inclusive = True 145 | self.end_inclusive = False 146 | return True 147 | else: 148 | raise InvalidPredicateOpException() 149 | 150 | return False 151 | 152 | def _matches_value(self, value): 153 | if value == None: 154 | return False 155 | if self.start != None: 156 | if self.start_inclusive: 157 | if value < self.start: 158 | return False 159 | elif value <= self.start: 160 | return False 161 | if self.end != None: 162 | if self.end_inclusive: 163 | if value > self.end: 164 | return False 165 | elif value >= self.end: 166 | return False 167 | return True 168 | 169 | def row_matches(self, row): 170 | value = row.get(self.column, None) 171 | return self._matches_value(value) 172 | 173 | def get_matching_rows(self, query): 174 | rows = query.get_row_range(self) 175 | return rows 176 | 177 | class OperationPredicate(object): 178 | def __init__(self, column, op, value=None): 179 | self.column = column 180 | self.op = op 181 | self.value = value 182 | if op == 'regex' or op == 'iregex': 183 | flags = re.I if op == 'iregex' else 0 184 | self.pattern = re.compile(value, flags) 185 | 186 | def __repr__(self): 187 | return '(OP: ' + self.op + ':' + unicode(self.value) + ')' 188 | 189 | def can_evaluate_efficiently(self, pk_column, indexed_columns): 190 | return False 191 | 192 | def row_matches(self, row): 193 | row_value = row.get(self.column, None) 194 | if self.op == 'isnull': 195 | return row_value == None 196 | # FIXME: Not sure if the following test is correct in all cases 197 | if (row_value == None) or (self.value == None): 198 | return False 199 | if self.op == 'in': 200 | return row_value in self.value 201 | if self.op == 'istartswith': 202 | return row_value.lower().startswith(self.value.lower()) 203 | elif self.op == 'endswith': 204 | return row_value.endswith(self.value) 205 | elif self.op == 'iendswith': 206 | return row_value.lower().endswith(self.value.lower()) 207 | elif self.op == 'iexact': 208 | return row_value.lower() == self.value.lower() 209 | elif self.op == 'contains': 210 | return row_value.find(self.value) >= 0 211 | elif self.op == 'icontains': 212 | return row_value.lower().find(self.value.lower()) >= 0 213 | elif self.op == 'regex' or self.op == 'iregex': 214 | return self.pattern.match(row_value) != None 215 | else: 216 | raise InvalidPredicateOpException() 217 | 218 | def incorporate_range_op(self, column, op, value, parent_compound_op): 219 | return False 220 | 221 | def get_matching_rows(self, query): 222 | # get_matching_rows should only be called for predicates that can 223 | # be evaluated efficiently, which is not the case for OperationPredicate's 224 | raise NotImplementedError('get_matching_rows() called for inefficient predicate') 225 | 226 | class CompoundPredicate(object): 227 | def __init__(self, op, negated=False, children=None): 228 | self.op = op 229 | self.negated = negated 230 | self.children = children 231 | if self.children == None: 232 | self.children = [] 233 | 234 | def __repr__(self): 235 | s = '(' 236 | if self.negated: 237 | s += 'NOT ' 238 | s += ('AND' if self.op == COMPOUND_OP_AND else 'OR') 239 | s += ': ' 240 | first_time = True 241 | if self.children: 242 | for child_predicate in self.children: 243 | if first_time: 244 | first_time = False 245 | else: 246 | s += ',' 247 | s += unicode(child_predicate) 248 | s += ')' 249 | return s 250 | 251 | def can_evaluate_efficiently(self, pk_column, indexed_columns): 252 | if self.negated: 253 | return False 254 | if self.op == COMPOUND_OP_AND: 255 | for child in self.children: 256 | if child.can_evaluate_efficiently(pk_column, indexed_columns): 257 | return True 258 | else: 259 | return False 260 | elif self.op == COMPOUND_OP_OR: 261 | for child in self.children: 262 | if not child.can_evaluate_efficiently(pk_column, indexed_columns): 263 | return False 264 | else: 265 | return True 266 | else: 267 | raise InvalidPredicateOpException() 268 | 269 | def row_matches_subset(self, row, subset): 270 | if self.op == COMPOUND_OP_AND: 271 | for predicate in subset: 272 | if not predicate.row_matches(row): 273 | matches = False 274 | break 275 | else: 276 | matches = True 277 | elif self.op == COMPOUND_OP_OR: 278 | for predicate in subset: 279 | if predicate.row_matches(row): 280 | matches = True 281 | break 282 | else: 283 | matches = False 284 | else: 285 | raise InvalidPredicateOpException() 286 | 287 | if self.negated: 288 | matches = not matches 289 | 290 | return matches 291 | 292 | def row_matches(self, row): 293 | return self.row_matches_subset(row, self.children) 294 | 295 | def incorporate_range_op(self, column, op, value, parent_predicate): 296 | return False 297 | 298 | def add_filter(self, column, op, value): 299 | if op in ('lt', 'lte', 'gt', 'gte', 'exact', 'startswith'): 300 | for child in self.children: 301 | if child.incorporate_range_op(column, op, value, self.op): 302 | return 303 | else: 304 | child = RangePredicate(column) 305 | incorporated = child.incorporate_range_op(column, op, value, COMPOUND_OP_AND) 306 | assert incorporated 307 | self.children.append(child) 308 | else: 309 | child = OperationPredicate(column, op, value) 310 | self.children.append(child) 311 | 312 | def add_child(self, child_query_node): 313 | self.children.append(child_query_node) 314 | 315 | def get_matching_rows(self, query): 316 | pk_column = query.query.get_meta().pk.column 317 | #indexed_columns = query.indexed_columns 318 | 319 | # In the first pass we handle the query nodes that can be processed 320 | # efficiently. Hopefully, in most cases, this will result in a 321 | # subset of the rows that is much smaller than the overall number 322 | # of rows so we only have to run the inefficient query predicates 323 | # over this smaller number of rows. 324 | if self.can_evaluate_efficiently(pk_column, query.indexed_columns): 325 | inefficient_predicates = [] 326 | result = None 327 | for predicate in self.children: 328 | if predicate.can_evaluate_efficiently(pk_column, query.indexed_columns): 329 | rows = predicate.get_matching_rows(query) 330 | 331 | if result == None: 332 | result = rows 333 | else: 334 | result = combine_rows(result, rows, self.op, pk_column) 335 | else: 336 | inefficient_predicates.append(predicate) 337 | else: 338 | inefficient_predicates = self.children 339 | result = query.get_all_rows() 340 | 341 | if result == None: 342 | result = [] 343 | 344 | # Now 345 | if len(inefficient_predicates) > 0: 346 | result = [row for row in result if self.row_matches_subset(row, inefficient_predicates)] 347 | 348 | return result 349 | 350 | -------------------------------------------------------------------------------- /django_cassandra/db/utils.py: -------------------------------------------------------------------------------- 1 | # Copyright 2010 BSN, Inc. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | 15 | import time 16 | from thrift import Thrift 17 | from thrift.transport import TTransport 18 | from thrift.transport import TSocket 19 | from thrift.protocol import TBinaryProtocol 20 | from cassandra import Cassandra 21 | #from cassandra.ttypes import * 22 | from django.db.utils import DatabaseError 23 | 24 | def _cmp_to_key(comparison_function): 25 | """ 26 | Convert a cmp= function into a key= function. 27 | This is built in to Python 2.7, but we define it ourselves 28 | to work with older versions of Python 29 | """ 30 | class K(object): 31 | def __init__(self, obj, *args): 32 | self.obj = obj 33 | def __lt__(self, other): 34 | return comparison_function(self.obj, other.obj) < 0 35 | def __gt__(self, other): 36 | return comparison_function(self.obj, other.obj) > 0 37 | def __eq__(self, other): 38 | return comparison_function(self.obj, other.obj) == 0 39 | def __le__(self, other): 40 | return comparison_function(self.obj, other.obj) <= 0 41 | def __ge__(self, other): 42 | return comparison_function(self.obj, other.obj) >= 0 43 | def __ne__(self, other): 44 | return comparison_function(self.obj, other.obj) != 0 45 | return K 46 | 47 | def _compare_rows(row1, row2, sort_spec_list): 48 | for sort_spec in sort_spec_list: 49 | column_name = sort_spec[0] 50 | reverse = sort_spec[1] if len(sort_spec) > 1 else False 51 | row1_value = row1.get(column_name, None) 52 | row2_value = row2.get(column_name, None) 53 | result = cmp(row1_value, row2_value) 54 | if result != 0: 55 | if reverse: 56 | result = -result 57 | break; 58 | else: 59 | result = 0 60 | return result 61 | 62 | def sort_rows(rows, sort_spec): 63 | if sort_spec == None: 64 | return rows 65 | 66 | if (type(sort_spec) != list) and (type(sort_spec) != tuple): 67 | raise InvalidSortSpecException() 68 | 69 | # The sort spec can be either a single sort spec tuple or a list/tuple 70 | # of sort spec tuple. To simplify the code below we convert the case 71 | # where it's a single sort spec tuple to a 1-element tuple containing 72 | # the sort spec tuple here. 73 | if (type(sort_spec[0]) == list) or (type(sort_spec[0]) == tuple): 74 | sort_spec_list = sort_spec 75 | else: 76 | sort_spec_list = (sort_spec,) 77 | 78 | rows.sort(key=_cmp_to_key(lambda row1, row2: _compare_rows(row1, row2, sort_spec_list))) 79 | 80 | COMBINE_INTERSECTION = 1 81 | COMBINE_UNION = 2 82 | 83 | def combine_rows(rows1, rows2, op, primary_key_column): 84 | # Handle cases where rows1 and/or rows2 are None or empty 85 | if not rows1: 86 | return list(rows2) if rows2 and (op == COMBINE_UNION) else [] 87 | if not rows2: 88 | return list(rows1) if (op == COMBINE_UNION) else [] 89 | 90 | # We're going to iterate over the lists in parallel and 91 | # compare the elements so we need both lists to be sorted 92 | # Note that this means that the input arguments will be modified. 93 | # We could optionally clone the rows first, but then we'd incur 94 | # the overhead of the copy. For now, we'll just always sort 95 | # in place, and if it turns out to be a problem we can add the 96 | # option to copy 97 | sort_rows(rows1,(primary_key_column,)) 98 | sort_rows(rows2,(primary_key_column,)) 99 | 100 | combined_rows = [] 101 | iter1 = iter(rows1) 102 | iter2 = iter(rows2) 103 | update1 = update2 = True 104 | 105 | while True: 106 | # Get the next element from one or both of the lists 107 | if update1: 108 | try: 109 | row1 = iter1.next() 110 | except: 111 | row1 = None 112 | value1 = row1.get(primary_key_column, None) if row1 != None else None 113 | if update2: 114 | try: 115 | row2 = iter2.next() 116 | except: 117 | row2 = None 118 | value2 = row2.get(primary_key_column, None) if row2 != None else None 119 | 120 | if (op == COMBINE_INTERSECTION): 121 | # If we've reached the end of either list and we're doing an intersection, 122 | # then we're done 123 | if (row1 == None) or (row2 == None): 124 | break 125 | 126 | if value1 == value2: 127 | combined_rows.append(row1) 128 | elif (op == COMBINE_UNION): 129 | if row1 == None: 130 | if row2 == None: 131 | break; 132 | combined_rows.append(row2) 133 | elif (row2 == None) or (value1 <= value2): 134 | combined_rows.append(row1) 135 | else: 136 | combined_rows.append(row2) 137 | else: 138 | raise InvalidCombineRowsOpException() 139 | 140 | update1 = (row2 == None) or (value1 <= value2) 141 | update2 = (row1 == None) or (value2 <= value1) 142 | 143 | return combined_rows 144 | 145 | _last_timestamp = None 146 | 147 | def get_next_timestamp(): 148 | # The timestamp is a 64-bit integer 149 | # We now use the standard Cassandra timestamp format of the 150 | # current system time in microseconds. We also keep track of the 151 | # last timestamp we returned and if the current time is less than 152 | # that, then we just advance the timestamp by 1 to make sure we 153 | # return monotonically increasing timestamps. Note that this isn't 154 | # guaranteed to handle the fairly common Django deployment model of 155 | # having multiple Django processes that are dispatched to from a 156 | # web server like Apache. In practice I don't think that case will be 157 | # a problem though (at least with current hardware) because I don't 158 | # think you could have two consecutive calls to Django from another 159 | # process that would be dispatched to two different Django processes 160 | # that would happen in the same microsecond. 161 | 162 | global _last_timestamp 163 | 164 | timestamp = int(time.time() * 1000000) 165 | 166 | if (_last_timestamp != None) and (timestamp <= _last_timestamp): 167 | timestamp = _last_timestamp + 1 168 | 169 | _last_timestamp = timestamp 170 | 171 | return timestamp 172 | 173 | def convert_string_to_list(s): 174 | # FIXME: Shouldn't use eval here, because of security considerations 175 | # (i.e. if someone could modify the data in Cassandra they could 176 | # insert arbitrary Python code that would then get evaluated on 177 | # the client machine. Should have code that parses the list string 178 | # to construct the list or else validate the string before calling eval. 179 | # But for now, during development, we'll just use the quick & dirty eval. 180 | return eval(s) 181 | 182 | def convert_list_to_string(l): 183 | return unicode(l) 184 | 185 | 186 | class CassandraConnection(object): 187 | def __init__(self, host, port, keyspace, user, password): 188 | self.host = host 189 | self.port = port 190 | self.keyspace = keyspace 191 | self.user = user 192 | self.password = password 193 | self.transport = None 194 | self.client = None 195 | self.keyspace_set = False 196 | self.logged_in = False 197 | 198 | def commit(self): 199 | pass 200 | 201 | def set_keyspace(self): 202 | if not self.keyspace_set: 203 | try: 204 | if self.client: 205 | self.client.set_keyspace(self.keyspace) 206 | self.keyspace_set = True 207 | except Exception, e: 208 | # In this case we won't have set keyspace_set to true, so we'll throw the 209 | # exception below where it also handles the case that self.client 210 | # is not valid yet. 211 | pass 212 | if not self.keyspace_set: 213 | raise DatabaseError('Error setting keyspace: %s; %s' % (self.keyspace, str(e))) 214 | 215 | def login(self): 216 | # TODO: This user/password auth code hasn't been tested 217 | if not self.logged_in: 218 | if self.user: 219 | try: 220 | if self.client: 221 | credentials = {'username': self.user, 'password': self.password} 222 | self.client.login(AuthenticationRequest(credentials)) 223 | self.logged_in = True 224 | except Exception, e: 225 | # In this case we won't have set logged_in to true, so we'll throw the 226 | # exception below where it also handles the case that self.client 227 | # is not valid yet. 228 | pass 229 | if not self.logged_in: 230 | raise DatabaseError('Error logging in to keyspace: %s; %s' % (self.keyspace, str(e))) 231 | else: 232 | self.logged_in = True 233 | 234 | def open(self, set_keyspace=False, login=False): 235 | if self.transport == None: 236 | # Create the client connection to the Cassandra daemon 237 | socket = TSocket.TSocket(self.host, int(self.port)) 238 | transport = TTransport.TFramedTransport(TTransport.TBufferedTransport(socket)) 239 | protocol = TBinaryProtocol.TBinaryProtocolAccelerated(transport) 240 | transport.open() 241 | self.transport = transport 242 | self.client = Cassandra.Client(protocol) 243 | 244 | if login: 245 | self.login() 246 | 247 | if set_keyspace: 248 | self.set_keyspace() 249 | 250 | def close(self): 251 | if self.transport != None: 252 | try: 253 | self.transport.close() 254 | except Exception, e: 255 | pass 256 | self.transport = None 257 | self.client = None 258 | self.keyspace_set = False 259 | self.logged_in = False 260 | 261 | def is_connected(self): 262 | return self.transport != None 263 | 264 | def get_client(self): 265 | if self.client == None: 266 | self.open(True, True) 267 | return self.client 268 | 269 | def reopen(self): 270 | self.close() 271 | self.open(True, True) 272 | 273 | 274 | class CassandraConnectionError(DatabaseError): 275 | def __init__(self, message=None): 276 | msg = 'Error connecting to Cassandra database' 277 | if message: 278 | msg += '; ' + str(message) 279 | super(CassandraConnectionError,self).__init__(msg) 280 | 281 | 282 | class CassandraAccessError(DatabaseError): 283 | def __init__(self, message=None): 284 | msg = 'Error accessing Cassandra database' 285 | if message: 286 | msg += '; ' + str(message) 287 | super(CassandraAccessError,self).__init__(msg) 288 | 289 | 290 | def call_cassandra_with_reconnect(connection, fn, *args, **kwargs): 291 | try: 292 | try: 293 | results = fn(connection.get_client(), *args, **kwargs) 294 | except TTransport.TTransportException: 295 | connection.reopen() 296 | results = fn(connection.get_client(), *args, **kwargs) 297 | except TTransport.TTransportException, e: 298 | raise CassandraConnectionError(e) 299 | except Exception, e: 300 | raise CassandraAccessError(e) 301 | 302 | return results 303 | 304 | 305 | -------------------------------------------------------------------------------- /manage.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | from django.core.management import execute_manager 3 | try: 4 | import settings # Assumed to be in the same directory. 5 | except ImportError: 6 | import sys 7 | sys.stderr.write("Error: Can't find the file 'settings.py' in the directory containing %r. It appears you've customized things.\nYou'll have to run django-admin.py, passing it your settings module.\n(If the file settings.py does indeed exist, it's causing an ImportError somehow.)\n" % __file__) 8 | sys.exit(1) 9 | 10 | if __name__ == "__main__": 11 | execute_manager(settings) 12 | -------------------------------------------------------------------------------- /settings.py: -------------------------------------------------------------------------------- 1 | # Django settings for test_db_backend project. 2 | 3 | DEBUG = True 4 | TEMPLATE_DEBUG = DEBUG 5 | 6 | ADMINS = ( 7 | # ('Your Name', 'your_email@domain.com'), 8 | ) 9 | 10 | MANAGERS = ADMINS 11 | 12 | DATABASES = { 13 | 'default': { 14 | 'ENGINE': 'django_cassandra.db', # Add 'postgresql_psycopg2', 'postgresql', 'mysql', 'sqlite3' or 'oracle'. 15 | 'NAME': 'DjangoTest', # Or path to database file if using sqlite3. 16 | 'USER': '', # Not used with sqlite3. 17 | 'PASSWORD': '', # Not used with sqlite3. 18 | 'HOST': 'localhost', # Set to empty string for localhost. Not used with sqlite3. 19 | 'PORT': '9160', # Set to empty string for default. Not used with sqlite3. 20 | 'SUPPORTS_TRANSACTIONS': False, 21 | 'CASSANDRA_REPLICATION_FACTOR': 1, 22 | 'CASSANDRA_ENABLE_CASCADING_DELETES': True 23 | } 24 | } 25 | 26 | # Local time zone for this installation. Choices can be found here: 27 | # http://en.wikipedia.org/wiki/List_of_tz_zones_by_name 28 | # although not all choices may be available on all operating systems. 29 | # On Unix systems, a value of None will cause Django to use the same 30 | # timezone as the operating system. 31 | # If running in a Windows environment this must be set to the same as your 32 | # system time zone. 33 | TIME_ZONE = 'America/Chicago' 34 | 35 | # Language code for this installation. All choices can be found here: 36 | # http://www.i18nguy.com/unicode/language-identifiers.html 37 | LANGUAGE_CODE = 'en-us' 38 | 39 | SITE_ID = 1 40 | 41 | # If you set this to False, Django will make some optimizations so as not 42 | # to load the internationalization machinery. 43 | USE_I18N = True 44 | 45 | # If you set this to False, Django will not format dates, numbers and 46 | # calendars according to the current locale 47 | USE_L10N = True 48 | 49 | # Absolute path to the directory that holds media. 50 | # Example: "/home/media/media.lawrence.com/" 51 | MEDIA_ROOT = '' 52 | 53 | # URL that handles the media served from MEDIA_ROOT. Make sure to use a 54 | # trailing slash if there is a path component (optional in other cases). 55 | # Examples: "http://media.lawrence.com", "http://example.com/media/" 56 | MEDIA_URL = '' 57 | 58 | # URL prefix for admin media -- CSS, JavaScript and images. Make sure to use a 59 | # trailing slash. 60 | # Examples: "http://foo.com/media/", "/media/". 61 | ADMIN_MEDIA_PREFIX = '/media/' 62 | 63 | # Make this unique, and don't share it with anybody. 64 | SECRET_KEY = 'b^%)yd-d6s%pk16+1m@fx!jsry!alaes%)nmb^ma#rxz8+i_to' 65 | 66 | # List of callables that know how to import templates from various sources. 67 | TEMPLATE_LOADERS = ( 68 | 'django.template.loaders.filesystem.Loader', 69 | 'django.template.loaders.app_directories.Loader', 70 | # 'django.template.loaders.eggs.Loader', 71 | ) 72 | 73 | MIDDLEWARE_CLASSES = ( 74 | 'django.middleware.common.CommonMiddleware', 75 | 'django.contrib.sessions.middleware.SessionMiddleware', 76 | #'django.middleware.csrf.CsrfViewMiddleware', 77 | 'django.contrib.auth.middleware.AuthenticationMiddleware', 78 | #'django.contrib.messages.middleware.MessageMiddleware', 79 | ) 80 | 81 | ROOT_URLCONF = 'django_cassandra_backend.urls' 82 | 83 | TEMPLATE_DIRS = ( 84 | # Put strings here, like "/home/html/django_templates" or "C:/www/django/templates". 85 | # Always use forward slashes, even on Windows. 86 | # Don't forget to use absolute paths, not relative paths. 87 | ) 88 | 89 | INSTALLED_APPS = ( 90 | 'django.contrib.auth', 91 | 'django.contrib.contenttypes', 92 | 'django.contrib.sessions', 93 | 'django.contrib.sites', 94 | #'django.contrib.messages', 95 | 'django_cassandra_backend.django_cassandra', 96 | # Uncomment the next line to enable the admin: 97 | 'django.contrib.admin', 98 | 'django_cassandra_backend.tests', 99 | #'django_cassandra_backend.djangotoolbox' 100 | ) 101 | 102 | AUTHENTICATION_BACKENDS = ( 103 | 'django.contrib.auth.backends.ModelBackend', 104 | ) 105 | 106 | -------------------------------------------------------------------------------- /tests/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vaterlaus/django_cassandra_backend/4f6df2585df51a7b51fed59481c564c0aee74418/tests/__init__.py -------------------------------------------------------------------------------- /tests/admin.py: -------------------------------------------------------------------------------- 1 | from .models import Host, Slice, Tag 2 | from django.contrib import admin 3 | 4 | admin.site.register(Host) 5 | admin.site.register(Slice) 6 | admin.site.register(Tag) 7 | -------------------------------------------------------------------------------- /tests/models.py: -------------------------------------------------------------------------------- 1 | from django.db import models 2 | from djangotoolbox.fields import ListField 3 | 4 | class Slice(models.Model): 5 | name = models.CharField(max_length=64) 6 | 7 | class Meta: 8 | db_table = 'Slice' 9 | ordering = ['id'] 10 | 11 | class Host(models.Model): 12 | mac = models.CharField(max_length=20, db_index=True) 13 | ip = models.CharField(max_length=20, db_index = True) 14 | slice = models.ForeignKey(Slice, db_index=True) 15 | 16 | class Meta: 17 | db_table = 'Host' 18 | ordering = ['id'] 19 | 20 | class Tag(models.Model): 21 | name = models.CharField(max_length=64) 22 | value = models.CharField(max_length=256) 23 | host = models.ForeignKey(Host, db_index=True) 24 | 25 | class Meta: 26 | db_table = 'Tag' 27 | ordering = ['id'] 28 | 29 | class Test(models.Model): 30 | test_date = models.DateField(null=True) 31 | test_datetime = models.DateTimeField(null=True) 32 | test_time = models.TimeField(null=True) 33 | test_decimal = models.DecimalField(null=True, max_digits=10, decimal_places=3) 34 | test_text = models.TextField(null=True) 35 | #test_list = ListField(models.CharField(max_length=500)) 36 | 37 | class Meta: 38 | db_table = 'Test' 39 | ordering = ['id'] 40 | 41 | 42 | 43 | class CompoundKeyModel(models.Model): 44 | name = models.CharField(max_length=64) 45 | index = models.IntegerField() 46 | extra = models.CharField(max_length=32, default='test') 47 | 48 | class CassandraSettings: 49 | COMPOUND_KEY_FIELDS = ('name', 'index') 50 | 51 | 52 | class CompoundKeyModel2(models.Model): 53 | slice = models.ForeignKey(Slice) 54 | name = models.CharField(max_length=64) 55 | index = models.IntegerField() 56 | extra = models.CharField(max_length=32) 57 | 58 | class CassandraSettings: 59 | COMPOUND_KEY_FIELDS = ('slice', 'name', 'index') 60 | COMPOUND_KEY_SEPARATOR = '#' 61 | 62 | class CompoundKeyModel3(models.Model): 63 | name = models.CharField(max_length=32) 64 | 65 | class CassandraSettings: 66 | COMPOUND_KEY_FIELDS = ('name') 67 | -------------------------------------------------------------------------------- /tests/tests.py: -------------------------------------------------------------------------------- 1 | # Copyright 2010 BSN, Inc. 2 | # 3 | # Licensed under the Apache License, Version 2.0 (the "License"); 4 | # you may not use this file except in compliance with the License. 5 | # You may obtain a copy of the License at 6 | # 7 | # http://www.apache.org/licenses/LICENSE-2.0 8 | # 9 | # Unless required by applicable law or agreed to in writing, software 10 | # distributed under the License is distributed on an "AS IS" BASIS, 11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | # See the License for the specific language governing permissions and 13 | # limitations under the License. 14 | 15 | from django.test import TestCase 16 | from .models import * 17 | import datetime 18 | import decimal 19 | from django.db.models.query import Q 20 | from django.db.utils import DatabaseError 21 | 22 | class FieldsTest(TestCase): 23 | 24 | TEST_DATE = datetime.date(2007,3,5) 25 | TEST_DATETIME = datetime.datetime(2010,5,4,9,34,25) 26 | TEST_DATETIME2 = datetime.datetime(2010, 6, 6, 6, 20) 27 | TEST_TIME = datetime.time(10,14,29) 28 | TEST_DECIMAL = decimal.Decimal('33.55') 29 | TEST_TEXT = "Practice? We're talking about practice?" 30 | TEST_TEXT2 = "I'm a man. I'm 40." 31 | #TEST_LIST = [u'aaa',u'bbb',u'foobar',u'snafu',u'hello',u'goodbye'] 32 | 33 | def setUp(self): 34 | self.test = Test(id='key1', 35 | test_date=self.TEST_DATE, 36 | test_datetime=self.TEST_DATETIME, 37 | test_time=self.TEST_TIME, 38 | test_decimal=self.TEST_DECIMAL, 39 | test_text=self.TEST_TEXT 40 | #,test_list=self.TEST_LIST 41 | ) 42 | self.test.save() 43 | 44 | def test_fields(self): 45 | test1 = Test.objects.get(id='key1') 46 | self.assertEqual(test1.test_date, self.TEST_DATE) 47 | self.assertEqual(test1.test_datetime, self.TEST_DATETIME) 48 | self.assertEqual(test1.test_time, self.TEST_TIME) 49 | self.assertEqual(test1.test_decimal, self.TEST_DECIMAL) 50 | self.assertEqual(test1.test_text, self.TEST_TEXT) 51 | #self.assertEqual(test1.test_list, self.TEST_LIST) 52 | 53 | test1.test_datetime = self.TEST_DATETIME2 54 | test1.test_text = self.TEST_TEXT2 55 | test1.save() 56 | 57 | test1 = Test.objects.get(id='key1') 58 | self.assertEqual(test1.test_datetime, self.TEST_DATETIME2) 59 | self.assertEqual(test1.test_text, self.TEST_TEXT2) 60 | 61 | class BasicFunctionalityTest(TestCase): 62 | 63 | HOST_COUNT = 5 64 | 65 | def get_host_params_for_index(self, index): 66 | decimal_index = str(index) 67 | hex_index = hex(index)[2:] 68 | if len(hex_index) == 1: 69 | hex_index = '0' + hex_index 70 | id = 'key'+decimal_index 71 | mac = '00:01:02:03:04:'+hex_index 72 | ip = '10.0.0.'+decimal_index 73 | slice = self.s0 if index % 2 else self.s1 74 | 75 | return id, mac, ip, slice 76 | 77 | def setUp(self): 78 | # Create a couple slices 79 | self.s0 = Slice(id='key0',name='slice0') 80 | self.s0.save() 81 | self.s1 = Slice(id='key1',name='slice1') 82 | self.s1.save() 83 | 84 | # Create some hosts 85 | for i in range(self.HOST_COUNT): 86 | id, mac, ip, slice = self.get_host_params_for_index(i) 87 | h = Host(id=id, mac=mac,ip=ip,slice=slice) 88 | h.save() 89 | 90 | 91 | def test_create(self): 92 | """ 93 | Tests that we correctly created the model instances 94 | """ 95 | 96 | # Test that we have the slices we expect 97 | slice_query_set = Slice.objects.all() 98 | index = 0 99 | for slice in slice_query_set: 100 | self.assertEqual(slice.id, 'key' + str(index)) 101 | self.assertEqual(slice.name, 'slice' + str(index)) 102 | index += 1 103 | 104 | # There should have been exactly 2 slices created 105 | self.assertEqual(index, 2) 106 | 107 | host_query_set = Host.objects.all() 108 | index = 0 109 | for host in host_query_set: 110 | id, mac, ip, slice = self.get_host_params_for_index(index) 111 | index += 1 112 | 113 | # There should have been exactly 2 slices created 114 | self.assertEqual(index, self.HOST_COUNT) 115 | 116 | def test_update(self): 117 | s = Slice.objects.get(id='key0') 118 | s.name = 'foobar' 119 | s.save() 120 | #import time 121 | #time.sleep(5) 122 | s1 = Slice.objects.get(id='key0') 123 | #s2 = Slice.objects.get(id='key0') 124 | self.assertEqual(s1.name, 'foobar') 125 | #self.assertEqual(s2.name, 'foobar') 126 | 127 | def test_delete(self): 128 | host = Host.objects.get(id='key1') 129 | host.delete() 130 | hqs = Host.objects.filter(id='key1') 131 | count = hqs.count() 132 | self.assertEqual(count,0) 133 | 134 | def test_query_update(self): 135 | slice0 = Slice.objects.get(pk='key0') 136 | qs = Host.objects.filter(slice=slice0) 137 | qs.update(ip='192.168.1.1') 138 | qs = Host.objects.all() 139 | for host in qs: 140 | if host.slice.pk == 'key0': 141 | self.assertEqual(host.ip, '192.168.1.1') 142 | else: 143 | self.assertNotEqual(host.ip, '192.168.1.1') 144 | 145 | def test_cascading_delete(self): 146 | slice0 = Slice.objects.get(pk='key0') 147 | slice0.delete() 148 | hqs = Host.objects.all() 149 | count = hqs.count() 150 | self.assertEqual(count, 3) 151 | for host in hqs: 152 | self.assertEqual(host.slice_id, 'key1') 153 | 154 | def test_default_id(self): 155 | s = Slice(name='slice2') 156 | s.save() 157 | s2 = Slice.objects.get(name='slice2') 158 | self.assertEqual(s2.name, 'slice2') 159 | 160 | SLICE_DATA_1 = ('key1', 'PCI') 161 | SLICE_DATA_2 = ('key2', 'Eng1') 162 | SLICE_DATA_3 = ('key3', 'Finance') 163 | SLICE_DATA_4 = ('key4', 'blue') 164 | SLICE_DATA_5 = ('key5', 'bluf') 165 | SLICE_DATA_6 = ('key6', 'BLTSE') 166 | SLICE_DATA_7 = ('key7', 'ZNCE') 167 | SLICE_DATA_8 = ('key8', 'UNCLE') 168 | SLICE_DATA_9 = ('key9', 'increment') 169 | 170 | HOST_DATA_1 = ('key1', '00:01:02:03:04:05', '10.0.0.1', 'key1', (('foo3', 'bar3'), ('foo1','hello'), ('aaa', 'bbb'))) 171 | HOST_DATA_2 = ('key2', 'ff:fc:02:33:04:05', '192.168.0.55', 'key2', None) 172 | HOST_DATA_3 = ('key3', 'ff:fc:02:03:04:01', '192.168.0.1', 'key2', (('cfoo3', 'bar3'), ('cfoo1','hello'), ('ddd', 'bbb'))) 173 | HOST_DATA_4 = ('key4', '55:44:33:03:04:05', '10.0.0.6', 'key1',None) 174 | HOST_DATA_5 = ('key5', '10:01:02:03:04:05', '10.0.0.2', 'key1', None) 175 | HOST_DATA_6 = ('key6', '33:44:55:03:04:05', '10.0.0.7', 'key3',None) 176 | HOST_DATA_7 = ('key7', '10:01:02:03:04:05', '192.168.0.44', 'key1', None) 177 | 178 | def create_slices(slice_data_list): 179 | for sd in slice_data_list: 180 | id, name = sd 181 | s = Slice(id=id,name=name) 182 | s.save() 183 | 184 | def create_hosts(host_data_list): 185 | for hd in host_data_list: 186 | id,mac,ip,slice_id,tag_list = hd 187 | slice = Slice.objects.get(id=slice_id) 188 | h = Host(id=id,mac=mac,ip=ip,slice=slice) 189 | h.save() 190 | if tag_list != None: 191 | for tag in tag_list: 192 | name, value = tag 193 | t = Tag(name=name,value=value,host=h) 194 | t.save() 195 | 196 | class QueryTest(TestCase): 197 | 198 | def setUp(self): 199 | create_slices((SLICE_DATA_1, SLICE_DATA_2, SLICE_DATA_3)) 200 | create_hosts((HOST_DATA_1, HOST_DATA_6, HOST_DATA_5, HOST_DATA_7, HOST_DATA_3, HOST_DATA_2, HOST_DATA_4)) 201 | 202 | def check_host_data(self, host, data): 203 | expected_id, expected_mac, expected_ip, expected_slice, expected_tag_list = data 204 | self.assertEqual(host.id, expected_id) 205 | self.assertEqual(host.mac, expected_mac) 206 | self.assertEqual(host.ip, expected_ip) 207 | self.assertEqual(host.slice.id, expected_slice) 208 | # TODO: For now we don't check the tag list 209 | 210 | def test_pk_query(self): 211 | h = Host.objects.get(id='key3') 212 | self.check_host_data(h, HOST_DATA_3) 213 | 214 | hqs = Host.objects.filter(id='key6') 215 | count = hqs.count() 216 | self.assertEqual(count, 1) 217 | h6 = hqs[0] 218 | self.check_host_data(h6, HOST_DATA_6) 219 | 220 | hqs = Host.objects.filter(id__gt='key4') 221 | count = hqs.count() 222 | self.assertEqual(count, 3) 223 | h5, h6, h7 = hqs[:] 224 | self.check_host_data(h5, HOST_DATA_5) 225 | self.check_host_data(h6, HOST_DATA_6) 226 | self.check_host_data(h7, HOST_DATA_7) 227 | 228 | hqs = Host.objects.filter(id__lte='key3') 229 | count = hqs.count() 230 | self.assertEqual(count, 3) 231 | h1, h2, h3 = hqs[:] 232 | self.check_host_data(h1, HOST_DATA_1) 233 | self.check_host_data(h2, HOST_DATA_2) 234 | self.check_host_data(h3, HOST_DATA_3) 235 | 236 | hqs = Host.objects.filter(id__gte='key3', id__lt='key7') 237 | count = hqs.count() 238 | self.assertEqual(count, 4) 239 | h3, h4, h5, h6 = hqs[:] 240 | self.check_host_data(h3, HOST_DATA_3) 241 | self.check_host_data(h4, HOST_DATA_4) 242 | self.check_host_data(h5, HOST_DATA_5) 243 | self.check_host_data(h6, HOST_DATA_6) 244 | 245 | def test_indexed_query(self): 246 | h = Host.objects.get(ip='10.0.0.7') 247 | self.check_host_data(h, HOST_DATA_6) 248 | 249 | hqs = Host.objects.filter(ip='192.168.0.1') 250 | h = hqs[0] 251 | self.check_host_data(h, HOST_DATA_3) 252 | 253 | def test_complex_query(self): 254 | hqs = Host.objects.filter(Q(id='key1') | Q(id='key3') | Q(id='key4')).order_by('id') 255 | count = hqs.count() 256 | self.assertEqual(count, 3) 257 | h1, h3, h4 = hqs[:] 258 | self.check_host_data(h1, HOST_DATA_1) 259 | self.check_host_data(h3, HOST_DATA_3) 260 | self.check_host_data(h4, HOST_DATA_4) 261 | 262 | s1 = Slice.objects.get(id='key1') 263 | 264 | hqs = Host.objects.filter(ip__startswith='10.', slice=s1) 265 | count = hqs.count() 266 | self.assertEqual(count, 3) 267 | h1, h4, h5 = hqs[:] 268 | self.check_host_data(h1, HOST_DATA_1) 269 | self.check_host_data(h4, HOST_DATA_4) 270 | self.check_host_data(h5, HOST_DATA_5) 271 | 272 | hqs = Host.objects.filter(ip='10.0.0.6', slice=s1) 273 | count = hqs.count() 274 | self.assertEqual(count, 1) 275 | h4 = hqs[0] 276 | self.check_host_data(h4, HOST_DATA_4) 277 | 278 | tqs = Tag.objects.filter(name='foo3', value='bar3') 279 | self.assertEqual(tqs.count(), 1) 280 | t = tqs[0] 281 | self.assertEqual(t.name, 'foo3') 282 | self.assertEqual(t.value, 'bar3') 283 | self.assertEqual(t.host_id, 'key1') 284 | 285 | hqs = Host.objects.filter((Q(ip__startswith='10.0') & Q(slice=s1)) | Q(mac__startswith='ff')).order_by('id') 286 | count = hqs.count() 287 | self.assertEqual(count, 5) 288 | h1, h2, h3, h4, h5 = hqs[:] 289 | self.check_host_data(h1, HOST_DATA_1) 290 | self.check_host_data(h2, HOST_DATA_2) 291 | self.check_host_data(h3, HOST_DATA_3) 292 | self.check_host_data(h4, HOST_DATA_4) 293 | self.check_host_data(h5, HOST_DATA_5) 294 | 295 | def test_exclude_query(self): 296 | hqs = Host.objects.exclude(ip__startswith="10") 297 | count = hqs.count() 298 | self.assertEqual(count,3) 299 | h2, h3, h7 = hqs[:] 300 | self.check_host_data(h2, HOST_DATA_2) 301 | self.check_host_data(h3, HOST_DATA_3) 302 | self.check_host_data(h7, HOST_DATA_7) 303 | 304 | def test_count(self): 305 | 306 | count = Host.objects.count() 307 | self.assertEqual(count, 7) 308 | 309 | count = Host.objects.all().count() 310 | self.assertEqual(count, 7) 311 | 312 | slice1 = Slice.objects.get(id='key1') 313 | qs = Host.objects.filter(slice=slice1) 314 | count = qs.count() 315 | #if count == 4: 316 | # h1,h4,h5,h7 = qs[:] 317 | #else: 318 | # h1,h4,h5,h7,h = qs[:] 319 | self.assertEqual(count, 4) 320 | 321 | qs = Slice.objects.filter(name__startswith='P') 322 | count = qs.count() 323 | self.assertEqual(count, 1) 324 | 325 | qs = Host.objects.filter(ip__startswith='10').order_by('slice_id') 326 | count = qs.count() 327 | self.assertEqual(count, 4) 328 | 329 | def test_query_set_slice(self): 330 | hqs = Host.objects.all()[2:6] 331 | count = hqs.count() 332 | h3, h4, h5, h6 = hqs[:] 333 | self.assertEqual(h3.id, 'key3') 334 | self.assertEqual(h4.id, 'key4') 335 | self.assertEqual(h5.id, 'key5') 336 | self.assertEqual(h6.id, 'key6') 337 | 338 | def test_order_by(self): 339 | # Test ascending order of all of the hosts 340 | qs = Host.objects.all().order_by('ip') 341 | h1, h2, h3, h4, h5, h6, h7 = qs[:] 342 | self.assertEqual(h1.id, 'key1') 343 | self.assertEqual(h2.id, 'key5') 344 | self.assertEqual(h3.id, 'key4') 345 | self.assertEqual(h4.id, 'key6') 346 | self.assertEqual(h5.id, 'key3') 347 | self.assertEqual(h6.id, 'key7') 348 | self.assertEqual(h7.id, 'key2') 349 | 350 | # Test descending order of all of the hosts 351 | qs = Host.objects.all().order_by('-ip') 352 | h1, h2, h3, h4, h5, h6, h7 = qs[:] 353 | self.assertEqual(h1.id, 'key2') 354 | self.assertEqual(h2.id, 'key7') 355 | self.assertEqual(h3.id, 'key3') 356 | self.assertEqual(h4.id, 'key6') 357 | self.assertEqual(h5.id, 'key4') 358 | self.assertEqual(h6.id, 'key5') 359 | self.assertEqual(h7.id, 'key1') 360 | 361 | # Test multiple ordering criteria 362 | qs = Host.objects.all().order_by('slice_id', 'ip') 363 | h1, h2, h3, h4, h5, h6, h7 = qs[:] 364 | self.assertEqual(h1.id, 'key1') 365 | self.assertEqual(h2.id, 'key5') 366 | self.assertEqual(h3.id, 'key4') 367 | self.assertEqual(h4.id, 'key7') 368 | self.assertEqual(h5.id, 'key3') 369 | self.assertEqual(h6.id, 'key2') 370 | self.assertEqual(h7.id, 'key6') 371 | 372 | # Test multiple ordering criteria 373 | qs = Host.objects.all().order_by('-slice_id', 'ip') 374 | h1, h2, h3, h4, h5, h6, h7 = qs[:] 375 | self.assertEqual(h1.id, 'key6') 376 | self.assertEqual(h2.id, 'key3') 377 | self.assertEqual(h3.id, 'key2') 378 | self.assertEqual(h4.id, 'key1') 379 | self.assertEqual(h5.id, 'key5') 380 | self.assertEqual(h6.id, 'key4') 381 | self.assertEqual(h7.id, 'key7') 382 | 383 | # Currently the nonrel code doesn't handle ordering that spans tables/column families 384 | #======================================================================= 385 | # qs = Host.objects.all().order_by('slice__name', 'id') 386 | # h2, h3, h6, h1, h5, h4, h7 = qs[:] 387 | # self.assertEqual(h2.id, 'key2') 388 | # self.assertEqual(h3.id, 'key3') 389 | # self.assertEqual(h6.id, 'key6') 390 | # self.assertEqual(h1.id, 'key1') 391 | # self.assertEqual(h5.id, 'key5') 392 | # self.assertEqual(h4.id, 'key4') 393 | # self.assertEqual(h7.id, 'key7') 394 | #======================================================================= 395 | 396 | 397 | class OperationTest(TestCase): 398 | 399 | def setUp(self): 400 | create_slices((SLICE_DATA_1, SLICE_DATA_2, SLICE_DATA_3, SLICE_DATA_4, SLICE_DATA_5, 401 | SLICE_DATA_6, SLICE_DATA_7, SLICE_DATA_8, SLICE_DATA_9)) 402 | 403 | def test_range_ops(self): 404 | qs = Slice.objects.filter(name__gt='PCI') 405 | count = qs.count() 406 | self.assertEqual(count, 5) 407 | s4,s5,s7,s8,s9 = qs[:] 408 | self.assertEqual(s4.id,'key4') 409 | self.assertEqual(s5.id,'key5') 410 | self.assertEqual(s7.id,'key7') 411 | self.assertEqual(s8.id,'key8') 412 | self.assertEqual(s9.id,'key9') 413 | 414 | qs = Slice.objects.filter(name__gte='bluf',name__lte='bluf') 415 | count = qs.count() 416 | self.assertEqual(count, 1) 417 | s5 = qs[0] 418 | self.assertEqual(s5.id, 'key5') 419 | 420 | qs = Slice.objects.filter(name__gt='blue', name__lte='bluf') 421 | count = qs.count() 422 | self.assertEqual(count, 1) 423 | s5 = qs[0] 424 | self.assertEqual(s5.id, 'key5') 425 | 426 | qs = Slice.objects.filter(name__exact='blue') 427 | count = qs.count() 428 | self.assertEqual(count, 1) 429 | s4 = qs[0] 430 | self.assertEqual(s4.id, 'key4') 431 | 432 | def test_other_ops(self): 433 | 434 | qs = Slice.objects.filter(id__in=['key1','key4','key6','key9']) 435 | count = qs.count() 436 | self.assertEqual(count, 4) 437 | s1,s4,s6,s9 = qs[:] 438 | self.assertEqual(s1.id,'key1') 439 | self.assertEqual(s4.id,'key4') 440 | self.assertEqual(s6.id,'key6') 441 | self.assertEqual(s9.id,'key9') 442 | 443 | qs = Slice.objects.filter(name__startswith='bl') 444 | count = qs.count() 445 | self.assertEqual(count, 2) 446 | s4,s5 = qs[:] 447 | self.assertEqual(s4.id,'key4') 448 | self.assertEqual(s5.id,'key5') 449 | 450 | qs = Slice.objects.filter(name__endswith='E') 451 | count = qs.count() 452 | self.assertEqual(count, 3) 453 | s6,s7,s8 = qs[:] 454 | self.assertEqual(s6.id,'key6') 455 | self.assertEqual(s7.id,'key7') 456 | self.assertEqual(s8.id,'key8') 457 | 458 | qs = Slice.objects.filter(name__contains='NC') 459 | count = qs.count() 460 | self.assertEqual(count, 2) 461 | s7,s8 = qs[:] 462 | self.assertEqual(s7.id,'key7') 463 | self.assertEqual(s8.id,'key8') 464 | 465 | qs = Slice.objects.filter(name__istartswith='b') 466 | count = qs.count() 467 | self.assertEqual(count, 3) 468 | s4,s5,s6 = qs[:] 469 | self.assertEqual(s4.id,'key4') 470 | self.assertEqual(s5.id,'key5') 471 | self.assertEqual(s6.id,'key6') 472 | 473 | qs = Slice.objects.filter(name__istartswith='B') 474 | count = qs.count() 475 | self.assertEqual(count, 3) 476 | s4,s5,s6 = qs[:] 477 | self.assertEqual(s4.id,'key4') 478 | self.assertEqual(s5.id,'key5') 479 | self.assertEqual(s6.id,'key6') 480 | 481 | qs = Slice.objects.filter(name__iendswith='e') 482 | count = qs.count() 483 | self.assertEqual(count, 5) 484 | s3,s4,s6,s7,s8 = qs[:] 485 | self.assertEqual(s3.id,'key3') 486 | self.assertEqual(s4.id,'key4') 487 | self.assertEqual(s6.id,'key6') 488 | self.assertEqual(s7.id,'key7') 489 | self.assertEqual(s8.id,'key8') 490 | 491 | qs = Slice.objects.filter(name__icontains='nc') 492 | count = qs.count() 493 | self.assertEqual(count, 4) 494 | s3,s7,s8,s9 = qs[:] 495 | self.assertEqual(s3.id,'key3') 496 | self.assertEqual(s7.id,'key7') 497 | self.assertEqual(s8.id,'key8') 498 | self.assertEqual(s9.id,'key9') 499 | 500 | qs = Slice.objects.filter(name__regex='[PEZ].*') 501 | count = qs.count() 502 | self.assertEqual(count, 3) 503 | s1,s2,s7 = qs[:] 504 | self.assertEqual(s1.id,'key1') 505 | self.assertEqual(s2.id,'key2') 506 | self.assertEqual(s7.id,'key7') 507 | 508 | qs = Slice.objects.filter(name__iregex='bl.*e') 509 | count = qs.count() 510 | self.assertEqual(count, 2) 511 | s4,s6 = qs[:] 512 | self.assertEqual(s4.id,'key4') 513 | self.assertEqual(s6.id,'key6') 514 | 515 | class Department(models.Model): 516 | name = models.CharField(primary_key=True, max_length=256) 517 | 518 | def __unicode__(self): 519 | return self.title 520 | 521 | class DepartmentRequest(models.Model): 522 | from_department = models.ForeignKey(Department, related_name='froms') 523 | to_department = models.ForeignKey(Department, related_name='tos') 524 | 525 | class RestTestMultipleForeignKeys(TestCase): 526 | 527 | def test_it(self): 528 | 529 | for i in range(0,4): 530 | department = Department() 531 | department.name = "id_" + str(i) 532 | department.save() 533 | 534 | departments = Department.objects.order_by('name') 535 | d0 = departments[0] 536 | d1 = departments[1] 537 | d2 = departments[2] 538 | d3 = departments[3] 539 | 540 | req = DepartmentRequest() 541 | req.from_department = d0 542 | req.to_department = d1 543 | req.save() 544 | 545 | req = DepartmentRequest() 546 | req.from_department = d2 547 | req.to_department = d1 548 | req.save() 549 | 550 | rs = DepartmentRequest.objects.filter(from_department = d3, to_department = d1) 551 | self.assertEqual(rs.count(), 0) 552 | 553 | rs = DepartmentRequest.objects.filter(from_department=d0, to_department=d1) 554 | self.assertEqual(rs.count(), 1) 555 | req = rs[0] 556 | self.assertEqual(req.from_department, d0) 557 | self.assertEqual(req.to_department, d1) 558 | 559 | rs = DepartmentRequest.objects.filter(to_department=d1).order_by('from_department') 560 | self.assertEqual(rs.count(), 2) 561 | req = rs[0] 562 | self.assertEqual(req.from_department, d0) 563 | self.assertEqual(req.to_department, d1) 564 | req = rs[1] 565 | self.assertEqual(req.from_department, d2) 566 | self.assertEqual(req.to_department, d1) 567 | 568 | 569 | class EmptyModel(models.Model): 570 | pass 571 | 572 | class EmptyModelTest(TestCase): 573 | 574 | def test_empty_model(self): 575 | em = EmptyModel() 576 | em.save() 577 | qs = EmptyModel.objects.all() 578 | self.assertEqual(qs.count(), 1) 579 | em2 = qs[0] 580 | self.assertEqual(em.id, em2.id) 581 | 582 | class CompoundKeyTest(TestCase): 583 | 584 | def test_construct_with_no_id(self): 585 | ckm = CompoundKeyModel(name='foo', index=6, extra='hello') 586 | ckm.save(); 587 | ckm = CompoundKeyModel.objects.all()[0] 588 | self.assertEqual(ckm.id, 'foo|6') 589 | 590 | def test_construct_with_id(self): 591 | ckm = CompoundKeyModel(id='foo|6', name='foo', index=6, extra='hello') 592 | ckm.save(); 593 | ckm = CompoundKeyModel.objects.all()[0] 594 | self.assertEqual(ckm.id, 'foo|6') 595 | 596 | def test_malformed_id(self): 597 | ckm = CompoundKeyModel(id='abc', name='foo', index=6, extra='hello') 598 | self.failUnlessRaises(DatabaseError, ckm.save) 599 | 600 | def test_construct_mismatched_id(self): 601 | ckm = CompoundKeyModel(id='foo|5', name='foo', index=6, extra='hello') 602 | self.failUnlessRaises(DatabaseError, ckm.save) 603 | 604 | def test_update_non_key_field(self): 605 | ckm = CompoundKeyModel(name='foo', index=6, extra='hello') 606 | ckm.save(); 607 | ckm = CompoundKeyModel.objects.all()[0] 608 | ckm.extra = 'goodbye' 609 | ckm.save(); 610 | ckm = CompoundKeyModel.objects.all()[0] 611 | self.assertEqual(ckm.extra, 'goodbye') 612 | 613 | def test_update_no_id(self): 614 | ckm = CompoundKeyModel(id='foo|6', name='foo', index=6, extra='hello') 615 | ckm.save(); 616 | ckm = CompoundKeyModel(name='foo', index=6, extra='goodbye') 617 | ckm.save(); 618 | ckm = CompoundKeyModel.objects.all()[0] 619 | self.assertEqual(ckm.extra, 'goodbye') 620 | 621 | def test_update_mismatched_id(self): 622 | ckm = CompoundKeyModel(name='foo', index=6, extra='hello') 623 | ckm.save(); 624 | ckm = CompoundKeyModel.objects.all()[0] 625 | ckm.name = 'bar' 626 | self.failUnlessRaises(DatabaseError, ckm.save) 627 | 628 | def test_delete_by_id(self): 629 | ckm = CompoundKeyModel(name='foo', index=6, extra='hello') 630 | ckm.save(); 631 | ckm = CompoundKeyModel.objects.get(pk='foo|6') 632 | ckm.delete() 633 | qs = CompoundKeyModel.objects.all() 634 | self.assertEqual(len(qs), 0) 635 | 636 | def test_delete_by_fields(self): 637 | ckm = CompoundKeyModel(name='foo', index=6, extra='hello') 638 | ckm.save() 639 | qs = CompoundKeyModel.objects.filter(name='foo', index=6) 640 | qs.delete() 641 | qs = CompoundKeyModel.objects.all() 642 | self.assertEqual(len(qs), 0) 643 | 644 | 645 | def test_custom_separator(self): 646 | s = Slice(id='default') 647 | s.save() 648 | ckm = CompoundKeyModel2(slice=s, name='foo', index=6, extra='hello') 649 | ckm.save(); 650 | ckm = CompoundKeyModel2.objects.all()[0] 651 | self.assertEqual(ckm.id, 'default#foo#6') 652 | -------------------------------------------------------------------------------- /tests/views.py: -------------------------------------------------------------------------------- 1 | # Create your views here. 2 | -------------------------------------------------------------------------------- /urls.py: -------------------------------------------------------------------------------- 1 | from django.conf.urls.defaults import * 2 | 3 | # Uncomment the next two lines to enable the admin: 4 | from django.contrib import admin 5 | admin.autodiscover() 6 | 7 | urlpatterns = patterns('', 8 | # Example: 9 | # (r'^django_cassandra_backend/', include('test_db_backend.foo.urls')), 10 | 11 | # Uncomment the admin/doc line below and add 'django.contrib.admindocs' 12 | # to INSTALLED_APPS to enable admin documentation: 13 | # (r'^admin/doc/', include('django.contrib.admindocs.urls')), 14 | 15 | # Uncomment the next line to enable the admin: 16 | (r'^admin/', include(admin.site.urls)), 17 | ) 18 | --------------------------------------------------------------------------------