├── .gitignore
├── .project
├── .pydevproject
├── README.txt
├── __init__.py
├── django_cassandra
├── __init__.py
└── db
│ ├── __init__.py
│ ├── base.py
│ ├── compiler.py
│ ├── creation.py
│ ├── introspection.py
│ ├── predicate.py
│ └── utils.py
├── manage.py
├── settings.py
├── tests
├── __init__.py
├── admin.py
├── models.py
├── tests.py
└── views.py
└── urls.py
/.gitignore:
--------------------------------------------------------------------------------
1 | *~
2 | *.pyc
3 | *.pyo
4 | *.egg-info/
5 |
--------------------------------------------------------------------------------
/.project:
--------------------------------------------------------------------------------
1 |
2 |
3 | django_cassandra_backend
4 |
5 |
6 |
7 |
8 |
9 | org.python.pydev.PyDevBuilder
10 |
11 |
12 |
13 |
14 |
15 | org.python.pydev.pythonNature
16 |
17 |
18 |
--------------------------------------------------------------------------------
/.pydevproject:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 | Default
6 | python 2.6
7 |
8 |
9 |
--------------------------------------------------------------------------------
/README.txt:
--------------------------------------------------------------------------------
1 | Introduction
2 | ============
3 | This is an early development release of a Django backend for the Cassandra database.
4 | It has only been under development for a short time and there are almost certainly
5 | issues/bugs with this release -- see the end of this document for a list of known
6 | issues. Needless to say, you shouldn't use this release in a production setting, the
7 | format of the data stored in Cassandra may change in future versions, there's no
8 | promise of backwards compatibility with this version, and so on.
9 |
10 | Please let me know if you find any bugs or have any suggestions for how to improve
11 | the backend. You can contact me at: rob.vaterlaus@gmail.com
12 |
13 | Installation
14 | ============
15 | The backend requires at least the 0.7 version of Cassandra. 0.7 has several features
16 | (e.g. programmatic creation/deletion of keyspaces & column families, secondary index
17 | support) that are useful for the Django database backend, so I targeted that
18 | instead of 0.6. Unfortunately, the Cassandra Thrift API changed between 0.6 and 0.7,
19 | so the two version are incompatible.
20 |
21 | I currently use the 1.0.10 release. That's the only version I test against, so no
22 | promises if you try it with a different version. I have tested earlier versions
23 | against the 0.7.x and 0.8.x versions of Cassandra with no problem, so I would expect
24 | that it would still work.
25 |
26 | If you're updating from a previous version of the Cassandra DB backend, then it's
27 | possible/likely that the format it stores models/fields in Cassandra has changed,
28 | so you should wipe your Cassandra database. If you're using the default locations
29 | for things, then this should involve executing something like "rm -rf /var/log/cassandra/*"
30 | and "rm -rf /var/lib/cassandra/*". At some point as the backend becomes more stable
31 | data format compatibility or migration will be supported, but for now it's not worth
32 | the effort.
33 |
34 | The backend also requires the Django-nonrel fork of Django and djangotoolbox.
35 | Both are available here: .
36 | I installed the Django-nonrel version of Django globally in site-packages and
37 | copied djangotoolbox into the directory where I'm testing the Cassandra backend,
38 | but there are probably other (better, e.g. virtualenv) ways to install those things.
39 | I'm using the current (as of 11/1/2011) version of both packages. The Django-nonrel is
40 | based on the 1.3 beta 1 release of Django and the version of djangotoolbox is 0.9.2.
41 |
42 | You also need to generate the Python Thrift API code as described in the Cassandra
43 | documentation and copy the generated "cassandra" directory (from Cassandra's
44 | interface/gen-py directory) over to the top-level Django project directory.
45 | You should use the 0.6.x version of Thrift if you're using the 0.8 or higher version
46 | of Cassandra. You should use the 0.5.x version of Thrift if you're using 0.7.
47 |
48 | To configure a project to use the Cassandra backend all you have to do is change
49 | the database settings in the settings.py file. Change the ENGINE value to be
50 | 'django_cassandra.db' and the NAME value to be the name of the keyspace to use.
51 | You also need to set the SUPPORTS_TRANSACTIONS setting to False, since Cassandra
52 | doesn't support transactions. You can set HOST and PORT to specify the host and
53 | port where the Cassandra daemon process is running. If these aren't specified
54 | then the backend uses default values of 'localhost' and 9160. You can also set
55 | USER and PASSWORD if you're using authentication with Cassandra. You can also set
56 | a few optional Cassandra-specific settings in the database settings. Set the
57 | CASSANDRA_REPLICATION_FACTOR and CASSANDRA_STRATEGY_CLASS settings to be the
58 | replication factor and strategy class value you want to use when the backend
59 | creates the keyspace during syncdb. The default values for these settings are
60 | 1 and "org.apache.cassandra.locator.SimpleStrategy". You can also define
61 | CASSANDRA_READ_CONSISTENCY_LEVEL and CASSANDRA_WRITE_CONSISTENCY_LEVEL to be
62 | the values you want to use for the consistency level for read and write
63 | operations. If you want to use different consistency level values for
64 | different operations or different column families then it should work to
65 | use the Django multiple database support to define different database
66 | settings with different consistency levels and use the appropriate one,
67 | but I haven't tested this to verify that it works.
68 |
69 | Configure Cassandra as described in the Cassandra documentation.
70 | If want to be able to do range queries over primary keys then you need to set the
71 | partitioner in the cassandra.yaml config file to be the OrderPreservingPartitioner.
72 |
73 | Once you're finished configuring Cassandra start up the Cassandra daemon process as
74 | described in the Cassandra documentation.
75 |
76 | Run syncdb. This creates the keyspace (if necessary) and the column families for the
77 | models in the installed apps. The Cassandra backend creates one column family per
78 | model. It uses the db_table value from the meta settings for the name of the
79 | column family if it's specified; otherwise it uses the default name similar to
80 | other backends.
81 |
82 | Now you should be able to use the normal model and query set calls from your
83 | Django code.
84 |
85 | The backend supports query set update operations. This doesn't have the same
86 | transactional semantics that it would have on a relational database, but it
87 | does mean that you can use the backend with code that depends on this feature.
88 | In particular it means that cascading deletes are now supported. For large data
89 | sets cascading deletes are typically a bad idea, so they are disabled by default.
90 | To enable them you define a value in the database settings dictionary named
91 | "CASSANDRA_ENABLE_CASCADING_DELETES" whose value is True.
92 |
93 | The backend supports automatic construction of compound id/pk fields that
94 | are composed of the values of other fields in the model. You would typically
95 | use this when you have some subset of the fields in the model that together
96 | uniquely identify that particular instance of the model. Compound key generation
97 | is enabled for a model by defining a class variable named COMPOUND_KEY_FIELDS
98 | in a nested class called "CassandraSettings" of the model. The value of the
99 | COMPOUND_KEY_FIELDS value is a tuple of the names of the fields that are used
100 | to form the compound key. By default the field values are separated by the '|'
101 | character, but this separator value can be overridden by defining a class
102 | variable in the CassandraSettings named COMPOUND_KEY_SEPARATOR whose value is
103 | the character to use as the separator.
104 |
105 | This release includes a test project and app. If you want to use the backend in
106 | another project you can copy the django_cassandra directory to the
107 | top-level directory of the project (along with the cassandra and djangotoolbox
108 | directories) or else make sure that these are installed in your environment.
109 |
110 | What Works
111 | ==========
112 | - the basics: creating model instances, querying (get/filter/exclude), count,
113 | update/save, delete, order_by
114 | - efficient queries for exact matches on the primary key. It can also do range
115 | queries on the primary key, but your Cassandra cluster must be configured to use the
116 | OrderPreservingPartitioner if you want to do that. Unfortunately, currently it
117 | doesn't fail gracefully if you try to do range queries when using the
118 | RandomPartitioner, so just don't do that :-)
119 | - inefficient queries for everything else that can't be done efficiently in
120 | Cassandra. The basic approach used in the query processing code is to first try
121 | to prune the number of rows to look at by finding a part of the query that can
122 | be evaluated efficiently (i.e. a primary key filter predicate or an exact match
123 | secondary index predicate). Then it evaluates the remaining filter
124 | predicates over the pruned rows to obtain the final result. If there's no part
125 | of the query that can be evaluated efficiently, then it just fetches the entire
126 | set of rows and does all of the filtering in the backend code.
127 | - programmatic creation of the keyspace & column families via syncdb
128 | - Django admin UI, except for users in the auth application (see below)
129 | - I think all of the filter operations (e.g. gt, startswith, regex, etc.) are supported
130 | although it's possible I missed something
131 | - complex queries with Q nodes
132 | - basic secondary index support. If the db_index attribute of a field is set to True,
133 | then the backend configures the column family to index on that field/column.
134 | Currently Cassandra only supports exact match queries with the secondary
135 | indexes, so the support is limited. Range queries on columns with secondary indexes
136 | will still be inefficient.
137 | - support for query update operations (and thus cascading deletes, but that's
138 | disabled by default)
139 |
140 | What Doesn't Work (Yet)
141 | =======================
142 | - I haven't tested all of the different field types, so there are probably
143 | issues there with how the data is converted to and from Cassandra with some of the
144 | field types. My use case was mostly string fields, so most of the testing was with
145 | that. I've also tried out integer, float, boolean, date, datetime, time, text
146 | and decimal fields, so I think those should work too, but I haven't tested all
147 | of the possible field types.
148 | - joins
149 | - chunked queries. It just tries to get everything all at once from Cassandra.
150 | Currently the maximum number of keys/rows that it can fetch (i.e. the count
151 | value in the Cassandra Thrift API) defaults semi-arbitrarily to 1000000, so
152 | if you try to query over a column family with more returned rows than that
153 | it won't work (and if you're anywhere near approaching that limit you're going
154 | to be using gobs of memory). Similarly, there's a limit of 10000 for the number
155 | of columns returned in a given row. It's doubtful that anyone would come
156 | anywhere near that limit, since that is dictated by the number of fields there
157 | are in the Django model. You override either/both of these limits by setting
158 | the CASSANDRA_MAX_KEY_COUNT and/or CASSANDRA_MAX_COLUMN_COUNT settings in the
159 | database settings in settings.py.
160 | - ListModel/ListField support from djangotoolbox (I think?). I haven't
161 | investigated how this works and if it's feasible to support in Cassandra,
162 | although I'm guessing it probably wouldn't be too hard. For now, this means
163 | that several of the unit tests from djangotoolbox fail if you have that
164 | in your installed apps. I made a preliminary pass to try to get this to
165 | work, but it turned out to be more difficult than expected, so it exists
166 | in a partially-completed form in the source.
167 | - probably a lot of other stuff that I've forgotten or am unaware of :-)
168 |
169 | Known Issues
170 | ============
171 | - I haven't been able to get the admin UI to work for users in the Django
172 | authentication middleware. I included djangotoolbox in my installed apps, as
173 | suggested on the Django-nonrel web site, which got my further, but I still get
174 | an error in some Django template code that tries to render a change list (I think).
175 | I still need to track down what's going on there.
176 | - There's a reported issues with using unicode strings. At this point it's
177 | still unclear whether this is a problem in the Django backend or in the
178 | Python Thrift bindings to Cassandra. I've think I've fixed all of the obvious
179 | places in the backend code to deal properly with Unicode strings, but it's
180 | possible/probable there are some remaining issues. The reported problem is with
181 | using non-ASCII characters in the model definitions. This triggers an exception
182 | during syncdb, so for now just don't do that. It hasn't been tested yet
183 | whether there's a problem with simply storing Unicode strings as the field
184 | values (as opposed to the model/field names).
185 | - There are a few unit tests that fail in the sites middleware. These don't fail
186 | with the other nonrel backends, so it's a bug/limitation in the Cassandra backend.
187 | - If you enable the authentication and session middleware a bunch of the
188 | associated unit tests fail if you run all of the unit tests.
189 | Waldemar says that it's expected that some of these unit tests will fail,
190 | because they rely on joins which aren't supported yet. I haven't verified yet
191 | that all of the failures are because of joins, though.
192 | - the code needs a cleanup pass for things like the exception handling/safety,
193 | some refactoring, more pydoc comments, etc.
194 | - I have a feeling there are some places where I haven't completely leveraged
195 | the code in djangotoolbox, so there may be places where I haven't done
196 | things in the optimal way
197 | - the error handling/messaging isn't great for things like the Cassandra
198 | daemon not running, a versioning mismatch between client and Cassandra
199 | daemon, etc. Currently you just get a somewhat uninformative exception in
200 | these cases.
201 |
202 | Changes for 0.2.4
203 | =================
204 | - switch the timestamp format to use the system time in microseconds to be
205 | consistent with the standard Cassandra timestamps used by other Cassandra
206 | components (e.g. the Cassandra CLI) and to hopefully eliminate issues with
207 | timestamp collisions across multiple Django processes.
208 |
209 | Changes for 0.2.3
210 | =================
211 | - fixed a bug with the retry/reconnect logic where it would use a stale Cassandra
212 | Client object.
213 |
214 | Changes for 0.2.2
215 | =================
216 | - fixed a bug with handling delete operations where it would sometimes incorrectly
217 | delete all items whose values were a substring of the specified query value
218 | instead of only if there was an exact match.
219 |
220 | Changes for 0.2.1
221 | =================
222 |
223 | - Fixed typo in the CassandraAccessError class
224 | - Added support for customizing the arguments that are used to create the
225 | keyspace. In particular this allows you to specify the durable_writes
226 | setting that was added in Cassandra 1.0 if you want to disable that for
227 | a keyspace.
228 |
229 | Changes for 0.2
230 | ===============
231 | - added support for automatic construction of compound id/pk fields that
232 | are composed of the values of other fields in the model. You would typically
233 | use this when you have some subset of the fields in the model that together
234 | uniquely identify that particular instance of the model. Compound key generation
235 | is enabled for a model by defining a class variable named COMPOUND_KEY_FIELDS
236 | in a nested class called "CassandraSettings" of the model. The value of the
237 | COMPOUND_KEY_FIELDS value is a tuple of the names of the fields that are used
238 | to form the compound key. By default the field values are separated by the '|'
239 | character, but this separator value can be overridden by defining a class
240 | variable in the CassandraSettings named COMPOUND_KEY_SEPARATOR whose value is
241 | the character to use as the separator.
242 | - added support for running under the 0.8 version of Cassandra. This included
243 | fixing a bug where the secondary index names were not properly scoped with
244 | its associated column family (which "worked" before because Cassandra wasn't
245 | properly checking for conflicts) and properly setting the replication factor
246 | as a strategy option instead of a field in the KsDef struct. The code checks
247 | the API version to detect whether it's running against the 0.7 or 0.8 version
248 | of Cassandra, so it still works under 0.7.
249 | - support for query set update operations
250 | - support for cascading deletes (disabled by default)
251 | - fixed some bugs in the constructors of some exception classes
252 | - cleaned up the code for handling reconnecting to Cassandra if there's a
253 | disruption in the connection (e.g. Cassandra restarting).
254 |
255 | Changes for 0.1.7
256 | =================
257 |
258 | - Made the max key/column counts bigger as a temporary workaround for large queries.
259 | Really need to support chunked operations for this to work better.
260 |
261 | Changes for 0.1.6
262 | =================
263 |
264 | - Fixed a bug with handling default values of fields
265 |
266 | Changes For 0.1.5
267 | =================
268 |
269 | - Fixed a bug with the Cassandra reconnection logic
270 |
271 | Changes For 0.1.4
272 | =================
273 |
274 | - Fixed a bug with the id field not being properly initialized if the model
275 | instance is created with no intialization arguments.
276 | - Added unit tests for the bugs that were fixed recently
277 | - Thanks to Abd Allah Diab for reporting this bug
278 |
279 | Changes For 0.1.3
280 | =================
281 |
282 | - Fixed a bug with query set filter operations if there were multiple filters
283 | on indexed fields (e.g. foreign key fields)
284 | - Fixed a bug with order_by operations on foreign key fields
285 | - Thanks to Abd Allah Diab for reporting these bugs
286 |
287 | Changes For 0.1.2
288 | =================
289 |
290 | - Added support for configuring the column family definition settings so that
291 | you can tune the various memtable, row/key cache, & compaction settings.
292 | You can configure global default settings in the datbase settings in
293 | settings.py and you can have per-model overrides for the column family
294 | associated with each model. For the global settings you define an item
295 | in the dictionary of database settings whose key is named
296 | CASSANDRA_COLUMN_FAMILY_DEF_DEFAULT_SETTINGS and whose value is a dictionary
297 | of the optional keyword arguments to be passed to the CfDef constructor.
298 | Consult the Cassandra docs for the list of valid keyword args to use.
299 | Currently the per-model settings overrides are specified inline in the models,
300 | which isn't a general solution but works in most cases.
301 | I'm also planning on adding a way to specify these settings for models
302 | non-intrusively. With the current inline mechanism you define a nested class
303 | inside the model called 'CassandraSettings'. The column family def settings
304 | are specified in a class variable named COLUMN_FAMILY_DEF_SETTINGS, which
305 | is a dictionary of any of the optional CfDef settings that you want to
306 | override from the default values. All of these things are optional, so if
307 | you don't need to override anything you don't need to define the
308 | CassandraSettings class. All of the required settings for the CfDef
309 | (e.g. keyspace, name, etc.) are determined by other means.
310 | - Fixed a bug in handling null/missing columns when converting from the
311 | value from Cassandra.
312 | - Fixed some bugs with reconnecting to Cassandra if connectivity to
313 | Cassandra is disrupted.
314 | - Added a few new tests and did some cleanup to the unit tests
315 |
316 | Changes For 0.1.1
317 | =================
318 | - fixed some bugs in the cassandra reconnection logic where it was always
319 | retrying the operation even when it succeeded the first time.
320 | - fixed a nasty bug with deleting instances where it would delete all
321 | instances whose key was a substring of the key of the instance being deleted.
322 |
323 |
--------------------------------------------------------------------------------
/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vaterlaus/django_cassandra_backend/4f6df2585df51a7b51fed59481c564c0aee74418/__init__.py
--------------------------------------------------------------------------------
/django_cassandra/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vaterlaus/django_cassandra_backend/4f6df2585df51a7b51fed59481c564c0aee74418/django_cassandra/__init__.py
--------------------------------------------------------------------------------
/django_cassandra/db/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vaterlaus/django_cassandra_backend/4f6df2585df51a7b51fed59481c564c0aee74418/django_cassandra/db/__init__.py
--------------------------------------------------------------------------------
/django_cassandra/db/base.py:
--------------------------------------------------------------------------------
1 | # Copyright 2010 BSN, Inc.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 |
15 | from django.db.utils import DatabaseError
16 |
17 | from djangotoolbox.db.base import NonrelDatabaseFeatures, \
18 | NonrelDatabaseOperations, NonrelDatabaseWrapper, NonrelDatabaseClient, \
19 | NonrelDatabaseValidation, NonrelDatabaseIntrospection, \
20 | NonrelDatabaseCreation
21 |
22 | import re
23 | import time
24 | from .creation import DatabaseCreation
25 | from .introspection import DatabaseIntrospection
26 | from .utils import CassandraConnection, CassandraConnectionError, CassandraAccessError
27 | from thrift.transport import TTransport
28 | from cassandra.ttypes import *
29 |
30 |
31 | class DatabaseFeatures(NonrelDatabaseFeatures):
32 | string_based_auto_field = True
33 |
34 | def __init__(self, connection):
35 | super(DatabaseFeatures, self).__init__(connection)
36 | self.supports_deleting_related_objects = connection.settings_dict.get('CASSANDRA_ENABLE_CASCADING_DELETES', False)
37 |
38 |
39 | class DatabaseOperations(NonrelDatabaseOperations):
40 | compiler_module = __name__.rsplit('.', 1)[0] + '.compiler'
41 |
42 | def pk_default_value(self):
43 | """
44 | Use None as the value to indicate to the insert compiler that it needs
45 | to auto-generate a guid to use for the id. The case where this gets hit
46 | is when you create a model instance with no arguments. We override from
47 | the default implementation (which returns 'DEFAULT') because it's possible
48 | that someone would explicitly initialize the id field to be that value and
49 | we wouldn't want to override that. But None would never be a valid value
50 | for the id.
51 | """
52 | return None
53 |
54 | def sql_flush(self, style, tables, sequence_list):
55 | for table_name in tables:
56 | self.connection.creation.flush_table(table_name)
57 | return ""
58 |
59 | class DatabaseClient(NonrelDatabaseClient):
60 | pass
61 |
62 | class DatabaseValidation(NonrelDatabaseValidation):
63 | pass
64 |
65 | class DatabaseWrapper(NonrelDatabaseWrapper):
66 | def __init__(self, *args, **kwds):
67 | super(DatabaseWrapper, self).__init__(*args, **kwds)
68 |
69 | # Set up the associated backend objects
70 | self.features = DatabaseFeatures(self)
71 | self.ops = DatabaseOperations(self)
72 | self.client = DatabaseClient(self)
73 | self.creation = DatabaseCreation(self)
74 | self.validation = DatabaseValidation(self)
75 | self.introspection = DatabaseIntrospection(self)
76 |
77 | self.read_consistency_level = self.settings_dict.get('CASSANDRA_READ_CONSISTENCY_LEVEL', ConsistencyLevel.ONE)
78 | self.write_consistency_level = self.settings_dict.get('CASSANDRA_WRITE_CONSISTENCY_LEVEL', ConsistencyLevel.ONE)
79 | self.max_key_count = self.settings_dict.get('CASSANDRA_MAX_KEY_COUNT', 1000000)
80 | self.max_column_count = self.settings_dict.get('CASSANDRA_MAX_COLUMN_COUNT', 10000)
81 | self.column_family_def_defaults = self.settings_dict.get('CASSANDRA_COLUMN_FAMILY_DEF_DEFAULT_SETTINGS', {})
82 |
83 | self._db_connection = None
84 | self.determined_version = False
85 |
86 | def configure_connection(self, set_keyspace=False, login=False):
87 |
88 | if not self._db_connection.is_connected():
89 | self._db_connection.open(False, False)
90 | self.determined_version = False
91 |
92 | if not self.determined_version:
93 | # Determine which version of Cassandra we're connected to
94 | version_string = self._db_connection.get_client().describe_version()
95 | try:
96 | # FIXME: Should do some version check here to make sure that we're
97 | # talking to a cassandra daemon that supports the operations we require
98 | m = re.match('^([0-9]+)\.([0-9]+)\.([0-9]+)$', version_string)
99 | major_version = int(m.group(1))
100 | minor_version = int(m.group(2))
101 | patch_version = int(m.group(3))
102 | self.determined_version = True
103 | except Exception, e:
104 | raise DatabaseError('Invalid Thrift version string', e)
105 |
106 | # Determine supported features based on the API version
107 | self.supports_replication_factor_as_strategy_option = major_version >= 19 and minor_version >= 10
108 |
109 | if login:
110 | self._db_connection.login()
111 |
112 | if set_keyspace:
113 | try:
114 | self._db_connection.set_keyspace()
115 | except Exception, e:
116 | # Set up the default settings for the keyspace
117 | keyspace_def_settings = {
118 | 'name': self._db_connection.keyspace,
119 | 'strategy_class': 'org.apache.cassandra.locator.SimpleStrategy',
120 | 'strategy_options': {},
121 | 'cf_defs': []}
122 |
123 | # Apply any overrides for the keyspace settings
124 | custom_keyspace_def_settings = self.settings_dict.get('CASSANDRA_KEYSPACE_DEF_SETTINGS')
125 | if custom_keyspace_def_settings:
126 | keyspace_def_settings.update(custom_keyspace_def_settings)
127 |
128 | # Apply any overrides for the replication strategy
129 | # Note: This could be done by the user using the
130 | # CASSANDRA_KEYSPACE_DEF_SETTINGS, but the following customizations are
131 | # still supported for backwards compatibility with older versions of the backend
132 | strategy_class = self.settings_dict.get('CASSANDRA_REPLICATION_STRATEGY')
133 | if strategy_class:
134 | keyspace_def_settings['strategy_class'] = strategy_class
135 |
136 | # Apply an override of the strategy options
137 | strategy_options = self.settings_dict.get('CASSANDRA_REPLICATION_STRATEGY_OPTIONS')
138 | if strategy_options:
139 | if type(strategy_options) != dict:
140 | raise DatabaseError('CASSANDRA_REPLICATION_STRATEGY_OPTIONS must be a dictionary')
141 | keyspace_def_settings['strategy_options'].update(strategy_options)
142 |
143 | # Apply an override of the replication factor. Depending on the version of
144 | # Cassandra this may be applied to either the strategy options or the top-level
145 | # keyspace def settings
146 | replication_factor = self.settings_dict.get('CASSANDRA_REPLICATION_FACTOR')
147 | replication_factor_parent = keyspace_def_settings['strategy_options'] \
148 | if self.supports_replication_factor_as_strategy_option else keyspace_def_settings
149 | if replication_factor:
150 | replication_factor_parent['replication_factor'] = str(replication_factor)
151 | elif 'replication_factor' not in replication_factor_parent:
152 | replication_factor_parent['replication_factor'] = '1'
153 |
154 | keyspace_def = KsDef(**keyspace_def_settings)
155 | self._db_connection.get_client().system_add_keyspace(keyspace_def)
156 | self._db_connection.set_keyspace()
157 |
158 |
159 | def get_db_connection(self, set_keyspace=False, login=False):
160 | if not self._db_connection:
161 | # Get the host and port specified in the database backend settings.
162 | # Default to the standard Cassandra settings.
163 | host = self.settings_dict.get('HOST')
164 | if not host or host == '':
165 | host = 'localhost'
166 |
167 | port = self.settings_dict.get('PORT')
168 | if not port or port == '':
169 | port = 9160
170 |
171 | keyspace = self.settings_dict.get('NAME')
172 | if keyspace == None:
173 | keyspace = 'django'
174 |
175 | user = self.settings_dict.get('USER')
176 | password = self.settings_dict.get('PASSWORD')
177 |
178 | # Create our connection wrapper
179 | self._db_connection = CassandraConnection(host, port, keyspace, user, password)
180 |
181 | try:
182 | self.configure_connection(set_keyspace, login)
183 | except TTransport.TTransportException, e:
184 | raise CassandraConnectionError(e)
185 | except Exception, e:
186 | raise CassandraAccessError(e)
187 |
188 | return self._db_connection
189 |
190 | @property
191 | def db_connection(self):
192 | return self.get_db_connection(True, True)
193 |
--------------------------------------------------------------------------------
/django_cassandra/db/compiler.py:
--------------------------------------------------------------------------------
1 | # Copyright 2010 BSN, Inc.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 |
15 | import datetime
16 | import sys
17 | import traceback
18 | import datetime
19 | import decimal
20 |
21 | from django.db.models import ForeignKey
22 | from django.db.models.sql.where import AND, OR, WhereNode
23 | from django.db.models.sql.constants import MULTI
24 | from django.db.utils import DatabaseError
25 |
26 | from functools import wraps
27 |
28 | from djangotoolbox.db.basecompiler import NonrelQuery, NonrelCompiler, \
29 | NonrelInsertCompiler, NonrelUpdateCompiler, NonrelDeleteCompiler
30 |
31 | from .utils import *
32 | from .predicate import *
33 |
34 | from uuid import uuid4
35 | from cassandra import Cassandra
36 | from cassandra.ttypes import *
37 | from thrift.transport.TTransport import TTransportException
38 |
39 | def safe_call(func):
40 | @wraps(func)
41 | def _func(*args, **kwargs):
42 | try:
43 | return func(*args, **kwargs)
44 | except Exception, e:
45 | raise DatabaseError, DatabaseError(*tuple(e)), sys.exc_info()[2]
46 | return _func
47 |
48 | class CassandraQuery(NonrelQuery):
49 |
50 | # FIXME: How do we set this value? What's the maximum value it can be?
51 | #MAX_FETCH_COUNT = 0x7ffffff
52 | MAX_FETCH_COUNT = 10000
53 |
54 | def __init__(self, compiler, fields):
55 | super(CassandraQuery, self).__init__(compiler, fields)
56 |
57 | self.pk_column = self.query.get_meta().pk.column
58 | self.column_family = self.query.get_meta().db_table
59 | self.root_predicate = None
60 | self.ordering_spec = None
61 | self.cached_results = None
62 |
63 | self.indexed_columns = []
64 | self.field_name_to_column_name = {}
65 | for field in fields:
66 | column_name = field.db_column if field.db_column else field.column
67 | if field.db_index:
68 | self.indexed_columns.append(column_name)
69 | self.field_name_to_column_name[field.name] = column_name
70 |
71 | # This is needed for debugging
72 | def __repr__(self):
73 | # TODO: add some meaningful query string for debugging
74 | return ''
75 |
76 | def _convert_key_slice_to_rows(self, key_slice):
77 | rows = []
78 | for element in key_slice:
79 | if element.columns:
80 | row = self._convert_column_list_to_row(element.columns, self.pk_column, element.key)
81 | rows.append(row)
82 | return rows
83 |
84 | def _convert_column_list_to_row(self, column_list, pk_column_name, pk_value):
85 | row = {}
86 | # FIXME: When we add code to allow primary keys that also are indexed,
87 | # then we can change this to not set the primary key column in that case.
88 | # row[pk_column_name] = pk_value
89 | for column in column_list:
90 | row[column.column.name] = column.column.value
91 | return row
92 |
93 |
94 | def _get_rows_by_pk(self, range_predicate):
95 |
96 | db_connection = self.connection.db_connection
97 | column_parent = ColumnParent(column_family=self.column_family)
98 | slice_predicate = SlicePredicate(slice_range=SliceRange(start='',
99 | finish='', count=self.connection.max_column_count))
100 |
101 | if range_predicate._is_exact():
102 | column_list = call_cassandra_with_reconnect(db_connection,
103 | Cassandra.Client.get_slice, range_predicate.start,
104 | column_parent, slice_predicate, self.connection.read_consistency_level)
105 | if column_list:
106 | row = self._convert_column_list_to_row(column_list, self.pk_column, range_predicate.start)
107 | rows = [row]
108 | else:
109 | rows = []
110 | else:
111 | if range_predicate.start != None:
112 | key_start = range_predicate.start
113 | if not range_predicate.start_inclusive:
114 | key_start = key_start + chr(1)
115 | else:
116 | key_start = ''
117 |
118 | if range_predicate.end != None:
119 | key_end = range_predicate.end
120 | if not range_predicate.end_inclusive:
121 | key_end = key_end[:-1] + chr(ord(key_end[-1])-1) + (chr(126) * 16)
122 | else:
123 | key_end = ''
124 |
125 | key_range = KeyRange(start_key=key_start, end_key=key_end,
126 | count=self.connection.max_key_count)
127 | key_slice = call_cassandra_with_reconnect(db_connection,
128 | Cassandra.Client.get_range_slices, column_parent,
129 | slice_predicate, key_range, self.connection.read_consistency_level)
130 |
131 | rows = self._convert_key_slice_to_rows(key_slice)
132 |
133 | return rows
134 |
135 | def _get_rows_by_indexed_column(self, range_predicate):
136 | # Construct the index expression for the range predicate
137 | index_expressions = []
138 | if ((range_predicate.start != None) and
139 | (range_predicate.end == range_predicate.start) and
140 | range_predicate.start_inclusive and
141 | range_predicate.end_inclusive):
142 | index_expression = IndexExpression(range_predicate.column, IndexOperator.EQ, unicode(range_predicate.start))
143 | index_expressions.append(index_expression)
144 | else:
145 | # NOTE: These range queries don't work with the current version of cassandra
146 | # that I'm using (0.7 beta3)
147 | # It looks like there are cassandra tickets to add support for this, but it's
148 | # unclear how soon it will be supported. We shouldn't hit this code for now,
149 | # though, because can_evaluate_efficiently was changed to disable range queries
150 | # on indexed columns (they still can be performed, just inefficiently).
151 | if range_predicate.start:
152 | index_op = IndexOperator.GTE if range_predicate.start_inclusive else IndexOperator.GT
153 | index_expression = IndexExpression(unicode(range_predicate.column), index_op, unicode(range_predicate.start))
154 | index_expressions.append(index_expression)
155 | if range_predicate.end:
156 | index_op = IndexOperator.LTE if range_predicate.end_inclusive else IndexOperator.LT
157 | index_expression = IndexExpression(unicode(range_predicate.column), index_op, unicode(range_predicate.end))
158 | index_expressions.append(index_expression)
159 |
160 | assert(len(index_expressions) > 0)
161 |
162 | # Now make the call to cassandra to get the key slice
163 | db_connection = self.connection.db_connection
164 | column_parent = ColumnParent(column_family=self.column_family)
165 | index_clause = IndexClause(index_expressions, '', self.connection.max_key_count)
166 | slice_predicate = SlicePredicate(slice_range=SliceRange(start='', finish='', count=self.connection.max_column_count))
167 |
168 | key_slice = call_cassandra_with_reconnect(db_connection,
169 | Cassandra.Client.get_indexed_slices,
170 | column_parent, index_clause, slice_predicate,
171 | self.connection.read_consistency_level)
172 | rows = self._convert_key_slice_to_rows(key_slice)
173 |
174 | return rows
175 |
176 | def get_row_range(self, range_predicate):
177 | pk_column = self.query.get_meta().pk.column
178 | if range_predicate.column == pk_column:
179 | rows = self._get_rows_by_pk(range_predicate)
180 | else:
181 | assert(range_predicate.column in self.indexed_columns)
182 | rows = self._get_rows_by_indexed_column(range_predicate)
183 | return rows
184 |
185 | def get_all_rows(self):
186 | # TODO: Could factor this code better
187 | db_connection = self.connection.db_connection
188 | column_parent = ColumnParent(column_family=self.column_family)
189 | slice_predicate = SlicePredicate(slice_range=SliceRange(start='', finish='', count=self.connection.max_column_count))
190 | key_range = KeyRange(start_token = '0', end_token = '0', count=self.connection.max_key_count)
191 | #end_key = u'\U0010ffff'.encode('utf-8')
192 | #key_range = KeyRange(start_key='\x01', end_key=end_key, count=self.connection.max_key_count)
193 |
194 | key_slice = call_cassandra_with_reconnect(db_connection,
195 | Cassandra.Client.get_range_slices, column_parent,
196 | slice_predicate, key_range, self.connection.read_consistency_level)
197 | rows = self._convert_key_slice_to_rows(key_slice)
198 |
199 | return rows
200 |
201 | def _get_query_results(self):
202 | if self.cached_results == None:
203 | assert(self.root_predicate != None)
204 | self.cached_results = self.root_predicate.get_matching_rows(self)
205 | if self.ordering_spec:
206 | sort_rows(self.cached_results, self.ordering_spec)
207 | return self.cached_results
208 |
209 | @safe_call
210 | def fetch(self, low_mark, high_mark):
211 |
212 | if self.root_predicate == None:
213 | raise DatabaseError('No root query node')
214 |
215 | try:
216 | if high_mark is not None and high_mark <= low_mark:
217 | return
218 |
219 | results = self._get_query_results()
220 | if low_mark is not None or high_mark is not None:
221 | results = results[low_mark:high_mark]
222 | except Exception, e:
223 | # FIXME: Can get rid of this exception handling code eventually,
224 | # but it's useful for debugging for now.
225 | #traceback.print_exc()
226 | raise e
227 |
228 | for entity in results:
229 | yield entity
230 |
231 | @safe_call
232 | def count(self, limit=None):
233 | # TODO: This could be implemented more efficiently for simple predicates
234 | # where we could call the count method in the Cassandra Thrift API.
235 | # We can optimize for that later
236 | results = self._get_query_results()
237 | return len(results)
238 |
239 | @safe_call
240 | def delete(self):
241 | results = self._get_query_results()
242 | timestamp = get_next_timestamp()
243 | column_family = self.query.get_meta().db_table
244 | mutation_map = {}
245 | for item in results:
246 | mutation_map[item[self.pk_column]] = {column_family: [Mutation(deletion=Deletion(timestamp=timestamp))]}
247 | db_connection = self.connection.db_connection
248 | call_cassandra_with_reconnect(db_connection,
249 | Cassandra.Client.batch_mutate, mutation_map,
250 | self.connection.write_consistency_level)
251 |
252 |
253 | @safe_call
254 | def order_by(self, ordering):
255 | self.ordering_spec = []
256 | for order in ordering:
257 | if order.startswith('-'):
258 | field_name = order[1:]
259 | reversed = True
260 | else:
261 | field_name = order
262 | reversed = False
263 | column_name = self.field_name_to_column_name.get(field_name, field_name)
264 | #if column in self.foreign_key_columns:
265 | # column = column + '_id'
266 | self.ordering_spec.append((column_name, reversed))
267 |
268 | def init_predicate(self, parent_predicate, node):
269 | if isinstance(node, WhereNode):
270 | if node.connector == OR:
271 | compound_op = COMPOUND_OP_OR
272 | elif node.connector == AND:
273 | compound_op = COMPOUND_OP_AND
274 | else:
275 | raise InvalidQueryOpException()
276 | predicate = CompoundPredicate(compound_op, node.negated)
277 | for child in node.children:
278 | child_predicate = self.init_predicate(predicate, child)
279 | if parent_predicate:
280 | parent_predicate.add_child(predicate)
281 | else:
282 | column, lookup_type, db_type, value = self._decode_child(node)
283 | db_value = self.convert_value_for_db(db_type, value)
284 | assert parent_predicate
285 | parent_predicate.add_filter(column, lookup_type, db_value)
286 | predicate = None
287 |
288 | return predicate
289 |
290 | # FIXME: This is bad. We're modifying the WhereNode object that's passed in to us
291 | # from the Django ORM. We should do the pruning as we build our predicates, not
292 | # munge the WhereNode.
293 | def remove_unnecessary_nodes(self, node, retain_root_node):
294 | if isinstance(node, WhereNode):
295 | child_count = len(node.children)
296 | for i in range(child_count):
297 | node.children[i] = self.remove_unnecessary_nodes(node.children[i], False)
298 | if (not retain_root_node) and (not node.negated) and (len(node.children) == 1):
299 | node = node.children[0]
300 | return node
301 |
302 | @safe_call
303 | def add_filters(self, filters):
304 | """
305 | Traverses the given Where tree and adds the filters to this query
306 | """
307 |
308 | #if filters.negated:
309 | # raise InvalidQueryOpException('Exclude queries not implemented yet.')
310 | assert isinstance(filters,WhereNode)
311 | self.remove_unnecessary_nodes(filters, True)
312 | self.root_predicate = self.init_predicate(None, filters)
313 |
314 | class SQLCompiler(NonrelCompiler):
315 | query_class = CassandraQuery
316 |
317 | SPECIAL_NONE_VALUE = "\b"
318 |
319 | # Override this method from NonrelCompiler to get around problem with
320 | # mixing the field default values with the field format as its stored
321 | # in the database (i.e. convert_value_from_db should only be passed
322 | # the database-specific storage format not the field default value.
323 | def _make_result(self, entity, fields):
324 | result = []
325 | for field in fields:
326 | value = entity.get(field.column)
327 | if value is not None:
328 | value = self.convert_value_from_db(
329 | field.db_type(connection=self.connection), value)
330 | else:
331 | value = field.get_default()
332 | if not field.null and value is None:
333 | raise DatabaseError("Non-nullable field %s can't be None!" % field.name)
334 | result.append(value)
335 |
336 | return result
337 |
338 | # This gets called for each field type when you fetch() an entity.
339 | # db_type is the string that you used in the DatabaseCreation mapping
340 | def convert_value_from_db(self, db_type, value):
341 |
342 | if value == self.SPECIAL_NONE_VALUE or value is None:
343 | return None
344 |
345 | if db_type.startswith('ListField:'):
346 | db_sub_type = db_type.split(':', 1)[1]
347 | value = convert_string_to_list(value)
348 | if isinstance(value, (list, tuple)) and len(value):
349 | value = [self.convert_value_from_db(db_sub_type, subvalue)
350 | for subvalue in value]
351 | elif db_type == 'date':
352 | dt = datetime.datetime.strptime(value, '%Y-%m-%d')
353 | value = dt.date()
354 | elif db_type == 'datetime':
355 | value = datetime.datetime.strptime(value, '%Y-%m-%d %H:%M:%S.%f')
356 | elif db_type == 'time':
357 | dt = datetime.datetime.strptime(value, '%H:%M:%S.%f')
358 | value = dt.time()
359 | elif db_type == 'bool':
360 | value = value.lower() == 'true'
361 | elif db_type == 'int':
362 | value = int(value)
363 | elif db_type == 'long':
364 | value = long(value)
365 | elif db_type == 'float':
366 | value = float(value)
367 | #elif db_type == 'id':
368 | # value = unicode(value).decode('utf-8')
369 | elif db_type.startswith('decimal'):
370 | value = decimal.Decimal(value)
371 | elif isinstance(value, str):
372 | # always retrieve strings as unicode (it is possible that old datasets
373 | # contain non unicode strings, nevertheless work with unicode ones)
374 | value = value.decode('utf-8')
375 |
376 | return value
377 |
378 | # This gets called for each field type when you insert() an entity.
379 | # db_type is the string that you used in the DatabaseCreation mapping
380 | def convert_value_for_db(self, db_type, value):
381 | if value is None:
382 | return self.SPECIAL_NONE_VALUE
383 |
384 | if db_type.startswith('ListField:'):
385 | db_sub_type = db_type.split(':', 1)[1]
386 | if isinstance(value, (list, tuple)) and len(value):
387 | value = [self.convert_value_for_db(db_sub_type, subvalue) for subvalue in value]
388 | value = convert_list_to_string(value)
389 | elif type(value) is list:
390 | value = [self.convert_value_for_db(db_type, item) for item in value]
391 | elif db_type == 'datetime':
392 | value = value.strftime('%Y-%m-%d %H:%M:%S.%f')
393 | elif db_type == 'time':
394 | value = value.strftime('%H:%M:%S.%f')
395 | elif db_type == 'bool':
396 | value = str(value).lower()
397 | elif (db_type == 'int') or (db_type == 'long') or (db_type == 'float'):
398 | value = str(value)
399 | elif db_type == 'id':
400 | value = unicode(value)
401 | elif (type(value) is not unicode) and (type(value) is not str):
402 | value = unicode(value)
403 |
404 | # always store strings as utf-8
405 | if type(value) is unicode:
406 | value = value.encode('utf-8')
407 |
408 | return value
409 |
410 | # This handles both inserts and updates of individual entities
411 | class SQLInsertCompiler(NonrelInsertCompiler, SQLCompiler):
412 |
413 | @safe_call
414 | def insert(self, data, return_id=False):
415 | pk_column = self.query.get_meta().pk.column
416 | model = self.query.model
417 | compound_key_fields = None
418 | if hasattr(model, 'CassandraSettings'):
419 | if hasattr(model.CassandraSettings, 'ADJUSTED_COMPOUND_KEY_FIELDS'):
420 | compound_key_fields = model.CassandraSettings.ADJUSTED_COMPOUND_KEY_FIELDS
421 | elif hasattr(model.CassandraSettings, 'COMPOUND_KEY_FIELDS'):
422 | compound_key_fields = []
423 | for field_name in model.CassandraSettings.COMPOUND_KEY_FIELDS:
424 | field_class = None
425 | for lf in model._meta.local_fields:
426 | if lf.name == field_name:
427 | field_class = lf
428 | break
429 | if field_class is None:
430 | raise DatabaseError('Invalid compound key field')
431 | if type(field_class) is ForeignKey:
432 | field_name += '_id'
433 | compound_key_fields.append(field_name)
434 | model.CassandraSettings.ADJUSTED_COMPOUND_KEY_FIELDS = compound_key_fields
435 | separator = model.CassandraSettings.COMPOUND_KEY_SEPARATOR \
436 | if hasattr(model.CassandraSettings, 'COMPOUND_KEY_SEPARATOR') \
437 | else self.connection.settings_dict.get('CASSANDRA_COMPOUND_KEY_SEPARATOR', '|')
438 | # See if the data arguments contain a value for the primary key.
439 | # FIXME: For now we leave the key data as a column too. This is
440 | # suboptimal, since the data is duplicated, but there are a couple of cases
441 | # where you need to keep the column. First, if you have a model with only
442 | # a single field that's the primary key (admittedly a semi-pathological case,
443 | # but I can imagine valid use cases where you have this), then it doesn't
444 | # work if the column is removed, because then there are no columns and that's
445 | # interpreted as a deleted row (i.e. the usual Cassandra tombstone issue).
446 | # Second, if there's a secondary index configured for the primary key field
447 | # (not particularly useful with the current Cassandra, but would be valid when
448 | # you can do a range query on indexed column) then you'd want to keep the
449 | # column. So for now, we just leave the column in there so these cases work.
450 | # Eventually we can optimize this and remove the column where it makes sense.
451 | key = data.get(pk_column)
452 | if key:
453 | if compound_key_fields is not None:
454 | compound_key_values = key.split(separator)
455 | for field_name, compound_key_value in zip(compound_key_fields, compound_key_values):
456 | if field_name in data and data[field_name] != compound_key_value:
457 | raise DatabaseError("The value of the compound key doesn't match the values of the individual fields")
458 | else:
459 | if compound_key_fields is not None:
460 | try:
461 | compound_key_values = [data.get(field_name) for field_name in compound_key_fields]
462 | key = separator.join(compound_key_values)
463 | except Exception, e:
464 | raise DatabaseError('The values of the fields used to form a compound key must be specified and cannot be null')
465 | else:
466 | key = str(uuid4())
467 | # Insert the key as column data too
468 | # FIXME. See the above comment. When the primary key handling is optimized,
469 | # then we would not always add the key to the data here.
470 | data[pk_column] = key
471 |
472 | timestamp = get_next_timestamp()
473 |
474 | mutation_list = []
475 | for name, value in data.items():
476 | # FIXME: Do we need this check here? Or is the name always already a str instead of unicode.
477 | if type(name) is unicode:
478 | name = name.decode('utf-8')
479 | mutation = Mutation(column_or_supercolumn=ColumnOrSuperColumn(column=Column(name=name, value=value, timestamp=timestamp)))
480 | mutation_list.append(mutation)
481 |
482 | db_connection = self.connection.db_connection
483 | column_family = self.query.get_meta().db_table
484 | call_cassandra_with_reconnect(db_connection,
485 | Cassandra.Client.batch_mutate, {key: {column_family: mutation_list}},
486 | self.connection.write_consistency_level)
487 |
488 | if return_id:
489 | return key
490 |
491 | class SQLUpdateCompiler(NonrelUpdateCompiler, SQLCompiler):
492 | def __init__(self, *args, **kwargs):
493 | super(SQLUpdateCompiler, self).__init__(*args, **kwargs)
494 |
495 | def execute_sql(self, result_type=MULTI):
496 | data = {}
497 | for field, model, value in self.query.values:
498 | assert field is not None
499 | if not field.null and value is None:
500 | raise DatabaseError("You can't set %s (a non-nullable "
501 | "field) to None!" % field.name)
502 | db_type = field.db_type(connection=self.connection)
503 | value = self.convert_value_for_db(db_type, value)
504 | data[field.column] = value
505 |
506 | # TODO: Add compound key check here -- ensure that we're not updating
507 | # any of the fields that are components in the compound key.
508 |
509 | # TODO: This isn't super efficient because executing the query will
510 | # fetch all of the columns for each row even though all we really need
511 | # is the key for the row. Should be pretty straightforward to change
512 | # the CassandraQuery class to support custom slice predicates.
513 |
514 | #model = self.query.model
515 | pk_column = self.query.get_meta().pk.column
516 |
517 | pk_index = -1
518 | fields = self.get_fields()
519 | for index in range(len(fields)):
520 | if fields[index].column == pk_column:
521 | pk_index = index;
522 | break
523 | if pk_index == -1:
524 | raise DatabaseError('Invalid primary key column')
525 |
526 | row_count = 0
527 | column_family = self.query.get_meta().db_table
528 | timestamp = get_next_timestamp()
529 | batch_mutate_data = {}
530 | for result in self.results_iter():
531 | row_count += 1
532 | mutation_list = []
533 | key = result[pk_index]
534 | for name, value in data.items():
535 | # FIXME: Do we need this check here? Or is the name always already a str instead of unicode.
536 | if type(name) is unicode:
537 | name = name.decode('utf-8')
538 | mutation = Mutation(column_or_supercolumn=ColumnOrSuperColumn(column=Column(name=name, value=value, timestamp=timestamp)))
539 | mutation_list.append(mutation)
540 | batch_mutate_data[key] = {column_family: mutation_list}
541 |
542 | db_connection = self.connection.db_connection
543 | call_cassandra_with_reconnect(db_connection,
544 | Cassandra.Client.batch_mutate, batch_mutate_data,
545 | self.connection.write_consistency_level)
546 |
547 | return row_count
548 |
549 | class SQLDeleteCompiler(NonrelDeleteCompiler, SQLCompiler):
550 | pass
551 |
--------------------------------------------------------------------------------
/django_cassandra/db/creation.py:
--------------------------------------------------------------------------------
1 | # Copyright 2010 BSN, Inc.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 |
15 | from django.db.backends.creation import TEST_DATABASE_PREFIX
16 | from django.db.utils import DatabaseError
17 | from djangotoolbox.db.creation import NonrelDatabaseCreation
18 | from cassandra import Cassandra
19 | from cassandra.ttypes import *
20 | from django.core.management import call_command
21 | from .utils import get_next_timestamp
22 |
23 | class DatabaseCreation(NonrelDatabaseCreation):
24 |
25 | data_types = {
26 | 'AutoField': 'text',
27 | 'BigIntegerField': 'long',
28 | 'BooleanField': 'bool',
29 | 'CharField': 'text',
30 | 'CommaSeparatedIntegerField': 'text',
31 | 'DateField': 'date',
32 | 'DateTimeField': 'datetime',
33 | 'DecimalField': 'decimal:%(max_digits)s,%(decimal_places)s',
34 | 'EmailField': 'text',
35 | 'FileField': 'text',
36 | 'FilePathField': 'text',
37 | 'FloatField': 'float',
38 | 'ImageField': 'text',
39 | 'IntegerField': 'int',
40 | 'IPAddressField': 'text',
41 | 'NullBooleanField': 'bool',
42 | 'OneToOneField': 'integer',
43 | 'PositiveIntegerField': 'int',
44 | 'PositiveSmallIntegerField': 'int',
45 | 'SlugField': 'text',
46 | 'SmallIntegerField': 'integer',
47 | 'TextField': 'text',
48 | 'TimeField': 'time',
49 | 'URLField': 'text',
50 | 'XMLField': 'text',
51 | 'GenericAutoField': 'id',
52 | 'StringForeignKey': 'id',
53 | 'AutoField': 'id',
54 | 'RelatedAutoField': 'id',
55 | }
56 |
57 | def sql_create_model(self, model, style, known_models=set()):
58 |
59 | db_connection = self.connection.db_connection
60 | keyspace = self.connection.settings_dict['NAME']
61 |
62 | opts = model._meta
63 | column_metadata = []
64 |
65 | # Browsing through fields to find indexed fields
66 | for field in opts.local_fields:
67 | if field.db_index:
68 | column_name = str(field.db_column if field.db_column else field.column)
69 | column_def = ColumnDef(name=column_name, validation_class='BytesType',
70 | index_type=IndexType.KEYS)
71 | column_metadata.append(column_def)
72 |
73 | cfdef_settings = self.connection.column_family_def_defaults.copy()
74 |
75 | if hasattr(model, 'CassandraSettings') and \
76 | hasattr(model.CassandraSettings, 'COLUMN_FAMILY_DEF_SETTINGS'):
77 | cfdef_overrides = model.CassandraSettings.COLUMN_FAMILY_DEF_SETTINGS
78 | if type(cfdef_overrides) is not dict:
79 | raise DatabaseError('The value of COLUMN_FAMILY_DEF_SETTINGS in the '
80 | 'CassandraSettings class must be a dictionary of the optional '
81 | 'settings to use when creating the column family.')
82 | cfdef_settings.update(cfdef_overrides)
83 |
84 | cfdef_settings['keyspace'] = keyspace
85 | if not cfdef_settings.get('name'):
86 | cfdef_settings['name'] = opts.db_table
87 | if not cfdef_settings.get('comparator_type'):
88 | cfdef_settings['comparator_type'] = 'UTF8Type'
89 | cfdef_settings['column_metadata'] = column_metadata
90 |
91 | column_family_def = CfDef(**cfdef_settings)
92 |
93 | db_connection.get_client().system_add_column_family(column_family_def)
94 |
95 | return [], {}
96 |
97 | def drop_keyspace(self, keyspace_name, verbosity=1):
98 | """
99 | Drop the specified keyspace from the cluster.
100 | """
101 |
102 | db_connection = self.connection.get_db_connection(False, False)
103 |
104 | try:
105 | db_connection.get_client().system_drop_keyspace(keyspace_name)
106 | except Exception, e:
107 | # We want to succeed without complaining if the test db doesn't
108 | # exist yet, so we just assume that any exception that's raised
109 | # was for that reason and ignore it, except for printing a
110 | # message if verbose output is enabled
111 | # FIXME: Could probably be more specific about the Thrift
112 | # exception that we catch here.
113 | #if verbosity >= 1:
114 | # print "Exception thrown while trying to drop the test database/keyspace: ", e
115 | pass
116 |
117 | def create_test_db(self, verbosity, autoclobber):
118 | """
119 | Create a new test database/keyspace.
120 | """
121 |
122 | if verbosity >= 1:
123 | print "Creating test database '%s'..." % self.connection.alias
124 |
125 | # Replace the NAME field in the database settings with the test keyspace name
126 | settings_dict = self.connection.settings_dict
127 | if settings_dict.get('TEST_NAME'):
128 | test_keyspace_name = settings_dict['TEST_NAME']
129 | else:
130 | test_keyspace_name = TEST_DATABASE_PREFIX + settings_dict['NAME']
131 |
132 | settings_dict['NAME'] = test_keyspace_name
133 |
134 | # First make sure we've destroyed an existing test keyspace
135 | # FIXME: Should probably do something with autoclobber here, but why
136 | # would you ever not want to autoclobber when running the tests?
137 | self.drop_keyspace(test_keyspace_name, verbosity)
138 |
139 | # Call syncdb to create the necessary tables/column families
140 | call_command('syncdb', verbosity=False, interactive=False, database=self.connection.alias)
141 |
142 | return test_keyspace_name
143 |
144 | def destroy_test_db(self, old_database_name, verbosity=1):
145 | """
146 | Destroy the test database/keyspace.
147 | """
148 |
149 | if verbosity >= 1:
150 | print "Destroying test database '%s'..." % self.connection.alias
151 |
152 | settings_dict = self.connection.settings_dict
153 | test_keyspace_name = settings_dict.get('NAME')
154 | settings_dict['NAME'] = old_database_name
155 |
156 | self.drop_keyspace(test_keyspace_name, verbosity)
157 |
158 | def flush_table(self, table_name):
159 |
160 | db_connection = self.connection.db_connection
161 |
162 | # FIXME: Calling truncate here seems to corrupt the secondary indexes,
163 | # so for now the truncate call has been replaced with removing the
164 | # row one by one. When the truncate bug has been fixed in Cassandra
165 | # this should be switched back to use truncate.
166 | # NOTE: This should be fixed as of the 0.7.0-rc2 build, so we should
167 | # try this out again to see if it works now.
168 | # UPDATE: Tried it with rc2 and it worked calling truncate but it was
169 | # slower than using remove (at least for the unit tests), so for now
170 | # I'm leaving it alone pending further investigation.
171 | #db_connection.get_client().truncate(table_name)
172 |
173 | column_parent = ColumnParent(column_family=table_name)
174 | slice_predicate = SlicePredicate(column_names=[])
175 | key_range = KeyRange(start_token = '0', end_token = '0', count = 1000)
176 | key_slice_list = db_connection.get_client().get_range_slices(column_parent, slice_predicate, key_range, ConsistencyLevel.ONE)
177 | column_path = ColumnPath(column_family=table_name)
178 | timestamp = get_next_timestamp()
179 | for key_slice in key_slice_list:
180 | db_connection.get_client().remove(key_slice.key, column_path, timestamp, ConsistencyLevel.ONE)
181 |
182 |
183 | def sql_indexes_for_model(self, model, style):
184 | """
185 | We already handle creating the indexes in sql_create_model (above) so
186 | we don't need to do anything more here.
187 | """
188 | return []
189 |
190 | def set_autocommit(self):
191 | pass
192 |
193 |
--------------------------------------------------------------------------------
/django_cassandra/db/introspection.py:
--------------------------------------------------------------------------------
1 | # Copyright 2010 BSN, Inc.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 |
15 | from djangotoolbox.db.base import NonrelDatabaseIntrospection
16 | from django.db.backends import BaseDatabaseIntrospection
17 |
18 | class DatabaseIntrospection(NonrelDatabaseIntrospection):
19 | def get_table_list(self, cursor):
20 | "Returns a list of names of all tables that exist in the database."
21 | db_connection = self.connection.db_connection
22 | ks_def = db_connection.get_client().describe_keyspace(db_connection.keyspace)
23 | result = [cf_def.name for cf_def in ks_def.cf_defs]
24 | return result
25 |
26 | def table_names(self):
27 | # NonrelDatabaseIntrospection has an implementation of this that returns
28 | # that all of the tables for the models already exist in the database,
29 | # so the DatabaseCreation code never gets called to create new tables,
30 | # which isn't how we want things to work for Cassandra, so we bypass the
31 | # nonrel implementation and go directly to the base introspection code.
32 | return BaseDatabaseIntrospection.table_names(self)
33 |
34 | def sequence_list(self):
35 | return []
36 |
37 | # TODO: Implement these things eventually
38 | #===============================================================================
39 | # def get_table_description(self, cursor, table_name):
40 | # "Returns a description of the table, with the DB-API cursor.description interface."
41 | # return ""
42 | #
43 | # def get_relations(self, cursor, table_name):
44 | # """
45 | # Returns a dictionary of {field_index: (field_index_other_table, other_table)}
46 | # representing all relationships to the given table. Indexes are 0-based.
47 | # """
48 | # relations = {}
49 | # return relations
50 | #
51 | # def get_indexes(self, cursor, table_name):
52 | # """
53 | # Returns a dictionary of fieldname -> infodict for the given table,
54 | # where each infodict is in the format:
55 | # {'primary_key': boolean representing whether it's the primary key,
56 | # 'unique': boolean representing whether it's a unique index}
57 | # """
58 | # indexes = {}
59 | # return indexes
60 | #===============================================================================
61 |
--------------------------------------------------------------------------------
/django_cassandra/db/predicate.py:
--------------------------------------------------------------------------------
1 | # Copyright 2010 BSN, Inc.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 |
15 | import re
16 | from .utils import combine_rows
17 |
18 | SECONDARY_INDEX_SUPPORT_ENABLED = True
19 |
20 | class InvalidSortSpecException(Exception):
21 | def __init__(self):
22 | super(InvalidSortSpecException, self).__init__('The row sort spec must be a sort spec tuple/list or a tuple/list of sort specs')
23 |
24 | class InvalidRowCombinationOpException(Exception):
25 | def __init__(self):
26 | super(InvalidRowCombinationOpException, self).__init__('Invalid row combination operation')
27 |
28 | class InvalidPredicateOpException(Exception):
29 | def __init__(self):
30 | super(InvalidPredicateOpException, self).__init__('Invalid/unsupported query predicate operation')
31 |
32 |
33 | COMPOUND_OP_AND = 1
34 | COMPOUND_OP_OR = 2
35 |
36 | class RangePredicate(object):
37 |
38 | def __init__(self, column, start=None, start_inclusive=True, end=None, end_inclusive=True):
39 | self.column = column
40 | self.start = start
41 | self.start_inclusive = start_inclusive
42 | self.end = end
43 | self.end_inclusive = end_inclusive
44 |
45 | def __repr__(self):
46 | s = '(RANGE: '
47 | if self.start:
48 | op = '<=' if self.start_inclusive else '<'
49 | s += (unicode(self.start) + op)
50 | s += self.column
51 | if self.end:
52 | op = '>=' if self.end_inclusive else '>'
53 | s += (op + unicode(self.end))
54 | s += ')'
55 | return s
56 |
57 | def _is_exact(self):
58 | return (self.start != None) and (self.start == self.end) and self.start_inclusive and self.end_inclusive
59 |
60 | def can_evaluate_efficiently(self, pk_column, indexed_columns):
61 | # FIXME: There's some problem with secondary index support currently.
62 | # I'm suspicious that this is a bug in Cassandra but I haven't really verified that yet.
63 | # Anyway disabling the secondary index support for now.
64 | return ((self.column == pk_column) or
65 | (SECONDARY_INDEX_SUPPORT_ENABLED and ((self.column in indexed_columns) and self._is_exact())))
66 |
67 | def incorporate_range_op(self, column, op, value, parent_compound_op):
68 | if column != self.column:
69 | return False
70 |
71 | # FIXME: The following logic could probably be tightened up a bit
72 | # (although perhaps at the expense of clarity?)
73 | if parent_compound_op == COMPOUND_OP_AND:
74 | if op == 'gt':
75 | if self.start == None or value >= self.start:
76 | self.start = value
77 | self.start_inclusive = False
78 | return True
79 | elif op == 'gte':
80 | if self.start == None or value > self.start:
81 | self.start = value
82 | self.start_inclusive = True
83 | return True
84 | elif op == 'lt':
85 | if self.end == None or value <= self.end:
86 | self.end = value
87 | self.end_inclusive = False
88 | return True
89 | elif op == 'lte':
90 | if self.end == None or value < self.end:
91 | self.end = value
92 | self.end_inclusive = True
93 | return True
94 | elif op == 'exact':
95 | if self._matches_value(value):
96 | self.start = self.end = value
97 | self.start_inclusive = self.end_inclusive = True
98 | return True
99 | elif op == 'startswith':
100 | # For the end value we increment the ordinal value of the last character
101 | # in the start value and make the end value not inclusive
102 | end_value = value[:-1] + chr(ord(value[-1])+1)
103 | if (((self.start == None) or (value > self.start)) and
104 | ((self.end == None) or (end_value <= self.end))):
105 | self.start = value
106 | self.end = end_value
107 | self.start_inclusive = True
108 | self.end_inclusive = False
109 | return True
110 | else:
111 | raise InvalidPredicateOpException()
112 | elif parent_compound_op == COMPOUND_OP_OR:
113 | if op == 'gt':
114 | if self.start == None or value < self.start:
115 | self.start = value
116 | self.start_inclusive = False
117 | return True
118 | elif op == 'gte':
119 | if self.start == None or value <= self.start:
120 | self.start = value
121 | self.start_inclusive = True
122 | return True
123 | elif op == 'lt':
124 | if self.end == None or value > self.end:
125 | self.end = value
126 | self.end_inclusive = False
127 | return True
128 | elif op == 'lte':
129 | if self.end == None or value >= self.end:
130 | self.end = value
131 | self.end_inclusive = True
132 | return True
133 | elif op == 'exact':
134 | if self._matches_value(value):
135 | return True
136 | elif op == 'startswith':
137 | # For the end value we increment the ordinal value of the last character
138 | # in the start value and make the end value not inclusive
139 | end_value = value[:-1] + chr(ord(value[-1])+1)
140 | if (((self.start == None) or (value <= self.start)) and
141 | ((self.end == None) or (end_value > self.end))):
142 | self.start = value
143 | self.end = end_value
144 | self.start_inclusive = True
145 | self.end_inclusive = False
146 | return True
147 | else:
148 | raise InvalidPredicateOpException()
149 |
150 | return False
151 |
152 | def _matches_value(self, value):
153 | if value == None:
154 | return False
155 | if self.start != None:
156 | if self.start_inclusive:
157 | if value < self.start:
158 | return False
159 | elif value <= self.start:
160 | return False
161 | if self.end != None:
162 | if self.end_inclusive:
163 | if value > self.end:
164 | return False
165 | elif value >= self.end:
166 | return False
167 | return True
168 |
169 | def row_matches(self, row):
170 | value = row.get(self.column, None)
171 | return self._matches_value(value)
172 |
173 | def get_matching_rows(self, query):
174 | rows = query.get_row_range(self)
175 | return rows
176 |
177 | class OperationPredicate(object):
178 | def __init__(self, column, op, value=None):
179 | self.column = column
180 | self.op = op
181 | self.value = value
182 | if op == 'regex' or op == 'iregex':
183 | flags = re.I if op == 'iregex' else 0
184 | self.pattern = re.compile(value, flags)
185 |
186 | def __repr__(self):
187 | return '(OP: ' + self.op + ':' + unicode(self.value) + ')'
188 |
189 | def can_evaluate_efficiently(self, pk_column, indexed_columns):
190 | return False
191 |
192 | def row_matches(self, row):
193 | row_value = row.get(self.column, None)
194 | if self.op == 'isnull':
195 | return row_value == None
196 | # FIXME: Not sure if the following test is correct in all cases
197 | if (row_value == None) or (self.value == None):
198 | return False
199 | if self.op == 'in':
200 | return row_value in self.value
201 | if self.op == 'istartswith':
202 | return row_value.lower().startswith(self.value.lower())
203 | elif self.op == 'endswith':
204 | return row_value.endswith(self.value)
205 | elif self.op == 'iendswith':
206 | return row_value.lower().endswith(self.value.lower())
207 | elif self.op == 'iexact':
208 | return row_value.lower() == self.value.lower()
209 | elif self.op == 'contains':
210 | return row_value.find(self.value) >= 0
211 | elif self.op == 'icontains':
212 | return row_value.lower().find(self.value.lower()) >= 0
213 | elif self.op == 'regex' or self.op == 'iregex':
214 | return self.pattern.match(row_value) != None
215 | else:
216 | raise InvalidPredicateOpException()
217 |
218 | def incorporate_range_op(self, column, op, value, parent_compound_op):
219 | return False
220 |
221 | def get_matching_rows(self, query):
222 | # get_matching_rows should only be called for predicates that can
223 | # be evaluated efficiently, which is not the case for OperationPredicate's
224 | raise NotImplementedError('get_matching_rows() called for inefficient predicate')
225 |
226 | class CompoundPredicate(object):
227 | def __init__(self, op, negated=False, children=None):
228 | self.op = op
229 | self.negated = negated
230 | self.children = children
231 | if self.children == None:
232 | self.children = []
233 |
234 | def __repr__(self):
235 | s = '('
236 | if self.negated:
237 | s += 'NOT '
238 | s += ('AND' if self.op == COMPOUND_OP_AND else 'OR')
239 | s += ': '
240 | first_time = True
241 | if self.children:
242 | for child_predicate in self.children:
243 | if first_time:
244 | first_time = False
245 | else:
246 | s += ','
247 | s += unicode(child_predicate)
248 | s += ')'
249 | return s
250 |
251 | def can_evaluate_efficiently(self, pk_column, indexed_columns):
252 | if self.negated:
253 | return False
254 | if self.op == COMPOUND_OP_AND:
255 | for child in self.children:
256 | if child.can_evaluate_efficiently(pk_column, indexed_columns):
257 | return True
258 | else:
259 | return False
260 | elif self.op == COMPOUND_OP_OR:
261 | for child in self.children:
262 | if not child.can_evaluate_efficiently(pk_column, indexed_columns):
263 | return False
264 | else:
265 | return True
266 | else:
267 | raise InvalidPredicateOpException()
268 |
269 | def row_matches_subset(self, row, subset):
270 | if self.op == COMPOUND_OP_AND:
271 | for predicate in subset:
272 | if not predicate.row_matches(row):
273 | matches = False
274 | break
275 | else:
276 | matches = True
277 | elif self.op == COMPOUND_OP_OR:
278 | for predicate in subset:
279 | if predicate.row_matches(row):
280 | matches = True
281 | break
282 | else:
283 | matches = False
284 | else:
285 | raise InvalidPredicateOpException()
286 |
287 | if self.negated:
288 | matches = not matches
289 |
290 | return matches
291 |
292 | def row_matches(self, row):
293 | return self.row_matches_subset(row, self.children)
294 |
295 | def incorporate_range_op(self, column, op, value, parent_predicate):
296 | return False
297 |
298 | def add_filter(self, column, op, value):
299 | if op in ('lt', 'lte', 'gt', 'gte', 'exact', 'startswith'):
300 | for child in self.children:
301 | if child.incorporate_range_op(column, op, value, self.op):
302 | return
303 | else:
304 | child = RangePredicate(column)
305 | incorporated = child.incorporate_range_op(column, op, value, COMPOUND_OP_AND)
306 | assert incorporated
307 | self.children.append(child)
308 | else:
309 | child = OperationPredicate(column, op, value)
310 | self.children.append(child)
311 |
312 | def add_child(self, child_query_node):
313 | self.children.append(child_query_node)
314 |
315 | def get_matching_rows(self, query):
316 | pk_column = query.query.get_meta().pk.column
317 | #indexed_columns = query.indexed_columns
318 |
319 | # In the first pass we handle the query nodes that can be processed
320 | # efficiently. Hopefully, in most cases, this will result in a
321 | # subset of the rows that is much smaller than the overall number
322 | # of rows so we only have to run the inefficient query predicates
323 | # over this smaller number of rows.
324 | if self.can_evaluate_efficiently(pk_column, query.indexed_columns):
325 | inefficient_predicates = []
326 | result = None
327 | for predicate in self.children:
328 | if predicate.can_evaluate_efficiently(pk_column, query.indexed_columns):
329 | rows = predicate.get_matching_rows(query)
330 |
331 | if result == None:
332 | result = rows
333 | else:
334 | result = combine_rows(result, rows, self.op, pk_column)
335 | else:
336 | inefficient_predicates.append(predicate)
337 | else:
338 | inefficient_predicates = self.children
339 | result = query.get_all_rows()
340 |
341 | if result == None:
342 | result = []
343 |
344 | # Now
345 | if len(inefficient_predicates) > 0:
346 | result = [row for row in result if self.row_matches_subset(row, inefficient_predicates)]
347 |
348 | return result
349 |
350 |
--------------------------------------------------------------------------------
/django_cassandra/db/utils.py:
--------------------------------------------------------------------------------
1 | # Copyright 2010 BSN, Inc.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 |
15 | import time
16 | from thrift import Thrift
17 | from thrift.transport import TTransport
18 | from thrift.transport import TSocket
19 | from thrift.protocol import TBinaryProtocol
20 | from cassandra import Cassandra
21 | #from cassandra.ttypes import *
22 | from django.db.utils import DatabaseError
23 |
24 | def _cmp_to_key(comparison_function):
25 | """
26 | Convert a cmp= function into a key= function.
27 | This is built in to Python 2.7, but we define it ourselves
28 | to work with older versions of Python
29 | """
30 | class K(object):
31 | def __init__(self, obj, *args):
32 | self.obj = obj
33 | def __lt__(self, other):
34 | return comparison_function(self.obj, other.obj) < 0
35 | def __gt__(self, other):
36 | return comparison_function(self.obj, other.obj) > 0
37 | def __eq__(self, other):
38 | return comparison_function(self.obj, other.obj) == 0
39 | def __le__(self, other):
40 | return comparison_function(self.obj, other.obj) <= 0
41 | def __ge__(self, other):
42 | return comparison_function(self.obj, other.obj) >= 0
43 | def __ne__(self, other):
44 | return comparison_function(self.obj, other.obj) != 0
45 | return K
46 |
47 | def _compare_rows(row1, row2, sort_spec_list):
48 | for sort_spec in sort_spec_list:
49 | column_name = sort_spec[0]
50 | reverse = sort_spec[1] if len(sort_spec) > 1 else False
51 | row1_value = row1.get(column_name, None)
52 | row2_value = row2.get(column_name, None)
53 | result = cmp(row1_value, row2_value)
54 | if result != 0:
55 | if reverse:
56 | result = -result
57 | break;
58 | else:
59 | result = 0
60 | return result
61 |
62 | def sort_rows(rows, sort_spec):
63 | if sort_spec == None:
64 | return rows
65 |
66 | if (type(sort_spec) != list) and (type(sort_spec) != tuple):
67 | raise InvalidSortSpecException()
68 |
69 | # The sort spec can be either a single sort spec tuple or a list/tuple
70 | # of sort spec tuple. To simplify the code below we convert the case
71 | # where it's a single sort spec tuple to a 1-element tuple containing
72 | # the sort spec tuple here.
73 | if (type(sort_spec[0]) == list) or (type(sort_spec[0]) == tuple):
74 | sort_spec_list = sort_spec
75 | else:
76 | sort_spec_list = (sort_spec,)
77 |
78 | rows.sort(key=_cmp_to_key(lambda row1, row2: _compare_rows(row1, row2, sort_spec_list)))
79 |
80 | COMBINE_INTERSECTION = 1
81 | COMBINE_UNION = 2
82 |
83 | def combine_rows(rows1, rows2, op, primary_key_column):
84 | # Handle cases where rows1 and/or rows2 are None or empty
85 | if not rows1:
86 | return list(rows2) if rows2 and (op == COMBINE_UNION) else []
87 | if not rows2:
88 | return list(rows1) if (op == COMBINE_UNION) else []
89 |
90 | # We're going to iterate over the lists in parallel and
91 | # compare the elements so we need both lists to be sorted
92 | # Note that this means that the input arguments will be modified.
93 | # We could optionally clone the rows first, but then we'd incur
94 | # the overhead of the copy. For now, we'll just always sort
95 | # in place, and if it turns out to be a problem we can add the
96 | # option to copy
97 | sort_rows(rows1,(primary_key_column,))
98 | sort_rows(rows2,(primary_key_column,))
99 |
100 | combined_rows = []
101 | iter1 = iter(rows1)
102 | iter2 = iter(rows2)
103 | update1 = update2 = True
104 |
105 | while True:
106 | # Get the next element from one or both of the lists
107 | if update1:
108 | try:
109 | row1 = iter1.next()
110 | except:
111 | row1 = None
112 | value1 = row1.get(primary_key_column, None) if row1 != None else None
113 | if update2:
114 | try:
115 | row2 = iter2.next()
116 | except:
117 | row2 = None
118 | value2 = row2.get(primary_key_column, None) if row2 != None else None
119 |
120 | if (op == COMBINE_INTERSECTION):
121 | # If we've reached the end of either list and we're doing an intersection,
122 | # then we're done
123 | if (row1 == None) or (row2 == None):
124 | break
125 |
126 | if value1 == value2:
127 | combined_rows.append(row1)
128 | elif (op == COMBINE_UNION):
129 | if row1 == None:
130 | if row2 == None:
131 | break;
132 | combined_rows.append(row2)
133 | elif (row2 == None) or (value1 <= value2):
134 | combined_rows.append(row1)
135 | else:
136 | combined_rows.append(row2)
137 | else:
138 | raise InvalidCombineRowsOpException()
139 |
140 | update1 = (row2 == None) or (value1 <= value2)
141 | update2 = (row1 == None) or (value2 <= value1)
142 |
143 | return combined_rows
144 |
145 | _last_timestamp = None
146 |
147 | def get_next_timestamp():
148 | # The timestamp is a 64-bit integer
149 | # We now use the standard Cassandra timestamp format of the
150 | # current system time in microseconds. We also keep track of the
151 | # last timestamp we returned and if the current time is less than
152 | # that, then we just advance the timestamp by 1 to make sure we
153 | # return monotonically increasing timestamps. Note that this isn't
154 | # guaranteed to handle the fairly common Django deployment model of
155 | # having multiple Django processes that are dispatched to from a
156 | # web server like Apache. In practice I don't think that case will be
157 | # a problem though (at least with current hardware) because I don't
158 | # think you could have two consecutive calls to Django from another
159 | # process that would be dispatched to two different Django processes
160 | # that would happen in the same microsecond.
161 |
162 | global _last_timestamp
163 |
164 | timestamp = int(time.time() * 1000000)
165 |
166 | if (_last_timestamp != None) and (timestamp <= _last_timestamp):
167 | timestamp = _last_timestamp + 1
168 |
169 | _last_timestamp = timestamp
170 |
171 | return timestamp
172 |
173 | def convert_string_to_list(s):
174 | # FIXME: Shouldn't use eval here, because of security considerations
175 | # (i.e. if someone could modify the data in Cassandra they could
176 | # insert arbitrary Python code that would then get evaluated on
177 | # the client machine. Should have code that parses the list string
178 | # to construct the list or else validate the string before calling eval.
179 | # But for now, during development, we'll just use the quick & dirty eval.
180 | return eval(s)
181 |
182 | def convert_list_to_string(l):
183 | return unicode(l)
184 |
185 |
186 | class CassandraConnection(object):
187 | def __init__(self, host, port, keyspace, user, password):
188 | self.host = host
189 | self.port = port
190 | self.keyspace = keyspace
191 | self.user = user
192 | self.password = password
193 | self.transport = None
194 | self.client = None
195 | self.keyspace_set = False
196 | self.logged_in = False
197 |
198 | def commit(self):
199 | pass
200 |
201 | def set_keyspace(self):
202 | if not self.keyspace_set:
203 | try:
204 | if self.client:
205 | self.client.set_keyspace(self.keyspace)
206 | self.keyspace_set = True
207 | except Exception, e:
208 | # In this case we won't have set keyspace_set to true, so we'll throw the
209 | # exception below where it also handles the case that self.client
210 | # is not valid yet.
211 | pass
212 | if not self.keyspace_set:
213 | raise DatabaseError('Error setting keyspace: %s; %s' % (self.keyspace, str(e)))
214 |
215 | def login(self):
216 | # TODO: This user/password auth code hasn't been tested
217 | if not self.logged_in:
218 | if self.user:
219 | try:
220 | if self.client:
221 | credentials = {'username': self.user, 'password': self.password}
222 | self.client.login(AuthenticationRequest(credentials))
223 | self.logged_in = True
224 | except Exception, e:
225 | # In this case we won't have set logged_in to true, so we'll throw the
226 | # exception below where it also handles the case that self.client
227 | # is not valid yet.
228 | pass
229 | if not self.logged_in:
230 | raise DatabaseError('Error logging in to keyspace: %s; %s' % (self.keyspace, str(e)))
231 | else:
232 | self.logged_in = True
233 |
234 | def open(self, set_keyspace=False, login=False):
235 | if self.transport == None:
236 | # Create the client connection to the Cassandra daemon
237 | socket = TSocket.TSocket(self.host, int(self.port))
238 | transport = TTransport.TFramedTransport(TTransport.TBufferedTransport(socket))
239 | protocol = TBinaryProtocol.TBinaryProtocolAccelerated(transport)
240 | transport.open()
241 | self.transport = transport
242 | self.client = Cassandra.Client(protocol)
243 |
244 | if login:
245 | self.login()
246 |
247 | if set_keyspace:
248 | self.set_keyspace()
249 |
250 | def close(self):
251 | if self.transport != None:
252 | try:
253 | self.transport.close()
254 | except Exception, e:
255 | pass
256 | self.transport = None
257 | self.client = None
258 | self.keyspace_set = False
259 | self.logged_in = False
260 |
261 | def is_connected(self):
262 | return self.transport != None
263 |
264 | def get_client(self):
265 | if self.client == None:
266 | self.open(True, True)
267 | return self.client
268 |
269 | def reopen(self):
270 | self.close()
271 | self.open(True, True)
272 |
273 |
274 | class CassandraConnectionError(DatabaseError):
275 | def __init__(self, message=None):
276 | msg = 'Error connecting to Cassandra database'
277 | if message:
278 | msg += '; ' + str(message)
279 | super(CassandraConnectionError,self).__init__(msg)
280 |
281 |
282 | class CassandraAccessError(DatabaseError):
283 | def __init__(self, message=None):
284 | msg = 'Error accessing Cassandra database'
285 | if message:
286 | msg += '; ' + str(message)
287 | super(CassandraAccessError,self).__init__(msg)
288 |
289 |
290 | def call_cassandra_with_reconnect(connection, fn, *args, **kwargs):
291 | try:
292 | try:
293 | results = fn(connection.get_client(), *args, **kwargs)
294 | except TTransport.TTransportException:
295 | connection.reopen()
296 | results = fn(connection.get_client(), *args, **kwargs)
297 | except TTransport.TTransportException, e:
298 | raise CassandraConnectionError(e)
299 | except Exception, e:
300 | raise CassandraAccessError(e)
301 |
302 | return results
303 |
304 |
305 |
--------------------------------------------------------------------------------
/manage.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | from django.core.management import execute_manager
3 | try:
4 | import settings # Assumed to be in the same directory.
5 | except ImportError:
6 | import sys
7 | sys.stderr.write("Error: Can't find the file 'settings.py' in the directory containing %r. It appears you've customized things.\nYou'll have to run django-admin.py, passing it your settings module.\n(If the file settings.py does indeed exist, it's causing an ImportError somehow.)\n" % __file__)
8 | sys.exit(1)
9 |
10 | if __name__ == "__main__":
11 | execute_manager(settings)
12 |
--------------------------------------------------------------------------------
/settings.py:
--------------------------------------------------------------------------------
1 | # Django settings for test_db_backend project.
2 |
3 | DEBUG = True
4 | TEMPLATE_DEBUG = DEBUG
5 |
6 | ADMINS = (
7 | # ('Your Name', 'your_email@domain.com'),
8 | )
9 |
10 | MANAGERS = ADMINS
11 |
12 | DATABASES = {
13 | 'default': {
14 | 'ENGINE': 'django_cassandra.db', # Add 'postgresql_psycopg2', 'postgresql', 'mysql', 'sqlite3' or 'oracle'.
15 | 'NAME': 'DjangoTest', # Or path to database file if using sqlite3.
16 | 'USER': '', # Not used with sqlite3.
17 | 'PASSWORD': '', # Not used with sqlite3.
18 | 'HOST': 'localhost', # Set to empty string for localhost. Not used with sqlite3.
19 | 'PORT': '9160', # Set to empty string for default. Not used with sqlite3.
20 | 'SUPPORTS_TRANSACTIONS': False,
21 | 'CASSANDRA_REPLICATION_FACTOR': 1,
22 | 'CASSANDRA_ENABLE_CASCADING_DELETES': True
23 | }
24 | }
25 |
26 | # Local time zone for this installation. Choices can be found here:
27 | # http://en.wikipedia.org/wiki/List_of_tz_zones_by_name
28 | # although not all choices may be available on all operating systems.
29 | # On Unix systems, a value of None will cause Django to use the same
30 | # timezone as the operating system.
31 | # If running in a Windows environment this must be set to the same as your
32 | # system time zone.
33 | TIME_ZONE = 'America/Chicago'
34 |
35 | # Language code for this installation. All choices can be found here:
36 | # http://www.i18nguy.com/unicode/language-identifiers.html
37 | LANGUAGE_CODE = 'en-us'
38 |
39 | SITE_ID = 1
40 |
41 | # If you set this to False, Django will make some optimizations so as not
42 | # to load the internationalization machinery.
43 | USE_I18N = True
44 |
45 | # If you set this to False, Django will not format dates, numbers and
46 | # calendars according to the current locale
47 | USE_L10N = True
48 |
49 | # Absolute path to the directory that holds media.
50 | # Example: "/home/media/media.lawrence.com/"
51 | MEDIA_ROOT = ''
52 |
53 | # URL that handles the media served from MEDIA_ROOT. Make sure to use a
54 | # trailing slash if there is a path component (optional in other cases).
55 | # Examples: "http://media.lawrence.com", "http://example.com/media/"
56 | MEDIA_URL = ''
57 |
58 | # URL prefix for admin media -- CSS, JavaScript and images. Make sure to use a
59 | # trailing slash.
60 | # Examples: "http://foo.com/media/", "/media/".
61 | ADMIN_MEDIA_PREFIX = '/media/'
62 |
63 | # Make this unique, and don't share it with anybody.
64 | SECRET_KEY = 'b^%)yd-d6s%pk16+1m@fx!jsry!alaes%)nmb^ma#rxz8+i_to'
65 |
66 | # List of callables that know how to import templates from various sources.
67 | TEMPLATE_LOADERS = (
68 | 'django.template.loaders.filesystem.Loader',
69 | 'django.template.loaders.app_directories.Loader',
70 | # 'django.template.loaders.eggs.Loader',
71 | )
72 |
73 | MIDDLEWARE_CLASSES = (
74 | 'django.middleware.common.CommonMiddleware',
75 | 'django.contrib.sessions.middleware.SessionMiddleware',
76 | #'django.middleware.csrf.CsrfViewMiddleware',
77 | 'django.contrib.auth.middleware.AuthenticationMiddleware',
78 | #'django.contrib.messages.middleware.MessageMiddleware',
79 | )
80 |
81 | ROOT_URLCONF = 'django_cassandra_backend.urls'
82 |
83 | TEMPLATE_DIRS = (
84 | # Put strings here, like "/home/html/django_templates" or "C:/www/django/templates".
85 | # Always use forward slashes, even on Windows.
86 | # Don't forget to use absolute paths, not relative paths.
87 | )
88 |
89 | INSTALLED_APPS = (
90 | 'django.contrib.auth',
91 | 'django.contrib.contenttypes',
92 | 'django.contrib.sessions',
93 | 'django.contrib.sites',
94 | #'django.contrib.messages',
95 | 'django_cassandra_backend.django_cassandra',
96 | # Uncomment the next line to enable the admin:
97 | 'django.contrib.admin',
98 | 'django_cassandra_backend.tests',
99 | #'django_cassandra_backend.djangotoolbox'
100 | )
101 |
102 | AUTHENTICATION_BACKENDS = (
103 | 'django.contrib.auth.backends.ModelBackend',
104 | )
105 |
106 |
--------------------------------------------------------------------------------
/tests/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/vaterlaus/django_cassandra_backend/4f6df2585df51a7b51fed59481c564c0aee74418/tests/__init__.py
--------------------------------------------------------------------------------
/tests/admin.py:
--------------------------------------------------------------------------------
1 | from .models import Host, Slice, Tag
2 | from django.contrib import admin
3 |
4 | admin.site.register(Host)
5 | admin.site.register(Slice)
6 | admin.site.register(Tag)
7 |
--------------------------------------------------------------------------------
/tests/models.py:
--------------------------------------------------------------------------------
1 | from django.db import models
2 | from djangotoolbox.fields import ListField
3 |
4 | class Slice(models.Model):
5 | name = models.CharField(max_length=64)
6 |
7 | class Meta:
8 | db_table = 'Slice'
9 | ordering = ['id']
10 |
11 | class Host(models.Model):
12 | mac = models.CharField(max_length=20, db_index=True)
13 | ip = models.CharField(max_length=20, db_index = True)
14 | slice = models.ForeignKey(Slice, db_index=True)
15 |
16 | class Meta:
17 | db_table = 'Host'
18 | ordering = ['id']
19 |
20 | class Tag(models.Model):
21 | name = models.CharField(max_length=64)
22 | value = models.CharField(max_length=256)
23 | host = models.ForeignKey(Host, db_index=True)
24 |
25 | class Meta:
26 | db_table = 'Tag'
27 | ordering = ['id']
28 |
29 | class Test(models.Model):
30 | test_date = models.DateField(null=True)
31 | test_datetime = models.DateTimeField(null=True)
32 | test_time = models.TimeField(null=True)
33 | test_decimal = models.DecimalField(null=True, max_digits=10, decimal_places=3)
34 | test_text = models.TextField(null=True)
35 | #test_list = ListField(models.CharField(max_length=500))
36 |
37 | class Meta:
38 | db_table = 'Test'
39 | ordering = ['id']
40 |
41 |
42 |
43 | class CompoundKeyModel(models.Model):
44 | name = models.CharField(max_length=64)
45 | index = models.IntegerField()
46 | extra = models.CharField(max_length=32, default='test')
47 |
48 | class CassandraSettings:
49 | COMPOUND_KEY_FIELDS = ('name', 'index')
50 |
51 |
52 | class CompoundKeyModel2(models.Model):
53 | slice = models.ForeignKey(Slice)
54 | name = models.CharField(max_length=64)
55 | index = models.IntegerField()
56 | extra = models.CharField(max_length=32)
57 |
58 | class CassandraSettings:
59 | COMPOUND_KEY_FIELDS = ('slice', 'name', 'index')
60 | COMPOUND_KEY_SEPARATOR = '#'
61 |
62 | class CompoundKeyModel3(models.Model):
63 | name = models.CharField(max_length=32)
64 |
65 | class CassandraSettings:
66 | COMPOUND_KEY_FIELDS = ('name')
67 |
--------------------------------------------------------------------------------
/tests/tests.py:
--------------------------------------------------------------------------------
1 | # Copyright 2010 BSN, Inc.
2 | #
3 | # Licensed under the Apache License, Version 2.0 (the "License");
4 | # you may not use this file except in compliance with the License.
5 | # You may obtain a copy of the License at
6 | #
7 | # http://www.apache.org/licenses/LICENSE-2.0
8 | #
9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 |
15 | from django.test import TestCase
16 | from .models import *
17 | import datetime
18 | import decimal
19 | from django.db.models.query import Q
20 | from django.db.utils import DatabaseError
21 |
22 | class FieldsTest(TestCase):
23 |
24 | TEST_DATE = datetime.date(2007,3,5)
25 | TEST_DATETIME = datetime.datetime(2010,5,4,9,34,25)
26 | TEST_DATETIME2 = datetime.datetime(2010, 6, 6, 6, 20)
27 | TEST_TIME = datetime.time(10,14,29)
28 | TEST_DECIMAL = decimal.Decimal('33.55')
29 | TEST_TEXT = "Practice? We're talking about practice?"
30 | TEST_TEXT2 = "I'm a man. I'm 40."
31 | #TEST_LIST = [u'aaa',u'bbb',u'foobar',u'snafu',u'hello',u'goodbye']
32 |
33 | def setUp(self):
34 | self.test = Test(id='key1',
35 | test_date=self.TEST_DATE,
36 | test_datetime=self.TEST_DATETIME,
37 | test_time=self.TEST_TIME,
38 | test_decimal=self.TEST_DECIMAL,
39 | test_text=self.TEST_TEXT
40 | #,test_list=self.TEST_LIST
41 | )
42 | self.test.save()
43 |
44 | def test_fields(self):
45 | test1 = Test.objects.get(id='key1')
46 | self.assertEqual(test1.test_date, self.TEST_DATE)
47 | self.assertEqual(test1.test_datetime, self.TEST_DATETIME)
48 | self.assertEqual(test1.test_time, self.TEST_TIME)
49 | self.assertEqual(test1.test_decimal, self.TEST_DECIMAL)
50 | self.assertEqual(test1.test_text, self.TEST_TEXT)
51 | #self.assertEqual(test1.test_list, self.TEST_LIST)
52 |
53 | test1.test_datetime = self.TEST_DATETIME2
54 | test1.test_text = self.TEST_TEXT2
55 | test1.save()
56 |
57 | test1 = Test.objects.get(id='key1')
58 | self.assertEqual(test1.test_datetime, self.TEST_DATETIME2)
59 | self.assertEqual(test1.test_text, self.TEST_TEXT2)
60 |
61 | class BasicFunctionalityTest(TestCase):
62 |
63 | HOST_COUNT = 5
64 |
65 | def get_host_params_for_index(self, index):
66 | decimal_index = str(index)
67 | hex_index = hex(index)[2:]
68 | if len(hex_index) == 1:
69 | hex_index = '0' + hex_index
70 | id = 'key'+decimal_index
71 | mac = '00:01:02:03:04:'+hex_index
72 | ip = '10.0.0.'+decimal_index
73 | slice = self.s0 if index % 2 else self.s1
74 |
75 | return id, mac, ip, slice
76 |
77 | def setUp(self):
78 | # Create a couple slices
79 | self.s0 = Slice(id='key0',name='slice0')
80 | self.s0.save()
81 | self.s1 = Slice(id='key1',name='slice1')
82 | self.s1.save()
83 |
84 | # Create some hosts
85 | for i in range(self.HOST_COUNT):
86 | id, mac, ip, slice = self.get_host_params_for_index(i)
87 | h = Host(id=id, mac=mac,ip=ip,slice=slice)
88 | h.save()
89 |
90 |
91 | def test_create(self):
92 | """
93 | Tests that we correctly created the model instances
94 | """
95 |
96 | # Test that we have the slices we expect
97 | slice_query_set = Slice.objects.all()
98 | index = 0
99 | for slice in slice_query_set:
100 | self.assertEqual(slice.id, 'key' + str(index))
101 | self.assertEqual(slice.name, 'slice' + str(index))
102 | index += 1
103 |
104 | # There should have been exactly 2 slices created
105 | self.assertEqual(index, 2)
106 |
107 | host_query_set = Host.objects.all()
108 | index = 0
109 | for host in host_query_set:
110 | id, mac, ip, slice = self.get_host_params_for_index(index)
111 | index += 1
112 |
113 | # There should have been exactly 2 slices created
114 | self.assertEqual(index, self.HOST_COUNT)
115 |
116 | def test_update(self):
117 | s = Slice.objects.get(id='key0')
118 | s.name = 'foobar'
119 | s.save()
120 | #import time
121 | #time.sleep(5)
122 | s1 = Slice.objects.get(id='key0')
123 | #s2 = Slice.objects.get(id='key0')
124 | self.assertEqual(s1.name, 'foobar')
125 | #self.assertEqual(s2.name, 'foobar')
126 |
127 | def test_delete(self):
128 | host = Host.objects.get(id='key1')
129 | host.delete()
130 | hqs = Host.objects.filter(id='key1')
131 | count = hqs.count()
132 | self.assertEqual(count,0)
133 |
134 | def test_query_update(self):
135 | slice0 = Slice.objects.get(pk='key0')
136 | qs = Host.objects.filter(slice=slice0)
137 | qs.update(ip='192.168.1.1')
138 | qs = Host.objects.all()
139 | for host in qs:
140 | if host.slice.pk == 'key0':
141 | self.assertEqual(host.ip, '192.168.1.1')
142 | else:
143 | self.assertNotEqual(host.ip, '192.168.1.1')
144 |
145 | def test_cascading_delete(self):
146 | slice0 = Slice.objects.get(pk='key0')
147 | slice0.delete()
148 | hqs = Host.objects.all()
149 | count = hqs.count()
150 | self.assertEqual(count, 3)
151 | for host in hqs:
152 | self.assertEqual(host.slice_id, 'key1')
153 |
154 | def test_default_id(self):
155 | s = Slice(name='slice2')
156 | s.save()
157 | s2 = Slice.objects.get(name='slice2')
158 | self.assertEqual(s2.name, 'slice2')
159 |
160 | SLICE_DATA_1 = ('key1', 'PCI')
161 | SLICE_DATA_2 = ('key2', 'Eng1')
162 | SLICE_DATA_3 = ('key3', 'Finance')
163 | SLICE_DATA_4 = ('key4', 'blue')
164 | SLICE_DATA_5 = ('key5', 'bluf')
165 | SLICE_DATA_6 = ('key6', 'BLTSE')
166 | SLICE_DATA_7 = ('key7', 'ZNCE')
167 | SLICE_DATA_8 = ('key8', 'UNCLE')
168 | SLICE_DATA_9 = ('key9', 'increment')
169 |
170 | HOST_DATA_1 = ('key1', '00:01:02:03:04:05', '10.0.0.1', 'key1', (('foo3', 'bar3'), ('foo1','hello'), ('aaa', 'bbb')))
171 | HOST_DATA_2 = ('key2', 'ff:fc:02:33:04:05', '192.168.0.55', 'key2', None)
172 | HOST_DATA_3 = ('key3', 'ff:fc:02:03:04:01', '192.168.0.1', 'key2', (('cfoo3', 'bar3'), ('cfoo1','hello'), ('ddd', 'bbb')))
173 | HOST_DATA_4 = ('key4', '55:44:33:03:04:05', '10.0.0.6', 'key1',None)
174 | HOST_DATA_5 = ('key5', '10:01:02:03:04:05', '10.0.0.2', 'key1', None)
175 | HOST_DATA_6 = ('key6', '33:44:55:03:04:05', '10.0.0.7', 'key3',None)
176 | HOST_DATA_7 = ('key7', '10:01:02:03:04:05', '192.168.0.44', 'key1', None)
177 |
178 | def create_slices(slice_data_list):
179 | for sd in slice_data_list:
180 | id, name = sd
181 | s = Slice(id=id,name=name)
182 | s.save()
183 |
184 | def create_hosts(host_data_list):
185 | for hd in host_data_list:
186 | id,mac,ip,slice_id,tag_list = hd
187 | slice = Slice.objects.get(id=slice_id)
188 | h = Host(id=id,mac=mac,ip=ip,slice=slice)
189 | h.save()
190 | if tag_list != None:
191 | for tag in tag_list:
192 | name, value = tag
193 | t = Tag(name=name,value=value,host=h)
194 | t.save()
195 |
196 | class QueryTest(TestCase):
197 |
198 | def setUp(self):
199 | create_slices((SLICE_DATA_1, SLICE_DATA_2, SLICE_DATA_3))
200 | create_hosts((HOST_DATA_1, HOST_DATA_6, HOST_DATA_5, HOST_DATA_7, HOST_DATA_3, HOST_DATA_2, HOST_DATA_4))
201 |
202 | def check_host_data(self, host, data):
203 | expected_id, expected_mac, expected_ip, expected_slice, expected_tag_list = data
204 | self.assertEqual(host.id, expected_id)
205 | self.assertEqual(host.mac, expected_mac)
206 | self.assertEqual(host.ip, expected_ip)
207 | self.assertEqual(host.slice.id, expected_slice)
208 | # TODO: For now we don't check the tag list
209 |
210 | def test_pk_query(self):
211 | h = Host.objects.get(id='key3')
212 | self.check_host_data(h, HOST_DATA_3)
213 |
214 | hqs = Host.objects.filter(id='key6')
215 | count = hqs.count()
216 | self.assertEqual(count, 1)
217 | h6 = hqs[0]
218 | self.check_host_data(h6, HOST_DATA_6)
219 |
220 | hqs = Host.objects.filter(id__gt='key4')
221 | count = hqs.count()
222 | self.assertEqual(count, 3)
223 | h5, h6, h7 = hqs[:]
224 | self.check_host_data(h5, HOST_DATA_5)
225 | self.check_host_data(h6, HOST_DATA_6)
226 | self.check_host_data(h7, HOST_DATA_7)
227 |
228 | hqs = Host.objects.filter(id__lte='key3')
229 | count = hqs.count()
230 | self.assertEqual(count, 3)
231 | h1, h2, h3 = hqs[:]
232 | self.check_host_data(h1, HOST_DATA_1)
233 | self.check_host_data(h2, HOST_DATA_2)
234 | self.check_host_data(h3, HOST_DATA_3)
235 |
236 | hqs = Host.objects.filter(id__gte='key3', id__lt='key7')
237 | count = hqs.count()
238 | self.assertEqual(count, 4)
239 | h3, h4, h5, h6 = hqs[:]
240 | self.check_host_data(h3, HOST_DATA_3)
241 | self.check_host_data(h4, HOST_DATA_4)
242 | self.check_host_data(h5, HOST_DATA_5)
243 | self.check_host_data(h6, HOST_DATA_6)
244 |
245 | def test_indexed_query(self):
246 | h = Host.objects.get(ip='10.0.0.7')
247 | self.check_host_data(h, HOST_DATA_6)
248 |
249 | hqs = Host.objects.filter(ip='192.168.0.1')
250 | h = hqs[0]
251 | self.check_host_data(h, HOST_DATA_3)
252 |
253 | def test_complex_query(self):
254 | hqs = Host.objects.filter(Q(id='key1') | Q(id='key3') | Q(id='key4')).order_by('id')
255 | count = hqs.count()
256 | self.assertEqual(count, 3)
257 | h1, h3, h4 = hqs[:]
258 | self.check_host_data(h1, HOST_DATA_1)
259 | self.check_host_data(h3, HOST_DATA_3)
260 | self.check_host_data(h4, HOST_DATA_4)
261 |
262 | s1 = Slice.objects.get(id='key1')
263 |
264 | hqs = Host.objects.filter(ip__startswith='10.', slice=s1)
265 | count = hqs.count()
266 | self.assertEqual(count, 3)
267 | h1, h4, h5 = hqs[:]
268 | self.check_host_data(h1, HOST_DATA_1)
269 | self.check_host_data(h4, HOST_DATA_4)
270 | self.check_host_data(h5, HOST_DATA_5)
271 |
272 | hqs = Host.objects.filter(ip='10.0.0.6', slice=s1)
273 | count = hqs.count()
274 | self.assertEqual(count, 1)
275 | h4 = hqs[0]
276 | self.check_host_data(h4, HOST_DATA_4)
277 |
278 | tqs = Tag.objects.filter(name='foo3', value='bar3')
279 | self.assertEqual(tqs.count(), 1)
280 | t = tqs[0]
281 | self.assertEqual(t.name, 'foo3')
282 | self.assertEqual(t.value, 'bar3')
283 | self.assertEqual(t.host_id, 'key1')
284 |
285 | hqs = Host.objects.filter((Q(ip__startswith='10.0') & Q(slice=s1)) | Q(mac__startswith='ff')).order_by('id')
286 | count = hqs.count()
287 | self.assertEqual(count, 5)
288 | h1, h2, h3, h4, h5 = hqs[:]
289 | self.check_host_data(h1, HOST_DATA_1)
290 | self.check_host_data(h2, HOST_DATA_2)
291 | self.check_host_data(h3, HOST_DATA_3)
292 | self.check_host_data(h4, HOST_DATA_4)
293 | self.check_host_data(h5, HOST_DATA_5)
294 |
295 | def test_exclude_query(self):
296 | hqs = Host.objects.exclude(ip__startswith="10")
297 | count = hqs.count()
298 | self.assertEqual(count,3)
299 | h2, h3, h7 = hqs[:]
300 | self.check_host_data(h2, HOST_DATA_2)
301 | self.check_host_data(h3, HOST_DATA_3)
302 | self.check_host_data(h7, HOST_DATA_7)
303 |
304 | def test_count(self):
305 |
306 | count = Host.objects.count()
307 | self.assertEqual(count, 7)
308 |
309 | count = Host.objects.all().count()
310 | self.assertEqual(count, 7)
311 |
312 | slice1 = Slice.objects.get(id='key1')
313 | qs = Host.objects.filter(slice=slice1)
314 | count = qs.count()
315 | #if count == 4:
316 | # h1,h4,h5,h7 = qs[:]
317 | #else:
318 | # h1,h4,h5,h7,h = qs[:]
319 | self.assertEqual(count, 4)
320 |
321 | qs = Slice.objects.filter(name__startswith='P')
322 | count = qs.count()
323 | self.assertEqual(count, 1)
324 |
325 | qs = Host.objects.filter(ip__startswith='10').order_by('slice_id')
326 | count = qs.count()
327 | self.assertEqual(count, 4)
328 |
329 | def test_query_set_slice(self):
330 | hqs = Host.objects.all()[2:6]
331 | count = hqs.count()
332 | h3, h4, h5, h6 = hqs[:]
333 | self.assertEqual(h3.id, 'key3')
334 | self.assertEqual(h4.id, 'key4')
335 | self.assertEqual(h5.id, 'key5')
336 | self.assertEqual(h6.id, 'key6')
337 |
338 | def test_order_by(self):
339 | # Test ascending order of all of the hosts
340 | qs = Host.objects.all().order_by('ip')
341 | h1, h2, h3, h4, h5, h6, h7 = qs[:]
342 | self.assertEqual(h1.id, 'key1')
343 | self.assertEqual(h2.id, 'key5')
344 | self.assertEqual(h3.id, 'key4')
345 | self.assertEqual(h4.id, 'key6')
346 | self.assertEqual(h5.id, 'key3')
347 | self.assertEqual(h6.id, 'key7')
348 | self.assertEqual(h7.id, 'key2')
349 |
350 | # Test descending order of all of the hosts
351 | qs = Host.objects.all().order_by('-ip')
352 | h1, h2, h3, h4, h5, h6, h7 = qs[:]
353 | self.assertEqual(h1.id, 'key2')
354 | self.assertEqual(h2.id, 'key7')
355 | self.assertEqual(h3.id, 'key3')
356 | self.assertEqual(h4.id, 'key6')
357 | self.assertEqual(h5.id, 'key4')
358 | self.assertEqual(h6.id, 'key5')
359 | self.assertEqual(h7.id, 'key1')
360 |
361 | # Test multiple ordering criteria
362 | qs = Host.objects.all().order_by('slice_id', 'ip')
363 | h1, h2, h3, h4, h5, h6, h7 = qs[:]
364 | self.assertEqual(h1.id, 'key1')
365 | self.assertEqual(h2.id, 'key5')
366 | self.assertEqual(h3.id, 'key4')
367 | self.assertEqual(h4.id, 'key7')
368 | self.assertEqual(h5.id, 'key3')
369 | self.assertEqual(h6.id, 'key2')
370 | self.assertEqual(h7.id, 'key6')
371 |
372 | # Test multiple ordering criteria
373 | qs = Host.objects.all().order_by('-slice_id', 'ip')
374 | h1, h2, h3, h4, h5, h6, h7 = qs[:]
375 | self.assertEqual(h1.id, 'key6')
376 | self.assertEqual(h2.id, 'key3')
377 | self.assertEqual(h3.id, 'key2')
378 | self.assertEqual(h4.id, 'key1')
379 | self.assertEqual(h5.id, 'key5')
380 | self.assertEqual(h6.id, 'key4')
381 | self.assertEqual(h7.id, 'key7')
382 |
383 | # Currently the nonrel code doesn't handle ordering that spans tables/column families
384 | #=======================================================================
385 | # qs = Host.objects.all().order_by('slice__name', 'id')
386 | # h2, h3, h6, h1, h5, h4, h7 = qs[:]
387 | # self.assertEqual(h2.id, 'key2')
388 | # self.assertEqual(h3.id, 'key3')
389 | # self.assertEqual(h6.id, 'key6')
390 | # self.assertEqual(h1.id, 'key1')
391 | # self.assertEqual(h5.id, 'key5')
392 | # self.assertEqual(h4.id, 'key4')
393 | # self.assertEqual(h7.id, 'key7')
394 | #=======================================================================
395 |
396 |
397 | class OperationTest(TestCase):
398 |
399 | def setUp(self):
400 | create_slices((SLICE_DATA_1, SLICE_DATA_2, SLICE_DATA_3, SLICE_DATA_4, SLICE_DATA_5,
401 | SLICE_DATA_6, SLICE_DATA_7, SLICE_DATA_8, SLICE_DATA_9))
402 |
403 | def test_range_ops(self):
404 | qs = Slice.objects.filter(name__gt='PCI')
405 | count = qs.count()
406 | self.assertEqual(count, 5)
407 | s4,s5,s7,s8,s9 = qs[:]
408 | self.assertEqual(s4.id,'key4')
409 | self.assertEqual(s5.id,'key5')
410 | self.assertEqual(s7.id,'key7')
411 | self.assertEqual(s8.id,'key8')
412 | self.assertEqual(s9.id,'key9')
413 |
414 | qs = Slice.objects.filter(name__gte='bluf',name__lte='bluf')
415 | count = qs.count()
416 | self.assertEqual(count, 1)
417 | s5 = qs[0]
418 | self.assertEqual(s5.id, 'key5')
419 |
420 | qs = Slice.objects.filter(name__gt='blue', name__lte='bluf')
421 | count = qs.count()
422 | self.assertEqual(count, 1)
423 | s5 = qs[0]
424 | self.assertEqual(s5.id, 'key5')
425 |
426 | qs = Slice.objects.filter(name__exact='blue')
427 | count = qs.count()
428 | self.assertEqual(count, 1)
429 | s4 = qs[0]
430 | self.assertEqual(s4.id, 'key4')
431 |
432 | def test_other_ops(self):
433 |
434 | qs = Slice.objects.filter(id__in=['key1','key4','key6','key9'])
435 | count = qs.count()
436 | self.assertEqual(count, 4)
437 | s1,s4,s6,s9 = qs[:]
438 | self.assertEqual(s1.id,'key1')
439 | self.assertEqual(s4.id,'key4')
440 | self.assertEqual(s6.id,'key6')
441 | self.assertEqual(s9.id,'key9')
442 |
443 | qs = Slice.objects.filter(name__startswith='bl')
444 | count = qs.count()
445 | self.assertEqual(count, 2)
446 | s4,s5 = qs[:]
447 | self.assertEqual(s4.id,'key4')
448 | self.assertEqual(s5.id,'key5')
449 |
450 | qs = Slice.objects.filter(name__endswith='E')
451 | count = qs.count()
452 | self.assertEqual(count, 3)
453 | s6,s7,s8 = qs[:]
454 | self.assertEqual(s6.id,'key6')
455 | self.assertEqual(s7.id,'key7')
456 | self.assertEqual(s8.id,'key8')
457 |
458 | qs = Slice.objects.filter(name__contains='NC')
459 | count = qs.count()
460 | self.assertEqual(count, 2)
461 | s7,s8 = qs[:]
462 | self.assertEqual(s7.id,'key7')
463 | self.assertEqual(s8.id,'key8')
464 |
465 | qs = Slice.objects.filter(name__istartswith='b')
466 | count = qs.count()
467 | self.assertEqual(count, 3)
468 | s4,s5,s6 = qs[:]
469 | self.assertEqual(s4.id,'key4')
470 | self.assertEqual(s5.id,'key5')
471 | self.assertEqual(s6.id,'key6')
472 |
473 | qs = Slice.objects.filter(name__istartswith='B')
474 | count = qs.count()
475 | self.assertEqual(count, 3)
476 | s4,s5,s6 = qs[:]
477 | self.assertEqual(s4.id,'key4')
478 | self.assertEqual(s5.id,'key5')
479 | self.assertEqual(s6.id,'key6')
480 |
481 | qs = Slice.objects.filter(name__iendswith='e')
482 | count = qs.count()
483 | self.assertEqual(count, 5)
484 | s3,s4,s6,s7,s8 = qs[:]
485 | self.assertEqual(s3.id,'key3')
486 | self.assertEqual(s4.id,'key4')
487 | self.assertEqual(s6.id,'key6')
488 | self.assertEqual(s7.id,'key7')
489 | self.assertEqual(s8.id,'key8')
490 |
491 | qs = Slice.objects.filter(name__icontains='nc')
492 | count = qs.count()
493 | self.assertEqual(count, 4)
494 | s3,s7,s8,s9 = qs[:]
495 | self.assertEqual(s3.id,'key3')
496 | self.assertEqual(s7.id,'key7')
497 | self.assertEqual(s8.id,'key8')
498 | self.assertEqual(s9.id,'key9')
499 |
500 | qs = Slice.objects.filter(name__regex='[PEZ].*')
501 | count = qs.count()
502 | self.assertEqual(count, 3)
503 | s1,s2,s7 = qs[:]
504 | self.assertEqual(s1.id,'key1')
505 | self.assertEqual(s2.id,'key2')
506 | self.assertEqual(s7.id,'key7')
507 |
508 | qs = Slice.objects.filter(name__iregex='bl.*e')
509 | count = qs.count()
510 | self.assertEqual(count, 2)
511 | s4,s6 = qs[:]
512 | self.assertEqual(s4.id,'key4')
513 | self.assertEqual(s6.id,'key6')
514 |
515 | class Department(models.Model):
516 | name = models.CharField(primary_key=True, max_length=256)
517 |
518 | def __unicode__(self):
519 | return self.title
520 |
521 | class DepartmentRequest(models.Model):
522 | from_department = models.ForeignKey(Department, related_name='froms')
523 | to_department = models.ForeignKey(Department, related_name='tos')
524 |
525 | class RestTestMultipleForeignKeys(TestCase):
526 |
527 | def test_it(self):
528 |
529 | for i in range(0,4):
530 | department = Department()
531 | department.name = "id_" + str(i)
532 | department.save()
533 |
534 | departments = Department.objects.order_by('name')
535 | d0 = departments[0]
536 | d1 = departments[1]
537 | d2 = departments[2]
538 | d3 = departments[3]
539 |
540 | req = DepartmentRequest()
541 | req.from_department = d0
542 | req.to_department = d1
543 | req.save()
544 |
545 | req = DepartmentRequest()
546 | req.from_department = d2
547 | req.to_department = d1
548 | req.save()
549 |
550 | rs = DepartmentRequest.objects.filter(from_department = d3, to_department = d1)
551 | self.assertEqual(rs.count(), 0)
552 |
553 | rs = DepartmentRequest.objects.filter(from_department=d0, to_department=d1)
554 | self.assertEqual(rs.count(), 1)
555 | req = rs[0]
556 | self.assertEqual(req.from_department, d0)
557 | self.assertEqual(req.to_department, d1)
558 |
559 | rs = DepartmentRequest.objects.filter(to_department=d1).order_by('from_department')
560 | self.assertEqual(rs.count(), 2)
561 | req = rs[0]
562 | self.assertEqual(req.from_department, d0)
563 | self.assertEqual(req.to_department, d1)
564 | req = rs[1]
565 | self.assertEqual(req.from_department, d2)
566 | self.assertEqual(req.to_department, d1)
567 |
568 |
569 | class EmptyModel(models.Model):
570 | pass
571 |
572 | class EmptyModelTest(TestCase):
573 |
574 | def test_empty_model(self):
575 | em = EmptyModel()
576 | em.save()
577 | qs = EmptyModel.objects.all()
578 | self.assertEqual(qs.count(), 1)
579 | em2 = qs[0]
580 | self.assertEqual(em.id, em2.id)
581 |
582 | class CompoundKeyTest(TestCase):
583 |
584 | def test_construct_with_no_id(self):
585 | ckm = CompoundKeyModel(name='foo', index=6, extra='hello')
586 | ckm.save();
587 | ckm = CompoundKeyModel.objects.all()[0]
588 | self.assertEqual(ckm.id, 'foo|6')
589 |
590 | def test_construct_with_id(self):
591 | ckm = CompoundKeyModel(id='foo|6', name='foo', index=6, extra='hello')
592 | ckm.save();
593 | ckm = CompoundKeyModel.objects.all()[0]
594 | self.assertEqual(ckm.id, 'foo|6')
595 |
596 | def test_malformed_id(self):
597 | ckm = CompoundKeyModel(id='abc', name='foo', index=6, extra='hello')
598 | self.failUnlessRaises(DatabaseError, ckm.save)
599 |
600 | def test_construct_mismatched_id(self):
601 | ckm = CompoundKeyModel(id='foo|5', name='foo', index=6, extra='hello')
602 | self.failUnlessRaises(DatabaseError, ckm.save)
603 |
604 | def test_update_non_key_field(self):
605 | ckm = CompoundKeyModel(name='foo', index=6, extra='hello')
606 | ckm.save();
607 | ckm = CompoundKeyModel.objects.all()[0]
608 | ckm.extra = 'goodbye'
609 | ckm.save();
610 | ckm = CompoundKeyModel.objects.all()[0]
611 | self.assertEqual(ckm.extra, 'goodbye')
612 |
613 | def test_update_no_id(self):
614 | ckm = CompoundKeyModel(id='foo|6', name='foo', index=6, extra='hello')
615 | ckm.save();
616 | ckm = CompoundKeyModel(name='foo', index=6, extra='goodbye')
617 | ckm.save();
618 | ckm = CompoundKeyModel.objects.all()[0]
619 | self.assertEqual(ckm.extra, 'goodbye')
620 |
621 | def test_update_mismatched_id(self):
622 | ckm = CompoundKeyModel(name='foo', index=6, extra='hello')
623 | ckm.save();
624 | ckm = CompoundKeyModel.objects.all()[0]
625 | ckm.name = 'bar'
626 | self.failUnlessRaises(DatabaseError, ckm.save)
627 |
628 | def test_delete_by_id(self):
629 | ckm = CompoundKeyModel(name='foo', index=6, extra='hello')
630 | ckm.save();
631 | ckm = CompoundKeyModel.objects.get(pk='foo|6')
632 | ckm.delete()
633 | qs = CompoundKeyModel.objects.all()
634 | self.assertEqual(len(qs), 0)
635 |
636 | def test_delete_by_fields(self):
637 | ckm = CompoundKeyModel(name='foo', index=6, extra='hello')
638 | ckm.save()
639 | qs = CompoundKeyModel.objects.filter(name='foo', index=6)
640 | qs.delete()
641 | qs = CompoundKeyModel.objects.all()
642 | self.assertEqual(len(qs), 0)
643 |
644 |
645 | def test_custom_separator(self):
646 | s = Slice(id='default')
647 | s.save()
648 | ckm = CompoundKeyModel2(slice=s, name='foo', index=6, extra='hello')
649 | ckm.save();
650 | ckm = CompoundKeyModel2.objects.all()[0]
651 | self.assertEqual(ckm.id, 'default#foo#6')
652 |
--------------------------------------------------------------------------------
/tests/views.py:
--------------------------------------------------------------------------------
1 | # Create your views here.
2 |
--------------------------------------------------------------------------------
/urls.py:
--------------------------------------------------------------------------------
1 | from django.conf.urls.defaults import *
2 |
3 | # Uncomment the next two lines to enable the admin:
4 | from django.contrib import admin
5 | admin.autodiscover()
6 |
7 | urlpatterns = patterns('',
8 | # Example:
9 | # (r'^django_cassandra_backend/', include('test_db_backend.foo.urls')),
10 |
11 | # Uncomment the admin/doc line below and add 'django.contrib.admindocs'
12 | # to INSTALLED_APPS to enable admin documentation:
13 | # (r'^admin/doc/', include('django.contrib.admindocs.urls')),
14 |
15 | # Uncomment the next line to enable the admin:
16 | (r'^admin/', include(admin.site.urls)),
17 | )
18 |
--------------------------------------------------------------------------------